Python Hash File Builder/Checker

From: Sam Baskinger (
Date: 08/20/04

  • Next message: Miroslaw Slawek Chorazy: "RE: MS binary integrity baseline"
    Date: Fri, 20 Aug 2004 09:52:21 -0400

    A few folks asked about this so inlined below is a short and simple
    python script that hashes files on a system and stick them in a simple
    text file. Works on Linux and Windows. (I love interpreted languages.)

    It currently uses SHA1 though it'll do MD5 with just a few tweaks.

    I do apologize because it isn't terribly well commented. All the
    configuration is in the form of global variables found at the top of the

    Hope folks find this useful. USE AT YOUR OWN RISK.


    import sys # for argv
    import os # for files
    import sha # for hashes
    import stat
    from stat import *
    # Case sensitive list of name and paths that will not be hashed by
    exceptions = [ "/dev",
    # case insensitive list of names and paths that wil not be hashed by
    # All entries should be lowercase.
    CIexceptions = [ "tmp", "temp" ]
    # Directories that we will try to hash.
    # Note that there are both Windows and Unix directories here.
    defaultDirectories = [ "C:\\WINDOWS",
                           "C:\\Program Files",
                           "C:\\Documents and Settings" ]
    defaultLog = "hashes.log"
    defaultDB = "hashes.db"
    # Report when a file hash matches.
    # Often we want explicit confirmation but other times we just want
    reportOK = False
    verbose = True
    # This is how we hash a file! :D
    def hashFile(name):
            hash = ()
            f = file(name, "r")
            d =
            while(len(d) > 0):
                d =
            result = hash.hexdigest()
            result = "IOError"
        return result
    def hashTree(root, output):
            # If the path is not in the exceptions lists.
            if root.lower() not in CIexceptions and root not in exceptions:
                # We use lstat so that we don't follow links to directories.
                if stat.S_ISDIR(os.lstat(root)[ST_MODE]) :
                    for d in os.listdir(root):
                        # Don't hash exceptions if the name matches.
                        if d.lower() not in CIexceptions and d not in
                            hashTree(os.path.join(root, d), output)
                # We only has reagular files.
                elif stat.S_ISREG(os.lstat(root)[ST_MODE]):
                    if verbose:
                        sys.stderr.write("ADDING: "+root+"\n")
                    h = hashFile(root)
                    output.write(h+" "+root+"\n")
                elif verbose:
                  sys.stderr.write("SKIPPED: "+root+"\n")
            output.write("TreeError "+root+"\n")
                sys.stderr.write("ADDED: "+h+" "+root+"\n")
    def verifyTree(input, output):
        line = input.readline()
        while(len(line) > 0):
            [origHash, fileName] = line.split(' ',1)
            fileName = fileName[0:-1]
            if(not os.path.isfile(fileName)):
                output.write("MISSING: "+origHash+" "+fileName+"\n")
                newHash = hashFile(fileName)
                if(origHash != newHash):
                    output.write("MISMATCH: "+ origHash + " " + newHash + "
                elif reportOK:
                    output.write("OK: "+origHash+" "+fileName+"\n")
            line = input.readline()
    def buildDB(dirs = defaultDirectories, dbfile=defaultDB):
        if(dbfile is "-"):
            f = sys.stdout
            f = open(dbfile, "w+")
        for t in dirs:
            hashTree(t, f);
    def checkDB(dbfile = defaultDB, logfile = defaultLog):
        if(dbfile is "-"):
            db = sys.stdin
            db = file(dbfile, "r")
        if(logfile is "-"):
            log = sys.stdout
            log = file(logfile, "w+")
        verifyTree(db, log)
    def ussage():
        print("Ussage: "+sys.argv[0]+
              " [ mk [db [directories...]] | ch [log [db]] ]")
    if(len(sys.argv) >= 2):
        if(sys.argv[1] == "mk"):
            if(len(sys.argv) == 2):
            if(len(sys.argv) == 3):
                buildDB(dbfile = sys.argv[2])
            elif(len(sys.argv) > 3):
                buildDB(dbfile = sys.argv[2], dirs=sys.argv[3:])
        elif(sys.argv[1] == 'ch'):
            if(len(sys.argv) == 2):
            elif(len(sys.argv) == 3):
            elif(len(sys.argv) == 4):
                checkDB(logfile=sys.argv[2], dbfile=sys.argv[3])


  • Next message: Miroslaw Slawek Chorazy: "RE: MS binary integrity baseline"

    Relevant Pages

    • Re: 12-31-95 DVD B&P
      ... sorry was just kiddin round, ... the folks are in town. ... I'll def ... Prev by Date: ...
    • Re: how to get next month string?
      ... Could someone help on how to use python to output the next month string like ... def nextmo: ... And for the other folks, one of these days I'll get around to writing ...
    • Re: [terminale] leggere pagina web
      ... import sys, urllib ... def reporthook: print a ... urllib.urlretrieve(url, file, reporthook) ... That's all folks! ...
    • Re: How to write Smart Python programs?
      ... emulate full lexical closure. ... (or at least that's what the folks on ... def outer: ...