Authenticating a Reconstructed Binary

As previously noted, a binary reconstructed from a memory dump may not match with the original file on disk. This raises the question how hash creation and file authentication procedures must be changed in order to provide this functionality.

By now the process to authenticate a binary recovered from memory would be cumbersome:

  1. reconstruct the binary from memory
  2. hash the sections known to be invariant
  3. locate the originating file on disk
  4. hash the same sections in the originating file
  5. compare the hashes
  6. proceed if the hashes match, else give up
  7. hash the complete originating file
  8. search for the hash in a database of known files

So the ability to reconstruct binaries from memory should have some impact on hashing and hash databases.

The first question is what to hash. Until now only complete files are hashed, which is unsuitable to authenticate a reconstructed binary as shown earlier.

The most flexible solution would be to hash the PE header and every single section. However that would require a slight change to the format of hash distributions. Until now most hash sets come as CSV text files (comma separated values) with a fixed number of values per record. Every record the would be extended with the number of pairs to follow and a list of (section name, hash value) pairs. As this needs to be done for every supported hash algorithm, this would significantly inflate the distribution.

Another solution would require a consensus of the forensic community which sections to hash and which not. There could be a "white list" of sections to hash or a "blacklist" of sections not to hash. In either case there are uncommon names to be dealt with like section names used by the various sorts of EXE packers (e.g. aspack and UPX). This approach requires only one additional entry per hash algorithm and per record. On the other hand compatibility issues will arise as soon as the consensus list of hashed sections has to be changed.

Despite its higher demands in storage capacity I'm still in favour of the flexible solution, that is to hash the header and every single segment.



This blog is a project of:
Andreas Schuster
Im Äuelchen 45
D-53177 Bonn

Copyright © 2005-2012 by
Andreas Schuster
All rights reserved.
Powered by Movable Type 5.12