FTK 2.0 - Performance

I've just completed another dry-run of FTK 2.0: preprocessing of a 256 MB thumb drive resulted in a full-text index of more than 3 GB and about 200 MB of table space were filled in the Oracle database. However, the whole operation took more than 4 hours! So let's have a closer look at the process and see what exactly is so time consuming.

In contrast to my other experiments there was no shortage on main memory this time. Also CPU utilization didn't reach the possible maximum. During the second half of the preprocessing the CPU was mostly idle, while one of the hard drives was rattling frantically.

So this time mass storage is the bottleneck. But what exactly is the cause and is there a workaround?

In general I'm trying to distribute concurrent operations among different hard drives (spindles, not just volumes). So, for example, data will be read from one disk, processed and the results then written to a different drive.

Therefore my analysis workstation usually is configured as follows:

  • drive 1: operating system, temporary files of the OS and page file
  • drive 2: forensic images
  • drive 3: case file (EnCase), exported files, database (FTK 2)


During the first phase everything works as planned. FTK reads in the images from the second drive, stores the results into the Oracle DB onto the third drive and - if there's a shortage on main memory - the page file will grow on the first drive.

But then the second phase starts. Again FTK will read the images from the second drive. But then it extracts some data into a temporary directory on the first drive! There dtSearch will pick them up and compile them into its full-text index. Fragments of the index are merged periodically. Finally the complete index will be copied into C:\ftk-data. This directory is also accessible through a network share named "ftk-data". (Access permissions will be discussed in a later post. I've opened a ticket with AcessData's support but haven't received a response yet.)

As it can be seen in Windows XP's performance monitor applet (perfmon.msc) concurrent reads and write make the disk command queue grow larger than usual.

One would like to untangle the concurrent operations by directing them to distinct spindles. For "ftk-data" there's an UNC name stored somewhere in the registry. So in theory one should be able to direct those data even onto a different machine. Unfortunately this options gets reset to the local machine name every time the "AccessData Database Monitor" (service_db.exe) starts. Possibly this option is reserved for the Lab and Professional versions of FTK which were already announced by AccessData.

So your only option is to configure the temporary directory used by the FTK application. However, you can't untangle concurrent drive access during the creation of the full-text index. Therefore the only way to reduce processing time is to make use of the various "Index refinement" options. You can define them for case as a whole and also individually for every evidence file. In the latter case you can also filter on file size and for a certain interval in time.

Index refinement options help to cut down on index size and preprocessing time.

I really like FTK for the dtSearch engine. Ironically it's the integration of this engine that causes so much trouble.

Archives

Imprint

This blog is a project of:
Andreas Schuster
Im Äuelchen 45
D-53177 Bonn
impressum@forensikblog.de

Copyright © 2005-2012 by
Andreas Schuster
All rights reserved.
Powered by Movable Type 5.12