Big file diffing with DarunGrim

 

One of the challenges with patch analysis is diffing big files. The definition of big files can vary, but usually we are talking about files that are bigger than a few mega-bytes. Usually Windows system32 files are relatively small. (Figure 1)

 

Figure 1 Usual Windows dll files

 

One of the main issues with big file diffing is the memory usage, DarunGrim uses a lot of internal data structure on the memory during whole process. Sometimes when the memory usage goes over what the process can handle, you might see DarunGrim crashes with memory error.

 

Figure 2 Memory and address space limits (source: http://msdn.microsoft.com/en-us/library/windows/desktop/aa366778(v=vs.85).aspx#memory_limits)

Figure 2 shows the memory limits depending on process type (64bit, 32bit) and Windows type. You can see that using IMAGE_FILE_LARGE_ADDRESS_AWARE compilation option, you can increase the memory limit for 32 bit processes and also if you use 64 bit processes, you can have up to 128TB of memory to be used by a process on some Windows flavors. So with this pre-alpha release, we included IMAGE_FILE_LARGE_ADDRESS_AWARE option to all binaries and added DarunGrim command line tool with 64 bit support. You can download new distribution from following link.

DarunGrim4Setup.exe

So basically just using new binaries, you can have benefit of large memory (up to 4GB on 64 bit Windows). But, some binaries are so huge, you still need more memory and it can be achieved by using 64 bit version of DarunGrimC.exe. The main GUI program (DarunGrim.exe) has some dependencies on 3rd party binaries without 64bit support. That is one of the reasons why we separated main logic into DarunGrimC.exe and built 64 bit version of core command line tool.

 

Example

This process has been used with my blog. First the binaries for unpatched and patched binaries are like Figure 3. The binary sizes are more than 18MB which is huge compared to other normal files. We all know that Office binaries are usually huge.

 

Figure 3 Diffing target files

When you follow instructions from my previous DarunGrim blog, you will have two DGF files generated. (Figure 4)

 

Figure 4 DGF files

Now, instead of running DarunGrim.exe, you can use DarunGrimC.exe command line tool. From the folder where two DGF files exist, run command line like following. After –f option, you can put unpatched and patched DGF file names and after that you can put the output diff file.

 

“C:\Program Files (x86)\DarunGrim4\x64\DarunGrimC.exe” -f “wwlib-14.0.7113.5001.dgf” “wwlib-14.0.7121.5004.dgf” “wwlib-14.0.7113.5001-14.0.7121.5004-diff.dgf”

 

It will take some time to finish whole analysis and if all process went well, you will get a diff file like Figure 5.

 

Figure 5 Diff file

 

I tested on a machine with following spec (2.30GHz) with 16GB physical memory and it took about 1 hour forty minutes to finish the analysis. (Figure 7)

Figure 6 Tested system

 

 

 

Figure 7 Test result

 

You can double click the diff file and GUI program (DarunGrim.exe) will display results. Everything else is same except diffing process. I might come up with 64 bit GUI (DarunGrim.exe) sometime later when all the dependency issues are figured out. And also the memory footage issue will be worked on in the future, but for now using command line 64 bit DarunGrimC.exe is the best option to perform binary diff analysis on big binary files.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: