Big file diffing with DarunGrim


One of the challenges with patch analysis is diffing big files. The definition of big files can vary, but usually we are talking about files that are bigger than a few mega-bytes. Usually Windows system32 files are relatively small. (Figure 1)


Figure 1 Usual Windows dll files


One of the main issues with big file diffing is the memory usage, DarunGrim uses a lot of internal data structure on the memory during whole process. Sometimes when the memory usage goes over what the process can handle, you might see DarunGrim crashes with memory error.


Figure 2 Memory and address space limits (source:

Figure 2 shows the memory limits depending on process type (64bit, 32bit) and Windows type. You can see that using IMAGE_FILE_LARGE_ADDRESS_AWARE compilation option, you can increase the memory limit for 32 bit processes and also if you use 64 bit processes, you can have up to 128TB of memory to be used by a process on some Windows flavors. So with this pre-alpha release, we included IMAGE_FILE_LARGE_ADDRESS_AWARE option to all binaries and added DarunGrim command line tool with 64 bit support. You can download new distribution from following link.


So basically just using new binaries, you can have benefit of large memory (up to 4GB on 64 bit Windows). But, some binaries are so huge, you still need more memory and it can be achieved by using 64 bit version of DarunGrimC.exe. The main GUI program (DarunGrim.exe) has some dependencies on 3rd party binaries without 64bit support. That is one of the reasons why we separated main logic into DarunGrimC.exe and built 64 bit version of core command line tool.



This process has been used with my blog. First the binaries for unpatched and patched binaries are like Figure 3. The binary sizes are more than 18MB which is huge compared to other normal files. We all know that Office binaries are usually huge.


Figure 3 Diffing target files

When you follow instructions from my previous DarunGrim blog, you will have two DGF files generated. (Figure 4)


Figure 4 DGF files

Now, instead of running DarunGrim.exe, you can use DarunGrimC.exe command line tool. From the folder where two DGF files exist, run command line like following. After –f option, you can put unpatched and patched DGF file names and after that you can put the output diff file.


“C:\Program Files (x86)\DarunGrim4\x64\DarunGrimC.exe” -f “wwlib-14.0.7113.5001.dgf” “wwlib-14.0.7121.5004.dgf” “wwlib-14.0.7113.5001-14.0.7121.5004-diff.dgf”


It will take some time to finish whole analysis and if all process went well, you will get a diff file like Figure 5.


Figure 5 Diff file


I tested on a machine with following spec (2.30GHz) with 16GB physical memory and it took about 1 hour forty minutes to finish the analysis. (Figure 7)

Figure 6 Tested system




Figure 7 Test result


You can double click the diff file and GUI program (DarunGrim.exe) will display results. Everything else is same except diffing process. I might come up with 64 bit GUI (DarunGrim.exe) sometime later when all the dependency issues are figured out. And also the memory footage issue will be worked on in the future, but for now using command line 64 bit DarunGrimC.exe is the best option to perform binary diff analysis on big binary files.