Use Case: Windows Explorer

SURVOL USECASE : WINDOWS EXPLORER™

This investigation scenario is done on a plain Windows machine, where File Explorer program is always available. The goal is to retrieve as much information as possible about the internal behavior of a well-known executable. The images are taken from the static SVG output, as it is more stable to copy/paste than a completely dynamic D3 display. This is a simple experiment on a process whose side effects and interactions with other resources are more or less known in advance, and quite simple. Please note that, no command has to be entered, and no specific IT knowledge is necessary.

The first step is to find the process using the "Process tree" script (sources_types/enumerate_CIM_Process.py). In the graph of processes and subprocesses, because it is a SVG document, one can use the Browser "find" usual command to find one or several processes running "explorer.exe", but it is also possible to scroll in the graph. This object belongs to the CIM class CIM_Process. (To be perfectly accurate, it is an instance of the class Win32_Process which derives from CIM_Process).

Of course, the scripts related to Java processes are not worth being called, as Explorer is written only in C. However, a process running a C program may call Java libraries, therefore in some situations these scripts are usable.

The scripts in the folder "Regex matching in heap" scan the memory process to detect some specific strings, using regular expressions. Let's have a look to some of the HTTP urls contained in the process memory.

URLs do not have a class name in the CIM terminology. This is not a difficulty for Survol which very easily adds its own classes to CIM ones: Each Survol class is a Python module, with very few requirements.

The script displaying sockets does not return anything. On the other hand, the list of memory mapped-files is really huge; It would be nice to display them in a list instead of a graph. Several of them related to TortoiseSVN and TortoiseGIT, for example this library.

We will see why we think it is dynamically loaded. Memory-mapped files are a different entity than plain files, because they convey more information such as the amount of memory mapped, the addresses etc... Still, there is a one-to-one correspondence with their data file names, which is why they have the same name.

It is not uninteresting to see that this DLL is also opened by several processes. But why does Snipping Tool accesses this DLL ? This can only be an indirect load because it is not part of the static dependency of this executable. It is of course possible to click on each process or any other resource to have a better idea of possible interactions.

One can see the associated CIM_DataFile object associated to the mapped file. It has a standard CIM class.

Let's go back to Windows Explorer. Which file does it open ? Unsurprisingly, cache files containing icons, if we believe the hint given by their names. At the moment, there is no script to decode these (They are no SQLLite database files despite the extensions).

Without too much surprise, we can see the file names contained in the memory process. We can find the files which are currently being displayed in the user interface and, of course, their names appear in the heap.

What is more interesting is that they apparently are not stored in a process cache, but simply resides in the operating system memory. A possible explanation is that Explorer would use the Operating System cache.

By default, the script selects only the filenames whose file actually exist. So it is possible to click on each filename and investigate further, or simply display it if it has a MIME type. But it is possible to toggle this flag, and in this case non-existent filenames will be displayed.

The logic of searching the process memory for specific patterns is very easy to extend to new expressions. At the moment, this can detect SQL queries, HTTP URLs, ODBC connections strings, file names and COM classes, once a specific characters patterns is defined.

Explorer.exe contains several icons ... This is not a very useful information in itself, but the methodology can be extended to extract any type of information from a CIM_Datafile object.

There is also a script which displays the complete graph of DLLs (Dynamic Link Library) and imported symbols from a PE (Portable Executable format) file. This is quite heavy but is exhaustive, and for example shows that the Tortoise library can only have been loaded dynamically, as it cannot be found here.

Each DLL listed in this diagram can be further browsed by clicking on it. Also, each symbol of each DLL can also be explored, because it is an object type with its properties and scripts.

A future plan is to disassemble the code of each entry point, to extract the call graph (Internal jumps and calls to external functions), but this is not available yet.