Docker® is a tool to creates lightweight, portable and self-sufficient containers for applications, allowing them to run on another infrastructure, such as a cloud. Dockerizing an application, or an entire information system, means converting it into a docker container image, to run within a Docker container. A docker container image is a stand-alone package of a software including everything needed to run its application: code, run-time, tools, libraries, settings.
Dockerization is modeled by a Dockerfile, a Docker specification text document that contains all the commands needed to assemble a docker container image. To deploy the application on another infrastructure, only the image is needed.
Dockerizing needs knowledge of the internal structure of the application, or the set of applications to dockerize. It needs to why which programs are running, their data files, various connections etc. But, a strategic application may have a long technical history, made of many different technologies, undocumented details and complex features. Therefore, estimating the cost of dockerization is difficult because it needs expert knowledge in many technical domains.
DockIT is Primhill Computer open source tool to help converting legacy applications into docker container images. It is a Python command line tool which monitors running applications and their inputs/outputs, and generates a Dockerfile, containing all the commands need by Docker, to build the container image of these applications.
DockIT does not need the source code. Tt can work with any binary program, whatever its programming language. It detects all resources, languages, libraries created or accessed by an application and any of its sub-components, during execution of a command, or when attaching to a running batch, without stopping it.
The kind of resources detected are for example:
Documentation or technical expertise are no longer necessary to have an accurate description of all IT resources needed by the target application. This result is a standard and documented Dockerfile, with extra information added as comments. This file can be updated to adjust to specific needs.
You do not have to spend days studying an application, to know what is needed to dockerize it: You just need to briefly monitor it with DockIT:
The kind of resources detected are for example:
When an user is discovering a legacy application, DockIT gives the significant advantage of understanding the overall scenario of its execution. This makes software design recovery much simpler.
DockIT is a command-line tool, which can be used two scenarios:
The kind of resources detected are for example:
Although DockIT brings a noticeable slow-down to the target process execution, it is still usable in production context, because only some system calls are monitored.
There are many command-line options:
-h,--help This message.
-v,--verbose Verbose mode (Cumulative).
-w,--warning Display warnings (Cumulative).
-s,--summary <CIM class> Prints a summary at the end: Start end end time stamps, executable name,
loaded libraries, read/written/created files and timestamps, subprocesses tree.
Examples: -s 'Win32_LogicalDisk.DeviceID="C:",Prop1="Value1",Prop2="Value2"'
-s 'CIM_DataFile:Category=["Others","Shared libraries"]'
-D,--dockerfile Generates a dockerfile.
-p,--pid <pid> Monitors a running process instead of starting an executable.
-f,--format TXT|CSV|JSON Output format. Default is TXT.
-F,--summary-format TXT|XML Summary output format. Default is XML.
-i,--input <file name> trace command input file.
-l,--log <filename prefix> trace command log output file.
-t,--tracer strace|ltrace|cdb command for generating trace log
-S,--server <Url> Survol url for CIM objects updates. Ex: http://127.0.0.1:80/survol/event_put.py
The execution outputs are:
DockIT is able to generate a Dockerfile skeleton out of any execution of a process. Depending on the target batch or process, this skeleton can be just a draft or a complete enumeration of the resources to dockerize. It enumerates all used resources at the lowest possible level, and cannot miss one. On the other hand, it might misinterpret some resources usage: one of the reason is that it does not have the source code. Despite this, this exhaustive list of resources, properly catalogued in a Dockerfile, makes dockerization much easier and reliable. For example, DockIT provides:
Once resources are properly identified, some manual adjustments to another infrastructure, such as a grid, are possible. For example:
Dockerization of application is not an exact science. There are several tools which share the same purpose. They all have their pro and cons, and their own specific technologies:
None of them brings a general solution because the problem is extremely complicated and relies on many arbitrary choices.Because DockIT analyse is as close as possible to the operating system, it provides results that cannot be obtained with other tools, and it always provides them. This unique perspective - analyzing system calls on the fly - are complementary with other Dockerfile parameters.
This is why it has an option to edit an existing Dockerfile, adding what is missing and was not detected before.
At the moment, DockIT runs on Linux®, and is being ported to Windows®. It can be ported to other platforms as long as system calls can be intercepted, for example by some hooking feature of the target operating system
DockIT is a distinct tool from Survol. Internally, they handle the same type of objects and resources as described by the CIM industrial standard. Survol and DockIT are two orthogonal technologies, based on the same concepts, to address and understand the behavior of running applications in-situ and in-vivo.
Survol displays snapshots, whereas DockIT traces a temporal behaviour on a time scale: DockIT, during its execution, reports the life-cycle of detected objects: Their creation, how they are used by system calls, and their destruction.