Navigation Bar

17 January 2012

[C++] - The duty of Compiler and Linker on EXE file creation

1. Introduction

We know that a project delivers the output as exe files. I mean, there are other forms also like dll, ocx etc., but here in this write-up I will discuss about the exe. The EXE stands for executable and that means the operating system with the help of the processor executes the instruction specified in the Exe file. This exe is a binary file and it also has information required by the Operating system in the file header. In the meantime its majority of the content is for the Processor of the machine that runs it.

2. Exe File headers

The header of the exe file contains information required for the operating system. OS knows how much memory to allocate in the data segment and how much code segment memory is required to load the instruction set packed inside the executable from this header. Well. As already told in the blogs, data segment is responsible for the entire program scoped constants, global variables etc. Whereas code segment is responsible for holding the instruction set that a Processor understands. We call this header as Process control Block. You can do a web search to know more about it.

Next to the process control block is the actual exe contents, which is nothing but the instruction set and its data processing in the form of microprocessor’s opcode format. We call this assembly language and that mean it is not that opening an exe file in word processor will reveal the assembly code for you.

3. How the Exe is generated?

Let us consider a simple example. Say, you have exe project, and the project has three header files and three implementation files. After the coding done without any error, you build the project. The build project operation will create the exe output. Have a look at the below picture.

When you build the project the following actions are taken by the development IDE (Say VS2005):

1)      The compiler conducts a pre-processing operation before doing its actual job of compilation. The processing is conducted on the source file; I mean the cpp files. This pre-processing replaces the macros to its content, #include header files, to its content etc.
2)      Once the above said operation is completed for the single file say a.cpp, the compiler starts compiling that file to generate the a.obj file. And this continues till all the cpp files on a specific project (Exe or dll or ocx; whatever it is) is converted to object file.
3)      Now linker comes into picture. The linker understands more robust and compact form of input that comes as object file for each compiler-processed cpp in the previous step. The linker combines all object files and generates the required binary say the exe file in our case.

If you we feed input to the compiler in the form of the cpp programming language and the linker actually generates the output binaries.

If your Solution workspace contains 57 projects with dependency properly set, the above said compilation and linking takes place for each project. When you build such big solution, just sit back and watch the output window. You will see for each project, the build operation displays cpp file processing and at the end the linker will generate the output.

4 Swapping the EXE process

The Operating system reads the information from process control block of the exe file and loads the exe data and instruction sets to the memory location. Once loaded the Operating see your exe as the running process. The processor will execute the instruction set of the all loaded process (Multi-tasking OS like windows). When the memory required to run the new process waiting in the queue is not adequate and the process has high priority the OS swaps the exe content (Instruction set part) and the state (All the global variable values) to the physical disc. The process (Your exe) is not terminated and suspended for some time. Once your process needs the execution of instruction set, OS will take the Image of the process from the disc and keeps that in main memory. This is called swapping. Some people say virtual memory. This is shown below:

In the above picture, processor executes the instruction set from code segment of exe, which allocated on the main memory. When they’re multiple processes to manage the limited availability of the main memory, sometime the OS swaps the exe process to disc. In the above case, Exe process P1, P2 and P3 in Main memory. And P4 and P5 are kept is physical disc as temporarily suspended.