Thursday, June 13, 2013

Life of C : Part 1 - Compilation

Hi Guys,

From so long I was thinking of putting some stuff together regarding all basic things about C that gives better understanding not only about C but also some about OS perspective. So Here is the start, (oh please... "Hello World!" again)


0. Prerequisites 
Through out the post I am assuming that reader has some knowledge of C. All explanations/examples  bellow are tested/explained on Linux system. To prepare system install following packages,

1. Compilation
Bear with the stupid example :)
This is basic C program, almost all programmers have written this program at least one in life.

Ok, so this is the C syntax to say computer to write/ display "Hello Guys!". What happens next?
Program to become executable it needs to go through certain steps,

1. Pre-processing
2. Compilation
3. Assembly
4. Linking


To see all intermediate file in process of compilation compile above program as follows,

This will save all intermediate files in current directory.
 1. Pre-processing
This is very first step in process of compilation of the C program. It mainly does following tasks,

1. Macro substitution
2. Comments stripping
3. Expansion of included files


Output of pre-processor for above program is as shown bellow,

file explanation:
1. first observation is at end of the file you can see argument in printf() is replaced by string "Hello Guys!" instead of macro.
2. second is it does not have any comments we added, they have been stripped out.
3. Third observation in "#include<stdio.h>" is replaced by lots of other stuff. It has been expanded and added in the source file so that compiler can clearly see "printf()" function.

It says printf() is defined somewhere externally.


2. Compilation
Next, compiler takes helloguys.i as input and converts it into compiler intermediate output file. This is assembly code. You can see "helloguys.s" file bellow. its consists of assembly instructions of your processor. Next assembler takes this instructions and converts them into machine code.

3. Assembly
In this stage "helloguys.s" file is taken as input and intermediate file "helloguys.o" is generated. This is object file.

This file contains machine level instructions. But function calls to external functions are not resolved yet. Since this is machine code it's not readable. But still see, it will look as bellow,

ELF string in file tells that this is Executable and Linkable Format.

4. Linking
So, this is the final stage of compilation process. It does linking of function calls and their definitions. Like printf() function. Until now this code doesn't know anything about printf(). It only puts place holder at the place of function call. At his stage actual function call printf() gets resolved and function address gets plugged-in.

function of linker:
1. Plug-ins actual function address
2. It also adds some extra code/info in your file. To run our code we need to submit it to system, so this stage adds some extra code to your code like, at start of code OS needs to set some environment variables, command line arguments plus some OS specific code so that OS can signal your application. Also to while exiting, code needs to return some variables to OS which will help to maintain the status of code and its environment. This all extra things will be added by linker.

To verify it, do following,



At least you can see that size of executable is lager than object file.

So, now you know Life of C, the stages that your code needs to go through before becoming executable.  This is just a tip of an ice-burg, there is lot more if you want to go into details.

Next part of the Life of C, will contain the life of your executable, what all happens when you submit it to OS to run.



Cheers!!!  

1 comment: