Thursday, June 13, 2013

Life of C : Part 1 - Compilation

Hi Guys,

From so long I was thinking of putting some stuff together regarding all basic things about C that gives better understanding not only about C but also some about OS perspective. So Here is the start, (oh please... "Hello World!" again)


0. Prerequisites 
Through out the post I am assuming that reader has some knowledge of C. All explanations/examples  bellow are tested/explained on Linux system. To prepare system install following packages,

$ sudo apt-get install gcc vim
view raw prereq.sh hosted with ❤ by GitHub
1. Compilation
Bear with the stupid example :)
This is basic C program, almost all programmers have written this program at least one in life.

#include<stdio.h>
#define MSG "Hello Guys!"
int main()
{
/* printing msg */
printf(MSG);
return 0;
}
view raw helloguys.c hosted with ❤ by GitHub
Ok, so this is the C syntax to say computer to write/ display "Hello Guys!". What happens next?
Program to become executable it needs to go through certain steps,

1. Pre-processing
2. Compilation
3. Assembly
4. Linking


To see all intermediate file in process of compilation compile above program as follows,

$ gcc -Wall -save-temps helloguys.c -o helloguys
$ ls
helloguys.c
helloguys.i
helloguys.o
helloguys.s
helloguys
$
view raw cmd.sh hosted with ❤ by GitHub
This will save all intermediate files in current directory.
 1. Pre-processing
This is very first step in process of compilation of the C program. It mainly does following tasks,

1. Macro substitution
2. Comments stripping
3. Expansion of included files


Output of pre-processor for above program is as shown bellow,

...
...
...
# 870 "/usr/include/stdio.h" 3 4
extern FILE *popen (__const char *__command, __const char *__modes) ;
extern int pclose (FILE *__stream);
extern char *ctermid (char *__s) __attribute__ ((__nothrow__ , __leaf__));
# 910 "/usr/include/stdio.h" 3 4
extern void flockfile (FILE *__stream) __attribute__ ((__nothrow__ , __leaf__));
extern int ftrylockfile (FILE *__stream) __attribute__ ((__nothrow__ , __leaf__)) ;
extern void funlockfile (FILE *__stream) __attribute__ ((__nothrow__ , __leaf__));
# 940 "/usr/include/stdio.h" 3 4
# 2 "helloguys.c" 2
int main()
{
printf("Hello Guys!");
return 0;
}
view raw helloguys.i hosted with ❤ by GitHub
file explanation:
1. first observation is at end of the file you can see argument in printf() is replaced by string "Hello Guys!" instead of macro.
2. second is it does not have any comments we added, they have been stripped out.
3. Third observation in "#include<stdio.h>" is replaced by lots of other stuff. It has been expanded and added in the source file so that compiler can clearly see "printf()" function.

It says printf() is defined somewhere externally.

extern int printf (__const char *__restrict __format, ...);
view raw print_extern.c hosted with ❤ by GitHub

2. Compilation
Next, compiler takes helloguys.i as input and converts it into compiler intermediate output file. This is assembly code. You can see "helloguys.s" file bellow. its consists of assembly instructions of your processor. Next assembler takes this instructions and converts them into machine code.

.file "helloguys.c"
.section .rodata
.LC0:
.string "Hello Guys!"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
subl $16, %esp
movl $.LC0, %eax
movl %eax, (%esp)
call printf
movl $0, %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
.section .note.GNU-stack,"",@progbits
view raw helloguys.s hosted with ❤ by GitHub
3. Assembly
In this stage "helloguys.s" file is taken as input and intermediate file "helloguys.o" is generated. This is object file.

This file contains machine level instructions. But function calls to external functions are not resolved yet. Since this is machine code it's not readable. But still see, it will look as bellow,

^?ELF^A^A^A^@^@^@^@^@^@^@^@^@^A^@^C^@^A^@^@^@^@^@^@^@^@^@^@^@$^A^@^@^@^@^@^@4^@^@^@^@^@(^@^M^@
^@U<89>å<83>äð<83>ì^P¸^@^@^@^@<89>^D$èüÿÿÿ¸^@^@^@^@ÉÃ^@^@^@Hello Guys!^@^@GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3^@^@^T^@^@^@^@^@^@^@^AzR^@^A|^H^A^[^L^D^D<88>^A^@^@^\^@^@^@^\^@^@^@^@^@^@^@^]^@^@^@^@A^N^H<85>^BB^M^EYÅ^L^D^D^@^@^@.symtab^@.strtab^@.shstrtab^@.rel.text^@.data^@.bss^@.rodata^@.comment^@.note.GNU-stack^@.rel.eh_frame^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
...
...
...
view raw helloguys.o hosted with ❤ by GitHub
ELF string in file tells that this is Executable and Linkable Format.

4. Linking
So, this is the final stage of compilation process. It does linking of function calls and their definitions. Like printf() function. Until now this code doesn't know anything about printf(). It only puts place holder at the place of function call. At his stage actual function call printf() gets resolved and function address gets plugged-in.

function of linker:
1. Plug-ins actual function address
2. It also adds some extra code/info in your file. To run our code we need to submit it to system, so this stage adds some extra code to your code like, at start of code OS needs to set some environment variables, command line arguments plus some OS specific code so that OS can signal your application. Also to while exiting, code needs to return some variables to OS which will help to maintain the status of code and its environment. This all extra things will be added by linker.

To verify it, do following,

$ size helloguys.o
text data bss dec hex filename
97 0 0 97 61 helloguys.o
$ size helloguys
text data bss dec hex filename
1149 256 8 1413 585 helloguys
view raw linker_out.sh hosted with ❤ by GitHub


At least you can see that size of executable is lager than object file.

So, now you know Life of C, the stages that your code needs to go through before becoming executable.  This is just a tip of an ice-burg, there is lot more if you want to go into details.

Next part of the Life of C, will contain the life of your executable, what all happens when you submit it to OS to run.

$./helloguys
Hello Guys!
view raw run.sh hosted with ❤ by GitHub


Cheers!!!  

1 comment: