On this page we will explore the process of generating machine code from one or more .c files.
- The preprocessor first handles any preprocessor directives (e.g., #include).
- The compiler then generates assembly code targeted for the ATmega32 microcontroller.
- The linker produces the actual machine code to be downloaded to the microcontroller.
- In addition, we can generate an .lss file (equivalent to the .lst file generated with the Atmel assembler).
Trivial C Program
The following C program, wk1.c, does nothing:
int main() { }
Even so, it produces the following .lss file:
wk1.elf: file format elf32-avr Sections: Idx Name Size VMA LMA File off Algn 0 .text 000000a2 00000000 00000000 00000054 2**1 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .stab 00000378 00000000 00000000 000000f8 2**2 CONTENTS, READONLY, DEBUGGING 2 .stabstr 0000005f 00000000 00000000 00000470 2**0 CONTENTS, READONLY, DEBUGGING 3 .debug_aranges 00000020 00000000 00000000 000004cf 2**0 CONTENTS, READONLY, DEBUGGING 4 .debug_pubnames 0000001b 00000000 00000000 000004ef 2**0 CONTENTS, READONLY, DEBUGGING 5 .debug_info 0000008a 00000000 00000000 0000050a 2**0 CONTENTS, READONLY, DEBUGGING 6 .debug_abbrev 00000034 00000000 00000000 00000594 2**0 CONTENTS, READONLY, DEBUGGING 7 .debug_line 00000055 00000000 00000000 000005c8 2**0 CONTENTS, READONLY, DEBUGGING 8 .debug_frame 00000020 00000000 00000000 00000620 2**2 CONTENTS, READONLY, DEBUGGING Disassembly of section .text: 00000000 <__vectors>: 0: 0c 94 2a 00 jmp 0x54 ; 0x54 <__ctors_end> 4: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 8: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> c: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 10: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 14: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 18: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 1c: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 20: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 24: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 28: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 2c: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 30: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 34: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 38: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 3c: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 40: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 44: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 48: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 4c: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 50: 0c 94 47 00 jmp 0x8e ; 0x8e <__bad_interrupt> 00000054 <__ctors_end>: 54: 11 24 eor r1, r1 56: 1f be out 0x3f, r1 ; 63 58: cf e5 ldi r28, 0x5F ; 95 5a: d8 e0 ldi r29, 0x08 ; 8 5c: de bf out 0x3e, r29 ; 62 5e: cd bf out 0x3d, r28 ; 61 00000060 <__do_copy_data>: 60: 10 e0 ldi r17, 0x00 ; 0 62: a0 e6 ldi r26, 0x60 ; 96 64: b0 e0 ldi r27, 0x00 ; 0 66: e2 ea ldi r30, 0xA2 ; 162 68: f0 e0 ldi r31, 0x00 ; 0 6a: 02 c0 rjmp .+4 ; 0x70 <.do_copy_data_start> 0000006c <.do_copy_data_loop>: 6c: 05 90 lpm r0, Z+ 6e: 0d 92 st X+, r0 00000070 <.do_copy_data_start>: 70: a0 36 cpi r26, 0x60 ; 96 72: b1 07 cpc r27, r17 74: d9 f7 brne .-10 ; 0x6c <.do_copy_data_loop> 00000076 <__do_clear_bss>: 76: 10 e0 ldi r17, 0x00 ; 0 78: a0 e6 ldi r26, 0x60 ; 96 7a: b0 e0 ldi r27, 0x00 ; 0 7c: 01 c0 rjmp .+2 ; 0x80 <.do_clear_bss_start> 0000007e <.do_clear_bss_loop>: 7e: 1d 92 st X+, r1 00000080 <.do_clear_bss_start>: 80: a0 36 cpi r26, 0x60 ; 96 82: b1 07 cpc r27, r17 84: e1 f7 brne .-8 ; 0x7e <.do_clear_bss_loop> 86: 0e 94 49 00 call 0x92 ; 0x92 <main> 8a: 0c 94 50 00 jmp 0xa0 ; 0xa0 <_exit> 0000008e <__bad_interrupt>: 8e: 0c 94 00 00 jmp 0 ; 0x0 <__vectors> 00000092 <main>: 92: cf 93 push r28 94: df 93 push r29 96: cd b7 in r28, 0x3d ; 61 98: de b7 in r29, 0x3e ; 62 9a: df 91 pop r29 9c: cf 91 pop r28 9e: 08 95 ret 000000a0 <_exit>: a0: ff cf rjmp .-2 ; 0xa0 <_exit>
- It is important to note that the addresses shown in the .lss file are 2 times the values found in the .lst files.
- This results in a mismatch between the values found in the .lss file and the disassembler file. (The disassembler file can be accessed from the view menu in AVR Studio when running the debugger.)
Vectors
- The compiler initializes all of the interrupt jump vectors.
- For our program we only make use of the reset jump vector.
- The compiler jumps to ctors_end when the microcontroller is powered on.
- The remaining jump vectors all contain jumps to bad_interrupt
- The bad_interrupt code just jumps back to the reset vector.
- Therefore, if any interrupt is unintentionally triggered, the program starts over.
ctors_end
00000054 <__ctors_end>: 54: 11 24 eor r1, r1 56: 1f be out 0x3f, r1 ; 63 SREG 58: cf e5 ldi r28, 0x5F ; 95 5a: d8 e0 ldi r29, 0x08 ; 8 5c: de bf out 0x3e, r29 ; 62 SPH 5e: cd bf out 0x3d, r28 ; 61 SPL
- Clears the status register and initializes the stack pointer.
- See p. 299 of the ATmega32 Datasheet for the addresses for the i/o registers.
do_copy_data
Initializes X and Z registers for copying data from program memory into data memory.
00000060 <__do_copy_data>: 60: 10 e0 ldi r17, 0x00 ; 0 62: a0 e6 ldi r26, 0x60 ; 96 64: b0 e0 ldi r27, 0x00 ; 0 66: e2 ea ldi r30, 0xA2 ; 162 68: f0 e0 ldi r31, 0x00 ; 0 6a: 02 c0 rjmp .+4 ; 0x70 <.do_copy_data_start>
- X points to the beginning of data memory.
- Z points to the beginning of the data currently stored in program memory.
- We'll revisit the do_copy_data_whatever when we actually have data to be copied.
do_clear_bss
- The .bss section is used to store uninitialized global or static variables.
- The do_clear_bss_whatever sections are used to clear out the data values where these global/static variables will be stored.
- We'll revisit the do_clear_bss_whatever when we actually have data to be copied.
Empty main and Beyond
00000092 <main>: 92: cf 93 push r28 94: df 93 push r29 96: cd b7 in r28, 0x3d ; 61 98: de b7 in r29, 0x3e ; 62 9a: df 91 pop r29 9c: cf 91 pop r28 9e: 08 95 ret 000000a0 <_exit>: a0: ff cf rjmp .-2 ; 0xa0 <_exit>
- main automatically reads the stack pointer into the Y register.
- Since main uses the Y register, R28 and R29 get pushed onto the stack at the beginning and popped of the stack at the end.
- Upon returning (back to 0x8a) we JMP to _exit which just loops forever.
Non-Trivial C Program
#include <avr/io.h> int y[4]; int main() { char word[] = "funness"; int x[3]; int i = 1; PORTB = x[i]; for(i=0; i<3; ++i) { x[i] = PINB; } for(i=0; i<3; ++i) { PORTB = x[3-i]; } return 0; }
Below are parts of the listing file generated with no compiler optimization.
- The compiler has picked the following locations to store variable data:
- word — stored in initialized data memory.
- y — stored in uninitialized data memory for global/static variables (bss section).
- x — stored on the stack.
- i — stored on the stack.
Data Copy
00000060 <__do_copy_data>: 60: 10 e0 ldi r17, 0x00 ; 0 62: a0 e6 ldi r26, 0x60 ; 96 64: b0 e0 ldi r27, 0x00 ; 0 66: ee ea ldi r30, 0xAE ; 174 68: f1 e0 ldi r31, 0x01 ; 1 6a: 02 c0 rjmp .+4 ; 0x70 <.do_copy_data_start> 0000006c <.do_copy_data_loop>: 6c: 05 90 lpm r0, Z+ 6e: 0d 92 st X+, r0 00000070 <.do_copy_data_start>: 70: a8 36 cpi r26, 0x68 ; 104 72: b1 07 cpc r27, r17 74: d9 f7 brne .-10 ; 0x6c <.do_copy_data_loop>
- A total of eight bytes are copied from program memory to data memory (for word).
- These eight bytes are the seven characters in funness and the null terminator.
- The do_copy_data_loop copies a value from program memory into R0 and then into data memory.
- The do_copy_data_start determines if the X pointer is pointing to the last data memory location reserved for storing some program variables.
- The compiler will store some other program variables on the stack (which you'll see in a minute... if you stay awake).
Clearing Data Area for Uninitialized Global/Static Variables
- All global/static variables are stored in data memory since they need to be available for the life of the program.
- Uninitialized global/static variables are cleared on initialization.
- In this program y is an uninitialized global variable requiring ten bytes (since each int is 16-bits wide).
00000076 <__do_clear_bss>: 76: 10 e0 ldi r17, 0x00 ; 0 78: a8 e6 ldi r26, 0x68 ; 104 7a: b0 e0 ldi r27, 0x00 ; 0 7c: 01 c0 rjmp .+2 ; 0x80 <.do_clear_bss_start> 0000007e <.do_clear_bss_loop>: 7e: 1d 92 st X+, r1 00000080 <.do_clear_bss_start>: 80: a0 37 cpi r26, 0x72 ; 114 82: b1 07 cpc r27, r17 84: e1 f7 brne .-8 ; 0x7e <.do_clear_bss_loop> 86: 0e 94 49 00 call 0x92 ; 0x92 <main> 8a: 0c 94 d6 00 jmp 0x1ac ; 0x1ac <_exit>
- Here we just cycle through the ten bytes of data memory writing zero to each location.
- Note: By compiler convention, R1 always has the value of zero.
- Once the uninitialized global/static memory has been cleared, main is called (line 86).
- If main returns, the controller get sent to a tight loop that cycles forever.
Main
The main function is much more exciting now. Here are few things to keep in mind when view this code:
- Non-static local variables are stored either in a dedicated register or on the stack.
- The Y register is used as a local stack pointer for the function.
- Interrupts are disabled when the stack pointer (addresses 0x3e and 0x3d) is being read.
00000092 <main>: 92: cf 93 push r28 94: df 93 push r29 96: cd b7 in r28, 0x3d ; 61 98: de b7 in r29, 0x3e ; 62 9a: 65 97 sbiw r28, 0x15 ; 21 9c: 0f b6 in r0, 0x3f ; 63 9e: f8 94 cli a0: de bf out 0x3e, r29 ; 62 a2: 0f be out 0x3f, r0 ; 63 a4: cd bf out 0x3d, r28 ; 61 a6: ae 01 movw r20, r28 a8: 4d 5f subi r20, 0xFD ; 253 aa: 5f 4f sbci r21, 0xFF ; 255 ac: 5a 8b std Y+18, r21 ; 0x12 ae: 49 8b std Y+17, r20 ; 0x11 b0: 80 e6 ldi r24, 0x60 ; 96 b2: 90 e0 ldi r25, 0x00 ; 0 b4: 9c 8b std Y+20, r25 ; 0x14 b6: 8b 8b std Y+19, r24 ; 0x13 b8: 98 e0 ldi r25, 0x08 ; 8 ba: 9d 8b std Y+21, r25 ; 0x15 bc: eb 89 ldd r30, Y+19 ; 0x13 be: fc 89 ldd r31, Y+20 ; 0x14 c0: 00 80 ld r0, Z c2: 4b 89 ldd r20, Y+19 ; 0x13 c4: 5c 89 ldd r21, Y+20 ; 0x14 c6: 4f 5f subi r20, 0xFF ; 255 c8: 5f 4f sbci r21, 0xFF ; 255 ca: 5c 8b std Y+20, r21 ; 0x14 cc: 4b 8b std Y+19, r20 ; 0x13 ce: e9 89 ldd r30, Y+17 ; 0x11 d0: fa 89 ldd r31, Y+18 ; 0x12 d2: 00 82 st Z, r0 d4: 49 89 ldd r20, Y+17 ; 0x11 d6: 5a 89 ldd r21, Y+18 ; 0x12 d8: 4f 5f subi r20, 0xFF ; 255 da: 5f 4f sbci r21, 0xFF ; 255 dc: 5a 8b std Y+18, r21 ; 0x12 de: 49 8b std Y+17, r20 ; 0x11 e0: 5d 89 ldd r21, Y+21 ; 0x15 e2: 51 50 subi r21, 0x01 ; 1 e4: 5d 8b std Y+21, r21 ; 0x15 e6: 8d 89 ldd r24, Y+21 ; 0x15 e8: 88 23 and r24, r24 ea: 41 f7 brne .-48 ; 0xbc <main+0x2a> ec: 81 e0 ldi r24, 0x01 ; 1 ee: 90 e0 ldi r25, 0x00 ; 0 f0: 9a 83 std Y+2, r25 ; 0x02 f2: 89 83 std Y+1, r24 ; 0x01 f4: a8 e3 ldi r26, 0x38 ; 56 f6: b0 e0 ldi r27, 0x00 ; 0 f8: 89 81 ldd r24, Y+1 ; 0x01 fa: 9a 81 ldd r25, Y+2 ; 0x02 fc: 9c 01 movw r18, r24 fe: 22 0f add r18, r18 100: 33 1f adc r19, r19 102: ce 01 movw r24, r28 104: 01 96 adiw r24, 0x01 ; 1 106: 82 0f add r24, r18 108: 93 1f adc r25, r19 10a: fc 01 movw r30, r24 10c: 3a 96 adiw r30, 0x0a ; 10 10e: 80 81 ld r24, Z 110: 91 81 ldd r25, Z+1 ; 0x01 112: 8c 93 st X, r24 114: 1a 82 std Y+2, r1 ; 0x02 116: 19 82 std Y+1, r1 ; 0x01 118: 16 c0 rjmp .+44 ; 0x146 <main+0xb4> 11a: 29 81 ldd r18, Y+1 ; 0x01 11c: 3a 81 ldd r19, Y+2 ; 0x02 11e: e6 e3 ldi r30, 0x36 ; 54 120: f0 e0 ldi r31, 0x00 ; 0 122: 80 81 ld r24, Z 124: 48 2f mov r20, r24 126: 50 e0 ldi r21, 0x00 ; 0 128: 22 0f add r18, r18 12a: 33 1f adc r19, r19 12c: ce 01 movw r24, r28 12e: 01 96 adiw r24, 0x01 ; 1 130: 82 0f add r24, r18 132: 93 1f adc r25, r19 134: fc 01 movw r30, r24 136: 3a 96 adiw r30, 0x0a ; 10 138: 51 83 std Z+1, r21 ; 0x01 13a: 40 83 st Z, r20 13c: 89 81 ldd r24, Y+1 ; 0x01 13e: 9a 81 ldd r25, Y+2 ; 0x02 140: 01 96 adiw r24, 0x01 ; 1 142: 9a 83 std Y+2, r25 ; 0x02 144: 89 83 std Y+1, r24 ; 0x01 146: 89 81 ldd r24, Y+1 ; 0x01 148: 9a 81 ldd r25, Y+2 ; 0x02 14a: 83 30 cpi r24, 0x03 ; 3 14c: 91 05 cpc r25, r1 14e: 2c f3 brlt .-54 ; 0x11a <main+0x88> 150: 1a 82 std Y+2, r1 ; 0x02 152: 19 82 std Y+1, r1 ; 0x01 154: 1b c0 rjmp .+54 ; 0x18c <main+0xfa> 156: a8 e3 ldi r26, 0x38 ; 56 158: b0 e0 ldi r27, 0x00 ; 0 15a: 23 e0 ldi r18, 0x03 ; 3 15c: 30 e0 ldi r19, 0x00 ; 0 15e: 89 81 ldd r24, Y+1 ; 0x01 160: 9a 81 ldd r25, Y+2 ; 0x02 162: f9 01 movw r30, r18 164: e8 1b sub r30, r24 166: f9 0b sbc r31, r25 168: cf 01 movw r24, r30 16a: 9c 01 movw r18, r24 16c: 22 0f add r18, r18 16e: 33 1f adc r19, r19 170: ce 01 movw r24, r28 172: 01 96 adiw r24, 0x01 ; 1 174: 82 0f add r24, r18 176: 93 1f adc r25, r19 178: fc 01 movw r30, r24 17a: 3a 96 adiw r30, 0x0a ; 10 17c: 80 81 ld r24, Z 17e: 91 81 ldd r25, Z+1 ; 0x01 180: 8c 93 st X, r24 182: 89 81 ldd r24, Y+1 ; 0x01 184: 9a 81 ldd r25, Y+2 ; 0x02 186: 01 96 adiw r24, 0x01 ; 1 188: 9a 83 std Y+2, r25 ; 0x02 18a: 89 83 std Y+1, r24 ; 0x01 18c: 89 81 ldd r24, Y+1 ; 0x01 18e: 9a 81 ldd r25, Y+2 ; 0x02 190: 83 30 cpi r24, 0x03 ; 3 192: 91 05 cpc r25, r1 194: 04 f3 brlt .-64 ; 0x156 <main+0xc4> 196: 80 e0 ldi r24, 0x00 ; 0 198: 90 e0 ldi r25, 0x00 ; 0 19a: 65 96 adiw r28, 0x15 ; 21 19c: 0f b6 in r0, 0x3f ; 63 19e: f8 94 cli 1a0: de bf out 0x3e, r29 ; 62 1a2: 0f be out 0x3f, r0 ; 63 1a4: cd bf out 0x3d, r28 ; 61 1a6: df 91 pop r29 1a8: cf 91 pop r28 1aa: 08 95 ret
Exercise for Students
Recompile the same program with the highest level of optimization (-0s) and see how things change. Is this what you would expect?