Lab 6 -- macros and subroutines

Have you noticed how we often need to repeat the same code again and again? For example, we might want to print something out. To print a value, we define a format string and data value (i.e. "sum") in the data section.


    mystr: db "%d", 10, 0   ; String format to use (decimal), followed by NL
    sum:   dq 0
Then in the code section, we refer to the string and the value to print as follows.

                      ; Now print the result out
   mov   rdi, mystr   ; Format of the string to print
   mov   rsi, [sum]   ; Value to print
   mov   rax, 0
   call  printf
Thus to print something out, we use a set of commands like those above to do this. Then we might want to print another value, so we repeat the same code with a minor variation. Later, we print a third value, and repeat the same code again with a minor variation. Isn't there a way to make this easier?

There are a couple of ways, and that is the subject of this lab. First, we will define and use a macro in part 1. Then we will define and use a subroutine in part 2.

Part 1

A macro is a kind of short-hand notation that the assembler (technically, the pre-processor) will process. This exists in higher level languages, too, such as the "#define" directive in C. It works like a smart find-and-replace. When the macro is found, it is "expanded" into whatever the programmer indicated. This is perhaps best explained with an example. Consider this example.

    %macro  print 2 
    
                          ; Print arg2 using string arg1
       mov   rdi, %1      ; Format of the string to print
       mov   rsi, %2      ; Value to print
       mov   rax, 0
       call  printf

    %endmacro 
The "%macro" and "%endmacro" delineate where this macro begins and ends. The macro has the name "print", and has 2 arguments. If you examine the code within the macro, you should reconize it as the commands that we have used to print a value. The only detail left to notice is that the macro contains "%1" and "%2", which correspond to the two arguments, respectively.

To use the macro, we put the macro's name, followed by the label of the format string and the memory location to print.


   print mystr, [sum]
What the assembler will do is "expand" this to the code defined in the macro, substituting "mystr" for "%1" and "[sum]" for "%2". Instead of typing out 4 or more lines to call printf, we just specify the one line with "print". Think of it like a global find-and-replace operation. Whether you use the macro or type out the equivalent lines, the result is the same.

If we have several values to print with different format strings, our code might look like this.


   print mystr1, [val1]
   print mystr2, [val2]
   print mystr3, [sum]
It should be easy to see that working with macros can save the programmer a lot of time and energy. It can also make the code easier to debug, since it contains a regular pattern. That is, imagine if we do not use a macro and instead type the commands for printf several times. And suppose that there is a subtle mistake in one of the commands, like switching rdi and rsi. Would you be able to spot the difference?

See this link for more information about macros.

Copy your code from the last lab, and replace any instances of calling printf with a macro as defined above. Call the result "lab6_pt1.asm". As with all labs, show the code (use "cat"), show the compilation, and that it runs.

Part 2

Another option for making repetitive code easier to use is the subroutine. Like a macro, a subroutine defines code that you can use again and again. However, a subroutine is a function, similar to a method in OOP. When you want to use a subroutine, you issue a call command to it, like the following.

   call mysubroutine
The subroutine must be defined in the code section. If you recall, all examples include a "ret" instruction as the last command. This returns control to whatever called your program, such as the OS shell. A subroutine is no different: it must end with a return instruction. When the computer calls the subroutine, it must remember where to come back to. It does this by pushing the current Instruction Pointer (IP) on the stack, then setting the IP to the subroutine's address. When the CPU gets to the return instruction, it pops the address from the stack and puts that in the IP.

We can create a subroutine for printing an integer and call it like this.


   call print_int
It could use a pre-defined format string, and if all that it prints is an integer, the string "%d" would work, defined in the data section. But this raises a question: how does it know what value to print? We would have to communicate the value somehow. A possible solution is to have a specific data value, defined with a label in the data section, then move the value to that location before calling "print_int". While this would work, you (the programmer) would need to remember which label to use for the move. Another solution is to use a register, such as A. Move the value to print to A, then call the subroutine. This helps with efficiency, especially if the value needs to be in a register in the subroutine.

The subroutine should be located after the main function's return. Make sure that "int_format" is defined in the data section. It could be the same as "mystr". The program should look like this:


   main:
       ; ... code goes here
       ; Put value to print in A register, if it is not there already
       mov   rax, [sum]
       call  print_int
       ; ... more code
       ret

   print_int:
       ; Instructions to print an int value go here
       mov   rdi, int_format  ; Format of the string to print
       mov   rsi, rax         ; Value to print
       mov   rax, 0
       call  printf
       ret

Copy your code from the previous lab (or part 1), and replace any instances of calling printf with a subroutine as defined above. Call the result "lab6_pt2.asm". As with all labs, show the code (use "cat"), show the compilation, and that it runs.

Questions:

What we learned