This lab uses the SNOWBALL server. You will need to log in and create a log like you did in previous labs. Remember to turn in a version of the log that has the control characters removed. Also, there are prompts/questions to answer in bold.
For your information, NASM is the "Netwide ASseMbler" for x86 CPUs. It is commonly used to program x86-based computers in assembly. This lab uses the nasm program.
After this lab, we will use the Venus simulator. You will notice that the code in future labs is a bit different from this and the previous labs, and you will likely find it to be simpler. After this lab, we will use the Venus assembler for this course unless specifically stated otherwise.
First, we will use the pound-sign ("#") for comments.
NASM uses the semi-colon (";") for comments.
Assembly language code contains directives, which inform the
assembler of what to do, but are not actually assembly language commands
since they do not have a machine-language equivalent.
Here are some points to observe about the directives.
Venus | gcc output | NASM | Comment |
.data | .section .rodata | section .data | The data section |
.asciiz, .string | .string "hello world." | db "Hello world", 0 | Defining a string |
.text | .text | section .text | The code section |
.globl | .globl | global | Associated label will be global |
.globl main | global main | Where the code actually starts | |
pushq %rbp | push rbp | Put the stack frame reference on the stack. We are going to change rbp. | |
movl $.LC0, %edi | mov rdi,fmt | di holds a pointer to the start of the format string | |
la a1, msg | movq %rsi, -16(%rbp) | mov rsi,msg | a1 holds a pointer to the start of string that "msg" defines si holds a pointer to the start of the string to print |
li a0, 4 ecall | call puts | call printf | Call the function that prints the text |
li a0, 0 | movl $0, %eax | mov rax,0 | Put the value 0 into the A (or a0) register |
leave | pop rbp | Restore the rbp value from the start. | |
.byte | db | 8 bit value | |
.word | dw | 32 bit value |
There is a 0 (also called a null-character) after the "Hello world" string, which is a convention for strings. We know the start of the string, specified with a label. In non-object oriented languages, how do you know the length or end of a string? One way is to encode the end of the string with a special character, and here it is the NUL character (0). Given the start of a string, any function can then iterate over the string's characters until it comes to a 0, and it then know that the string's end has been reached.
Programs often use "main:" as a label, defining where the code starts. While this is mandatory with some assemblers, Venus does not appear to need it.
Both "puts" and "printf" are functions to send output to stdout. It's interesting to note that the original lab1.c program specified printf, but that the compiler generated code to call puts instead. Changes like this happen when a compiler optimizes our code for us; the result may in fact be more efficient. However, as the programmer you are responsible for your code. If there is some obscure bug on your particular system in puts but not in printf, looking at the higher-level language (HLL) code you might conclude "it cannot be that because my code uses printf". A bug that hides at the HLL level does not hide at the assembly language level.
Compiling the gcc version is done with "gcc -c lab1.s", then linking it is done with "gcc lab1.o -o lab1". Compiling the NASM version is done with "nasm -f elf64 hello_64.asm", then linking it with "gcc hello_64.o -o hello_64". Using "-f elf64" specifies the file format for the output. On SNOWBALL, omitting the "-f elf64" will generate some errors. There is an optional "-l hello_64.lst" that creates a "listing" file, with both the assembly language instructions and the machine language results.
; Assemble: nasm -f elf64 AddTwoSum_64.asm
; Link: gcc AddTwoSum_64.o -o AddTwoSum_64
; AddTwoSum_64.asm - Example from Chapter 3 of the book by Kip Irvine.
; See http://www.asmirvine.com/gettingStartedVS2019/index.htm
; This is adapted for NASM.
section .data ; Data section, initialized variables
sum: dq 0
section .text
global main
main:
mov rax, 5
add rax, 6
mov [sum], rax
mov rax, 0
ret
You can download this here.
Once you have a copy on SNOWBALL, use the "nasm" command to assemble it.
Then use the "gcc" command to link it.
Then run it.
Describe what this program does from the "main:" label to the end.
What do you observe when you run it? Does the program work?
Now let's look at an expanded version.
; Assemble: nasm -f elf64 AddTwoSum_64_pt2.asm
; Link: gcc AddTwoSum_64_pt2.o -o AddTwoSum_64_pt2
; Based on AddTwoSum_64.asm (by Kip Irvine)
; This is adapted for NASM.
extern printf ; We will use this external function
section .data ; Data section, initialized variables
mystr: db "%d", 10, 0 ; String format to use (decimal), followed by NL
sum: dq 0
section .text
global main
main:
mov rax,5
add rax,6
mov [sum], rax
; Now print the result out
mov rdi, mystr ; Format of the string to print
mov rsi, [sum] ; Value to print
mov rax, 0
call printf
mov rax, 0
ret
You can download this here.
Once you have a copy on SNOWBALL, use the "nasm" command to assemble it.
Then use the "gcc" command to link it.
Then run it.
What do you observe? Does the program work?
What does this program do differently from the first one? (Describe
what the assembly language commands do.)
Compile this again, only this time use
"nasm -f elf64 -l AddTwoSum_64_pt2.lst AddTwoSum_64_pt2.asm".
Use "cat" to show the AddTwoSum_64_pt2.lst file.
What do you observe in the file?
Run the command "xxd AddTwoSum_64_pt2" (which creates a hexadecimal dump
of the file's contents). What do you observe there, and how does it
relate to the .lst file? (Hint: look for the values B8 in the AddTwoSum_64_pt2.lst and b8 in the xxd output.)
Now we'll see a slightly different version, called AddTwoSum_64_pt3.asm. Download it, put it on SNOWBALL, and use the "nasm" command to assemble it then gcc to link it. Then run it. Do you observe any differences between this and AddTwoSum_64_pt2.asm? Use the "diff" command to show the differences between them, then explain what they are.
Now run AddTwoSum_64_pt2, and then enter the command
echo $?
Next, run AddTwoSum_64_pt3,
and then enter the command
echo $?
What do you observe about the output from these two commands?
Look up what a "return value" value is under Unix/Linux,
describe what it is, and say how it relates to this lab.
Be sure to document where you got your answer, and use double-quotes
for anything you do not say yourself.
Remember that we will grade your lab report so it is vital to turn that in. The other files (your code, a text version of any log file, etc.) are to document your work in case we need more information.
In this lab, we have: