Lab 13 -- Machine Language

By now you have worked with the assembler enough to see that it translates assembly language instructions into machine language. It makes some things easy for us, such as allowing labels. You may have wondered how the assembler works, and we will explore this in this lab.

Part 1

We will use a program called "print_regs", which as the name suggests prints the values of the A, B, C, and D registers. Instead of working with an assembly language file, we will be working on the executable.

The first step is to get the "print_regs" file. Here is what the program does when run.


    [mweeks@gsuad.gsu.edu@snowball ~]$ ./print_regs
    A = 0
    B = 0
    C = 0
    D = 0
    [mweeks@gsuad.gsu.edu@snowball ~]$
You should be able to download this, then use sftp to copy it to SNOWBALL. The following shows what steps you need to take to get it to run. Note that "print_regs_downloaded" is a copy of "print_regs" that was downloaded (using Firefox), then put on the server under a new name. You do not need to call it by a new name.

    [mweeks@gsuad.gsu.edu@snowball ~]$ ./print_regs_downloaded
    -bash: ./print_regs_downloaded: Permission denied
    [mweeks@gsuad.gsu.edu@snowball ~]$ ls -l print_regs_downloaded
    -rw-r--r--. 1 mweeks@gsuad.gsu.edu mweeks@gsuad.gsu.edu 8656 Mar 13 12:10 print_regs_downloaded
    [mweeks@gsuad.gsu.edu@snowball ~]$ chmod 755 print_regs_downloaded
    [mweeks@gsuad.gsu.edu@snowball ~]$ ./print_regs_downloaded
    A = 0
    B = 0
    C = 0
    D = 0
    [mweeks@gsuad.gsu.edu@snowball ~]$ 
Trying to run the program generates a "Permission denied" error, which is a security feature: you normally do not want to download and run someone else's compiled program. Using "ls -l" shows that the file has read and write permissions for the owner. The "chmod 755" command changes the permissions to read, write, and execute for the file owner, then read and execute for the group and world. Finally, running the program shows that it outputs the same information as "print_regs" above. Notice that the file size is "8656", which is something that you can use to verify a copy if you run into problems.

Your first task is simple: get a copy of the "print_regs" program, and run it on SNOWBALL. If you were able to do this, skip to Part 2.

If you run into trouble getting the file on SNOWBALL, you can instead download and put the file print_regs.xxd, which is the result of running "xxd" on the file. In other words, this is the file as strings of hexadecimal values. On the server, you can reverse ("-r") the encoding, and redirect the output to a file.


    xxd -r print_regs.xxd > print_regs
You will still need to do the "chmod" command from above. Note that if you get an error like "xxd: Disk quota exceeded", then delete the file that xxd created, and check that the command (and .xxd file) are correct. You may need to reach out to a TA if this happens.




Part 2

We will use the hexadecimal version of the "print_regs" file, too. You can easily do this with the following command.


    xxd print_regs > print_regs.xxd
Next, examine the "print_regs.xxd" file. It is over 500 lines long, so using a command like "more" or even "cat" can be wasteful. Instead, use the following command.

    grep "00005[3456]0:" < print_regs.xxd
This redirects the "print_regs.xxd" file as input to the "grep" command. Grep finds a pattern within a file, and prints all lines matching that pattern. In this case, the pattern is "00005" followed by 3, 4, 5, or 6, then "0:", so any line containing "0000530:", "0000540:", etc., will be printed. When you use the "grep" command above, you should see 4 lines. Show the "grep" output in the log.

The address 530 (hexadecimal) is where the "main:" label starts. You should see that the first three bytes are 48 31 c0, which corresponds to the command "xor rax, rax". After this is 48 31 db, which is "xor rbx, rbx". Next is 48 31 c9, which means "xor rcx, rcx".

Questions:




Part 3

The following is partial output from the command objdump -d print_regs -M intel. You do not have to use this command for this lab. Notice that it disassembles the code of the "print_regs" program.

    0000000000400530 <main>:
      400530:    48 31 c0                 xor    rax,rax
      400533:    48 31 db                 xor    rbx,rbx
      400536:    48 31 c9                 xor    rcx,rcx
      400539:    48 31 d2                 xor    rdx,rdx
      40053c:    90                       nop
      40053d:    90                       nop
      40053e:    90                       nop
Notice that the byte value "90" is repeated many times. This is to have some room to work with; this encodes the "NOP" command, which literally means "no operation".

At this point, it is a good idea for you to create a back-up of the print_regs.xxd file. You can do this simply by copying it to another filename. Next, change the print_regs.xxd file to replace 5 consecutive NOPs with the following values, in this order.


    bb 45 00 00 00
There are three sections to the xxd file, such as the following lines show.

    0000530: 4831 c048 31db 4831 c948 31d2 9090 9090  H1.H1.H1.H1.....
The left-most column specifies the offset. The next 8 columns contain 2 bytes of data each. The last column shows the equivalent data in ASCII. Ignore the equivalent data in ASCII, since the xxd program will ignore it, too. When finished, use xxd to "reverse" the hex encoding with "-r", and redirect the output to a file.

    xxd -r print_regs.xxd > print_regs
Again, xxd does not look at the last column when reversing the hex encoding. The new copy of "print_regs" may not be executable, so issue the "chmod" command (from part 1) if needed. Note that if you get an error like "xxd: Disk quota exceeded", then delete the file that xxd created, and check that the command (and .xxd file) are correct.

Now run the new "print_regs" program. What do you observe?

As you probably figured out, the values "45 00 00 00" specify a 32-bit immediate value. The rest of the command ("bb") specifies what to do, i.e. put that immediate value into a register. Revist the print_regs.xxd file, and also replace some of the NOPs with b8 34 00 00 00, b9 56 00 00 00, and ba 67 00 00 00. Be careful when you this, since the size of the file should stay the same. Use the "xxd -r" command to make a new program, give it execute permission, and run it.

Questions:




Part 4

Let's try an experiment with a new file. Copy the following line, and paste it into a file called "lab13_pt4.xxd".


    0000000: 9048 31c9 bb45 0000 0000 0000 0000 0000  .ABC............
Now use "xxd -r" as above, and redirect the output to "lab13_pt4". This will create a binary file, and while it has valid machine language code in it, the OS is not going to run it. However, we can still disassemble it.

Trying to use "objdump" on the "lab13_pt4" file will generate an error ("file format not recognized"). We can instead use "ndisasm", the Netwide Disassembler. Try the following.


    ndisasm lab13_pt4
Question: does the result match what you expected? Why not?

Now try the following.

    ndisasm -b64 lab13_pt4
Question: does the result match what you expected? What does the "-b64" specify, and why does it help?




Part 5

The command "mov ebx, eax" assembles to "89 c3". Let's copy this into the file "lab13_pt5.xxd", as follows.


    0000000: 9048 31c9 bb45 0000 89c3 0000 0000 0000  .ABC............
Use "xxd -r" to make "lab13_pt5". Now use "ndisasm" on it ("ndisasm -b64 lab13_pt5").

Question: Is the output what you expect? Why or why not?

This sort of thing can happen when you use a disassembler on a file that contains more than just code. Let's examine that again, only this time using a "sync point" with the disassembler. This is the "-s008h" argument, which says to syncronize the code at offset 008 in hexadecimal. Yes, we could specify it a bit more simply as "-s8".

    ndisasm -b64 -s008h lab13_pt5

Questions:


More Questions:

In this lab, we have learned: