4 – Developing an Operating System – Tutorial – Episode 2 – 7c00 Memory addressing

In Episode 2, we saw how we can write a simple boot sector and print Welcome to LearnOS on the screen.

Links to Previous Episodes:

Episode 1 – Introduction
Episode 2 – BIOS, UEFI, Boot loader

In this episode, we are going to learn about memory addressing. I was planning to actually get into kernel writing, but thought that it will be better to do some more basic instead of jumping directly into kernel. So, let’s get right into it.

YouTube video of this Episode – Episode 3:

So, What is 7c00? You might have seen other boot sector code where they have used 7c00 as origin for boot sector. But why? Have a look at the following figure:

When BIOS loads 1st sector of our disk into memory, it will load it at particular memory address i.e. 7c00.

Let us verify that BIOS is really loading our program at 7c00 or not. Also With that we will be able to figure out why boot loader / sector programs sets base pointer of CPU by writing a code ORG 7c00h or ORG 0x7c00.

Let’s write following code in our boot.asm file:

 mov     ah, 0Eh
 
 mov     al, "1"    ;Print 1
 int        10h
 mov     al, data_text
 int        10h

 jmp $        ;Infinite loop

 data_text:
        db      "0"

 times     510-($-$$)    db    0
 dw         0xAA55

If you compile above code and run it on qemu (Not sure, how to compile and run? Read here) you will see following output:

We are expecting code to print 1 and after that 0 on the screen, but instead we see is 1 and junk character. This is happening because we are addressing the pointer and not the data of data_text.

Let us try to use correct approach and add additional code of printing content of memory address of data_text. Code will look like following:

 mov     ah, 0Eh
 
 mov     al, "1"    ;Print 1
 int        10h
 mov     al, data_text
 int        10h

 mov     al, "2"    ;Print 2
 int        10h
 mov     al, [data_text]
 int        10h

 jmp $        ;Infinite loop

 data_text:
        db      "0"

 times     510-($-$$)    db    0
 dw         0xAA55

Now, if you compile and execute the code, final output will be as follows:

Output is blank after 2. This is happening because our program does not know what is the memory address of data_text.

Now let’s improve it further by changing the base pointer (bx register) of CPU to 7c00 address. Our code will look like following:

mov     ah, 0Eh
 
 mov     al, "1"    ;Print 1
 int        10h
 mov     al, data_text
 int        10h

 mov     al, "2"    ;Print 2
 int        10h
 mov     al, [data_text]
 int        10h

 mov     al, "3"    ;Print 3
 int        10h
 mov     bx, 0x7c00        ;move base pointer of CPU to 7c00.
 mov     al, [bx]
 mov     bx, data_text        ;add address of data_text to base pointer.
 int        10h

 jmp $        ;Infinite loop

 data_text:
        db      "0"

 times     510-($-$$)    db    0
 dw         0xAA55

Compile the above program and run it and you should be able to see output as follows:

This works.. now Let’s do one more verification. We will now try to access memory address directly and print its content instead of accessing data_text variable.

mov     ah, 0Eh
 
mov     al, "1"    ;Print 1
int        10h
mov     al, data_text
int        10h

mov     al, "2"    ;Print 2
int        10h
mov     al, [data_text]
int        10h

mov     al, "3"    ;Print 3
int        10h
mov     bx, 0x7c00        ;move base pointer of CPU to 7c00.
mov     al, [bx]
mov     bx, data_text        ;add address of data_text to base pointer.
int        10h

mov     al, "4"
int        10h
mov     al, [0x7c2c]
int        10h   

jmp $        ;Infinite loop

data_text:
        db      "0"

times     510-($-$$)    db    0
dw         0xAA55 

After compiling and running above code, we get following output:

This confirms us that our code is loaded at 7c00 memory address by BIOS. If you notice we are accessing memory address 7c2c and not 7c00 because the lines above data_text occupied memory. You can check the hex dump of our compiled file and see the location of 0, and you will notice that it is at 2c. If we add 2c to 7c00 we get 7c2c. Hex dump as follows:

This works good, but we cannot always compute the memory addresses and write our code by adding to offset. To avoid this issue, we will use ORG 7C00 in start of our code. Let’s modify our boot.asm file with following code:

org     0x7c00

mov     ah, 0Eh 

mov     al, "1"
 int        10h
 mov     al, [data_text]
 int        10h

jmp     $

data_text:
             db     "0"

times     510-($-$$)     db     0
 dw     0xAA55

Output of above code will be:

Output is as expected. Hope you have understood what I am trying to explain it here. If you are not really confident about this, feel free to ask any questions in the comment section. I will be happy to answer it.

Next I am planning switch into protected mode from real mode, initialize video adapter and load second stage which will load / initialize own file system.

Source repository for LearnOS is at following URL:

https://github.com/dhavalhirdhav/LearnOS

About the Author