7 – Developing an Operating System – Tutorial – Episode 3 – Protected Mode, Read Disk, GDT and Initiate Kernel (C)

In previous episode we learnt about memory offset. In this episode, we are going to reach two very important milestones.

YouTube Video coming soon

1 – Switching from Real Mode to Protected Mode and define Global Descriptor Table (GDT)
2 – Read kernel from disk and Initialize Kernel written in C language.

Let us have a quick look at what we are going to do:

After our boot sector loads, we will read kernel from disk and load it into memory, after that we will switch from real mode to protected mode, what is real mode? what is protected mode? dont worry read further more. Once we are in protected mode and define Global Descriptor Table (GDT), after that we will initialize our kernel which is written in C language, and from this point onward we will go away from assembly language. Most likely we will develop file system still in Assembly, but not sure. So let’s understand many things mentioned in this paragraph.

Real Mode and Protected Mode

Every x86 CPU (32-bit CPU) and even 64-bit processor boots into 16-bit mode for backward compatibility. 16-bit processors stayed with us for a long duration and many applications were developed on 16-bit processors, including some operating systems (Dos, Windows 3.11, etc…). When 32-bit CPUs came, companies wanted to make sure that 16-bit applications would still run on 32-bit CPU, so processor companies decided that 32-bit CPU will start in real mode (16-bit) and then program will need to send special instruction to start in protected (32-bit) mode. Similarly 64-bit CPUs still start in 16-bit mode and then they switch to Long Mode. Now, why do we need 32-bit or 64-bit? To make it simpler, in 16-bit application can directly access 64 KB of memory and with segmentation max up to 1 MB. So if we have a computer with more then 1 MB RAM, 16-bit CPU cannot use them. Similarly 32-bit CPU at max can access up to 4 GB Memory space.

Apart from more memory access, Protected (32-bit) Mode also gives a lot more benefit over Real Mode. Some of the benefits are:

  1. Privilege Levels (Rings) – which means you can prevent certain types of application to access certain memory address locations.
  2. Global Descriptor Table (GDT) – we will learn about this later in the post.
  3. Paging – Operating System can restrict and control a application or task access to memory. Meaning operating system can restrict one application to access memory of second application.
  4. Multi Tasking – With rings and Task state segment, we can make multi tasking easier, but we will discuss about this much later when we implement multi tasking in our operating system.

Above are major benefits of Protected Mode, there are some more benefits. You can read more about Protected Mode on Wiki.

So now let’s load kernel from the disk and load it into specified memory address and then switch to a protected mode from real mode. One of the reason why we are going to switch to protected mode is our kernel. Our kernel is written in C, and C compiler which we are going to use will produce 32-bit binary files. So we do need to switch to protected mode to initialize our kernel.

So, let’s start coding.

org 0x7c00
KERNEL_OFFSET equ 0x1000 ;We will load our kernel at 0x1000

mov [BOOT_DRIVE], dl ;Store the drive number on which system has booted in BOOT_DRIVE variable.
mov bp, 0x9000
mov sp, bp

mov bx, MSG_REAL_MODE ;We will display 'Started in 16-bit real mode' message to the user.
call print ;print the message.
call print_nl ;add a new line.

jmp $     ;infinite loop

BOOT_DRIVE db 0
MSG_REAL_MODE db "Started in 16-bit real mode", 0

times     510-($-$$)     db   0 ;fill up boot loader upto 512 bytes.
dw   0xaa55 ;boot loader signature

Let us understand above code.. Above code is basically defining sequence of our code. If you try to compile it will give errors, as code is not complete. I have provided comments in code which will explain what we are doing.

jmp $     ;infinite loop

print:
     pusha

start:
     mov  al, [bx]
     cmp  al, 0
     je   done

     mov  ah, 0x0e
     int  0x10

     add  bx, 1
     jmp  start

done:
     popa
     ret

print_nl:
    pusha
    
    mov ah, 0x0e
    mov al, 0x0a ; newline char
    int 0x10
    mov al, 0x0d ; carriage return
    int 0x10
    
    popa
    ret

Above code, we will write it just below line number 12 in boot.asm

Now, next we will read kernel from the disk and load it into BOOT_DRIVE (0x1000) memory offset.

call print
call print_nl

call load_kernel

call switch_to_pm

In boot.asm file, we will add line number 4 as shown above.. which is about loading kernel. Here we are just going to read kernel from disk and load it into memory. NOTE: We are just loading it into memory and not executing it.

print_nl:
    pusha
    
    mov ah, 0x0e
    mov al, 0x0a ; newline char
    int 0x10
    mov al, 0x0d ; carriage return
    int 0x10
    
    popa
    ret

load_kernel:
     mov  bx, MSG_LOAD_KERNEL
     call print
     call print_nl

     mov  bx, KERNEL_OFFSET ;read from disk and store in 0x1000
     mov  dh, 1 ;read only 1 sector from HDD or bootable disk
     mov  dl, [BOOT_DRIVE]
     call disk_load
     ret

Above code we will write it inside boot.asm file, after print_nl block. Also at the bottom of the code, just below our MSG_REAL_MODE message, write below code:

MSG_LOAD_KERNEL db "Loading kernel into memory", 0

Above two code blocks will first print the message on the screen, “Loading kernel into memory” and after that it will set certain values in registers so that when we read from disk, it will load it into a specified memory offset. In above code, you will notice we are using disk_load routine.. this routine will read from disk. Below is the routine for reading from disk. disk_load routine we will write it below load_kernel routine.

; load 'dh' sectors from drive 'dl' into ES:BX
disk_load:
    pusha
    ; reading from disk requires setting specific values in all registers
    ; so we will overwrite our input parameters from 'dx'. Let's save it
    ; to the stack for later use.
    push dx

    mov ah, 0x02 ; ah <- int 0x13 function. 0x02 = 'read'
    mov al, dh   ; al <- number of sectors to read (0x01 .. 0x80)
    mov cl, 0x02 ; cl <- sector (0x01 .. 0x11)
                 ; 0x01 is our boot sector, 0x02 is the first 'available' sector
    mov ch, 0x00 ; ch <- cylinder (0x0 .. 0x3FF, upper 2 bits in 'cl')
    ; dl <- drive number. Our caller sets it as a parameter and gets it from BIOS
    ; (0 = floppy, 1 = floppy2, 0x80 = hdd, 0x81 = hdd2)
    mov dh, 0x00 ; dh <- head number (0x0 .. 0xF)

    ; [es:bx] <- pointer to buffer where the data will be stored
    ; caller sets it up for us, and it is actually the standard location for int 13h
    int 0x13      ; BIOS interrupt
    jc disk_error ; if error (stored in the carry bit)

    pop dx
    cmp al, dh    ; BIOS also sets 'al' to the # of sectors read. Compare it.
    jne sectors_error
    popa
    ret

disk_error:
    mov bx, DISK_ERROR
    call print
    call print_nl
    mov dh, ah ; ah = error code, dl = disk drive that dropped the error
    call print_hex ; check out the code at http://stanislavs.org/helppc/int_13-1.html
    jmp disk_loop

sectors_error:
    mov bx, SECTORS_ERROR
    call print

disk_loop:
    jmp $

; receiving the data in 'dx'
; For the examples we'll assume that we're called with dx=0x1234
print_hex:
    pusha

    mov cx, 0 ; our index variable

; Strategy: get the last char of 'dx', then convert to ASCII
; Numeric ASCII values: '0' (ASCII 0x30) to '9' (0x39), so just add 0x30 to byte N.
; For alphabetic characters A-F: 'A' (ASCII 0x41) to 'F' (0x46) we'll add 0x40
; Then, move the ASCII byte to the correct position on the resulting string
hex_loop:
    cmp cx, 4 ; loop 4 times
    je end_hex
    
    ; 1. convert last char of 'dx' to ascii
    mov ax, dx ; we will use 'ax' as our working register
    and ax, 0x000f ; 0x1234 -> 0x0004 by masking first three to zeros
    add al, 0x30 ; add 0x30 to N to convert it to ASCII "N"
    cmp al, 0x39 ; if > 9, add extra 8 to represent 'A' to 'F'
    jle step2
    add al, 7 ; 'A' is ASCII 65 instead of 58, so 65-58=7

step2:
    ; 2. get the correct position of the string to place our ASCII char
    ; bx <- base address + string length - index of char
    mov bx, HEX_OUT + 5 ; base + length
    sub bx, cx  ; our index variable
    mov [bx], al ; copy the ASCII char on 'al' to the position pointed by 'bx'
    ror dx, 4 ; 0x1234 -> 0x4123 -> 0x3412 -> 0x2341 -> 0x1234

    ; increment index and loop
    add cx, 1
    jmp hex_loop

end_hex:
    ; prepare the parameter and call the function
    ; remember that print receives parameters in 'bx'
    mov bx, HEX_OUT
    call print

    popa
    ret

HEX_OUT:
    db '0x0000',0 ; reserve memory for our new string

We will also put two messages at the bottom of our code file:

DISK_ERROR db "Disk read error", 0
SECTORS_ERROR db "Incorrect number of sectors read", 0

Now comes the most important part, i.e. Switching to protected mode and printing on the screen. Write below line of code in boot.asm file after call load_kernel

call load_kernel

call switch_to_pm

jmp $     ;infinite loop

switch_to_pm routine will look like following:

switch_to_pm:
    cli ; 1. disable interrupts
    lgdt [gdt_descriptor] ; 2. load the GDT descriptor
    mov eax, cr0
    or eax, 0x1 ; 3. set 32-bit mode bit in cr0
    mov cr0, eax
    jmp CODE_SEG:init_pm ; 4. far jump by using a different segment

We are disabling interrupts, we will no longer be able to use interrupts once we switch to protected mode. New instruction you will see here is lgdt, here we are loading Interrupt Descriptor Table Register. But before we switch to protected mode, we need to define our GDT (Global Descriptor Table). We will define it with below code:

gdt_start:
     dd   0x0  ;4 bytes
     dd   0x0  ;4 bytes

gdt_code:
     dw   0xffff    ;segment length, bits 0-15
     dw   0x0       ;segment base, bits 0-15
     db   0x0       ;segment base, bits 16-23
     db   10011010b ;flags (8 bits)
     db   11001111b ;flags (4 bits) + segment length, bits 16-19
     db   0x0       ;segment base, bits 24-31

gdt_data:
     dw   0xffff
     dw   0x0
     db   0x0
     db   10010010b
     db   11001111b
     db   0x0

gdt_end:

gdt_descriptor:
     dw   gdt_end - gdt_start - 1  ;size (16-bit), always one less of its true size
     dd   gdt_start                ;address (32-bit)

CODE_SEG  equ  gdt_code - gdt_start
DATA_SEG  equ  gdt_data - gdt_start

To understand Protected Mode and GDT, read it here https://en.wikipedia.org/wiki/Protected_mode also I have taken lots of inspiration from existing git hub repository for basics you can read about GDT here as well – https://github.com/cfenollosa/os-tutorial/tree/master/09-32bit-gdt

Now that we are in protected mode. We need to print a message on screen from protected mode. But we can no longer use our previous code to print on the screen. Remember we need to disable interrupts to switch to protected mode. So, now in order to print on the screen, we will access video memory and print. To do this, we will write below code:

VIDEO_MEMORY   equ  0xb8000
WHITE_ON_BLACK equ 0x0f ; the color byte for each character

print_string_pm:
     pusha
     mov  edx, VIDEO_MEMORY

print_string_pm_loop:
    mov al, [ebx] ; [ebx] is the address of our character
    mov ah, WHITE_ON_BLACK

    cmp al, 0 ; check if end of string
    je print_string_pm_done

    mov [edx], ax ; store character + attribute in video memory
    add ebx, 1 ; next char
    add edx, 2 ; next video memory position

    jmp print_string_pm_loop

print_string_pm_done:
    popa
    ret

Also we need to put below line at bottom of the code.

MSG_PROT_MODE db "Loaded 32-bit protected mode", 0

This completes our boot.asm file code and now our entire boot.asm file looks as follows (Must re-structure / split into multiple files):

org 0x7c00
KERNEL_OFFSET equ 0x1000

mov [BOOT_DRIVE], dl
mov bp, 0x9000
mov sp, bp

mov bx, MSG_REAL_MODE
call print
call print_nl

call load_kernel

call switch_to_pm

jmp $     ;infinite loop

print:
     pusha

start:
     mov  al, [bx]
     cmp  al, 0
     je   done

     mov  ah, 0x0e
     int  0x10

     add  bx, 1
     jmp  start

done:
     popa
     ret

print_nl:
    pusha
    
    mov ah, 0x0e
    mov al, 0x0a ; newline char
    int 0x10
    mov al, 0x0d ; carriage return
    int 0x10
    
    popa
    ret

load_kernel:
     mov  bx, MSG_LOAD_KERNEL
     call print
     call print_nl

     mov  bx, KERNEL_OFFSET ;read from disk and store in 0x1000
     mov  dh, 1 ;read only 1 sector from HDD or bootable disk
     mov  dl, [BOOT_DRIVE]
     call disk_load
     ret

; load 'dh' sectors from drive 'dl' into ES:BX
disk_load:
    pusha
    ; reading from disk requires setting specific values in all registers
    ; so we will overwrite our input parameters from 'dx'. Let's save it
    ; to the stack for later use.
    push dx

    mov ah, 0x02 ; ah <- int 0x13 function. 0x02 = 'read'
    mov al, dh   ; al <- number of sectors to read (0x01 .. 0x80)
    mov cl, 0x02 ; cl <- sector (0x01 .. 0x11)
                 ; 0x01 is our boot sector, 0x02 is the first 'available' sector
    mov ch, 0x00 ; ch <- cylinder (0x0 .. 0x3FF, upper 2 bits in 'cl')
    ; dl <- drive number. Our caller sets it as a parameter and gets it from BIOS
    ; (0 = floppy, 1 = floppy2, 0x80 = hdd, 0x81 = hdd2)
    mov dh, 0x00 ; dh <- head number (0x0 .. 0xF)

    ; [es:bx] <- pointer to buffer where the data will be stored
    ; caller sets it up for us, and it is actually the standard location for int 13h
    int 0x13      ; BIOS interrupt
    jc disk_error ; if error (stored in the carry bit)

    pop dx
    cmp al, dh    ; BIOS also sets 'al' to the # of sectors read. Compare it.
    jne sectors_error
    popa
    ret

disk_error:
    mov bx, DISK_ERROR
    call print
    call print_nl
    mov dh, ah ; ah = error code, dl = disk drive that dropped the error
    call print_hex ; check out the code at http://stanislavs.org/helppc/int_13-1.html
    jmp disk_loop

sectors_error:
    mov bx, SECTORS_ERROR
    call print

disk_loop:
    jmp $

; receiving the data in 'dx'
; For the examples we'll assume that we're called with dx=0x1234
print_hex:
    pusha

    mov cx, 0 ; our index variable

; Strategy: get the last char of 'dx', then convert to ASCII
; Numeric ASCII values: '0' (ASCII 0x30) to '9' (0x39), so just add 0x30 to byte N.
; For alphabetic characters A-F: 'A' (ASCII 0x41) to 'F' (0x46) we'll add 0x40
; Then, move the ASCII byte to the correct position on the resulting string
hex_loop:
    cmp cx, 4 ; loop 4 times
    je end_hex
    
    ; 1. convert last char of 'dx' to ascii
    mov ax, dx ; we will use 'ax' as our working register
    and ax, 0x000f ; 0x1234 -> 0x0004 by masking first three to zeros
    add al, 0x30 ; add 0x30 to N to convert it to ASCII "N"
    cmp al, 0x39 ; if > 9, add extra 8 to represent 'A' to 'F'
    jle step2
    add al, 7 ; 'A' is ASCII 65 instead of 58, so 65-58=7

step2:
    ; 2. get the correct position of the string to place our ASCII char
    ; bx <- base address + string length - index of char
    mov bx, HEX_OUT + 5 ; base + length
    sub bx, cx  ; our index variable
    mov [bx], al ; copy the ASCII char on 'al' to the position pointed by 'bx'
    ror dx, 4 ; 0x1234 -> 0x4123 -> 0x3412 -> 0x2341 -> 0x1234

    ; increment index and loop
    add cx, 1
    jmp hex_loop

end_hex:
    ; prepare the parameter and call the function
    ; remember that print receives parameters in 'bx'
    mov bx, HEX_OUT
    call print

    popa
    ret

HEX_OUT:
    db '0x0000',0 ; reserve memory for our new string

gdt_start:
     dd   0x0  ;4 bytes
     dd   0x0  ;4 bytes

gdt_code:
     dw   0xffff    ;segment length, bits 0-15
     dw   0x0       ;segment base, bits 0-15
     db   0x0       ;segment base, bits 16-23
     db   10011010b ;flags (8 bits)
     db   11001111b ;flags (4 bits) + segment length, bits 16-19
     db   0x0       ;segment base, bits 24-31

gdt_data:
     dw   0xffff
     dw   0x0
     db   0x0
     db   10010010b
     db   11001111b
     db   0x0

gdt_end:

gdt_descriptor:
     dw   gdt_end - gdt_start - 1  ;size (16-bit), always one less of its true size
     dd   gdt_start                ;address (32-bit)

CODE_SEG  equ  gdt_code - gdt_start
DATA_SEG  equ  gdt_data - gdt_start

switch_to_pm:
    cli ; 1. disable interrupts
    lgdt [gdt_descriptor] ; 2. load the GDT descriptor
    mov eax, cr0
    or eax, 0x1 ; 3. set 32-bit mode bit in cr0
    mov cr0, eax
    jmp CODE_SEG:init_pm ; 4. far jump by using a different segment

use32

init_pm:
     mov  ax, DATA_SEG
     mov  ds, ax
     mov  ss, ax
     mov  es, ax
     mov  fs, ax
     mov  gs, ax
     
     mov  ebp, 0x90000
     mov  esp, ebp

     call BEGIN_PM

BEGIN_PM:
     mov  ebx, MSG_PROT_MODE
     call print_string_pm
     call KERNEL_OFFSET
     jmp $

VIDEO_MEMORY   equ  0xb8000
WHITE_ON_BLACK equ 0x0f ; the color byte for each character

print_string_pm:
     pusha
     mov  edx, VIDEO_MEMORY

print_string_pm_loop:
    mov al, [ebx] ; [ebx] is the address of our character
    mov ah, WHITE_ON_BLACK

    cmp al, 0 ; check if end of string
    je print_string_pm_done

    mov [edx], ax ; store character + attribute in video memory
    add ebx, 1 ; next char
    add edx, 2 ; next video memory position

    jmp print_string_pm_loop

print_string_pm_done:
    popa
    ret

BOOT_DRIVE db 0
MSG_REAL_MODE db "Started in 16-bit real mode", 0
MSG_PROT_MODE db "Loaded 32-bit protected mode", 0
MSG_LOAD_KERNEL db "Loading kernel into memory", 0
DISK_ERROR db "Disk read error", 0
SECTORS_ERROR db "Incorrect number of sectors read", 0

times     510-($-$$)     db   0
dw   0xaa55

Now, that we are into protected mode, we will load our kernel. We are developing our kernel in 32-bit and in C language. In above code if you notice inside BEGIN_PM routine, we are calling KERNEL_OFFSET, means we are jumping to 0x1000. This will execute our kernel.

Create a file called kernel.c and write below code:

void main() {
	char* video_memory = (char*) 0xb8000;
	*video_memory = 'X';
}

So that is going to be our kernel as of now. Tiny little kernel which will print X on the screen.

We will also require to create a middle man, who will take control from boot laoder, and pass it to kernel. For this we will create a kernel loader. Let’s create a file called loader.asm and write below code:

format ELF ;instruct assembler to produce ELF (Executable and Linkable Format) file.

extrn main ;tell assembler that main is the external function so ignore the assembler / compiler if main is not found in code.

public _start

_start:
  call main ;call external main function.
  jmp $

Now, using assembler and c compiler we are going to compile our code and link them. So that binaries can call kernel correctly. For this we need to use compiler which can produce ELF files (Cross Compiler). For this we are going to use GCC, I am doing all of this on Windows, so I am going to use WSL for the same. You can go through my previous blog post to install and setup WSL as well as cross-compiler.

To compile above code, we are going to execute following commands by creating a bat file called compile.bat:

echo off
echo "clean all binaries"
del *.bin
del *.o
del *.elf

echo "compile boot.asm"
fasm boot.asm

echo "compile loader.asm"
fasm loader.asm

echo "compile kernel.c"
wsl gcc -m32 -ffreestanding -c kernel.c -o kernel.o

wsl objcopy kernel.o -O elf32-i386 kernel.elf
wsl /usr/local/i386elfgcc/bin/i386-elf-ld -o kernel.bin -Ttext 0x1000 loader.o kernel.elf --oformat binary
type boot.bin kernel.bin > os_image.bin
qemu-system-x86_64 os_image.bin

Now, execute compile.bat file from command line and you will be able to see output as following:

This will put a big smile on our face. We have achieved a great milestone of booting, reading kernel from disk, switch to protected mode, and execute the kernel.

In next chapter, we will create a video driver entirely in C language, so we can show blinking cursor, clear the screen and able to print messages on the screen by using functions like printf, etc… we will create our own printf function. Yes exciting.. so stay tuned and wait for a next blog entry, which will come up very soon.

You can access the LearnOS code repository at: https://github.com/dhavalhirdhav/LearnOS

About the Author

2 thoughts on “7 – Developing an Operating System – Tutorial – Episode 3 – Protected Mode, Read Disk, GDT and Initiate Kernel (C)

Comments are closed.