Write OS kernel from scratch – GDT and protected mode

Time:2021-10-16

Series catalog

From MBR to loader

NextBIOS boot to real mode, this article beginsloaderPreparation of. First, review the disk image and memory distribution diagram:

Write OS kernel from scratch - GDT and protected mode

At present, we only need to pay attention to the memory distribution of less than 1MB, mainly yellowmbrAnd blueloaderpart. In the previous articlembrLoaded into memory, and the program stream passes the last instruction of MBRjmp LOADER_BASE_ADDR (0x8000)It has been implementedloaderNext, you need to implement the loader.

Work of loader

Generally speaking, the work of loader mainly includes the following:

  • establishGDT(Global Descriptor Table), initialize the kernel code and segment registers(segment registers), lead the CPU into protection mode(protection mode);
  • Create kernel page directory(page directory)And page table(page tables), open virtual memoryvirtual memory), enterpagingpattern;
  • loadkernelImage to the memory, and then enter the kernel code execution. At this point, the control of the system is transferred to the kernel;

You can see that the loader has a lot of work and has involved some core parts of the x86 architecture. Therefore, in order to understand and implement the loader, you must be prepared for the following knowledge:

  • GDT, segment memory addressing, segment register, protection mode;
  • Virtual memory, page directory, page table;
  • elfFile format, because the kernel will be compiled and linked to the file in this format;

Loader implementation

As before, let’s give a link to my project code firstsrc/boot/loader.S, for your reference.

This source code has been more, especially it is written in assembly, and the code also contains many tool functions and print related functions. In order to avoid falling into confusion, several most important key nodes (functions) are extracted here, which represent the above-mentioned key nodes respectivelyloaderSeveral tasks to be done:

#Entrance
loader_start

#Initialize GDT and enter protection mode
setup_protection_mode
protection_mode_entry

#Initialize the kernel page directory and page table
setup_page

#Load and enter the kernel
init_kernel

Next, we implement these functions one by one. In this article, we first initialize GDT and enter 32-bitProtection mode

Enter loader

Before we start, let’s first look at the beginning of the loadercode, like MBR, the starting memory address of the loader code is defined first0x8000, this is because we have designed in advance that MBR will load the loader from the disk to the memory 0x8000 position and jump over, so the address of the loader must start from this address.

; LOADER_BASE_ADDR = 0x8000
SECTION loader vstart=LOADER_BASE_ADDR

Next, enter the first code of the loaderjmp loader_start, it’s a simple jump, we jump toloader_startStart to actually execute the loader:

loader_entry:
  jmp loader_start

;  Global data
; ...

loader_start:
  call clear_screen
  call setup_protection_mode

If you are not familiar with this way of assembly coding, you may wonder whyjmpExcuse me, what’s the skip in the middle? The answer is, in the middle is the data part we want to define, similar to.cGlobal variables defined in the file. It defines a bunch of strings for printing, as well as the most important onesGDT

You may have realized that the instructions and data parts of the assembly source code can be arranged freely, and their arrangement order in the final compiled binary completely follows the arrangement of the source code. So you can arbitrarily arrange the location of your instructions and data, as long as the instruction flow can flow and execute smoothly without flying. Of course, the wholeloaderThe starting position of the, i.e0x8000Must be an entry code at because this is andmbrAgreed jump address. As for the back, all can be freely played and arranged.

Initialize GDT table

To the definition of global data mentioned above, you can skip some print string information I added and go directly toGDTDefinition of. Four are defined hereGDT entry, each entry occupies 8 bytes, i.e. 64 bits. For the meaning and field format of GDT, please refer tohereYou can also refer to my previous recommendationsJamesM’s kernel development tutorials。 These are the historical baggage of x86 architecture. I don’t want to waste words explaining it again, but our code must implement and follow its rules.

The first entry of GDT is reserved and is not used; The fourth is the displayvideoMemory segment descriptor, which is not necessary, you can ignore it; So we just need to focus on the second and third items, which are:

  • Kernel code snippet(kernel code)Descriptor;
  • Kernel data segment(kernel data)Descriptor;

We useddThe pseudo instruction defines these two segment descriptors(segment descriptor):

CODE_DESC:
  dd DESC_CODE_LOW_32
  dd DESC_CODE_HIGH_32

DATA_DESC:
  dd DESC_DATA_LOW_32
  dd DESC_DATA_HIGH_32

DESC_CODE_LOW_32DESC_CODE_HIGH_32DESC_DATA_LOW_32DESC_DATA_HIGH_32It’s all defined insrc/boot/boot.incIn, you can verify each bit against the manual documents given above. In the same sentence, this is a boring, troublesome, detailed but inseparable work. There are no difficulties. What we need is patience to read the document manual.


In order to take care of the students who are not very familiar with the compilation, it is necessary toddExplain the role of pseudo instructions again.ddI meandefine double (4-bytes)And, similarlydb (byte)dw (word, 2-bytes), they appear in the assembly source code, which means that the data content defined later is written in the compiled binary. From this, you can once again experience the relationship between assembly and compiled binary, which is almost a rigid translation.

Enter protection mode

After setting GDT, we can enter the protection mode:

; enable A20
in al, 0x92
or al, 0000_0010b
out 0x92, al

; load GDT
lgdt [gdt_ptr]

; open protection mode - set cr0 bit 0
mov eax, cr0
or eax, 0x00000001
mov cr0, eax

; refresh pipeline
jmp dword SELECTOR_CODE:protection_mode_entry

Note that it is used herelgdtInstruction loadingGDT, and openedcr0The bit of the protection mode of the register officially enters the protection mode. Through afar jump, willcsSegment register initialized tokernel codeParagraph. be carefulcsThe value of the register cannot be passed directlymovInstruction settings, but must be set implicitly through jump statements.

After the jump, the next program comesprotection_mode_entryFor the execution of, several are initialized herekernel dataSegment register:

protection_mode_entry:
  ; set data segments
  mov ax, SELECTOR_DATA
  mov ds, ax
  mov es, ax
  mov ss, ax

  ; set video segment
  mov ax, SELECTOR_VIDEO
  mov gs, ax

The initialization of this protection mode is completed, and then comes to the key part of the loadersetup_pageFunction, start to establish the virtual memory of the kernel, which will be left for the next article.