Creating an OS using Rust: [Part-2] Creating a minimal Rust Kernel

Published in

Dev Genius

11 min readNov 26, 2023

In Part-1 of this blog series, we had discussed about creating a freestanding Rust executable that can run on bare metal. In this blog post, we will build on the freestanding Rust executable to create a bootable disk image that prints something to the screen.

As always, the basic assumption is that the reader is already well accustomed to the Rust Programming Language, and understands the Rust syntax and concepts like cargo, rustc, rustup, raw pointers, unsafe block etc.

Let’s get started.

Introduction

When you turn on a computer, it begins executing firmware code that is stored in motherboard ROM. This code performs a power-on self-test, detects available RAM, and pre-initializes the CPU and hardware. Afterwards, it looks for a bootable disk and starts booting the operating system kernel.

On x86, there are two firmware standards: the “Basic Input/Output System“ (BIOS) and the newer “Unified Extensible Firmware Interface” (UEFI). The BIOS standard is old and outdated, but simple and well-supported on any x86 machine since the 1980s. UEFI, in contrast, is more modern and has many more features, but is more complex to set up.

For now, we will only provide BIOS support in our Rust kernel.

BIOS vs UEFI

UEFI and BIOS are two types of motherboard firmware used during startup to initialize the hardware and load the operating system. They also determine the device boot priority and allow users to customize hardware and software settings.

Both firmware types serve the same purpose, but UEFI is newer and offers more customization options and features.

The image below shows an example of BIOS.

Below image is an example of UEFI.

The table below compares some of the key features of BIOS and UEFI.

BIOS Boot Process

Below are the steps in the BIOS booting process when a computer is turned on.

First the BIOS gets loaded from some special flash memory located on the motherboard.
The BIOS runs self-test and initialization routines of the hardware, then it looks for bootable disks.
If it finds the bootable disks, control is transferred to its bootloader, which is a 512-byte portion of executable code stored at the disk’s beginning. Most bootloaders are larger than 512 bytes, so bootloaders are commonly split into a small first stage, which fits into 512 bytes, and a second stage, which is subsequently loaded by the first stage.
The bootloader has to determine the location of the kernel image on the disk and load it into memory.
The bootloader also needs to switch the CPU from the 16-bit real mode first to the 32-bit protected mode, and then to the 64-bit long mode, where 64-bit registers and the complete main memory are available.
The bootloader’s third job is to query certain information (such as a memory map) from the BIOS and pass it to the OS kernel.

Writing a bootloader is a bit cumbersome as it requires assembly language and a lot of non-insightful steps like “write this magic value to this processor register”. Therefore, instead of building a bootloader from scratch, we will use a tool named bootimage that automatically prepends a bootloader to the Rust kernel.

CPU Operating Modes

Real Mode

Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. The mode gets its name from the fact that addresses in real mode always correspond to real locations in memory.

Real mode is characterized by a 20-bit segmented memory address space (giving 1 MB of addressable memory) and unlimited direct software access to all addressable memory, I/O addresses, and peripheral hardware.

Real mode provides no support for memory protection, multitasking, or code privilege levels.

Protected mode

Protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units (CPUs). It allows system software to use features such as segmentation, virtual memory, paging, and safe multi-tasking designed to increase an operating system’s control over application software.

When a processor that supports x86 protected mode is powered on, it begins executing instructions in real mode, in order to maintain backward compatibility with earlier x86 processors.

Protected mode may only be entered after the system software sets up one descriptor table and enables the Protection Enable (PE) bit in the control register 0 (CR0).

Long mode

Long mode is the mode where a 64-bit operating system can access 64-bit instructions and registers. 64-bit programs are run in a sub-mode called 64-bit mode, while 32-bit programs and 16-bit protected mode programs are executed in a sub-mode called compatibility mode.

Real mode or virtual 8086 mode programs cannot be natively run in long mode.

With a computer running legacy BIOS, the BIOS and the boot loader run in real mode. After execution passes to an operating system kernel which supports x86–64, the kernel verifies CPU support for long mode and then executes the instructions to enter it.

With a computer running UEFI, the UEFI firmware (except CSM and legacy Option ROM), any UEFI boot loader, and the operating system kernel all run in long mode.

Target Specification

Cargo supports different target systems through the — target parameter. The target is described by a so-called target triple, which describes the CPU architecture, the vendor, the operating system, and the ABI.

For our target system, however, we require some special configuration parameters (e.g., no underlying OS), so none of the existing target triples (that Rust supports) fits. So, we will have to define our own custom target through a JSON file (x86_64-corrode_os.json).

{
    "llvm-target": "x86_64-unknown-none",
    "data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
    "arch": "x86_64",
    "target-endian": "little",
    "target-pointer-width": "64",
    "target-c-int-width": "32",
    "os": "none",
    "executables": true,
    "linker-flavor": "ld.lld",
    "linker": "rust-lld",
    "panic-strategy": "abort",
    "disable-redzone": true,
    "features": "-mmx,-sse,+soft-float"
}

Since we’re targeting to run our OS on bare metal, we have changed the OS in the llvm-target and the os field to none.
We’ve also defined the linker-flavor and linker fields, so as not to use the platform’s default linker (which might not support Linux targets), and instead we use the cross-platform LLD linker that is shipped with Rust for linking our kernel.
We’ve set the panic-strategy field to “abort” to specify that the target doesn’t support stack unwinding on panic, so instead the program should abort directly.
We also need to handle interrupts at some point during the kernel development process. To do that safely, we have to disable a certain stack pointer optimization called the “red zone”, because it would cause stack corruption otherwise. We’ve set the disable-redzone field to true for this purpose.
The features field enables/disables target features. We disable the mmx and sse features by prefixing them with a minus and enable the soft-float feature by prefixing it with a plus. Note that there must be no spaces between different flags, otherwise LLVM fails to interpret the features string.
The data-layout field defines the size of various integer, floating point, and pointer types.

Building the kernel

For building an operating system, we will need some experimental features that are only available on the nightly channel, so we need to install a nightly version of Rust.

rustup override set nightly

The nightly compiler allows us to opt-in to various experimental features by using so-called feature flags at the top of our file. Note that such experimental features are completely unstable, which means that future Rust versions might change or remove them without prior warning. For this reason, we will only use them if absolutely necessary.

Once Rust nightly version is installed, we can start building the kernel for our new target.

And we’ve our first failure. The error tells that the Rust compiler no longer finds the core library, which contains basic Rust types such as Result, Option, and iterators, and is implicitly linked to all no_std crates.

The problem is that the core library is distributed together with the Rust compiler as a precompiled library. So, it is only valid for supported host triples (e.g., x86_64-unknown-linux-gnu) but not for our custom target. If we want to compile code for other targets, we need to recompile core for these targets first.

“build-std” Option

The build-std feature of cargo allows to recompile core and other standard library crates on demand, instead of using the precompiled versions shipped with the Rust installation. This feature is very new and still not finished, so it is marked as “unstable” and only available on nightly Rust compilers.

To use the feature, we need to create a local cargo configuration file at .cargo/config.toml (the .cargo folder should be next to src folder) with the following content.

This tells cargo that it should recompile the core and compiler_builtins (required because it is a dependency of core) libraries. In order to recompile these libraries, cargo needs access to the rust source code. So, let’s install that first.

rustup component add rust-src

Next, let’s rerun the build command.

Et voila! The build finished successfully for our custom target.

Enabling mem features

There are some memory-related functions (examples below) that are not enabled by default by Rust, because they are normally provided by the C library on the system.

memset, which sets all bytes in a memory block to a given value
memcpy, which copies one memory block to another
memcmp, which compares two memory blocks

Since we can’t link to the C library of the operating system, we need an alternative way to provide these functions to the compiler.

One possible approach for this could be to implement our own memset etc. functions and apply the #[no_mangle] attribute to them (to avoid the automatic renaming during compilation). However, this is dangerous since the slightest mistake in the implementation of these functions could lead to undefined behavior.

Fortunately, the compiler_builtins crate already contains implementations for all the needed functions, they are just disabled by default to not collide with the implementations from the C library. We can enable them by setting cargo’s build-std-features flag to [“compiler-builtins-mem”].

With this change, our kernel has valid implementations for all compiler-required functions, so it will continue to compile even if our code gets more complex.

We’ve also added the target parameter to avoid passing the — target parameter on every invocation of cargo build.

However, our _start entry point, which will be called by the boot loader, is still empty. It’s time that we output something to screen from it to ensure that the kernel is working.

Printing to screen

We’ll use the VGA text buffer for now to display contents on screen. It’s a special memory area consisting of 25 lines that each contain 80-character cells. Each character cell displays an ASCII character with some foreground and background colors. The screen output looks like this.

The buffer is located at address 0xb8000 and each character cell consists of an ASCII byte and a color byte.

First, we cast the integer 0xb8000 into a raw pointer. Then we iterate over the bytes of the static HELLO byte string. In the body of the FOR loop, we use the offset method to write the string byte and the corresponding color byte (0xb is a light cyan).

Note that we’re using an unsafe block here to tell the Rust compiler that we are absolutely sure that the operations are valid. The reason is that the Rust compiler can’t prove that the raw pointers we create are valid. They could point anywhere and lead to data corruption.

Running the kernel

First, we will turn our compiled kernel into a bootable disk image by linking it with a bootloader. Then we can run the disk image in the QEMU virtual machine or boot it on real hardware using a USB stick. We will opt for the first option for now.

Instead of writing our own bootloader, which is a project on its own, we use the bootloader crate. This crate implements a basic BIOS bootloader without any C dependencies, just Rust and inline assembly. To use it for booting our kernel, we need to add a dependency on it (use 0.9.8 version since the process fails for latest versions):

Next, we need to link our kernel with the bootloader after compilation. We will use the tool bootimage that first compiles the kernel and bootloader, and then links them together to create a bootable disk image.

cargo install bootimage

rustup component add llvm-tools-preview

The first command installs the bootimage tool, and the second command installs the llvm-tools-preview rustup component, required for building the boot loader.

Once both installations are done, we can go ahead and start the boot loader build process.

cargo bootimage

After executing the command, we should be able to see a bootable disk image named bootimage-corrode-os.bin in the target/x86_64-corrode_os/debug directory. We can boot it in a virtual machine or copy it to a USB drive to boot it on real hardware.

Note that this is not a CD image, which has a different format, so burning it to a CD doesn’t work.

Booting the image in QEMU

To boot the image in QEMU, we will execute the below command.

qemu-system-x86_64 -drive format=raw,file=target\x86_64-corrode_os\debug\bootimage-corrode-os.bin

This will open a separate window with “Hello World!” visible on the screen.

Summary

Finally, we’ve been able to create a minimal Rust kernel that prints “Hello World!” to screen when booted up.

For next steps, I’m exploring the VGA text buffer in more detail and working on writing a safe interface for it, so as to avoid using the Rust unsafe block.

I’ll cover the detailed implementation steps in the next post.