Creating an OS using Rust: [Part-3] Playing with VGA Text Mode

Tapas Das
Dev Genius
Published in
10 min readDec 17, 2023

--

In Part-2 of this blog series, we had discussed about creating a minimal Rust kernel that prints “Hello World!” to screen when booted up. In this blog post, we will build on same to write a safe interface for accessing and writing to VGA text buffer, so as to avoid using the Rust unsafe block.

As always, the basic assumption is that the reader is already well accustomed to the Rust Programming Language, and understands the Rust syntax and concepts like cargo, raw pointers, statics, structs, Enums, unsafe block etc.

Let’s get started.

VGA Text Buffer

The VGA text buffer is a two-dimensional array with typically 25 rows and 80 columns, which is directly rendered to the screen. Each array entry describes a single screen character through the following format:

  • Bits 0–7: ASCII code print
  • Bits 8–11: Foreground color
  • Bits 12–14: Background color
  • Bit 15: Blink

The first byte represents the character that should be printed. The second byte defines how the character is displayed. The first four bits define the foreground color, the next three bits the background color, and the last bit whether the character should blink.

The characters are part of “Code Page 437” character set (also known as “OEM font”, “high ASCII” or “Extended ASCII”). The following tables show code page 437. Each character is shown with its equivalent Unicode code point (when it is not equal to the character’s code).

Pic Reference: Code page 437 — Wikipedia

The VGA text buffer is accessible via memory mapped I/O to the address 0xb8000.

This means that reads and writes to that address don’t access the RAM but directly access the text buffer on the VGA hardware. This means we can read and write it through normal memory operations to that address.

Representing VGA colors in Rust

We will define an Enum to explicitly specify the number for each color. Because of the repr(u8) attribute, each Enum variant is stored as a u8.

Next, we will create a newtype on top of u8 to represent a full color code that specifies foreground and background color.

#[repr(transparent)] is being used to ensure that the layout and ABI of the whole struct is guaranteed to be the same as that one field.

Representing text buffer in Rust

Since the field ordering in default structs is undefined in Rust, we have used the #[repr(C)] attribute to guarantee that the struct’s fields are laid out exactly like in a C struct and thus it guarantees the correct field ordering.

For the Buffer struct, we have used the #[repr(transparent)] attribute again to ensure that it has the same memory layout as its single field.

Next, we will create the writer type to actually print characters on the screen.

  • The writer will always write to the last line and shift lines up when a line is full (or on \n new line character).
  • The column_position field keeps track of the current position in the last row.
  • The current foreground and background colors are specified by color_code type and a reference to the VGA buffer is stored in buffer field.

Note that we have defined an explicit ‘static lifetime here to tell the compiler that the reference is valid for the whole program run time (which is true for the VGA text buffer).

Printing characters on screen

Next, we will create a method for the Writer type to modify the VGA text buffer and write a single byte.

If the byte is the newline byte \n, the writer does not print anything. Instead, it calls a new_line method. Other bytes get printed to the screen in the second match case.

When printing a byte, the writer checks if the current line is full. If yes, then a new_line call is used to wrap the line. Then it writes a new ScreenChar to the buffer at the current position. Finally, the current column position is advanced.

To print whole strings, we can convert them to bytes and print them one-by-one.

Note: Rust strings are UTF-8 by default, so they might contain bytes that are not supported by the VGA text buffer. This is the reason that we’re using match to differentiate printable “Code Page 437” bytes and unprintable bytes. For unprintable bytes, we print a ■ character, which has the hex code 0xfe on the VGA hardware.

Now that we have everything in place, let’s try and print “Hello World!” again but this time with safe Rust.

It first creates a new Writer that points to the VGA buffer at 0xb8000. First, we cast the integer 0xb8000 as a mutable raw pointer. Then we convert it to a mutable reference by dereferencing it (through *) and immediately borrowing it again (through &mut). This conversion requires an unsafe block, since the compiler can’t guarantee that the raw pointer is valid.

Then, we’re writing the byte b’H’, strings “ello “ and “Wörld!”. To see the output, we need to call the print_something function from our _start function.

Let’s build the bootimage again and run the same using QEMU to check the result.

We can see that the ö is printed as two ■ characters. That’s because ö is represented by two bytes in UTF-8, which both don’t fall into the printable “Code Page 437” range.

Volatile Reads and Writes

So far, the Rust compiler is unaware that we really access VGA buffer memory (instead of normal RAM) and knows nothing about the side effect that some characters appear on the screen.

As a result, the future Rust compilers might decide that these writes are unnecessary (owing to aggressive optimizations) and can be omitted. To avoid this erroneous optimization, we need to specify these writes as volatile. This tells the compiler that the write has side effects and should not be optimized away.

In computer programming, volatile means that a value is prone to change over time, outside the control of some code. Volatility has implications within function calling conventions, and also impacts how variables are stored, accessed and cached.

In order to achieve this, we will use the “Volatile” crate that provides a Volatile wrapper type with read and write methods. These methods internally use the read_volatile and write_volatile functions of the core library and thus guarantee that the reads/writes are not optimized away.

Next, we will update the buffer type to use “Volatile”.

Next, we’ve to update the Writer::write_byte method to use “write” method for printing characters, instead of a typical assignment using =. This guarantees that the Rust compiler will never optimize away this write.

Formatting Macros

Wouldn’t it be nice if we could easily print different types, like integers or floats, without going through too many hoops. Rust’s formatting macros allows us to do just that. But first, we need to implement the core::fmt::Write trait.

The only required method of the core::fmt::Write trait is write_str, which looks quite similar to our write_string method, just with a fmt::Result return type. Now we can use Rust’s built-in write!/writeln! formatting macros.

This should print Hello World! The numbers are 42 and 0.3333333333333333 at the bottom of the screen.

Handling new lines

In case the characters don’t fit into a line anymore, we would want to move every character one line up (the top line gets deleted) and start at the beginning of the last line again. To achieve this, we will add a new_line implementation.

We iterate over all the screen characters and move each character one row up. We also omit the 0th row (the first range starts at 1) because it’s the row that is shifted off screen.

Next, we will add the clear_row implementation, which will clear a row by overwriting all of its characters with a space character. This will work in conjunction with new_line method to shift characters one row up and then clear the current row.

Creating global writer

Ideally, we would like to be able to print characters on screen from any module, without carrying a “Writer” instance around. So, we will create a static WRITER that can be reused by all modules.

If we try to compile this code, we will get the below errors.

As we know, Rust statics are initialized at compile time, in contrast to normal variables that are initialized at run time. The issue here is that Rust compiler is unable to convert raw pointers to references at compile time.

In order to solve this problem, we’ll use the crate “lazy_static”, which lazily initializes statics when accessed for the first time. Thus, initialization happens at runtime, so arbitrarily complex initialization code is possible.

Note: We need the spin_no_std feature, since we don’t link the standard library.

Next, let’s define our static WRITER using lazy_static to solve the compilation errors.

However, we won’t be able to use the WRITER static yet, since it is immutable. This means that we can’t write anything to it (since all the write methods take &mut self). One possible solution would be to use a mutable static. But then every read and write to it would be unsafe since it could easily introduce data races and other bad things.

Leveraging Spinlocks

Ideally, we would like to use “Mutex” to achieve synchronized interior mutability (mutual exclusion by blocking threads when the resource is already locked). But our basic kernel does not have any blocking support or even a concept of threads, so we can’t use it.

Instead, we can use a very basic kind of mutex that requires no operating system feature called “spinlocks”.

A spinlock is a lock that causes a thread trying to acquire it to simply wait in a loop (“spin”) while repeatedly checking whether the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of busy waiting. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on (the one that holds the lock) blocks or “goes to sleep”.

To use spinlocks, let’s add the “spin” crate to the dependencies.

Next, we will use the spinlocks to add safe interior mutability to our static WRITER.

What this helps us to do is directly print characters from our _start function, without using the print_something function.

Let’s build the bootimage once more and run the same using QEMU to check the result.

As expected, we now see a “Hello Again! some numbers: 42 1.337” on the screen.

Summary

Finally, we got rid of most of the unsafe Rust code blocks, we had written in last blog post. Now we only have a single unsafe block in our code, which is needed to create a Buffer reference pointing to 0xb8000.

Afterwards, all operations are safe. Rust uses bounds checking for array accesses by default, so we can’t accidentally write outside the buffer. Thus, we encoded the required conditions in the type-system and are able to provide a safe interface to the outside.

For next steps, I’m exploring recreating the println! macro to ease up the process of writing characters or strings on boot up.

Stay tuned for the next post.

--

--