Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Follow publication

What the Fork Does fork() Do?

Daniel Kogan
Dev Genius
Published in
6 min readJun 3, 2023

Why is multiprocessing important and how can we utilize it to its fullest potential?

Original Photo by Kari Shea on Unsplash, Edited by Daniel Kogan

This article will require cursory knowledge of C syntax to follow along with the programming examples used towards the end.

At its best, multiprocessing can massively improve efficiency in your programs by delegating responsibilities appropriately. Some example uses cases of multiprocessing are:

  • Every chrome tab running in a separate process to provide and display different content
  • Downloading multiple files at the same time
  • Your operating system managing background tasks
  • Breaking down images into multiple sections and then applying a filter to each section individually

However, multiprocessing also comes with some drawbacks. Creating a new process is very expensive, and maintaining data between different processes (interprocess communication) is often complicated. In these situations, it can often be better to use a thread, as they are lightweight and designed for synchronizing data. A process will be a better fit when you need isolation (no need for synchronized data) and they can be terminated without affecting the operation of other processes unlike threads.

What is a Process?

A process can be thought of a task. A process will contain an ID (also known as PID), its own stack and heap, code to execute, and a reference to its ‘parent’ process. Every process has a parent process which acts as a task manager. The parent is entrusted with responsibility for the child from birth to death, and unfortunately for these parents, they are expected to outlive their child processes.

In this sense, the process hierarchy can be viewed as a tree, where the higher levels of the tree have broader responsibilities. For instance, the top level process can manage all the currently running applications, which manage specific details of those applications.

On startup in a Unix system, the kernel creates an ‘init’ process, with the PID of 1.

Simplified Process Hierarchy for Chrome Application with 3 Tabs

Processes are also broken down into groups. A process group is a collection of related processes that are grouped together for certain administrative purposes, and a child process will inherit its group from its parent. In the example above, all the processes underneath google chrome will be in the same process group.

You are also able to view the process table, or list of every currently active process on your unix device by running the following command: ps aux

How Do I Program it?

In C, you can create a new process with the fork() command. fork() is a very interesting command because it is one of the only C commands to return twice — the child process recieves a value of 0 from fork, while the parent receives the pid of the child.

You may be tempted to immediately try something like this and be confused on why this doesn’t work.

#include <stdio.h>
#include <unistd.h>

int main() {
int pid = fork();
if (pid == -1) {
printf("there was an error");
exit(-1);
}
if (pid == 0) {
printf("I am the child process - Hello World");
}
else {
printf("I am the parent process - Hello World");
}
return 0;
}

After running it a few times, you will notice that the number of printed hello world messages is inconsistent, either 1 or 2. Why is this? This is caused by race conditions in this program.

A race condition is when the behavior of a program depends on the timing of certain concurrent functions which leads to unpredictable behavior.

This race condition is caused because creating a process is expensive, so what happens here is the parent process is created, prints, and then returns before the child process executes. Here is a correct version of the code

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>

int main() {
int pid = fork();
if (pid == -1) {
printf("there was an error");
exit(-1);
}
if (pid == 0) {
printf("I am the child process - Hello World");
}
else {
printf("I am the parent process - Hello World");
wait(NULL); // wait for the child process to finish executing
}
return 0;
}

Now the parent process will always terminate after the child, preventing race conditions.

How Many Processes are Created?

It is important to understand the process creation graph of execution when fork is called to understand how many processes we may be creating.

Take a look at the following program and analyze how many processes are created:

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int main() {
fork();
fork();
return 0;
}

The answer is 3. This is because both the parent and child process create new child processes in the second fork. Here is the process creation graph:

Graph of created processes

The Fork Bomb

You may have heard of a type of attack called the ‘fork bomb.’ This would be a program that looks something like this:

while (1) fork();

This will create new child processes until the machine is out of resources to allocate, and then because the loop is infinite it will stall the machine, making it basically unusable.

It is called a ‘bomb’ because each fork will result in the next fork running exponentially more times, quickly ‘exploding’ and consuming system resources.

Zombie Processes

I mentioned earlier that it is the parent’s responsibility to monitor its child until the child’s death, but what if it doesn’t? This will create a zombie process, which still has an entry on the process table but is not doing anything. Here is an example on how to create a zombie process:

#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>

int main() {
pid_t pid = fork();

if (pid == 0) {
// Child process
exit(0); // die
} else if (pid > 0) {
// Parent process
sleep(10); // parent sleeps on the job while child dies
}

return 0;
}

You can run this command to view the zombie process in your system: ps -e -o pid,ppid,state,cmd

Original Photo by cottonbro studio from Pexels, Edited by Daniel Kogan

As the name implies, we want to avoid zombie processes, as they consume system resources while not doing anything with them. Here is the fixed version of our code which will reap the child process and appropriately clean up system resources:

#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>

int main() {
pid_t pid = fork();

if (pid == 0) {
// Child process
exit(0); // die
} else if (pid > 0) {
// Parent process
int status;
waitpid(0, &status, 0); // Wait for child process to die
}
return 0;
}

The waitpid() function will block execution of a process until a child process dies. It takes in 3 parameters: pid, status, and options

pid can be -1,0, or any value greater than 0. If it is -1, it will wait for any child process to die. If it is 0, it will wait for a child process in the same process group as itself to die. If it is greater than 0, it will wait for the specific process with the given id to die.

status is a pointer to an integer where the return code of the process will be stored.

options allows you to specify additional options to pass into waitpid. For example, the WNOHANG option will not allow waitpid to suspend execution, and will instead return instantly even if no child process has died.

Fun Fact: waitpid(-1, NULL, 0) is actually equivalent to wait(NULL) in c!

Conclusion

In conclusion, multiprocessing can be a very powerful tool for efficiency, and understanding it helps you understand how different tasks are managed on your computer.

If you are interested in learning more about low level programming, check out the following article on programming in Assembly!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Written by Daniel Kogan

Computer Science student at Stony Brook University | Software Engineering Intern at J.P. Morgan Chase

No responses yet

Write a response