- A process is a program that is runnable on a virtual machine. It is runnable because each process is not always running on the CPU at a time
- Process can handle ordinary instructions and system calls
- Processes cannot access information from other
graph TD A[ Processes ] -->| System Calls | B(Kernel) B --> C{Hardware}
Principles of Designing System Calls
Orthogonality:
- In the context of system calls, orthogonality refers to the design principle where each operation does a single thing well, reducing the overlap in functionality.
- System calls should ideally function independently, with each handling specific tasks without unnecessary interdependencies. This makes the system easier to understand and less prone to errors.
- The principle encourages system calls to be versatile, potentially handling various resources such as processes, files, and network connections.
Handling Files (System Calls):
- lseek():
- The
lseek(fd, offset, whence)system call is used to change the location of the read/write pointer in a file, with “fd” being the file descriptor, “offset” specifying the number of bytes, and “whence” determining the position from which to move the pointer. - While
lseek()is logical for storage devices or regular files where random access is possible, it doesn’t align with the orthogonality principle when used with stream-based files (like network sockets) where such random access doesn’t make sense.
- The
- open() and close():
- The
open()system call allocates a file descriptor for a file or device, allowing the process to read from or write to that file or device. - Conversely, the
close()system call deallocates the file descriptor, releasing the resource and making the descriptor available for futureopen()calls. - Implementation Insight: Within the process table, there’s a specific area that tracks the file descriptors associated with each process. When
open()is called, it allocates a slot in this table. Whenclose()is called, it deallocates the slot, thus freeing that resource.
- The
A file descriptor leak is when the program forgets to close the file and allocates too much to the file descriptors.
Handling Processes
pidt fork(void); you create a process by cloning a current process and this is done via a system call- 1: The fork failed due to insufficient resources.
- 0: The fork succeeded, and the current process is the child process.
- Positive value: The fork succeeded, and the current process is the parent process. The returned value is the process ID (PID) of the child process.
pid_t pid = fork();
if (pid == -1) {
// Fork failed
perror("fork");
exit(1);
} else if (pid == 0) {
// Child process
printf("I am the child process (PID: %d)\n", getpid());
} else {
// Parent process
printf("I am the parent process (PID: %d)\n", getpid());
printf("My child's PID is %d\n", pid);
}- int kill(pid_t pid, int sig):
- The
kill()system call is used to send a signal specified by “sig” to the process or a group of processes specified by “pid.” - Contrary to the note,
kill()doesn’t always cause a process to exit immediately. Its behavior depends on the signal sent. For example,SIGTERMis a polite request for termination, whileSIGKILLis a forceful one. Processes can handle or ignore some signals, butSIGKILLcannot be caught, blocked, or ignored. - The purpose of
kill()extends beyond just terminating processes; it’s a general-purpose signal-sending interface. Signals can indicate various things, from urgent interrupts to process controls.
- The
- void _exit(int status):
- The
_exit()function causes the current process to terminate and doesn’t return. - The “status” argument is returned to the parent process as the exit status and can indicate success if it’s zero (or another value agreed upon by the program’s documentation) and an error if it’s non-zero. It doesn’t run any functions registered with
atexit()or perform any standard I/O cleanups. - It’s considered more abrupt compared to the standard
exit()function, which performs various cleanups before terminating the program.- It flushes all standard I/O buffers (stdout, stderr, etc.).
- It calls the functions registered with
atexit()and, in C++, destructors for static/global objects.
- You would want to use
_exit()overexit()because after afork()system call, the child process inherits file descriptors and buffers from the parent. If the child callsexit(), these buffers get flushed, which can lead to duplicate outputs if the buffers were not empty. Also, open file descriptors get closed, which can affect the parent process if it is still using them.
- The
- pid_t waitpid(pid_t pid, int status, int options):
- The
waitpid()system call suspends execution of the calling process until a child specified by pid argument has changed state. It’s used for synchronizing the state of parent and child processes.- You can only wait for your immediate child, not even grandchildren
- The “pid” argument specifies the set of child processes for which to wait. For example, if pid is -1, the call waits for any child process.
- The “status” is a pointer that stores the exit status of the child. It’s how the child process ended its execution (normally, by a signal, etc.).
- The “options” argument influences the behavior of
waitpid(). For instance, WNOHANG makes the call non-blocking. - If the child is not waited for, it becomes a “zombie” process. A zombie process is one that has completed execution but still has an entry in the process table. This entry is kept around to hold the exit status of the process so that the parent can retrieve it later.
- The
None of these system calls deallocate the process, you still need the process table entry. You need to keep these entries allocated because if you want to recognize the exit status of the process, you will need to get it out of that slot within the process table.
// Recursive as well as creating fork tree
while (fork() >= 0)
continue;execvp
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
void printdate(void) {
pid_t p = fork(); // Create a new process
if (p < 0) {
// Handle error: fork failed
perror("fork failed");
exit(EXIT_FAILURE);
}
if (p == 0) { // This block is executed by the child process
char *args[] = {"date", NULL}; // Prepare arguments: command "date" and a NULL terminator
execvp("/bin/date", args); // Replace the child process's image with "date" command
// If execvp returns, it must have failed
perror("execvp failed");
exit(EXIT_FAILURE);
}
int status;
if (waitpid(p, &status, 0) < 0) { // Parent process waits for the child to complete
// Handle error: waitpid failed
perror("waitpid failed");
exit(EXIT_FAILURE);
}
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) { // Check if the child exited normally
// Handle error: child process did not exit normally
perror("child process failed");
exit(EXIT_FAILURE);
}
}
int main() {
printdate(); // Function call to print the current date
return 0;
}execvpis a variant of theexec()family of functions that replaces the current process image with a new process image. This is used to run a new program within an existing process.- The “v” in
execvpindicates that it accepts an array ofchar*(also known as a vector in C terminology), allowing the programmer to specify the new program’s command-line arguments. - The “p” means it uses the
PATHenvironment variable to find the file to be executed if the specified file name does not contain a directory path, thereby simplifying command execution.
- The “v” in
execvpdoes not create a new process; it merely replaces the current process’s memory space and execution context with those of the new program. The process ID remains the same, but the machine code, static variables, stack, and other execution resources are replaced.- One of the key things to remember about
execvpand its sibling functions is that they do not return on success. If the function does return, an error occurred (e.g., the file was not found, or the current process lacks the necessary permissions to execute the target process), which is why there’s error-handling code immediately followingexecvp.- In your specific use case,
execvpis used to replace the child process’s image (created byfork()) with the “date” command. This command prints the current date and time to the console, then exits, at which point the parent process resumes execution.
- In your specific use case,
- You use
execvpwhen you don’t know the number of command line arguments to pass in, because vectors are dynamic. - You use
execlpwhen you do know the number of command line arguments to pass in, because lists/arrays are static.
Forking
When a process is forked using the fork() system call, it creates a child process that is a nearly identical clone of the parent. However, there are distinctions:
- Registers: All of the registers are duplicated in the child process, except for the
%rax(in x86 architecture) or equivalent register because it holds the return value offork(), which is the new process ID in the parent and 0 in the child. - File Descriptors: These are copied, and both parent and child share the same file descriptors. Any action (like file closing) in one process doesn’t affect the descriptor in another.
- Memory: The entire address space of the parent is replicated in the child, including the states of variables, program counter, etc. However, after forking, both processes get their own separate address space.
Process Isolation and Interaction
Although processes in an OS are designed to run in isolation to protect memory and prevent interference, they often need to interact for various functional purposes. Different systems also have different isolation levels:
- Partial Isolation Violation: There are mechanisms that intentionally violate this isolation principle to allow inter-process communication (IPC) and process management.
waitpid(): Allows a parent process to pause until a child process terminates. It can also retrieve the exit status of the child, indicating that the parent is not completely isolated from the child.kill(): Enables one process to send a signal to another process or a group of processes, often to terminate, pause, or otherwise control them.- File Operations: Processes can read from or write to files, potentially allowing communication between processes through a shared file system.
- Shared Memory: Processes can also communicate with each other by establishing a shared memory segment:
shm*system calls: These functions (shmget,shmat,shmdt,shmctl) are used to create, access, detach, and control shared memory segments. Shared memory is a powerful feature but must be used cautiously. Since it allows different processes to access the same memory area, it could lead to race conditions or data inconsistency if one process modifies data while another is reading it.
Concurrency Concerns
- Shared Memory Risks: When using shared memory, processes may access and modify the same data concurrently, leading to potential issues like:
- Race Conditions: When multiple processes access and manipulate the same data concurrently, and the final result depends on the sequence of their execution.
- Data Inconsistency: Without proper synchronization, one process might be reading data while another is modifying it, leading to unpredictable results.
- Synchronization Solutions: To mitigate these risks, synchronization mechanisms like mutexes, semaphores, or condition variables can be employed to ensure that only one process at a time can access a particular section of shared memory.
- Covert channels refer to the methods used to transfer information between entities in a system who are not supposed to communicate by the system’s security policy.
- Covert Communication: The sender process could write a specific pattern of data to a disk file or manipulate the availability of a system resource. The receiver process, while not communicating directly with the sender, could monitor these changes (e.g., file changes, system load) and decode the information being sent
Modeling OS Resources
- Handles and Descriptors (System Calls)
- Purpose: They serve as an abstract reference to a resource, be it a file, a window, or some I/O device, etc.
- Usage: Programs interact with these handles or descriptors to request the OS to perform operations on the actual resource.
- Benefit: This indirection adds a layer of protection and namespace isolation, preventing direct memory access or manipulation and potential corruption.
With regular programs, we use pointers and with system calls we use handles to provide another layer of abstraction
Linked Map of Contexts