Unix Kernel Architecture:System Calls, Process States in Unix and Kernel Operations.

System Calls

The kernel lies between the underlying hardware and other high level processes. So basically there are two interfaces: one between the hardware and kernel and the other between the kernel and other high level processes. System calls provide the latter interface.

System call interacts with the kernel employing a syscall vector. In this vector each system call vector has a fixed position. Note that each version of Unix may individually differ in the way they organize the syscall vector. The fork() is a system call. It would be using a format as shown below.

syscall fork-number

Clearly, the fork-number is the position for fork in the syscall vector. All system calls are executed in the kernel mode. Typically, Unix kernels execute the following secure seven steps on a system call:

1. Arguments (if present) for the system call are determined.

2. Arguments (if present) for the system call are pushed in a stack.

3. The state of calling process is saved in a user structure.

4. The process switches to kernel mode.

5. The syscall vector is used as an interface to the kernel routine.

6. The kernel initiates the services routine and a return value is obtained from the kernel service routine.

7. The return value is converted to a c version (usually an integer or a long integer).

The value is returned to process which initiated the call. The system also logs the userid of the process that initiated that call.

14.2.1 An Example of a System Call

Let us trace the sequence when a system call to open a file occurs.

• User process executes a syscall open a file.

• User process links to a c runtime library for open and sets up the needed parameters in registers.

• A SW trap is executed now and the operation switches to the kernel mode.

- The kernel looks up the syscall vector to call "open"

- The kernel tables are modified to open the file.

- Return to the point of call with exit status.

• Return to the user process with value and status.

• The user process may resume now with modified status on file or abort on error with exit status.

14.3 Process States in Unix

Unix has the following process state transitions:

• idle ----> runnable -----> running.

• running ----> sleep (usually when a process seeks an event like I/O, it sleeps awaiting event completion).

• running ----> suspended (suspended on a signal).

• running ----> Zombie (process has terminate but has yet to return to its exit code to parent. In unix every process reports its exit status to the parent.)

• sleeping ---> runnable

• suspended---> runnable

Note that it is the sleep operation which gives the user process an illusion of synchronous operation. The Unix notion of suspended on a signal gives a very efficient mechanism for process to respond to awaited signals in inter-process communications.

14.4 Kernel Operations

The Unix kernel is a main memory resident \process". It has an entry in the process table and has its own data area in the primary memory. Kernel, like any other process, can also use devices on the systems and run processes. Kernel differs from a user process in three major ways.

1. The first major key difference between the kernel process and other processes lies in the fact that kernel also maintains the needed data-structures on behalf of Unix. Kernel maintains most of this data-structure in the main memory itself. The OS based paging or segmentation cannot swap these data structures in or out of the main memory.

2. Another way the kernel differs from the user processes is that it can access the scheduler. Kernel also maintains a disk cache, basically a buffer, which is synchronized ever so often (usually every 30 seconds) to maintain disk file consistency. During this period all the other processes except kernel are suspended.

3. Finally, kernel can also issue signals to kill any process (like a process parent can send a signal to child to abort). Also, no other process can abort kernel.

A fundamental data structure in main memory is page table which maps pages in virtual address space to the main memory. Typically, a page table entry may have the following information.

1. The page mapping as a page frame number, i.e. which disk area it mirrors.

2. The date page was created.

3. Page protection bit for read/write protections.

4. Bits to indicate if the page was modified following the last read.

5. The current page address validity (vis-a-vis the disk).

Usually the page table area information is stored in files such as immu.h or vmmac.h. Processes operating in \user mode" cannot access the page table. At best they can obtain a copy of the page table. They cannot write into page table. This can be done only when they are operating in kernel mode.

User processes use the following areas :

• Code area: Contains the executable code.

• Data area: Contains the static data used by the process.

• Stack area: Usually contains temporary storages needed by the process.

• User area : Stores the housekeeping data.

• Page tables : Used for memory management and accessed by kernel.

The memory is often divided into four quadrants as shown in Figure 14.2. The vertical line shows division between the user and the kernel space. The horizontal line shows the swappable and memory resident division. Some Unix versions hold a higher level data structure in the form of region table entries. A region table stores the following information.

• Pointers to i-nodes of files in the region.

• The type of region (the four kinds of files in Unix).

• Region size.

• Pointers to page tables that store the region.

• Bit indicating if the region is locked.

• The process numbers currently accessing the region.

Generally, the above information is stored in region.h files. The relevant flags that help manage the region are:

1. RT_UNUSED : Region not being used

2. RT_PRIVATE: Only private access permitted, i.e. non-shared region

3. RT_STEXT: Shared text region

4. RT_SHMEM: Shared memory region

The types of regions that can be attached to a process with their definitions are as follows:

clip_image001

Figure 14.2: User and kernel space.

1. PT_TEXT: Text region

2. PT_DATA: Data region

3. PT_STACK: Stack region

4. PT_SHMEM: Shared memory region

5. PT_DMM: Double mapped memory

6. PT_LIBTEXT: Shared library text region

7. PT_LIBDAT: Shared library data region

In a region-based system, lower-level functions that are needed to be able to use the regions are as follows:

1. *allocreg(): To allocate a region

2. freereg(): To free a region

3. *attachreg(): Attach region to the process

4. *detachreg(): Detach region from the process

5. *dupreg(): Duplicate region in a fork

6. growreg(): Increase the size of region

7. findreg(): Find from virtual address

8. chprot(): Change protection for the region

9. reginit(): Initialise the region table

The functions defined above are available to kernel only. These are useful as the virtual memory space of the user processes needs regions for text, data and stack. The above function calls cannot be made in user mode. (If they were available in user mode they would be system calls. These are not system calls.)

Comments

Popular posts from this blog

Input Output (IO) Management:HW/SW Interface and Management of Buffers.

Introduction to Operating Systems:Early History: The 1940s and 1950s

Input Output (IO) Management:IO Organization.