Implementing a system call in Linux kernel

    1. Preface:
      After I have started to read and “dig in” into the world of Linux kernel development, and went over the procedure of compiling and installing the kernel – I’m now going to implement a basic system call.
      I would mention in advance, that this is mainly an educational purpose activity, so perhaps the system call I’m going to implement does not really have a proper “usage”.
    2. Some logistics notes:
      1. I am compiling and using kernel version 4.18.4
      2. My OS is Ubuntu 18.04.1
      3. My machine architecture is x86_64 – as we shell see later on, this means that the system call I will implement is for this architecture.
    3. Background:
      System calls are a layer (or an API) between the system resources (the hardware) managed by the OS and the user-space programs. It has 3 main purposes:

      1. HW abstraction:It provides an abstract HW interface – the user-space application does not care about the underlying HW and how it does what it does (for example, how the file system implements operations like read/write).
      2. System security and stability: It allows the kernel, which is the “middleman” for all system calls and user-space applications, to be “in charge” and to validate that each process (user-space application) that invokes some system call – indeed has the privileges to invoke this system call.
      3. Enable virtualization of the system: If every application could have invoke the service that the system call provides “on it’s own” – it would not be able to actually share the system resources correctly between the different user space applications – which is fundamental in order to have mechanisms such as virtual memory and multi-tasking.
    4. System calls and the C library:
      An important connection to note here is between the two. System calls are usually (but not always), in practice, are called via C library functions, meaning, in a way, one can say that the C library functions are the wrappers of the system calls.

      1. A nice figure that illustrate the “big picture” with (mainly) the kernel <–> system calls <–> C library <–> user space application is as follows:
        system_call_1
        As can be seen here, user space applications can invoke (call) system calls without going through the C library routines.
      2. An “high-level” flow of invoking a system call (via its C library wrapper) can be depicted in the figure below:
        syscall_1.png
        NOTES:
      3. The transitions in the figure noted by the red numbers should be discussed further on.
      4. As mentioned earlier, the C library is “one way” (and usually the preferred way) to invoke a system call, BUT it is not the ONLY way –> it is possible to invoke a system call “directly” from the user space application
    5. Design considerations: As of “normal user-space” functions implementation, there are, in addition, some things to note when adding/implementing a (new) system call:
      1. Portability: Do not make assumptions about an architecture’s word size or endianness, for example.
      2. Input parameters: As regular functions do, system calls must carefully verify all their parameters as well, the only thing that is different is “what” validity checks they do – for example:
      – File I/O system calls must check whether the file descriptor is valid.
      – Process related system calls must check whether the provided PID is valid.
      3. Output parameters: Before following a pointer “back” to user-space, the system call must ensure that the pointer points to a region of memory in user-space and in particular, in the invoking  process’s address space.
      4. Permissions: The system call must verify that the calling process has valid permissions to do what it attempts to do.
      5. System call context: When performing a system call, the kernel “switches” into “the process’s context” (i.e.- current is pointing to the task struct of the process that invoked the system call).
      6. Preemption of the kernel: The fact that process context is preemptible implies that, like user-space, the current task may be preempted by another task – i.e. the system call that is now executed on behalf of process_1, may then execute on behalf of process_2 – so care must be exercised to ensure that system calls are re-entrant.
    6. Implementation phase:
      1. System call folder: In this case, I chose to contain all the required files for our new system call within a “dedicated” folder of its own – named info. Note that this is not mandatory and I chose to do so only for simplicity. This folder was added at the “root” folder of the “working” Linux kernel (linux-4.18.4).
      sys_1.png
      2. Updating the “main” Makefile: Due to the fact that I chose to add the system call’s files within a new folder (info) I need to update the “main” Makefile of the entire Linux kernel project, in order to indicate to it, that an additional folder was added, so during the build procedure it will look for files to build (compile) ALSO in that folder. This is achieved by adding the name of the folder at the end of the core-y+= command within the main Makefile as shown below:
      sys_2.png
      Note: The info/ was added.
      3. System call deceleration: Add to the info folder the header file, that will be called processInfo.h. It will hold the system call function’s deceleration, in this case add the line:

      asmlinkage long sys_getProcessInfo(void);
      

      Notes:
      -The name of the system call MUST start with the prefix sys_function_name.
      – The return type MUST be long.
      – The asmlinkage MUST be added to the system call’s signature.
      If any additional declarations of functions, structs, macros, etc were required as part of the system call’s implementation – this is the place to do so.
      4. System call implementation: Add to the info folder also the respective source file, processInfo.c, that will hold the system call’s implementation:

      #include<linux/kernel.h>
      #include "processInfo.h"
      
      asmlinkage long sys_getProcessInfo(void)
      {
      	printk("getProcessInfo - hey there \n");
      	return 0;
      }
      

      5. System call’s Makefile creation: Add a Makefile to the info folder for our system call implementation code. It should contain the following:

      obj-y:=processInfo.o
      

      6. System call table update: As mentioned earlier, my machine is x86_64 architecture based – so we need to update the respective system call table for this architecture. It is located within the arch/x86/entry/syscalls folder and the file name is syscall_64.tbl:sys_3.png
      The new system call entry needs to be added at the end of the “already” defined system calls section – which is, as its name states – the section of the system calls relevant for the 64 bit version of the x86 architecture. Note that the system call’s number MUST be the number of the last system call in that table + 1. In this case it is 335 (the last system call in this kernel version is 334 for the rseq system call):
      sys_2.png
      IMPORTANT NOTE: Pay attention, that although the “signature” (last column) of the other system calls start with the __x64_sys prefix – when you add your system call, DO NOT add it as well with this prefix, but exactly as its name in the deceleration in the syscalls.h file was.
      7. Updating the syscalls header file: The final place we need to update regarding the new system call is the syscalls.h file. This file contains declarations of all the system calls for the Linux kernel. To be honest, I did not understand entirely why, but I saw in many other tutorials that the deceleration of the new system call should (must ?) be located at the very end of this header file, just before the #endif directive, see below:
      sys_3.png
      8. Compatibility issues (optional): In order to make this system call compatible with all other situations – some additional declarations (that will be followed by some more implementations) will need to be added. Here, for the sack of simplicity, I have implemented this system call ONLY for the x86_64 platform.
      9. Building the kernel: As in the previous post, now we should build the kernel (for only the above modifications/additions no new/updated configuration is required), so you can perform step 8 here.
      10. Reboot the system: In order for the modifications to take place.
      11. Verify the system call was added to the kernel: Type in the following command and you should see the output:
      sys_1.png
      –> This means that the new system call was added successfully.
      12. Testing the new system call: In order to invoke the system call we need to pass its number to the syscall function, see below:

      #include <iostream>
      #include <stdio.h>
      #include <unistd.h>
      #include <sys/syscall.h>
      #include <errno.h>
       
      #define GET_PID_NR 39
      #define MY_SYS_CALL_NR 335
      
      using namespace std;
      
      int main(int argc, char** argv)
      {
      	cout << "main - start" << endl;
      	//Call an existing syscall as test
      	long pid;
      	pid = syscall(GET_PID_NR);
      	cout << "PID is:" << pid << endl;;
      	 
      	//Call our syscall
      	long res = 0;	
      	res = syscall(MY_SYS_CALL_NR);
      	cout << "res is:" << res << endl;
      	cout << "main - end" << endl;
      	return 0;
      }
      

      Also, we can verify that the message it wrote to the kernel log indeed exists:
      sys_4.png

      g. Conclusion: It worth to mention here that the topic of system calls can (and should) be discussed further on – this was a very simple introduction for it.

      h. Useful links:
      – Nice question & answer from SO
      – Official Linux kernel documentation: How to add system call

      The picture:Nazca lines, Nazca desert, Peru.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s