|
| 1 | +--- |
| 2 | +path: '/blog/2025/09/16/kernel-basics-for-tetragon.' |
| 3 | +date: '2025-09-16T12:00:00.000Z' |
| 4 | +title: 'Linux Kernel Fundamentals for Writing Effective Tetragon Tracing Policies ' |
| 5 | +ogImage: cover.png |
| 6 | +isFeatured: false |
| 7 | +ogSummary: 'Learn the basics of the Linux Kernel needed to effectively write good Tetragon tracing policies' |
| 8 | +categories: |
| 9 | + - Community |
| 10 | +tags: |
| 11 | + - Tetragon |
| 12 | +--- |
| 13 | + |
| 14 | +**_Author: Paul Arah, Isovalent@Cisco_** |
| 15 | + |
| 16 | + |
| 17 | + |
| 18 | +When you write Tetragon tracing policies, you’re not writing arbitrary sets of rules; you're programming against the kernel execution path itself. Every policy you create hooks directly into kernel functions, intercepts system calls, and examines kernel data structures. This power comes with responsibility. Without an understanding of how the Linux kernel works, you'll find yourself writing policies that are ineffective, overly broad, or worse, missing the exact events you are trying to find. |
| 19 | + |
| 20 | +This blog is meant to be a pointer guide; it won’t make you a kernel hacker overnight, but it aims to cover some core Linux knowledge essential for crafting effective Tetragon tracing policies. We’ll connect kernel fundamentals such as user vs. kernel space, system calls, process structures, namespaces, and more to practical tracing policy examples. A basic familiarity with Linux and a base-level understanding of what Tetragon is will be enough to follow along. |
| 21 | + |
| 22 | +## Kernel Space vs User Space |
| 23 | + |
| 24 | +One of the most fundamental concepts in Linux system programming is the distinction between kernel space and user space. |
| 25 | + |
| 26 | +**User space** is where apps like Bash, Nginx, VS Code run. Userspace programs run with restricted privileges and cannot do things like access arbitrary memory locations, execute privileged instructions, directly control hardware, or access kernel data structures. |
| 27 | + |
| 28 | +**Kernel space** is where kernel code runs with unrestricted access to all system resources. Here, code can access any memory location, execute privileged CPU instructions, directly control hardware devices, modify system-wide data structures, and more. |
| 29 | + |
| 30 | +## The System Call Interface |
| 31 | + |
| 32 | +User programs can’t talk to the kernel directly. They use system calls(syscalls) like `open`, `write`, or `execve`. System calls are the controlled entry points that allow user space programs to request kernel services. When a user space program wants to open a file, allocate memory, or create a network connection, it must go through the system call interface. For example, when a program calls `open("name.text")`, it isn’t the library function itself that touches the file. Behind the scenes, this becomes a `sys_open` system call. The kernel then processes the request, checks permissions, and returns a reference that the program can use. |
| 33 | + |
| 34 | +Most programming languages offer some sort of standard library that provides a high-level abstraction over the system call interface; this way, application developers typically never have to access the system call interface directly. When you write tracing policies, we’re working with the kernel’s perspective, where syscalls are the actual events being invoked. |
| 35 | + |
| 36 | + |
| 37 | + |
| 38 | +## Hook Points |
| 39 | + |
| 40 | +When you write a tracing policy, you have to tell Tetragon where to look in the system. These attachment points are called hook points. There are two complementary perspectives we can view this from: how the Linux kernel itself defines them, and how Tetragon exposes them for us. Understanding both perspectives helps us choose the right hook point for the security observability scenario. |
| 41 | + |
| 42 | +### The Linux Kernel’s Point of View |
| 43 | + |
| 44 | +In the kernel, a hook point is just a place in the execution flow where code can be instrumented. Different subsystems provide different mechanisms for instrumentation: |
| 45 | + |
| 46 | +- **Kprobes** provide dynamic probes on almost any kernel function. They let you intercept functions like `fd_install()` whenever they’re called. Kprobes are powerful, but are tightly coupled with your kernel version since kernel functions can change across versions. |
| 47 | +- **Tracepoints** are essentially built-in static markers inside the kernel. For example, `sched_process_exec` fires every time a process runs a new program. Tracepoints are more stable than kprobes and work across kernel versions. |
| 48 | +- **Uprobes** are like kprobes, but for user-space programs. For example, you can hook into the readline() function in Bash to see when someone types a command. |
| 49 | +- **BPF LSM** essentially allows instrumenting Linux Security Module (LSM) hooks at runtime. A good way to think of LSM hooks is as some kind of built-in checkpoint that asks, “Is this action allowed?” before letting a process do something sensitive. Security systems like SELinux or AppArmor use LSM hooks. Tetragon can also use LSM hooks for access control and observability. LSM hooks are reliable, less prone to race conditions like TOCTOU, and always represent real enforcement points. |
| 50 | + |
| 51 | +This blog post titled: [Linux tracing systems & how they fit together](https://jvns.ca/blog/2017/07/05/linux-tracing-systems/) by Julia Evans, is a good resource for learning about the tracing systems in the Linux kernel. |
| 52 | + |
| 53 | +### Tetragon Points of View |
| 54 | + |
| 55 | +Tetragon abstracts these raw attachment points into policy targets. **A hook point in Tetragon is simply a declaration of where you want to monitor and what arguments you want extracted.** |
| 56 | +With a kprobe spec for example, you declare, and Tetragon takes care of attaching to the kernel function and extracting the arguments. |
| 57 | + |
| 58 | +```yaml |
| 59 | +spec: |
| 60 | + kprobes: |
| 61 | + - call: 'fd_install' |
| 62 | + syscall: false |
| 63 | +``` |
| 64 | +
|
| 65 | +With a tracepoint spec, Tetragon subscribes to the stable kernel tracepoints, decoding the arguments. |
| 66 | +
|
| 67 | +```yaml |
| 68 | +spec: |
| 69 | + tracepoints: |
| 70 | + - subsystem: 'sched' |
| 71 | + event: 'sched_process_exec' |
| 72 | +``` |
| 73 | +
|
| 74 | +This same principle applies to every hook point in Tetragon. With LSM hooks, for example, you point at security checks (`file_open`, `bprm_check_security`) and Tetragon handles the attachment. |
| 75 | + |
| 76 | +From the Tetragon point of view, **a hook point is a declarative contract that specifies what to watch (functions, tracepoints, LSM hooks), which arguments to pull out(pid, file, etc), and how to filter or act on the event.** |
| 77 | + |
| 78 | +### Writing Kprobe-based Tracing Policies for Monitoring System Calls |
| 79 | + |
| 80 | +Before rounding off this section, it is important to highlight one interesting abstraction Tetragon provides for kprobe-based policies that monitor system calls. Different CPU architectures implement system calls differently, and this can create portability challenges. On x86_64, system call handlers have names like `__x64_sys_write`, while on ARM64, they're named `__arm64_sys_write`. |
| 81 | +Tetragon provides elegant abstraction here. Instead of writing architecture-specific policies like this: |
| 82 | + |
| 83 | +```yaml |
| 84 | +# Architecture-specific (don't do this) |
| 85 | +spec: |
| 86 | + kprobes: |
| 87 | + - call: '__x64_sys_write' # Only works on x86_64 |
| 88 | + syscall: true |
| 89 | +``` |
| 90 | + |
| 91 | +You can write portable policies that work across different architectures. |
| 92 | + |
| 93 | +```yaml |
| 94 | +# Portable across architectures |
| 95 | +spec: |
| 96 | + kprobes: |
| 97 | + - call: 'sys_write' # Works on any architecture |
| 98 | + syscall: true |
| 99 | +``` |
| 100 | + |
| 101 | +Tetragon automatically translates `sys_write` to the correct architecture-specific function name. This abstraction is crucial for policies that need to work across diverse environments. |
| 102 | + |
| 103 | +### Choosing the Right Hook Point |
| 104 | + |
| 105 | +The best hook point for your policy depends on your specific security observability objective and how much stability you need across environments. If your goal prioritizes portability, tracepoints are often the safest choice. They are built into the kernel source and tend to remain stable across kernel versions. |
| 106 | +If you’re writing security-sensitive policies, LSM hooks are a better option. Because they operate on kernel-owned memory after user input has already been validated, they naturally avoid time-of-check to time-of-use (TOCTOU) pitfalls. |
| 107 | + |
| 108 | +When you need fine-grained insight into kernel internals, kprobes give you the flexibility to attach almost anywhere. The tradeoff is that they are more tightly coupled to your kernel version, since function names and prototypes can change between releases. To use them effectively, you need to be comfortable browsing kernel symbols and understanding the calling conventions of the functions you hook. |
| 109 | +Finally, if your focus is on application-level behavior, uprobes let you trace functions inside user-space binaries and libraries. The prerequisite here is being able to explore how a particular program is laid out at the binary level. |
| 110 | + |
| 111 | +Whichever option you choose, the common theme is that you’re navigating the Linux kernel (or user-space program) at the level of functions, structures, and symbols. An understanding of these internals is what allows you to pick the right hook. |
| 112 | + |
| 113 | +## Process Management |
| 114 | + |
| 115 | +In the kernel's view, every running program is represented by a task_struct data structure. This massive structure (over 1,000 lines in recent kernels) contains everything the kernel needs to know about a process, including the process and thread group IDs (PID/TGID), memory management information, file descriptor table, security credentials, scheduling information, and signal handling state. |
| 116 | +When your Tetragon examines process-related information, it is often looking at fields within the current process's task_struct. Process lifecycle monitoring is a core use case for Tetragon, and by default, without deploying any additional tracing policy, Tetragon observes the process lifecycle. The process section of the [Linux kernel teaching lab](https://linux-kernel-labs.github.io/refs/heads/master/lectures/processes.html) covers in detail how processes work in Linux. |
| 117 | + |
| 118 | +### Process Monitoring with Tetragon |
| 119 | + |
| 120 | +Consider a policy that tracks process creation using the sched_process_exec tracepoint: |
| 121 | + |
| 122 | +```yaml |
| 123 | +spec: |
| 124 | + tracepoints: |
| 125 | + - subsystem: 'sched' |
| 126 | + event: 'sched_process_exec' |
| 127 | + args: |
| 128 | + - index: 0 |
| 129 | + type: 'int' |
| 130 | + resolve: 'pid' |
| 131 | + - index: 2 |
| 132 | + type: 'linux_binprm' |
| 133 | +``` |
| 134 | + |
| 135 | +This policy hooks into the scheduler subsystem when a new program is executed. The `resolve: "pid"` directive tells Tetragon to extract the PID from the first argument (a `task_struct` pointer), while the `linux_binprm` type captures information about the binary being executed. |
| 136 | + |
| 137 | +Understanding that process creation involves multiple kernel subsystems (the scheduler, memory manager, and file system) helps you choose the right hook points for your monitoring and enforcement objectives. |
| 138 | + |
| 139 | +### Process Hierarchies and Namespaces |
| 140 | + |
| 141 | +Container environments add complexity to process management through namespaces. Containers use chroot, namespaces, and cgroups to isolate processes. A good developer-centric resource that covers the Linux internals of how containers work is [Crafting Containers By Hand – What Are Containers?](https://btholt.github.io/complete-intro-to-containers/what-are-containers) |
| 142 | + |
| 143 | +A process might have different PIDs in different PID namespaces; PID 1 inside a container might be PID 12345 from the host perspective. Tetragon policies need to account for this. When you're filtering by PID, consider whether you want the namespace PID or the host PID: |
| 144 | + |
| 145 | +```yaml |
| 146 | +selectors: |
| 147 | + - matchPIDs: |
| 148 | + - operator: In |
| 149 | + followForks: true |
| 150 | + isNamespacePID: true # Use container-internal PID |
| 151 | + values: |
| 152 | + - 1 |
| 153 | +``` |
| 154 | + |
| 155 | +## File System and File Descriptors |
| 156 | + |
| 157 | +Whenever a program works with files, for example, opening `/etc/passwd`, writing logs, or reading configs; it goes through the Linux file system layer. The kernel doesn’t let programs touch files directly. Instead, it hands them a file descriptor: a small number like 3, 4, or 5 that represents an open file. File descriptors 0, 1, and 2 are always reserved for standard input, output, and error(`stdin`, `stdout`, `stderr`). |
| 158 | + |
| 159 | +Tetragon can hook into the kernel functions that create or use these file descriptors. For example, the `fd_install()` function runs whenever the kernel adds a new open file to a process. By attaching a kprobe here, you can see which files are being opened and by which process: |
| 160 | + |
| 161 | +```yaml |
| 162 | +spec: |
| 163 | + kprobes: |
| 164 | + - call: 'fd_install' |
| 165 | + syscall: false |
| 166 | + args: |
| 167 | + - index: 0 |
| 168 | + type: int # the file descriptor number |
| 169 | + - index: 1 |
| 170 | + type: file # the file being opened |
| 171 | + selectors: |
| 172 | + - matchArgs: |
| 173 | + - index: 1 |
| 174 | + operator: 'Prefix' |
| 175 | + values: |
| 176 | + - '/etc' |
| 177 | +``` |
| 178 | + |
| 179 | +The big idea here is that every file action in Linux is funneled through file descriptors, and by watching them, you can write focused policies that only report on the paths you care about. |
| 180 | + |
| 181 | +## Networking and Sockets |
| 182 | + |
| 183 | +In Linux, all network traffic flows through sockets. A socket works like a file descriptor, but instead of pointing to a file, it represents a network connection (IP + port). When an app calls connect() or send(), the kernel manages the socket. This means every HTTP request, DNS lookup, or database call is visible at the socket layer. |
| 184 | + |
| 185 | + |
| 186 | + |
| 187 | +With Tetragon, you can hook into functions like `__sys_connect` to catch new connections and filter them by port, address, or namespace. |
| 188 | + |
| 189 | +```yaml |
| 190 | +spec: |
| 191 | + kprobes: |
| 192 | + - call: '__sys_connect' |
| 193 | + syscall: true |
| 194 | + args: |
| 195 | + - index: 0 |
| 196 | + type: 'sockaddr' |
| 197 | + selectors: |
| 198 | + - matchArgs: |
| 199 | + - index: 0 |
| 200 | + operator: 'DPortPriv' |
| 201 | +``` |
| 202 | + |
| 203 | +Sockets are the kernel’s gateway for networking, and Tetragon gives you the hooks to watch or restrict how they’re used. |
| 204 | + |
| 205 | +## Conclusion: Building Your Kernel Knowledge |
| 206 | + |
| 207 | +Understanding Linux kernel fundamentals for effectively writing Tetragon tracing policies is an ongoing journey. The kernel is a complex, evolving system, and effective policy development requires understanding how its various subsystems interact. |
| 208 | + |
| 209 | +Start by focusing on the areas most relevant to your monitoring objectives: |
| 210 | + |
| 211 | +- For file system security: study VFS, file descriptors, and path resolution |
| 212 | +- For process monitoring: understand task structures, process lifecycle, and namespaces |
| 213 | +- For network security: learn socket structures, network stack flow, and connection tracking |
| 214 | +- For container security: understand namespaces, cgroups, and container runtime interactions |
| 215 | + |
| 216 | +The investment in kernel knowledge pays dividends in the form of more effective, efficient, and reliable tracing policies. Remember that kernel internals can change between versions, so staying current with kernel development and testing your policies across different kernel versions is essential for production deployments |
| 217 | + |
| 218 | +## Resources and References: |
| 219 | + |
| 220 | +- [Linux Kernel Teaching](https://linux-kernel-labs.github.io/refs/heads/master/index.html) |
| 221 | +- [Linux tracing systems & how they fit together](https://jvns.ca/blog/2017/07/05/linux-tracing-systems/) |
| 222 | +- [Learning eBPF by Liz Rice](https://isovalent.com/books/learning-ebpf/) |
| 223 | +- [Crafting Containers By Hand – What Are Containers?](https://btholt.github.io/complete-intro-to-containers/what-are-containers) |
| 224 | +- [File Monitoring with eBPF and Tetragon](https://isovalent.com/blog/post/file-monitoring-with-ebpf-and-tetragon-part-1/) |
| 225 | +- [Tetragon Documentation](https://tetragon.io/docs/) |
| 226 | +- [The Linux Programming Interface](https://www.amazon.com/Linux-Programming-Interface-System-Handbook/dp/1593272200) |
0 commit comments