Published by Addison-Wesley Professional (September 25, 2014) © 2013
Marshall McKusick | George Neville-Neil | Robert WatsonThe most complete, authoritative technical guide to the FreeBSD kernel’s internal structure has now been extensively updated to cover all major improvements between Versions 5 and 11. Approximately one-third of this edition’s content is completely new, and another one-third has been extensively rewritten.
Three long-time FreeBSD project leaders begin with a concise overview of the FreeBSD kernel’s current design and implementation. Next, they cover the FreeBSD kernel from the system-call level down–from the interface to the kernel to the hardware. Explaining key design decisions, they detail the concepts, data structures, and algorithms used in implementing each significant system facility, including process management, security, virtual memory, the I/O system, filesystems, socket IPC, and networking.
This Second Edition
• Explains highly scalable and lightweight virtualization using FreeBSD jails, and virtual-machine acceleration with Xen and Virtio device paravirtualization
• Describes new security features such as Capsicum sandboxing and GELI cryptographic disk protection
• Fully covers NFSv4 and Open Solaris ZFS support
• Introduces FreeBSD’s enhanced volume management and new journaled soft updates
• Explains DTrace’s fine-grained process debugging/profiling
• Reflects major improvements to networking, wireless, and USB support
Readers can use this guide as both a working reference and an in-depth study of a leading contemporary, portable, open source operating system. Technical and sales support professionals will discover both FreeBSD’s capabilities and its limitations. Applications developers will learn how to effectively and efficiently interface with it; system administrators will learn how to maintain, tune, and configure it; and systems programmers will learn how to extend, enhance, and interface with it.
Marshall Kirk McKusick writes, consults, and teaches classes on UNIX- and BSD-related subjects. While at the University of California, Berkeley, he implemented the 4.2BSD fast filesystem. He was research computer scientist at the Berkeley Computer Systems Research Group (CSRG), overseeing development and release of 4.3BSD and 4.4BSD. He is a FreeBSD Foundation board member and a long-time FreeBSD committer. Twice president of the Usenix Association, he is also a member of ACM, IEEE, and AAAS.
George V. Neville-Neil hacks, writes, teaches, and consults on security, networking, and operating systems. A FreeBSD Foundation board member, he served on the FreeBSD Core Team for four years. Since 2004, he has written the “Kode Vicious” column for Queue and Communications of the ACM. He is vice chair of ACM’s Practitioner Board and a member of Usenix Association, ACM, IEEE, and AAAS.
Robert N.M. Watson is a University Lecturer in systems, security, and architecture in the Security Research Group at the University of Cambridge Computer Laboratory. He supervises advanced research in computer architecture, compilers, program analysis, operating systems, networking, and security. A FreeBSD Foundation board member, he served on the Core Team for ten years and has been a committer for fifteen years. He is a member of Usenix Association and ACM.
Preface xxi
About the Authors xxix
Part I: Over view 1
Chapter 1: History and Goals 3
1.1 History of the UNIX System 3
1.2 BSD and Other Systems 7
1.3 The Transition of BSD to Open Source 9
1.4 The FreeBSD Development Model 14
References 17
Chapter 2: Design Overview of FreeBSD 21
2.1 FreeBSD Facilities and the Kernel 21
2.2 Kernel Organization 23
2.3 Kernel Services 26
2.4 Process Management 26
2.5 Security 29
2.6 Memory Management 36
2.7 I/O System Overview 39
2.8 Devices 44
2.9 The Fast Filesystem 45
2.10 The Zettabyte Filesystem 49
2.11 The Network Filesystem 50
2.12 Interprocess Communication 50
2.13 Network-Layer Protocols 51
2.14 Transport-Layer Protocols 52
2.15 System Startup and Shutdown 52
Exercises 54
References 54
Chapter 3: Kernel Services 57
3.1 Kernel Organization 57
3.2 System Calls 62
3.3 Traps and Interrupts 64
3.4 Clock Interrupts 65
3.5 Memory-Management Services 69
3.6 Timing Services 73
3.7 Resource Services 75
3.8 Kernel Tracing Facilities 77
Exercises 84
References 85
Part II: Processes 87
Chapter 4: Process Management 89
4.1 Introduction to Process Management 89
4.2 Process State 92
4.3 Context Switching 99
4.4 Thread Scheduling 114
4.5 Process Creation 126
4.6 Process Termination 128
4.7 Signals 129
4.8 Process Groups and Sessions 136
4.9 Process Debugging 142
Exercises 144
References 146
Chapter 5: Security 147
5.1 Operating-System Security 148
5.2 Security Model 149
5.3 Process Credentials 151
5.4 Users and Groups 154
5.5 Privilege Model 157
5.6 Interprocess Access Control 159
5.7 Discretionary Access Control 161
5.8 Capsicum Capability Model 174
5.9 Jails 180
5.10 Mandatory Access-Control Framework 184
5.11 Security Event Auditing 200
5.12 Cryptographic Services 206
5.13 GELI Full-Disk Encryption 212
Exercises 217
References 217
Chapter 6: Memory Management 221
6.1 Terminology 221
6.2 Overview of the FreeBSD Virtual-Memory System 227
6.3 Kernel Memory Management 230
6.4 Per-Process Resources 244
6.5 Shared Memory 250
6.6 Creation of a New Process 258
6.7 Execution of a File 262
6.8 Process Manipulation of Its Address Space 263
6.9 Termination of a Process 266
6.10 The Pager Interface 267
6.11 Paging 276
6.12 Page Replacement 289
6.13 Portability 298
Exercises 308
References 310
Part III: I/OSystem 313
Chapter 7: I/O System Overview 315
7.1 Descriptor Management and Services 316
7.2 Local Interprocess Communication 333
7.3 The Virtual-Filesystem Interface 339
7.4 Filesystem-Independent Services 344
7.5 Stackable Filesystems 352
Exercises 358
References 359
Chapter 8: Devices 361
8.1 Device Overview 361
8.2 I/O Mapping from User to Device 367
8.3 Character Devices 370
8.4 Disk Devices 374
8.5 Network Devices 378
8.6 Terminal Handling 382
8.7 The GEOM Layer 391
8.8 The CAM Layer 399
8.9 Device Configuration 402
8.10 Device Virtualization 414
Exercises 428
References 429
Chapter 9: The Fast Filesystem 431
9.1 Hierarchical Filesystem Management 431
9.2 Structure of an Inode 433
9.3 Naming 443
9.4 Quotas 451
9.5 File Locking 454
9.6 Soft Updates 459
9.7 Filesystem Snapshots 480
9.8 Journaled Soft Updates 487
9.9 The Local Filestore 496
9.10 The Berkeley Fast Filesystem 501
Exercises 517
References 519
Chapter 10: The Zettabyte Filesystem 523
10.1 Introduction 523
10.2 ZFS Organization 527
10.3 ZFS Structure 532
10.4 ZFS Operation 535
10.5 ZFS Design Tradeoffs 547
Exercises 549
References 549
Chapter 11: The Network Filesystem 551
11.1 Overview 551
11.2 Structure and Operation 553
11.3 NFS Evolution 567
Exercises 586
References 587
Part IV: Interprocess Communication 591
Chapter 12: Interprocess Communication 593
12.1 Interprocess-Communication Model 593
12.2 Implementation Structure and Overview 599
12.3 Memory Management 601
12.4 IPC Data Structures 606
12.5 Connection Setup 612
12.6 Data Transfer 615
12.7 Socket Shutdown 620
12.8 Network-Communication Protocol Internal Structure 621
12.9 Socket-to-Protocol Interface 626
12.10 Protocol-to-Protocol Interface 631
12.11 Protocol-to-Network Interface 634
12.12 Buffering and Flow Control 643
12.13 Network Virtualization 644
Exercises 646
References 648
Chapter 13: Network-Layer Protocols 649
13.1 Internet Protocol Version 4 650
13.2 Internet Control Message Protocols (ICMP) 657
13.3 Internet Protocol Version 6 659
13.4 Internet Protocols Code Structure 670
13.5 Routing 675
13.6 Raw Sockets 686
13.7 Security 688
13.8 Packet-Processing Frameworks 700
Exercises 715
References 717
Chapter 14: Transport-Layer Protocols 721
14.1 Internet Ports and Associations 721
14.2 User Datagram Protocol (UDP) 723
14.3 Transmission Control Protocol (TCP) 725
14.4 TCP Algorithms 732
14.5 TCP Input Processing 741
14.6 TCP Output Processing 745
14.7 Stream Control Transmission Protocol (SCTP) 761
Exercises 768
References 770
Part V: System Operation 773
Chapter 15: System Startup and Shutdown 775
15.1 Firmware and BIOSes 776
15.2 Boot Loaders 777
15.3 Kernel Boot 782
15.4 User-Level Initialization 798
15.5 System Operation 800
Exercises 805
References 806
Glossary 807
Index 847