AlexshaDocsProgramming
Related
How to Build a .NET AI Orchestration Library: A Step-by-Step GuideDecoding Genius: ‘Breaking the Code’ Brings Alan Turing’s Story to Cambridge StageModernize Your Go Code with the New go fix: A Step-by-Step GuidePython 3.15.0 Alpha 3: Key Features and Development InsightsGo 1.25 Introduces Flight Recorder for Real-Time Execution TracingHow to Coordinate Multiple AI Agents in Large-Scale SystemsMastering Autonomous AI Agents: A Security-Focused Guide to OpenClaw10 Key Insights into Python 3.15.0 Alpha 3: What Developers Need to Know

When APIs Are Not Enough: The Clash Between Kernel Improvements and TCMalloc's Reliance on Undocumented Behavior

Last updated: 2026-05-03 19:45:24 · Programming

Introduction: Hyrum's Law in Action

Hyrum's Law famously asserts that any observable behavior of a system, no matter how incidental, will eventually become a dependency for someone. The Linux kernel community is currently grappling with a vivid illustration of this principle. Recent enhancements to restartable sequences (rseq) in the 6.19 release aimed to address performance bottlenecks while meticulously preserving the documented API. Yet, despite these precautions, Google's TCMalloc library—which relies on undocumented behaviors—broke under the new kernel. This clash underscores the tension between kernel developers' commitment to no regressions and external libraries' unintended dependencies.

When APIs Are Not Enough: The Clash Between Kernel Improvements and TCMalloc's Reliance on Undocumented Behavior

Restartable Sequences and the 6.19 Updates

Restartable sequences are a kernel feature that allows user-space code to execute critical sections atomically without system calls, provided the kernel can restart them if interrupted. In Linux 6.19, developers optimized rseq to reduce overhead in high-frequency scenarios, such as per-CPU caching. The changes were designed to be backward-compatible—the documented API remained unchanged, and all existing usage patterns were supported. However, the internal implementation shifted in subtle ways.

Documented API vs. Actual Behavior

The kernel's documentation specified only the expected use of rseq: setting up a per-thread structure and calling the appropriate system call. But TCMalloc, Google's memory allocator, had come to depend on specific, unspecified timing and ordering of rseq operations. For instance, it assumed that certain kernel-side states would persist longer than guaranteed. When the 6.19 kernel altered those internal details—though still within the documented contract—TCMalloc began to malfunction.

TCMalloc's Violation of the Documented API

Deep analysis revealed that TCMalloc was not just relying on undocumented behavior; it was actively bypassing the intended use of rseq. The library would manipulate the restartable sequences control block in ways the kernel never anticipated, effectively hogging the feature and preventing other code from using it correctly. This meant that even if TCMalloc's own operations seemed fine, any other library or application attempting to leverage rseq would fail or behave unpredictably.

Consequences for the Ecosystem

The violation had broader repercussions. Because TCMalloc is widely deployed (e.g., in Chrome and many server workloads), its improper use of rseq created a de facto standard that other software had to match—or risk incompatibility. The 6.19 kernel change inadvertently broke this hidden contract, causing TCMalloc to crash or degrade performance. More critically, it highlighted a systemic issue: the kernel's no-regressions rule, which demands that any new release must not break existing user-space programs, now forced developers to accommodate TCMalloc's non-compliant behavior.

The Kernel Community's Response

Under the no-regressions policy, the kernel developers had to find a solution that restored TCMalloc's functionality without sacrificing the improvements for everyone else. This involved adding compatibility shims—extra logic to detect when a process is using rseq in the old, undocumented way, and then emulating that behavior. These shims carry a small performance cost, but they preserve the broader ecosystem's stability.

Lessons Learned

This episode is a textbook example of Hyrum's Law. It demonstrates that even the most carefully maintained API can be undermined by dependencies on incidental implementation details. Developers are now discussing how to better document not just the API, but also the intended usage patterns, and how to detect violations early. For TCMalloc, fixes are being planned to bring it into alignment with the documented interface, but the kernel's accommodation serves as a pragmatic stopgap.

Conclusion: Navigating Unwritten Contracts

The clash between kernel improvements and TCMalloc's undocumented dependencies is far from unique. It echoes similar conflicts in other ecosystems, from web browsers to operating systems. The Linux kernel's approach—prioritizing no regressions while pushing for long-term compliance—offers a balanced path forward. However, it also serves as a warning: any observable behavior, no matter how obscure, can become a de facto dependency. For library authors, the lesson is clear: stick to the documentation, and for kernel developers, proactive communication about implementation changes can help prevent such surprises.