NativeLink is a runtime abstraction layer that accelerates modern build systems. We designed it to help developers optimize their build infrastructure for efficiency and replace CI/CD systems by making deployment pipelines more like production environments. Additionally, it runs high-fidelity simulations under near real-world conditions. NativeLink achieves this through its core components: Content Addressable Storage (CAS), which stores data blobs retrievable via their content-derived hashes, and the Action Cache (A/C), which stores action results keyed by command hashes. Clients, such as build and test systems like Bazel and Buck2, interact with NativeLink by invoking commands, uploading necessary data to CAS, and executing tasks based on DAG traversal. NativeLink’s architecture ensures precise, reproducible task execution, facilitating faster and more accurate validation of builds and designs.

NativeLink Overview Diagram

Let’s go over some major components in NativeLink’s architecture:

NativeLink Components

Content Addressable Storage (CAS)

The CAS is a typical and deterministic blob storage. Data is stored and retrieved based on its content rather than its location or name by using unique hash keys generated from the data itself. This allows identical data to be stored only once.

Action Cache (AC)

The AC in NativeLink stores the results of previously executed commands using a unique hash generated from the command and its parameters as the key. When a command is executed, a hash is created based on the command's content, inputs, and environmental variables. This hash is then used to store the command's output in the AC, linking the specific command to its result. When the same command is issued again, the system generates the hash and checks the AC for a matching entry. If found, the cached result is retrieved instantly from the CAS.

Scheduler

The scheduler is responsible for managing the task execution. It ensures each task is processed in the correct order and it keeps track of all pending, running, and completed tasks. The scheduler also balances workloads across multiple workers so performance is optimized.

Worker

The workers execute the commands or build tasks assigned to them by the scheduler. These tasks include compiling code to machine code. Workers are provisioned with the current toolchains to compile and link code, creating build artifacts. The worker pool consists of multiple workers running on powerful remote machines. The remote machines provide more compute to local development environments.

NativeLink’s Architecture

1. Client Checks the AC to see if the build artifacts exist

a. Before executing the build tasks, the client calls the AC. This call includes a digest, which is a hash that uniquely identifies the action whose results the client is interested in. The AC checks its own store to see if it has an entry corresponding to the provided digest. This store contains the action results, which include references (digests) to the output files stored in the CAS.

b. If the action result is found, the AC returns the action result to the client. The action result contains the digests of the output files that can be fetched from the CAS. If the action result is not found (a cache miss), the AC communicates to the client that the action result is not available.

2. Request route if the build artifacts do not exist is the CAS

The client uploads necessary input files to the CAS and requests execution of the action through the scheduler and workers.

a. The client sends an execute request to the scheduler. This request includes the action digest and all necessary information for the execution.The scheduler queues the action, assigns it to a worker, and manages its execution.

b.The worker executes the action and generates the outputs.

c. The workers upload the output files to the CAS.

d. The worker then updates the AC with the new action result by associating the action digest with the output digests.

e. The worker prepares an ActionResult object that contains metadata, like the output files' digests, about the execution.

f. The worker tells the scheduler that the task is completed.

g. The scheduler informs the client about the completion of the task and provides references to the output files.

This wraps it up for the NativeLink Overview! In subsequent blog posts, we’ll deep-dive into more details about the NativeLink components. We'll double-click into the mechanisms involved in remote execution and remote caching. If you have questions or want to learn more about NativeLink, please join the community Slack Channel.

A quick overview of NativeLink's architecture

NativeLink is a runtime abstraction layer that accelerates modern build systems. We designed it to help developers optimize their build infrastructure

A quick overview of NativeLink's architecture

NativeLink Overview Diagram

NativeLink Components

Content Addressable Storage (CAS)

Action Cache (AC)

Scheduler

Worker

NativeLink’s Architecture

1. Client Checks the AC to see if the build artifacts exist

2. Request route if the build artifacts do not exist is the CAS

A quick overview of NativeLink's architecture

Product

Community

Company

Resources

Documentation

Terms and Privacy

Compliance

Contact us

Career opportunities

Resources

© Trace Machina 2024