Compile-Time Evaluation as Language Design

Most production software is bottlenecked on runtime overhead that is invisible to the programmer. A garbage collector pause. A JIT warmup phase. A scheduler context switch. An unpredictable allocation pattern. These are not inherent to the problems being solved. They are the cost of deferring decisions to runtime.

Sarif is a compiled systems programming language that defers less. It compiles directly to native machine code (or WebAssembly) with no VM, no JIT, no garbage collector, and no hidden runtime allocations. The compiler resolves as much as possible at compile time; what remains runs against a minimal C runtime that talks directly to the operating system. The result: programs that start instantly, use predictable memory, and ship as small binaries.

This post explains how and why Sarif was designed this way. Whether you are evaluating Sarif for a project, curious about alternative systems language designs, or looking for insight into how languages can eliminate runtime overhead, the sections below walk through each decision.

How Lowering Becomes Evaluation

Traditional compilers have an optimization phase that runs after IR construction. Sarif treats lowering itself as evaluation.

When the compiler lowers HIR to MIR, it runs pure functions, evaluates constants, and propagates known values. The lowering is the evaluation. There is no separate pass that simplifies expressions after the IR is built. The boundary between compile-time and runtime is structural.

Concretely: loops over known ranges become unrolled sequences. Table lookups against known data become direct constants. Configuration parameters that are constant become compile-time branches. Type specialization that other languages do at runtime via generics or dynamic dispatch happens at compile time via monomorphization.

The MIR is not just an internal representation. It is the normative specification of the language. Every backend, native and WebAssembly, must produce behavior identical to the MIR interpreter. The specification is executable and testable, and users can inspect what the compiler sees at each stage with commands such as sarifc check main.sarif --dump-ir=mir.

The Runtime as Shared Cost

The argument for a managed runtime is that it enables abstractions: automatic memory management, dynamic dispatch, thread scheduling. These are real conveniences. The problem is that the runtime is a shared resource that cannot be specialized. Every allocation goes through the same GC. Every virtual call goes through the same vtable dispatch. Every thread shares the same scheduler state. The runtime must be correct for all programs, so it cannot be optimal for any particular one.

Consider a language with a tracing GC. Allocating a small temporary object is nearly free in terms of code complexity. At runtime, that object must be tracked, its references traced, and eventually the GC pauses all threads to reclaim it. For soft real-time workloads such as audio processing, game loops, and network packets at line rate, these pauses are unacceptable regardless of how short they are.

Sarif’s arena model makes allocation explicit but nearly free. You reset a pointer. Memory is reclaimed by rewinding the allocation arena to a saved position. There is no GC because there is nothing to trace. Allocation is a pointer bump. Deallocation is nothing until the arena rewinds. This is not a minor optimization. It is a different cost model.

Scoped Allocation with Effect Tracking

Sarif manages memory through explicit scoped allocation. Functions that allocate are marked with effects [alloc]. Within those functions, allocations come from a chunked arena. Lifetime is controlled structurally via with_arena { ... } blocks, which automatically manage pushing and popping allocation scopes upon entry and exit (explicit alloc_push() and alloc_pop() calls are deprecated).

Critically, allocations made within an [alloc] function can be returned to callers. A function can build a tree, a list, or any data structure and return it. The caller receives a valid pointer. The arena does not automatically reclaim memory when a function returns; it only rewinds when the enclosing with_arena block exits.

Warning: This design requires callers to respect scope boundaries. If a caller references a returned pointer outside the with_arena block in which it was created, subsequent allocations can reuse that memory and corrupt the returned data. MIR-level escape analysis, implemented with interprocedural fixed-point iteration, catches this class of bug: functions returning scalars extracted from allocations are safe; values directly referencing arena memory trigger warnings in Core and Total, and hard errors in RT.

Text follows the same rule. Builders, concatenation, slicing, and fixed-precision float formatting allocate through the scoped arena. Argument text (arg_text()) uses process-lifetime malloc because argv is OS-provided memory that persists for the process lifetime. stdin_text remains heap-cached by design. Callers must continue to respect allocation scope boundaries.

The effect system tracks which functions can allocate, and the type checker enforces that callers of [alloc] functions declare the effect themselves. If a function uses text_builder_new, list_push, or other allocating builtins without declaring effects [alloc], the compiler rejects it.

Memory Safety Without Borrow Checking

Memory safety is often framed as binary: either you have a GC or you have undefined behavior. Rust introduced a third option: the borrow checker proves memory safety at compile time with zero runtime cost.

Rust’s borrow checker enables memory safety without garbage collection and without runtime overhead. The checker itself has a cost. Every reference requires lifetime verification. Every mutable borrow requires exclusion checking. For large Rust codebases, this accumulates. In some projects, lifetime and trait analysis can become a meaningful part of compile time.

Sarif takes a different trade-off. Compile time is fast because the compiler does not reason about reference lifetimes. Memory management relies on scoped arena discipline, not compile-time lifetime proof. Allocation happens only in functions marked [alloc], and lifetime is bounded by structured with_arena { ... } block scopes. When the block exits and the scope is automatically popped, memory is reclaimed atomically.

MIR-level escape analysis catches cases where arena-allocated memory escapes its allocation scope. The analysis is monotonic (false → true only), converging in at most N passes for N functions. Functions returning only scalars extracted from allocations are safe; values directly referencing arena memory trigger diagnostics. RT enforces this as a hard error, while Core and Total emit warnings.

This is not a full Rust-style borrow checker. The analysis is simpler: single-pass per function with fixed-point propagation through the call graph, targeting the dominant class of arena-use-after-free bugs without the complexity of lifetime-generic verification.

Enums and Exhaustive Pattern Matching

Sarif’s enums are algebraic data types, closed at compile time. Every variant is known. Every match is exhaustive.

When the compiler knows all possible variants, it generates direct control flow. Dense variant sets become jump tables. Sparse ones become balanced decision trees. There is no dynamic dispatch, no vtable lookup, no interface indirection.

You might object that dynamic dispatch is needed for polymorphic code. The response is that most polymorphism resolves at compile time via monomorphization. Functions that must be polymorphic at runtime use explicit enum tagging or effect handlers, both with known bounded cost.

Open polymorphism via interfaces is usually a workaround for the inability to monomorphize at compile time. When the compiler can see all call sites and all implementations, it specializes. When it cannot, explicit tagging makes the dynamic behavior visible and bounded.

Byte-First I/O

Sarif’s I/O model is byte-oriented. Raw data arrives as Bytes, not as decoded Unicode strings.

This matters for high-throughput workloads: genomics processing of 100 GB DNA files, network protocol parsing at line rate, and binary data formats where string allocation overhead is non-trivial.

Most languages assume everything is UTF-8 text and decode eagerly. For data that is genuinely text, this is convenient. For binary protocols, genomic files, and network packets, it is pure overhead. Consider network packets at 100 Gbps. Every packet must be parsed. Decoding bytes to strings that will immediately be matched as bytes is waste.

Sarif provides bytes_slice for zero-copy views into memory buffers and text_* functions for when decoding is genuinely needed. Programs operating on binary data pay no decoding cost.

Concurrency Belongs in the Runtime Layer

Sarif has no async/await, no spawn(), no channels, no locks in the language syntax.

This is not naivety. It is the recognition that concurrency primitives in a language commit the language to a particular concurrency model. If the language has spawn, it needs a thread scheduler. If it has channels, it needs a runtime to manage them.

Sarif’s position is that concurrency should be managed by an external orchestrator or runtime system, not built into the language itself. Programs are pure deterministic kernels. An external orchestrator, written in Rust, C, or any systems language, schedules Sarif functions across cores or network nodes. From within Sarif, the computation is single-threaded and deterministic.

This separation has a practical benefit. It becomes possible to reason about performance at the function level. There are no hidden scheduling decisions, no non-deterministic thread interleavings, no race conditions that only appear under load.

Parallel scaling at the orchestration layer is future work. A DAG runtime can schedule pure Sarif functions as work items, parallelizing across cores or machines, without turning Sarif source into a threaded programming model. That runtime still needs explicit effect boundaries and deterministic scheduler rules; hidden ambient concurrency would undermine the predictability Sarif is designed to preserve.

The Normative MIR Oracle

Most compilers treat their intermediate representation as an internal detail: useful for optimization passes, but not the source of truth. Sarif inverts this. The MIR interpreter is the normative semantic oracle for the language.

Every maintained backend, native (Cranelift) and WebAssembly, must produce behavior identical to the MIR interpreter. If the native backend produces different output than the interpreter for the same input, the backend is wrong. This is not a convention; it is an architectural constraint that eliminates entire classes of backend bugs.

The practical benefit: you can debug your program using the MIR interpreter (sarifc run), then compile the same code to native (sarifc build), and trust that the behavior is identical. The interpreter is not a reference implementation; it is the specification.

Profiles: One Language, Different Strictness

Sarif provides profiles that restrict the same language rather than creating dialects. Core is the base language. Total is stricter, intended to remove partiality and unbounded execution. RT is stricter still, intended to bound resource use and preserve predictability.

These are not different language modes. They are enforced subsets. A program that compiles under the RT profile is valid Core Sarif; it just happens to satisfy stricter constraints.

Build and Runtime Performance

Sarif leads the retained bnch decision profiles in the maintained report: balanced, speed, memory, build, and deploy. The benchmark suite evaluates Sarif against other systems languages across canonical workloads, with all benchmarks passing with no failures or mismatches. Full results are in the bnch report.

What matters is not the specific numbers but what they imply. Sarif leads the build profile by a wide margin. The compiler does not spend time on borrow checking or complex lifetime analysis. The language does not defer decisions to runtime that could be made at compile time. The current toolchain compiles retained Sarif programs quickly.

For large projects, build speed directly affects developer productivity. Every minute saved in build time is a minute saved on every iteration. Over thousands of iterations, this compounds.

Sarif also leads in speed because compile-time evaluation means less runtime work. No GC pauses, no JIT warmup, no scheduler overhead. Memory usage is predictable and bounded because the arena model makes allocation cost explicit and deallocation cost zero. Binary sizes are small because there is no managed runtime library to bundle, no GC to include, no standard library overhead. The minimal C runtime is pruned at link time so only used functions are included.

Current Stage 0 Status

Sarif is currently at Stage 0. The compiler is written in Rust and emits native binaries via Cranelift, but the formatter is now a self-hosted Sarif program. sarifc format delegates to the bootstrap Sarif interpreter by default, passing parity tests against the Rust formatter for all shipped inputs. This validates the bootstrap compiler against serious production use.

Stage 0 continues moving checking and documentation authority into Sarif-hosted tooling before the compiler backend itself is self-hosted. Full compiler self-hosting remains a later milestone, but the same test applies: Sarif has to express its own serious tooling without weakening the maintained semantics.

The most important current milestones:

Bootstrap formatter: sarifc format runs the Sarif-hosted formatter by default and passes parity tests against the Rust formatter.
HIR→MIR lowering: control flow (if/while/repeat/match), data access (field/index/record/array), and record/array creation are lowered in the bootstrap compiler.
MIR constant folding: integer and float binary operations fold at compile time, with tests covering arithmetic, bitwise operators, and comparisons.
MIR-level escape analysis: interprocedural fixed-point analysis replaces the earlier conservative has_alloc heuristic. Each function’s result is analyzed through the call graph until stable, eliminating false positives from pass-through wrappers. RT enforces escaping arena references as hard errors.
MIR interpreter performance: call handling is iterative rather than recursively unbounded, and avoidable cloning in branch execution has been removed.
Runtime memory model: text arena ownership is finalized. arg_text uses process-lifetime malloc; builders, concatenation, slicing, and formatting allocate through the scoped arena.
Runtime verification: the C runtime is fuzzed with Clang libFuzzer and ASan across runtime opcodes, with no known memory safety violations in the maintained corpus.
Total profile: compile-time constant repeat N is statically terminating and allowed; non-constant repeats and while remain forbidden.
Link-time runtime pruning: Removed all conditional compilation flags (RuntimeFeatures::detect and C runtime #ifndef blocks), allowing the native link stage (-Wl,--gc-sections or equivalent) to naturally prune unused runtime functions from generated binaries.
Binary size work: release flags suppress debug info and strip at link time, reducing minimal native binary size without sacrificing benchmark performance.
Semantic analyzer stability: ownership analysis deduplicates duplicate function definitions by name during fixpoint iteration, preventing oscillation.
Native build coverage: integration tests cover list_sort_text, link-time runtime pruning, scoped allocation, process arguments, stdin, stdout streaming, WebAssembly output, and package builds.
CI pipeline: GitHub Actions runs format, lint, and test checks on pushes and pull requests to main.
Rust pipeline fuzzing: the Rust fuzz target covers lexer through escape analysis and continues to run against a retained corpus with no known crashes.
Allocation scopes: Bounded with_arena { ... } block scopes are fully supported and auto-manage arena push/pop. Explicit alloc_push/alloc_pop builtins are deprecated with compile-time warnings and will be removed in a future milestone.

Remaining Stage 0 work:

Bootstrap semantic analysis parity: sarifc check, doc, and format all default to self-hosted bootstrap paths. However, the bootstrap semantic analysis (type inference, ownership checking, borrow analysis on expressions) is still simplified compared to the Rust reference implementation. Full parity means the Sarif-hosted checker can validate the same programs the Rust checker does, with equivalent diagnostics.

The compiler demonstrates the properties the language optimizes for: fast compilation, predictable memory usage, small binaries. These are not accidental. They are the result of deliberate design decisions about what belongs in the language and what belongs in the runtime.

The Core Insight

The managed runtime is a cost that most programs pay whether they need it or not. Garbage collection, dynamic dispatch, thread scheduling add overhead that is often invisible to the programmer but present at runtime.

Sarif’s bet is that this overhead is unnecessary for most systems programming problems. By pushing evaluation to compile time and providing a minimal runtime for what cannot be evaluated ahead of time, Sarif produces programs that start immediately, run predictably, and deploy as small binaries.

The evidence is in the benchmarks. The design is in the language. The implementation is in the compiler.

Getting Started

Sarif is open source. To try it yourself:

Clone the repository: git clone https://github.com/ninji-research/sarif.git
Build the compiler: cargo build --release
Write a Sarif file (see examples/ for sample projects; each contains a Sarif.toml manifest)
Compile and run: Use cargo run --release -p sarifc -- build examples/hello-package -o hello (or use the built binary target/release/sarifc build examples/hello-package -o hello) and execute with ./hello.

The best way to understand Sarif’s compile-time evaluation philosophy is to write a small program and inspect the stages with --dump-ir=mir (MIR), --dump-ir=hir (HIR), or --dump-ir=semantic (ownership and type metadata). Sarif is early-stage and evolving, but the core compilation pipeline works today and produces working native binaries on Linux and macOS.

NINJI specializes in high-performance systems engineering and security analysis. We partner with organizations to design, audit, and harden critical infrastructure. Explore our services or contact our team to scope a project.