Rumoca Internals
This book explains how the Rumoca compiler works and how to contribute to it. It is written for humans first: the chapters tell the story of a model moving through the pipeline, then point you at the code and the specs that own each piece.
Rumoca is a Modelica compiler written in Rust. It compiles Modelica models
to differential-algebraic equations (DAE), simulates them with pluggable
solver backends, and generates code for symbolic, compiled, and packaged
targets. The same compiler runs natively, in the rumoca-lsp language
server, and in WebAssembly in the browser.
How to Use This Book
- New to the codebase? Read the How the Compiler Works part in order — it follows a model from source text to simulation results.
- Fixing a bug or adding a feature? Start with Getting Started, then read the chapter for the pipeline stage you are touching, then read the spec that owns it.
- Looking up a rule? Skip this book and go straight to
spec/.
This Book Explains; the Specs Decide
Normative architecture and contribution rules live in
spec/, and
AGENTS.md is
the routing index from task to spec. This book deliberately does not
restate spec rules — it gives you the mental model that makes the specs easy
to read, and links to them. When this book and a spec disagree, the spec
wins (and a docs fix is welcome).
Philosophy in One Paragraph
Rumoca’s core scope ends at DAE generation (SPEC_0031): the compiler builds a stable, deterministic, solver-agnostic DAE, and everything downstream — solvers, runtimes, viewers, bindings, code generation targets — is a replaceable extension layered on that contract. Most architectural decisions in the codebase trace back to defending that boundary.
Prerequisites
You should be comfortable with Rust, basic compiler concepts (parsing, ASTs, type checking), and Git/GitHub workflows. Familiarity with Modelica helps but is not required — the user guide and the chapters here explain the concepts where they matter.
Pipeline Overview
A model moves through the compiler in a fixed sequence of phases, each owned
by a rumoca-phase-* crate, each producing or refining one of four
intermediate representations:
Modelica source (.mo)
│ parse rumoca-phase-parse
▼
AST ───────────────────── rumoca-ir-ast
│ resolve rumoca-phase-resolve
│ typecheck rumoca-phase-typecheck
│ instantiate rumoca-phase-instantiate
│ flatten rumoca-phase-flatten
▼
Flat ──────────────────── rumoca-ir-flat
│ DAE lowering rumoca-phase-dae
▼
DAE ───────────────────── rumoca-ir-dae ◄── the stable contract
│ structural prep + rumoca-phase-structural,
│ solve lowering rumoca-phase-solve
▼
Solve ─────────────────── rumoca-ir-solve
│
├── simulate rumoca-sim, rumoca-solver-*
└── generate code rumoca-phase-codegen + targets
The stage contracts — what each IR may contain, what each phase may and may not do — are normative in SPEC_0007. The narrative below is the mental model.
The Story of a Compile
- Parse turns source text into an AST that faithfully represents the text — comments, spans, syntax structure — with no semantic knowledge.
- Resolve builds the scope tree and assigns every definition a stable
DefId. From this point on, compiler identity isDefId-keyed, never string-keyed. - Typecheck checks the resolved tree and evaluates structural parameters (the ones that determine array sizes and loop ranges).
- Instantiate applies modifications and builds the instance tree for
the requested model —
Tank tank1(area = 2.0)becomes a concrete instance withareabound. - Flatten walks the instance tree into a single flat model: one list of
variables with fully qualified names, one list of equations,
connectsets expanded into equality and flow-sum equations. Arrays stay symbolic;der,pre,sampleare still present as expressions. - DAE lowering eliminates Modelica-specific operators and produces the
MLS Appendix B canonical form: pure functions over the variable vector,
with events, relations, and clocks as explicit metadata.
reinitbecomes guarded update equations;prebecomes explicit__pre__.*slots;assert/terminatebecome event actions. - Structural preparation (on the way to simulation) matches equations to unknowns, sorts into block lower-triangular (BLT) order, tears algebraic loops, performs index reduction with dummy derivatives, and selects states.
- Solve lowering converts the prepared system into a register-machine representation — scalar programs and tensor program nodes — that execution backends consume directly.
You can watch every step on a real model:
rumoca compile Model.mo --emit ast-mo # or flat-mo, dae-mo, *-json
rumoca compile Model.mo --emit solve-json
rumoca compile Model.mo --inspect structure
rumoca compile Model.mo --target sympy -o /tmp/out -v # phase timing lines
Try It Here
This block runs the same pipeline in your browser; Show DAE displays the lowered system (stage 6) for whatever you type:
model Mixer "Two tanks exchanging fluid"
parameter Real k = 0.4 "Exchange coefficient";
Real h1(start = 1.0);
Real h2(start = 0.0);
Real q "Exchange flow";
equation
q = k * (h1 - h2);
der(h1) = -q;
der(h2) = q;
annotation(experiment(StopTime = 10.0));
end Mixer;
Where the Boundaries Bite
Three boundary rules explain most review feedback on pipeline changes:
- Codegen targets the lowest IR it needs — no lower. A formatter reads AST; FMI export and symbolic backends read DAE; numeric kernels read Solve. Reaching down for convenience couples a backend to representation details it should not know.
- The DAE is lean. Mass matrices, Jacobians, BLT orderings, tearing choices — anything that is a solver work product — belongs in structural analysis results or Solve artifacts, never stored inside the DAE.
- Phases fail loudly in their own stage. If source information (a span, a name structure, a type) is lost at a phase boundary, the fix is to preserve it at that boundary — not to reconstruct it downstream by parsing strings.
The Four IRs
Each IR is a serializable data structure crate (rumoca-ir-*) with a
schema version. This chapter says what each one is for; the binding
contracts live in
SPEC_0007.
AST (rumoca-ir-ast)
The parser’s output: concrete syntax with comments and spans. It represents
text, not semantics — no name resolution, no types. Formatters,
pretty-printers, and doc generators work here because they need the original
syntax. Every node carries a Span, and that provenance must survive any
AST merging.
Dump it: rumoca compile Model.mo --emit ast-mo (or ast-json).
Flat (rumoca-ir-flat)
The instantiated, modified, flattened model: variables and equations with fully qualified names, no unresolved references, no pending modification chains. Two things are deliberately not done yet:
- Arrays stay symbolic. Scalarization happens later, with shape metadata, only for backends that need it.
- Modelica operators survive.
der(),pre(),sample(),initial()are still expression nodes; their semantic lowering is the DAE phase’s job.
Flat is the right level for flat-Modelica export and for structural transformations that preserve Modelica expression form.
Dump it: --emit flat-mo / flat-json.
DAE (rumoca-ir-dae)
The stable contract of the whole project: the MLS Appendix B canonical DAE.
Pure functions over the variable vector v := [p; t; ẋ; x; y; z; m; pre(z); pre(m)], partitioned by kind:
| ID | Function | Role |
|---|---|---|
| B.1a | fx(v, c) = 0 | Continuous DAE residual |
| B.1b | fz(v, c) = 0 | Discrete real update |
| B.1c | fm(v, c) = 0 | Discrete-valued update |
| B.1d | fc(relation(v)) | Event conditions |
By the time a model is DAE, no source temporal operator survives — not
in any partition. pre has become explicit __pre__.* parameter slots;
reinit has become guarded update equations; sample has become event and
clock metadata; assert/terminate have become event actions. A validation
pass (appendix_b_validation) enforces this positively rather than relying
on downstream code to cope.
The DAE is also deliberately lean: it represents Modelica semantics and source identity, never solver work products. If you are tempted to cache a Jacobian, a BLT ordering, or a scalarized variant inside the DAE — that belongs in structural analysis results or Solve artifacts.
Dump it: --emit dae-mo / dae-json. The dae-mo form is what the
Show DAE buttons in both guides render.
Solve (rumoca-ir-solve)
A register-machine representation of the DAE functions: ComputeBlock
graphs mixing ScalarProgramBlocks (flat register programs, one scalar
output each) with tensor program nodes (MatMul, LinSolve, …) that carry
explicit shape/layout metadata and a scalar fallback. Solve adds no new
mathematics — it changes format so execution backends (interpreter,
Cranelift JIT, MLIR, CUDA, generated C/Rust) can consume the system
directly, choosing scalar expansion or native tensor kernels.
Dump it: --emit solve-json (there is no Modelica rendering of Solve).
Schema Versions
Serialized DAE and Solve payloads carry a mandatory root schema_version;
deserializers reject unsupported versions. The policy is in
IR Schema Versioning.
Front End: Parse to Flat
The front end takes source text to the Flat IR. Five phases, each a crate.
Parse (rumoca-phase-parse)
Produces the AST: syntax structure, comments, spans. The parser assigns each
file a stable source identity (Span.source derives from the source name,
not an insertion index), which is what lets diagnostics, the LSP, and the
formatter agree about locations across sessions.
The parser is also the foundation of the editor experience — it is fast and
error-tolerant enough to run on every keystroke in rumoca-lsp and the
browser playground.
Resolve (rumoca-phase-resolve)
Builds the scope tree and assigns every definition a DefId — a stable,
structural identity for classes, components, and variables. Name lookup
walks the scope tree
(SPEC_0002);
the DefId design and its invariants are
SPEC_0001.
The rule that shapes everything downstream: after resolution, identity is
DefId, never a rendered name string. Hashing or comparing flattened
name strings ("a.b.c") inside semantic code means structure was lost too
early; carry the DefId instead. Textual path parsing is allowed only at
true source/protocol/config/display boundaries.
Typecheck (rumoca-phase-typecheck)
Checks the resolved tree and evaluates structural parameters — the values
that determine array dimensions and for-equation ranges, which must be
known before instantiation can size anything. Type errors carry spans and
phase-local error codes.
Instantiate (rumoca-phase-instantiate)
Builds the instance tree for the requested model: applies modification
chains (Tank tank1(area = 2.0, h(start = 1.0))), handles extends,
redeclarations, and conditional components, producing an
InstanceOverlay/InstancedTree. Instantiation and flattening are
deliberately separate phases — the production path runs instanced
typechecking between them, and no cross-phase shortcuts are allowed.
Flatten (rumoca-phase-flatten)
Traverses the instance overlay into flat::Model: one variable list with
fully qualified names, one equation list, connect sets expanded into
potential-equality and flow-sum equations, for-equations unrolled. What
flattening does not do is equally important: arrays stay symbolic,
function bodies stay structured, and Modelica operators (der, pre,
sample) survive untouched for the DAE phase.
Seeing It
rumoca compile Model.mo --emit ast-mo # what the parser saw
rumoca compile Model.mo --emit flat-mo # what flattening produced
Comparing flat-mo against your mental model of the hierarchy is the
fastest way to debug modification and connection handling.
DAE Lowering and Structural Analysis
DAE Lowering (rumoca-phase-dae)
DAE lowering eliminates the Modelica-specific surface and produces the MLS Appendix B canonical system (see The Four IRs). The transformations with the most moving parts:
pre()elimination. Everypre(x)becomes an explicit__pre__.*parameter slot; the runtime writes those slots at event entry. This must hold in all partitions (f_x,f_z,f_m,f_c) —preexists only in AST and Flat.when/reinitlowering. When-clauses become event conditions plus guarded discrete update equations;reinit(x, e)becomes a guarded state update over current/pre slots.- Relation extraction. Every relational expression that can generate an
event gets a relation/condition variable;
conditions.relationsis the single owner of that surface. Synthetic numeric roots (fromabs,sign) live separately inevents.synthetic_root_conditions. sample/clocks. Scheduled events are explicit DAE metadata, not expressions the runtime has to discover.assert/terminate. Lowered to guardedevents.event_actionswith source spans, keeping the compute graphs pure.
A positive validation gate (appendix_b_validation) rejects any surviving
source temporal operator — the contract is enforced, not assumed.
Structural Analysis (rumoca-phase-structural)
Between the DAE and a runnable system sits structural preparation:
- Matching. Assign each equation to the unknown it will compute (maximum bipartite matching). Failure here is the structurally singular system diagnostic users see, which names unmatched equations and unknowns.
- Sorting (BLT). Order the matched system into block lower-triangular form: a sequence of scalar assignments and strongly connected components (simultaneous blocks).
- Tearing. Within coupled blocks, choose tearing variables so a small nonlinear core is iterated while the rest is evaluated explicitly.
- Index reduction. Higher-index DAEs are reduced with the Mattsson–Söderlind dummy-derivative method: constraint equations are differentiated and some derivative-defined states are demoted to dummy variables. State selection decides which candidates remain integrator states.
- Scalarization (when a backend requires it) expands symbolic arrays using shape metadata — never by parsing display strings.
The analysis products (matching, BLT blocks, tearing choices, state
selection reports) are returned as separate artifacts — they are inputs to
Solve lowering and to --inspect structure, and deliberately not stored
in the DAE.
Watching It Work
rumoca compile Model.mo --inspect structure # matching, BLT, SCCs, tearing
rumoca sim Model.mo --inspect eval # values + derivatives at a point
rumoca sim Model.mo --inspect jacobian --at "x=1@0"
These are the same tools the user guide teaches for debugging models — as a compiler developer you will mostly use them to verify that a lowering or matching change did what you intended on a specific model.
Solve IR and Execution
Solve Lowering (rumoca-phase-solve)
Solve lowering converts the structurally prepared DAE into the Solve IR: a
register-machine format that execution backends consume directly. The MLS
B.1 functions become ComputeBlock graphs mixing two kinds of nodes:
| Node | Contents |
|---|---|
ScalarProgramBlock | Flat register programs (Vec<LinearOp>), one scalar output per program |
Tensor program nodes (ComputeNode::MatMul, LinSolve, …) | Tensor kernels with explicit shape/layout metadata and a scalar fallback |
Solve adds no mathematics; it changes format. Keeping tensor structure
explicit above the scalar layer is what lets a backend choose between
scalar expansion (embedded C) and native kernels (BLAS/faer, CUDA, MLIR
linalg) without re-deriving structure.
SolveProblem is the base lowered problem. Expensive or non-canonical
products (mass-matrix form, output projections) are separate artifacts
requested by the backends that need them.
Execution Adapters
Execution adapters wrap toolchains and runtime APIs around generated or lowered code. They must not own semantics — no DAE lowering, no structural rewrites, no template policy:
| Crate | Role |
|---|---|
rumoca-exec-cranelift | In-process JIT via Cranelift |
rumoca-exec-mlir | MLIR-based compilation path |
rumoca-exec-wasm | WASM execution backend |
The generated-code targets (rust-solve, c-solve, embedded-c,
cuda-c, cuda-nvrtc-solve-jit, fmi2/fmi3) consume Solve through the
codegen engine instead — see
Code Generation Engine.
Adding a New Backend
The pathway for a new execution backend (the same one a future WebGPU/WGSL backend would take):
- Decide the consumption model: a codegen target (templates rendering
kernels, like
cuda-c) or an execution adapter (an API wrapper, likerumoca-exec-cranelift) — or both, like the NVRTC JIT. - Consume Solve IR. If the backend needs tensor structure, use the tensor program nodes; every node guarantees a scalar fallback, so a backend can start scalar-only and specialize incrementally.
- Declare capabilities honestly in the target manifest — readiness level
and per-feature support columns are what
rumoca targetsreports. - Keep language/toolchain specifics in the target’s
target.tomland templates, never in phase logic (SPEC_0029).
Diagnostics and Spans
Good diagnostics are a feature of every phase, not a layer bolted on at the end. The normative rules are SPEC_0008; this page is the working model.
Phase-Local Errors
Each phase defines its own error enum with phase-specific error codes
(EM001 duplicate class, ED013 unsupported algorithm lowering, …),
defined next to the code that emits them. There is no central error enum:
errors evolve with their phase, ownership is obvious, and code ranges stay
consistent per phase.
When you add a failure path, add it to the owning phase’s enum with a span and a code — do not widen a generic error or stringify early.
Spans and Source Identity
Span.source is a stable identity derived from the source name, assigned
by the parser. Spans must be carried from source data through every
diagnostic-producing IR. Fallback spans are acceptable only when no source
exists; if source data exists and a diagnostic lacks a span, the correct
fix is to preserve the span at the phase boundary where it was lost — not
to synthesize one downstream.
This is why diagnostics work identically across the CLI (rich terminal rendering), VS Code (Problems panel via LSP ranges), and the browser editors (Monaco markers): they all consume the same span-carrying diagnostics, only the presentation differs.
Failing Early and Loudly
The repository’s defensive-coding posture: catch errors in the phase that
owns the invariant, with a typed error — never weaken a check because a
downstream consumer can “cope”. Validation gates (such as the DAE’s
appendix_b_validation) are positive enforcement of contracts, and tests
assert the specific expected error, not just “some error”.
Tracing
Phases use the tracing crate for structured diagnostics during
development. Debugging knobs are documented CLI flags — Rumoca has a
zero-RUMOCA_*-environment-variable policy
(SPEC_0018),
so a debugging affordance worth keeping becomes a flag (like --inspect
and -v), not an env var.
Simulation Runtime
The simulation runtime turns a compiled model into trajectories. It is organized around shared solver interfaces: generic simulation policy lives in the shared runtime layer, and solver backends are thin adapters.
Layering
| Crate | Role |
|---|---|
rumoca-sim | Simulation orchestration over the compiled model |
rumoca-solver | Shared solver API, result types, report payloads |
rumoca-solver-rk45 | Explicit Runge–Kutta-style backend (rk-like) |
rumoca-solver-diffsol | Implicit backends via diffsol (bdf, esdirk34, trbdf2) |
rumoca-input, rumoca-input-keyboard, rumoca-input-gamepad | Interactive input devices |
rumoca-signal-frame | Signal payload types |
rumoca-transport-udp, rumoca-transport-websocket | External coupling and viewer transport |
rumoca-viz-web | Browser viewer assets |
Shared Responsibilities
These belong in the shared runtime/solver API layer, never duplicated in a backend:
- event schedules and root handling (zero-crossing location, event
iteration,
__pre__.*slot updates at event entry), - input routing and zero-order-hold behavior,
- result collection and the report payload,
- termination (
terminate) and assertion handling, - pacing:
as_fast_as_possible,realtime, andlockstepmodes.
If two solvers need the same code, it moves into the shared layer. Solver backends implement the integration method and consume resolved input values — they must not know about keyboards, gamepads, or transports.
Events at Runtime
The DAE hands the runtime explicit metadata: relations (event condition
surfaces), scheduled events (sample), event actions (assert,
terminate), and the pre-slot bindings. The runtime’s job is mechanical:
detect or schedule the event, advance to the event instant, write pre
slots, apply discrete updates, and restart integration. No backend
rediscovers event structure from expressions.
Failure Diagnostics
When a simulation produces a non-finite value, the runtime re-runs with NaN tracing to name the offending variables — the user-facing behavior documented in the handbook’s troubleshooting chapter. Keep that path working when touching solver internals; it is the difference between a useful bug report and “solver error”.
Code Generation Engine
rumoca-phase-codegen renders text from IRs through Jinja templates. A
target is a directory: a target.toml manifest plus templates. Built-in
targets are bundled into the binary; users can supply their own directory
or a raw template (see the user guide’s
Custom Targets
chapter for the user-facing workflow).
Ownership Rules
- The target owns its IR choice. A manifest declares which stage it
consumes (
ast/flat/dae/solve); individual templates may not silently switch IRs. Targets consume the lowest IR they need — no lower. - The engine renders; it does not decide semantics. Scalarization, structural analysis, and lowering happen in compiler phases; templates receive prepared data.
- No language special-casing in Rust. Phase code must not branch on
“is this C/CUDA/Python”. Language-specific behavior is expressed in
target.tomlmetadata and the templates themselves (SPEC_0029).
Capability Declarations
rumoca targets prints, for every built-in target, the IR it consumes,
its generation mode (symbolic / compiled / source-transform / packaged),
deployment class, readiness level (0 experimental … 2 validated), and
per-feature support columns (scalar, matmul, linsolve, sparse, dynamic
control flow, events, forward/reverse AD). These come from the target
manifests — keep them honest when extending a target; the table is the
user-facing contract.
Template Runtime Tests
Targets whose output is executable are covered by opt-in template-runtime
regression tests (cargo xtask verify template-runtimes), which actually
run the generated code. Adding a target with runnable output should come
with such a test.
Adding a Target
- Start from a worked example:
examples/codegen/standalone_web/is a complete custom bundle; the built-in target directories show the full manifest vocabulary. - Declare the IR stage and capabilities in
target.toml. - Write templates against the serialized IR (
--emit <stage>-jsonshows the exact shape). - Wire a runtime regression test if the output executes.
WASM, LSP, and Editors
The same compiler serves three editor surfaces: the rumoca-lsp language
server (native, bundled with the VS Code extension), the WASM bindings
(browser playground and the live examples in these books), and the CLI’s
terminal diagnostics. They share rumoca-tool-lsp for language smarts, so
completion and diagnostics behave identically everywhere.
Crates and Directories
| Location | Role |
|---|---|
rumoca-tool-lsp | Language-server logic: diagnostics, completion, hover, semantic tokens, definitions, code actions |
rumoca-tool-fmt, rumoca-tool-lint | Formatter and linter (CLI + LSP + WASM) |
rumoca-bind-wasm | wasm-bindgen API: compile, simulate, steppers, source-root management, LSP functions |
rumoca-bind-python | Python bindings (pip install rumoca) |
editors/vscode | VS Code extension (TypeScript) |
editors/wasm | Browser playground (Monaco workbench over the WASM package) |
docs/user-guide/live/ | The books’ live-example runner (mini Monaco editors over the same WASM package) |
The WASM API Surface
rumoca-bind-wasm exposes the pipeline to JavaScript:
compile(source, model)→ DAE JSON;render_target(...)renders a DAE-level target (the books’ Show DAE uses thedae-modelicatarget).simulate_model(source, model, t_end, dt, solver)→ result payload withnames/allData/nStatesplus request/timing metadata. Passingt_end = 0,dt = 0,solver = ""defers to the model’sexperimentannotation.WasmStepper— step-at-a-time simulation withset_input/getfor interactive use.lsp_diagnostics/lsp_completion/lsp_hover/lsp_semantic_tokens… — thin wrappers overrumoca-tool-lspused by the playground workers and the books’ editors.- Source-root management (
load_source_roots, parsed-document caches, bundled archives) so browser sessions can host package trees.
Build and Test
cargo xtask wasm build # wasm-pack build into pkg/<profile>/
cargo xtask wasm test # CI gate: build + browser smoke tests (Playwright)
cargo xtask vscode test # VS Code extension gate
The Pages deployment copies the WASM package, the playground, and both books into one artifact — see Docs and Pages.
One Language Definition
Monaco’s Modelica language definition (tokenizer, comments, brackets) lives
in editors/wasm/src/modules/modelica_language.js and is imported by both
the playground and the books’ live runner. Edit it there only — the books
load it dynamically from the deployed site layout.
Getting Started
Clone and Build
Install the Rust toolchain pinned by rust-toolchain.toml, then:
git clone https://github.com/CogniPilot/rumoca
cd rumoca
cargo build --workspace
Install the Developer CLI
The repository uses an xtask developer CLI (the
cargo-xtask convention) for every
verification, packaging, and maintenance workflow. Install the standalone
launcher and the repo hooks once:
cargo xtask repo cli install # `xtask` on PATH + shell completions
cargo xtask repo hooks install # git hooks
After that, xtask ... works from anywhere in the workspace (the
cargo xtask ... form always works too).
Fetch Modelica Dependencies
cargo xtask repo modelica-deps ensure
Downloads the pinned MSL and CMM versions into target/, which the
examples, tests, and committed VS Code settings expect.
Sanity Check
cargo xtask verify quick
This runs the same verification surface as GitHub CI except the slow
full-MSL parity gate. It expects the local prerequisites CI installs:
cargo-llvm-cov, Node/npm, and the wasm Rust target/tooling. For the
narrow loops you will actually iterate with, see
Testing and Quality Gates.
Editor Setup
Open the repository root in VS Code with the Rumoca Modelica extension.
For compiler development, enable rumoca.useSystemServer and put your
locally built rumoca-lsp on PATH so the editor exercises your changes.
Launch the extension from source in editors/vscode when working on the
extension itself.
Find Your Bearings
- Read Pipeline Overview if you have not yet.
- Read Where the Rules Live — five minutes that will save your first review round.
- Pick the chapter for the area you are changing; it links the owning spec.
Where the Rules Live: Specs
Rumoca keeps one source of truth per rule: the specs in
spec/.
AGENTS.md is a
thin routing index from “what you are touching” to “which spec to read” —
it contains no rules itself, and neither does this book.
The Map
| If you are touching… | Read |
|---|---|
| Compiler pipeline / any IR / any phase | SPEC_0007 |
| Crate dependencies, foundation types, re-exports | SPEC_0029 |
| Modelica semantics (anything MLS-affecting) | SPEC_0022 |
Name lookup, scopes, DefId | SPEC_0001, SPEC_0002 |
| Diagnostics, spans, error codes, tracing | SPEC_0008 |
Tool config (rumoca-tool-*, env-var policy) | SPEC_0018 |
| Function length, nesting, file size, determinism | SPEC_0021 |
| Opening a PR | SPEC_0025 |
| Scope/philosophy questions (“should this live in the compiler?”) | SPEC_0031 |
Start at
spec/README.md
for the full index with statuses;
SPEC_0000
explains how specs themselves work.
Working With Specs
- Active specs (
ACCEPTED/REFERENCE) are mandatory; archived ones are history. - If a change you want violates a spec, propose the spec change first — architecture tests enforce important boundaries, and bypassing a test is never the move.
- If you cannot find the spec for what you are about to change, stop and ask: the rule either exists somewhere you have not looked, or it needs to be written before the code.
- Do not duplicate spec content into other documents (including this one); one source of truth per rule is what keeps the spec set trustworthy.
House Norms Worth Knowing Early
These recur in review and each traces to a spec:
- Prefer the correct long-term fix over a short-term hack; fix root causes in the owning phase rather than weakening checks downstream.
- No
clippyallowattributes; address the lint. - No behavior-changing
RUMOCA_*environment variables — knobs are documented CLI flags or config keys. - Debugging is
tracing-based, noteprintln!behind env vars. - Deterministic collections and complexity limits per SPEC_0021.
Testing and Quality Gates
The Verification Surface
cargo xtask verify is the umbrella for everything CI runs:
| Command | Scope |
|---|---|
cargo xtask verify quick | Full CI surface except the slow full-MSL parity gate |
cargo xtask verify full | Everything, including full-MSL parity |
cargo xtask verify lint | Formatting + clippy |
cargo xtask verify workspace | Workspace build/tests |
cargo xtask verify docs | Documentation build (rustdoc + mdBook books) |
cargo xtask verify msl-parity | MSL parity gate on its own |
cargo xtask verify template-runtimes | Opt-in execution tests for generated target code |
Editor surfaces have their own gates:
cargo xtask vscode test # extension compile + tests
cargo xtask wasm test # wasm build + browser smoke tests
verify quick/full include the coverage, VS Code, and wasm gates, so
they need the same prerequisites CI installs: cargo-llvm-cov, Node/npm,
and the wasm Rust target/tooling.
During Development
Plain Cargo works for tight loops:
cargo test -p rumoca-phase-dae
cargo test -p rumoca-phase-structural some_test_name
When testing failure paths, assert the specific phase error you expect — the codebase’s expect-vs-error discipline exists so a passing test means the right thing failed for the right reason.
The MSL Quality Gate
The strongest regression net is the Modelica Standard Library gate: CI compiles and simulates a large MSL model population and compares against recorded baselines, blocking silent regressions in compile success, simulation success, and trace parity. Details, baseline policy, and promotion workflow: MSL Quality Gate.
For compiler changes that could affect MSL behavior, run the parity gate
(or at minimum verify quick plus a targeted MSL model) before opening the
PR — SPEC_0025
defines what evidence a PR needs.
Coverage
cargo xtask coverage report
CI enforces a coverage gate; locally you need cargo-llvm-cov.
Architecture Tests
Dependency boundaries from SPEC_0029 are enforced by tests. If one fails on your change, the answer is a design conversation (possibly a spec change) — not loosening the test.
Pull Requests
The PR process — required verification, metrics, MSL evidence, and done criteria — is normative in SPEC_0025. Read it before opening your first PR. The short practical version:
Before Opening
cargo xtask verify quickpasses locally (useverify fullwhen your change can affect MSL behavior).- The change follows the owning spec; if it required bending one, the spec change is part of the discussion, not an afterthought.
- Diagnostics added or changed carry spans and phase-local error codes.
- Tests assert specific expected errors, not just failure.
Commit Hygiene
- Commits are signed off (
git commit -s) — the DCO sign-off is required. - Write commit messages about why, not just what.
Review Flow
PRs run the full CI matrix: lint, workspace tests, coverage gate, MSL quality gate, docs build, VS Code extension gate, and the WASM gate. The MSL gate posts a comparison comment on the PR so reviewers see compile/simulate/parity movement at a glance.
Upstream-first: when a bug traces to an earlier phase, fix it there rather than compensating downstream — reviewers will ask for the root-cause fix (see Diagnostics and Spans for the same principle applied to error handling).
Issues and Discussion
Bugs and feature requests: https://github.com/CogniPilot/rumoca/issues.
A minimal .mo reproduction (the
playground makes these easy to
verify) turns a vague report into a fixable one.
Crate Map
The workspace is deliberately granular — phases, IRs, tools, backends, and bindings are separate crates with enforced dependency edges. The normative boundary rules are SPEC_0029; this page is the orientation map.
Families
| Family | Crates | Role |
|---|---|---|
| Foundation | rumoca-core, rumoca-contracts, rumoca-codec, rumoca-codec-flatbuffers | Shared low-level types, contracts, serialization |
| IRs | rumoca-ir-ast, rumoca-ir-flat, rumoca-ir-dae, rumoca-ir-solve | The four stage data structures |
| Phases | rumoca-phase-parse, -resolve, -typecheck, -instantiate, -flatten, -dae, -structural, -solve, -codegen | One transformation each |
| Facade | rumoca-compile | Session/compilation API the tools use |
| Evaluators | rumoca-eval-ast, -eval-flat, -eval-dae, -eval-solve | Stage-appropriate evaluation |
| Runtime | rumoca-sim, rumoca-solver, rumoca-solver-rk45, rumoca-solver-diffsol, rumoca-worker | Simulation orchestration and solver backends |
| Execution adapters | rumoca-exec-cranelift, rumoca-exec-mlir, rumoca-exec-wasm | JIT/compiled execution over Solve |
| Interactive I/O | rumoca-input, rumoca-input-keyboard, rumoca-input-gamepad, rumoca-signal-frame, rumoca-transport-udp, rumoca-transport-websocket, rumoca-viz-web | Devices, signals, transports, viewer |
| Tools | rumoca-tool-fmt, rumoca-tool-lint, rumoca-tool-lsp | Formatter, linter, language server logic |
| Bindings | rumoca (CLI), rumoca-bind-wasm, rumoca-bind-python | User-facing entry points |
| Testing/dev | rumoca-test-msl, xtask | MSL gates, developer CLI |
The Rules That Matter Daily
- Foundation types live in low-level crates; phases depend forward through explicit IR contracts.
- Tools and bindings use the
rumoca-compilefacade instead of reaching into phase internals. - Backend/runtime crates stay thin around the shared solver and execution APIs; shared policy lives in the shared layer.
- Target-language specifics live in
target.toml+ templates, never in phase logic.
Architecture tests enforce the important edges. If a dependency you want violates a spec, update the spec first — see Where the Rules Live.
Docs and Pages
Site Layout
GitHub Pages serves one artifact assembled by the WASM build job in CI:
https://cognipilot.github.io/rumoca/ ← playground (editors/wasm)
https://cognipilot.github.io/rumoca/user-guide/ ← mdBook, docs/user-guide
https://cognipilot.github.io/rumoca/dev-guide/ ← mdBook, docs/dev-guide
https://cognipilot.github.io/rumoca/pkg/<subdir>/ ← rumoca-bind-wasm package
https://cognipilot.github.io/rumoca/src/ ← playground JS modules
The deploy job publishes whatever the WASM build job staged into
gh-pages/; any new public static content must be copied there in that CI
step (.github/workflows/ci.yml, “Prepare GitHub Pages content”).
Books
Both books are mdBook projects (docs/user-guide, docs/dev-guide).
mdbook build docs/user-guide
mdbook build docs/dev-guide
cargo xtask verify docs # the docs CI gate
Live Examples
Fenced blocks annotated modelica,interactive become editable, runnable
mini editors backed by the WASM package. The runner is
docs/user-guide/live/rumoca-live.js (+ .css), wired into both books via
additional-js/additional-css in their book.tomls —
docs/dev-guide/live is a symlink to the user-guide copy so there is a
single source.
How it works:
- Editors are Monaco (same CDN as the playground), using the shared
Modelica language definition
editors/wasm/src/modules/modelica_language.js, with a plain-textarea fallback when the CDN is unreachable. - Language services (completion, hover, diagnostics-as-markers) call
the
lsp_*functions ofrumoca-bind-wasmdirectly on the main thread — book examples are small, so no worker is needed. - Simulate calls
simulate_model(source, model, 0, 0, ""), deferring to the model’sexperimentannotation; results render as an inline SVG plot. Show DAE renders thedae-modelicatarget. - Package discovery probes
<site>/pkg/<subdir>/(deployed layout) and<repo>/pkg/<subdir>/(local repo-root serve), overridable withwindow.RUMOCA_LIVE_PKG_BASE. The WASM download happens lazily on first interaction. - Visualizations: a
viz-radialfence annotation adds a built-in animated cross-section for 1-D array states; a followingjs,rumoca-vizfence becomes an editable visualization script that receives{ payload, times, names, data, container, api }(see the turkey and 2-D wave pages in the user guide for worked examples).
Authoring a live example:
```modelica,interactive
model M ... end M;
```
Keep embedded models validated — simulate them with the native CLI before
committing, and give them an experiment annotation so browser runs have
sensible defaults.
Local testing with the WASM parts active: build the package
(cargo xtask wasm build), build the books, then serve the repository
root (e.g. python3 -m http.server) and open
docs/user-guide/book/index.html from there. A manual browser smoke test
covering the live widgets lives at
editors/wasm/tests/book_live_smoke.mjs.
Updating the Pages Artifact
mdbook build runs inside the WASM CI job for both books; broken book
builds therefore block the Pages deployment rather than shipping. The
symlinked live/ assets are copied into each book’s output with hashed
filenames automatically.
Scenario Config and VS Code
Scenario and tool configuration are covered by
spec/SPEC_0018_TOOL_CONFIG.md.
Rumoca now prefers colocated rum.toml files for runnable scenarios. A rum.toml file
uses TOML content, lives next to the example it controls, and describes one
task: simulation, code generation, or another explicit tool action.
VS Code should run rum.toml scenarios directly. The play action belongs on the scenario,
not on a .mo file that requires guessing the model, source roots, solver, and
viewer.
Workspace Modelica paths are editor settings. Keep repo-committable dependency
paths in workspace .vscode/settings.json files when that is useful for
examples. Scenario-specific paths belong in the rum.toml scenario.
MSL Quality Gate
Rumoca’s main MSL baseline is the MSL 4.1.0 root-example set selected by
Modelica.*.Examples.*. Helper packages under Examples such as Utilities,
BaseClasses, Internal, and Interfaces are excluded.
Run the gate with:
cargo xtask verify msl-parity
The raw test command is:
cargo test --release --package rumoca-test-msl --features msl-full-test \
--test msl_tests balance_pipeline::balance_pipeline_core::test_msl_all -- --nocapture
The gate writes the current run to:
target/msl/results/msl_quality_current.jsontarget/msl/results/msl_package_pass_rates.mdtarget/msl/results/msl_package_trace_accuracy.mdtarget/msl/results/mls_contract_coverage.mdtarget/msl/results/omc_simulation_reference.json
Local full runs also generate OMC compile/flatten reference data in
target/msl/results/omc_reference.json unless
RUMOCA_MSL_SKIP_OMC_COMPILE_REFERENCE=1 is set. CI sets that flag because
cold GitHub runners repeatedly reload MSL for the compile reference; the CI
gate still checks Rumoca stage counts and OMC simulation trace parity.
CI compares the current run against
crates/rumoca-test-msl/tests/msl_tests/msl_quality_baseline.json.
The stage checks are cumulative over the fixed root-example denominator:
parse/IR-AST, flatten/IR-flat, DAE/IR-DAE, solve/IR-Solve,
initial-condition solve, and simulation. Increasing an early-stage pass count
is always treated as an improvement; the gate fails when any cumulative stage
count drops below the committed baseline for the same target set.
msl_quality_current.json also records release review metadata:
omc_versionrecords the OpenModelica build used for OMC trace parity; the quality gate compares the upstream release version and tolerates distro package rebuild suffix drift.mls_contract_coveragegroups per-model stage, Solve-IR, balance, simulation, and error-code counts by MLS contract category (ARR,CONN_STRM,FUNC,EQN_ALG_SIM,CLK_SM,DECL_TYPE,PKG,OTHER). The same data is written asmls_contract_coverage.{json,md,txt}so release reviews can inspect category coverage without manually querying the quality snapshot JSON.
On pull requests, CI also generates
target/msl/results/msl_pr_comment.md with cargo xtask repo msl pr-comment and
publishes it as a sticky PR comment. The comment embeds the package pass-rate,
MLS contract coverage, and OMC trace-accuracy markdown tables so reviewers can
inspect the MSL gate without downloading artifacts first. Its top summary also
shows deltas against the committed MSL quality baseline. Forked pull requests
receive the uploaded artifacts from the read-only CI run, then a separate
workflow_run publisher comments from the artifact using repository write
permissions.
When a full run is promoted, use reviewed full-run data and keep the committed
stage counts conservative enough to absorb compile-timeout jitter. Do not
promote focused subsets or one-off explicit target files as the baseline.
Promotion requires a full-run snapshot with non-empty omc_version metadata.
Focused debugging runs can use RUMOCA_MSL_SIM_MATCH,
RUMOCA_MSL_SIM_LIMIT, RUMOCA_MSL_SIM_TARGETS_FILE, or
RUMOCA_MSL_TARGET_SCOPE=committed-targets, but those runs are not baseline
updates.
For commit-to-commit regression diffs, run both worktrees with the same focused target JSON, then generate machine-readable buckets with:
cargo xtask repo msl parity-manifest \
--rumoca-results-file <worktree>/target/msl/results/msl_results.json \
--omc-simulation-reference-file <worktree>/target/msl/results/omc_simulation_reference.json \
--output-file <worktree>/target/msl/results/parity_fail_manifest.json
Compare msl_quality_current.json, parity_fail_manifest.json, and the
per-model [sim_*] log lines before inspecting emitted IR artifacts.
OMC reference pool and compile-speed comparison
cargo xtask repo msl omc-simulation-reference generates the OMC simulation baseline
(omc_simulation_reference.json) that the trace gate compares rumoca against,
and emits the rumoca-vs-OMC compile-speed report. It runs a pool of persistent
omc --interactive=zmq worker sessions (the OMC analogue of the rumoca warm
worker): each worker loads the MSL once, pulls per-model jobs, and is killed +
respawned (with its whole process group, so hung simulation grandchildren are
reaped) on a per-model timeout.
| Concern | Behavior |
|---|---|
| Pool size | One worker per physical core, minus headroom on large hosts (--workers 0 = auto); each worker pinned to a core. |
| Per-model timeout | --batch-timeout-seconds (wall, compile+simulate). Kept equal to the rumoca per-model budget so timing is fair. |
| Caching | Results are reused while the OMC version and MSL source are unchanged (cache_key in the JSON). --force re-runs everything. |
| Scope | All targets by default (so models that later pass rumoca already have a baseline); --rumoca-sim-ok-only is the CI fast subset. |
| Subsetting | --model-regex '<re>' scopes a run to matching models — the fast path for local iteration. |
Compile-speed artifacts
Restricted to models where the OMC and rumoca traces agree (high/near band), so only matching results are timed:
msl_speed_comparison.json— the single data contract. Its_aboutblock defines every metric (OMC compile =timeTotal - timeSimulation;speedup = omc_compile / rumoca_compile, >1 = rumoca faster; scaling binned byscalar_equations, the flattened system size — not states, which are 0 for most MSL examples).msl_speed_scaling.html— a self-contained local scatter plot (one point per model, x = scalar equations, y = compile seconds, rumoca vs OMC) rendered with the same embedded uPlot backend asplot-compare. Open it in a browser.
The plot is rendered two ways from that one JSON:
- Local:
omc-simulation-referencewritesmsl_speed_scaling.html(uPlot). - PR comment:
cargo xtask repo msl pr-commentreads the JSON and renders the table plus a mermaidxychart. GitHub cannot execute JS, so the PR plot is mermaid, not the uPlot viewer — and it is produced only bypr-comment, not on every OMC run.
Fast local subset
# Scope to a regex; reuses cached OMC + existing rumoca traces, then writes
# msl_speed_comparison.json + msl_speed_scaling.html for just that subset.
cargo xtask repo msl omc-simulation-reference \
--model-regex 'Mechanics\.Translational\.Examples'
MSL Baseline Promotion Analysis, 2026-06-05
This note compares the recent full MSL quality artifacts from PRs #191, #192,
#198, and #199 against the committed baseline. PR #202 is excluded from the
promotion decision because its MSL job was cancelled before producing
msl_quality_current.json; the uploaded artifact only contains the placeholder
PR comment.
Quality Snapshot
All four comparable PR artifacts are full-scope MSL 4.1.0 runs using OpenModelica 1.26.7 and report the same quality counts. That repeated result is strong enough to promote the quality baseline from a successful full run.
| Metric | Baseline | Recent full runs | Delta |
|---|---|---|---|
| Parse | 566 | 566 | 0 |
| Flatten | 556 | 562 | +6 |
| DAE / compiled | 414 | 477 | +63 |
| IR-Solve | 336 | 387 | +51 |
| Balanced | 401 | 464 | +63 |
| Initial balanced | 401 | 464 | +63 |
| Initial-condition solve OK | 182 | 224 | +42 |
| Simulation OK | 132 | 153 | +21 |
The trace snapshot also improves on the gate-relevant model counts:
| Trace metric | Baseline | Recent full runs | Delta |
|---|---|---|---|
| Models compared | 123 | 143 | +20 |
| High agreement | 65 | 89 | +24 |
| Minor agreement | 16 | 20 | +4 |
| Deviation agreement | 42 | 34 | -8 |
| Models with any bad channel | 46 | 44 | -2 |
| Bad channels | 1047 | 678 | -369 |
The severe-channel total rises from 35 to 151 because the current run compares 20 more models and 1363 more channels. The model-level gate signal is still better: acceptable trace models increase from 81 to 109, and models without a severe channel increase from 117 to 131.
Speed Snapshot
The rendered PR comments include comparable aggregate speed tables for 108 trace-agreeing models:
| PR | Total throughput | Total median | Compile throughput | Compile median | Sim throughput | Sim median |
|---|---|---|---|---|---|---|
| #191 | 3.49 | 5.96 | 5.80 | 6.82 | 0.18 | 1.22 |
| #192 | 3.25 | 5.51 | 5.05 | 5.96 | 0.19 | 1.57 |
| #198 | 3.17 | 5.30 | 4.80 | 5.64 | 0.19 | 1.65 |
| #199 | 3.31 | 5.58 | 5.02 | 5.74 | 0.19 | 1.47 |
These values support a stable median-based speed gate more than a mean-based gate. Mean speedup is more sensitive to outlier models and tiny OMC simulation times, while the existing spec defines a median tolerance for system and wall runtime ratios.
The 2026-06-05 artifacts cannot directly promote runtime speed baselines because
the generated msl_quality_current.json snapshots did not serialize
runtime_ratio_stats. The gate now preserves those stats from the OMC parity
input and checks both system and wall medians once a promoted baseline includes
them. A follow-up full MSL run after that change can safely promote the speed
baseline from the machine-readable snapshot.
Recommendation
Promote the MSL quality baseline from the successful #198 full MSL artifact.
Do not synthesize speed baseline values from rendered markdown. Instead, use the
next full MSL snapshot generated after runtime ratio stats are serialized into
msl_quality_current.json to promote speed medians through the normal baseline
promotion command.
Strict Eval Rollout
This page records the roll-forward plan for removing silent default-to-zero
evaluation from production compiler/runtime paths. Normative phase and crate
rules remain in spec/; this guide explains the migration sequence.
Goal
Production evaluation should return an error when an expression, variable reference, runtime slot, table, or backend input is missing or unsupported. Defaulting to zero is allowed only at explicitly named caller boundaries, such as a solver initial-guess policy.
The first targets are:
rumoca-eval-daeinternal helpers that currently default unhandled expressions or missing bindings to zero.rumoca-eval-solveruntime reads that currently default missing state, parameter, seed, or register slots to zero.- Backend interpreter paths that mirror the permissive Solve-IR behavior.
Roll Forward
The migration is intentionally not compatibility-preserving. New production evaluator APIs should expose errors directly; callers that still need a fallback must choose an explicit policy at the boundary.
- Convert exported evaluator APIs to return
Result<T, EvalError>or a runtime layout error. - Replace implicit defaults with explicit caller policies such as
EvalGuess::SolverInitialGuess. - Add source guards that prevent the known default-zero fallback surface from growing while each helper is converted.
- Run the MSL quality gates after each behavioral slice and promote baselines only through the normal reviewed promotion flow.
- Remove each permissive fallback as its caller has an explicit error or initial-guess policy.
Acceptance Checks
- No production evaluator helper silently substitutes
0.0for an unsupported expression or missing binding. - Solve/runtime APIs validate vector dimensions before residual or Jacobian evaluation.
- Any remaining default-to-zero behavior is named as an initial-guess or recovery policy at the caller boundary.
- MSL trace and balance gates stay active while strict behavior rolls through the evaluator stack.
Test Expectations
Each converted evaluator branch should add a negative test that proves the new error is reported. MSL parity should be run before promoting any baseline after behavior changes.
IR Schema Versioning
Rumoca’s serialized IRs are compatibility contracts, not debug dumps. DAE and
Solve JSON must carry an explicit schema_version, and deserializers reject
unsupported versions instead of guessing.
The policy is:
- Keep the same schema version only for additive fields that have a semantic
default and are annotated with
#[serde(default)]. - Bump the schema version for renamed fields, removed fields, changed enum tags, changed units, changed indexing conventions, or changed interpretation of an existing field.
- Do not add silent in-crate migrators to
Deserialize. Migration should be an explicit tool or phase so stale fixtures and artifacts fail visibly. - Commit or update golden fixtures with every intentional schema-shape change.
Worked example:
#![allow(unused)]
fn main() {
pub const DAE_SCHEMA_VERSION: u16 = 1;
#[derive(Deserialize, Serialize)]
pub struct DaeClockPartition {
pub schedules: Vec<ClockSchedule>,
// Same-version additive change: old artifacts deserialize as no triggers.
#[serde(default)]
pub triggered_conditions: Vec<Expression>,
}
}
If schedules is renamed to periodic_schedules, the change is incompatible:
#![allow(unused)]
fn main() {
pub const DAE_SCHEMA_VERSION: u16 = 2;
#[derive(Deserialize)]
struct DaeWire {
schema_version: u16,
periodic_schedules: Vec<ClockSchedule>,
}
impl<'de> Deserialize<'de> for Dae {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
let wire = DaeWire::deserialize(deserializer)?;
if wire.schema_version != DAE_SCHEMA_VERSION {
return Err(serde::de::Error::custom("unsupported DAE schema_version"));
}
Ok(Self::from_wire(wire))
}
}
}
Review checklist:
- Update the relevant
*_SCHEMA_VERSIONconstant for incompatible changes. - Keep old versions rejected by the primary IR deserializer.
- Add JSON and bincode round-trip coverage for the changed shape.
- Update committed goldens so reviewers can see the exact wire-format delta.