# model-ledger — full documentation > git for models — the open, agent-native source of truth that discovers every model, rule, and pipeline across all your platforms as one immutable graph. --- # source: glossary.md # Glossary The whole system is a handful of nouns. (These terms also get hover-definitions wherever they appear in the docs.) `Backend` : Pluggable storage behind the `LedgerBackend` protocol — in-memory, SQLite, JSON files, Snowflake, or a remote HTTP service. Swapping it never changes your code. `Composite` : A governed group whose members are themselves models — a business-level entity (e.g. a "Credit Decision System") that rolls up its scorecard, rules, and ETL. See [Composites](concepts/composite.md). `Connector` : A source that emits `DataNode`s from a platform (SQL, REST, GitHub, …) via the `SourceConnector` protocol. See [Connectors & discovery](guides/connectors.md). `DataNode` : The core graph primitive: anything with typed input/output ports — an ML model, a heuristic rule, an ETL job, an alert queue. See [DataNode & the graph](concepts/datanode.md). `DataPort` : A named connection point on a `DataNode`, optionally carrying schema so identically named outputs from different models don't falsely link. `Dependency graph` : The links between nodes, built automatically when an output port name matches an input port name (`connect()`). `Event log` : The inventory itself — an append-only sequence of immutable Snapshots. Nothing is overwritten, so history is always reconstructable. `ModelRef` : A model's stable identity: name, owner, type, risk `tier`, purpose, status. The minimum a regulator needs. See [Snapshots & the event log](concepts/snapshot.md). `Point-in-time` : Reconstruction of the inventory as it stood on any past date, via `inventory_at()`. `Profile` : A pluggable compliance check (`sr_11_7`, `eu_ai_act`, `nist_ai_rmf`) that validates a model's completeness against a framework. See [Governance](governance.md). `Snapshot` : An immutable, content-addressed record of one thing that happened to a model — a registration, a retrain, a validation. The unit of the event log. `Tag` : A mutable named pointer to a specific Snapshot (e.g. `production`, `latest-validated`). --- # source: governance.md # Governance Model-risk regimes change their names and their numbers. What they *ask for* barely changes. Strip away the acronyms and every regime — US banking, EU, insurance — wants the same six things from your model inventory. model-ledger is built to produce them as a byproduct of normal use, not as a separate compliance chore. ## What every regime actually asks for | The durable need | What an examiner says | The model-ledger primitive | |---|---|---| | **Complete inventory** | "Show me *every* model — including the shadow ones." | Cross-platform [discovery & connectors](guides/connectors.md) — ML models, rules, and ETL as one graph | | **Risk tiering** | "Which are high-materiality?" | `tier` on every [`ModelRef`](reference/index.md); business systems roll up as [composites](concepts/composite.md) | | **Change control + audit trail** | "What changed, when, and who did it?" | Immutable, content-addressed [Snapshots](concepts/snapshot.md) — append-only, tamper-evident | | **Dependency & lineage** | "How do these components feed each other?" | The [dependency graph](concepts/datanode.md), built from port matching | | **Validation records** | "Prove this was validated, and find what wasn't." | `record_validation()` events live in the same immutable log | | **Point-in-time reconstruction** | "Show me the inventory as it stood on December 31." | [`inventory_at(date)`](recipes/point-in-time.md) replays the log | That's the whole compliance story: **nothing is overwritten, so the answer to "what was true then?" is always reconstructable.** ## It falls out of normal use ```python from model_ledger import Ledger ledger = Ledger.from_sqlite("./inventory.db") # Identity + risk tier — the minimum a regulator needs ledger.register( name="credit_scorecard", owner="risk-team", model_type="ml_model", tier="high", purpose="Consumer credit decisioning", ) # Validation outcomes are just events in the same immutable log ledger.record("credit_scorecard", event="validated", actor="mrm-team", payload={"result": "pass", "validator": "second-line"}) # The full, ordered, tamper-evident history an examiner can replay for snap in ledger.history("credit_scorecard"): print(snap.timestamp, snap.event_type, snap.actor) ``` ## Frameworks it maps to The primitives above satisfy the documentation and inventory expectations of the major model-risk and AI-governance regimes: - **US banking — SR 26‑2 / OCC Bulletin 2026‑13** (the 2026 revision that superseded SR 11‑7): tiered model inventory, materiality classification, lifecycle documentation, and validation status. - **EU AI Act — Annex IV**: version-tracked technical documentation, component dependencies, and change history for high-risk systems. - **NIST AI RMF** and **ISO/IEC 42001**: inventory, risk management, and lifecycle governance practices. model-ledger ships **pluggable validation profiles** (`sr_11_7`, `eu_ai_act`, `nist_ai_rmf`) that check a model's completeness against a framework, and you can add your own — profiles are a plugin layer, not the core. Run them with `model-ledger validate --profile ` (see the [CLI guide](guides/cli.md)). !!! note "Framework-agnostic on purpose" model-ledger is a model inventory for *any* organization with deployed models — not a single-regulation tool. The frameworks above are examples of what the underlying capability is good for; they are a thin, swappable layer over a durable foundation. When a regulator renumbers a rule, you update a profile — not your inventory. --- # source: index.md
Open-source model governance

git for models.

Know what models you have deployed, where they run, what they depend on, and what changed — across every platform, as one immutable, queryable graph. Built for the regulator’s real question: show me everything that ever changed.

raw_txns features rules fraud_model review_queue

declare nodes · connect() · the graph builds itself

`model-ledger` is a model inventory for any organization with deployed models. It **discovers** models, heuristic rules, and ETL across your platforms, **maps the dependency graph** automatically, and **records every change as an immutable event**. Unlike registries tied to one platform (MLflow, SageMaker, W&B), it spans all of them — and it's built to be driven by AI agents through a native MCP server. [Get started in 60 seconds :octicons-arrow-right-24:](quickstart.md){ .md-button .md-button--primary } [Why a ledger, not a registry? :octicons-arrow-right-24:](#why-a-ledger-not-a-registry){ .md-button } ## Four ways in
- :material-language-python:{ .lg .middle }  __Python SDK__ --- Declare nodes; the graph connects itself. The whole API is tool-shaped. ```bash pip install model-ledger ``` [:octicons-arrow-right-24: Quickstart](quickstart.md) - :material-robot-outline:{ .lg .middle }  __MCP Server__ --- Talk to your inventory. The agent surface is the product — 8 tools, 3 resources. ```bash pip install "model-ledger[mcp]" claude mcp add model-ledger -- model-ledger mcp --demo ``` [:octicons-arrow-right-24: Agent guide](guides/agents.md) - :material-api:{ .lg .middle }  __REST API__ --- Auto-generated OpenAPI for frontends and dashboards. Same tools over HTTP. ```bash pip install "model-ledger[rest-api]" model-ledger serve --demo ``` [:octicons-arrow-right-24: Backends & serving](guides/backends.md) - :material-console:{ .lg .middle }  __CLI__ --- Launch the MCP server or REST API from anywhere — zero config to start. ```bash model-ledger mcp # for agents model-ledger serve # for HTTP ``` [:octicons-arrow-right-24: Reference](reference/index.md)
## The graph builds itself Every model is a [`DataNode`](concepts/datanode.md) with typed input and output ports. When an output name matches an input name, [`connect()`](reference/index.md) creates the dependency edge — no hand-wiring. ```python from model_ledger import Ledger, DataNode ledger = Ledger.from_sqlite("./inventory.db") ledger.add([ DataNode("segmentation", platform="etl", outputs=["customer_segments"]), DataNode("fraud_scorer", platform="ml", inputs=["customer_segments"], outputs=["risk_scores"]), DataNode("fraud_alerts", platform="alerting", inputs=["risk_scores"]), ]) ledger.connect() ledger.trace("fraud_alerts") # ['segmentation', 'fraud_scorer', 'fraud_alerts'] ``` ```mermaid graph LR A["segmentation
ETL"] -->|customer_segments| B["fraud_scorer
ML model"] B -->|risk_scores| C["fraud_alerts
Alert queue"] classDef etl fill:#607D8B,color:#fff,stroke:#455A64; classDef ml fill:#7a1a1a,color:#fff,stroke:#5a1010; classDef alert fill:#C8884E,color:#fff,stroke:#9c6a3a; class A etl; class B ml; class C alert; ``` ## One operation, every surface The SDK, the REST API, and the MCP tools are the **same six verbs** — `discover`, `record`, `investigate`, `query`, `trace`, `changelog` (plus `tag`/`list_tags`). Registering a model looks like this everywhere: === "Python" ```python from model_ledger import Ledger ledger = Ledger.from_sqlite("./inventory.db") ledger.register( name="fraud_scoring", owner="risk-team", model_type="ml_model", tier="high", purpose="Real-time fraud detection", ) ``` === "MCP (what the agent calls)" ```json { "tool": "record", "arguments": { "model_name": "fraud_scoring", "event": "registered", "owner": "risk-team", "model_type": "ml_model", "purpose": "Real-time fraud detection" } } ``` === "REST" ```bash curl -X POST localhost:8000/record \ -H 'content-type: application/json' \ -d '{"model_name":"fraud_scoring","event":"registered", "owner":"risk-team","model_type":"ml_model", "purpose":"Real-time fraud detection"}' ``` ## Why a ledger, not a registry A registry answers *"what is the current state?"* A regulator asks *"show me the **complete history** of every change, approval, and validation."* Those are different data structures. model-ledger treats the inventory as an **append-only event log**. A model is an identity ([`ModelRef`](concepts/snapshot.md)); everything else — every retrain, every config change, every validation — is an immutable, content-addressed [`Snapshot`](concepts/snapshot.md). You get full history and point-in-time reconstruction for free, because nothing is ever overwritten. That's exactly what a model-risk program needs — see how it maps to SR 26‑2, the EU AI Act, and NIST in [**Governance**](governance.md).
:material-graph-outline:  **Cross-platform** — ML models, heuristic rules, ETL, and queues are all one `DataNode`. The graph spans MLflow, SageMaker, your warehouse, your scheduler. { .card } :material-history:  **Change is the point** — every mutation is an immutable Snapshot. Reconstruct your inventory as it stood on any date. { .card } :material-robot-happy-outline:  **Agent-native** — the MCP server is a first-class surface, not an afterthought. Ask Claude *"if we deprecate `customer_features`, what breaks?"* { .card } :material-puzzle-outline:  **Bring your own everything** — storage backends, source connectors, and compliance profiles are all pluggable protocols. { .card }
--- Built in the open by [Block](https://opensource.block.xyz/) · Apache-2.0 · [Source](https://github.com/block/model-ledger) · [PyPI](https://pypi.org/project/model-ledger/) · [`/llms.txt`](llms.txt) for agents --- # source: installation.md # Installation model-ledger requires **Python 3.10+**. The core is deliberately tiny (`httpx` + `pydantic` only); everything else is an opt-in extra, so you install just the surfaces and backends you use. ```bash pip install model-ledger # core: SDK + dependency graph + connectors # or uv add model-ledger ``` ## Extras | Install | Adds | For | |---|---|---| | `model-ledger` | SDK, graph, SQL/REST/GitHub connectors | the core library | | `model-ledger[mcp]` | MCP server (`model-ledger mcp`) | AI agents — Claude, Goose, Cursor | | `model-ledger[rest-api]` | FastAPI app (`model-ledger serve`) | frontends, dashboards | | `model-ledger[cli]` | Typer + Rich CLI | terminal use | | `model-ledger[snowflake]` | Snowflake backend | production storage | | `model-ledger[introspect-sklearn]` | scikit-learn introspector | extract algorithm/features from fitted models | | `model-ledger[introspect-xgboost]` | XGBoost introspector | " | | `model-ledger[introspect-lightgbm]` | LightGBM introspector | " | | `model-ledger[excel]` | openpyxl | spreadsheet import/export | | `model-ledger[all]` | Snowflake + pandas + httpx | the common production set | Combine them: `pip install "model-ledger[mcp,rest-api,snowflake]"`. ## Which extra for which surface - **Python SDK** — core install is enough. - **Talk to it from an agent** — `[mcp]`, then `claude mcp add model-ledger -- model-ledger mcp` (see the [Agent guide](guides/agents.md)). - **Serve it over HTTP** — `[rest-api]`, then `model-ledger serve` (see [Backends](guides/backends.md)). - **From the terminal** — `[cli]` (see the [CLI guide](guides/cli.md)). Next: the [60-second quickstart](quickstart.md). --- # source: quickstart.md # Quickstart Zero infrastructure. Zero credentials. From `pip install` to a working dependency graph in under a minute. === "Python SDK" ```bash pip install model-ledger ``` ```python from model_ledger import Ledger, DataNode ledger = Ledger() # in-memory; swap for Ledger.from_sqlite("inv.db") to persist ledger.add([ DataNode("raw_txns", platform="warehouse", outputs=["transactions"]), DataNode("feature_build", platform="etl", inputs=["transactions"], outputs=["features"]), DataNode("fraud_model", platform="ml", inputs=["features"], outputs=["risk_scores"]), DataNode("review_queue", platform="alerting", inputs=["risk_scores"]), ]) ledger.connect() # ports match → edges appear print(ledger.trace("review_queue")) # ['raw_txns', 'feature_build', 'fraud_model', 'review_queue'] print(ledger.upstream("fraud_model")) # ['raw_txns', 'feature_build'] ``` That's the whole idea: **declare nodes, the graph connects itself.** Next, give a node an identity and a history → [Register a model](#register-a-model). === "Talk to it (MCP)" ```bash pip install "model-ledger[mcp]" # Register the server with Claude Code (one time) claude mcp add model-ledger -- model-ledger mcp --demo ``` Then just ask: > **You:** what models are in my inventory? > > **Claude:** 7 models across 5 platforms. `fraud_scoring` was retrained and > deployed this week. Want me to dig into anything? > > **You:** if we deprecate `customer_features`, what breaks? > > **Claude:** 3 models consume it directly, 2 more transitively. The `--demo` flag loads a sample inventory so you can explore before connecting your own data. See the [Agent guide](guides/agents.md) for the full tool surface. === "REST API" ```bash pip install "model-ledger[rest-api]" model-ledger serve --demo --port 8000 ``` Open **http://localhost:8000/docs** for live, auto-generated OpenAPI docs, or: ```bash curl "localhost:8000/query?limit=5" curl "localhost:8000/trace/fraud_scoring?direction=upstream" curl "localhost:8000/overview" ``` ## Register a model A `DataNode` gives you the graph. [`register()`](reference/index.md) gives a model an **identity** and starts its **history** — the two things a regulator asks for. ```python from model_ledger import Ledger ledger = Ledger.from_sqlite("./inventory.db") ledger.register( name="fraud_scoring", owner="risk-team", model_type="ml_model", tier="high", purpose="Real-time card fraud detection", ) # Record an event — any payload you like, no schema to maintain ledger.record("fraud_scoring", event="retrained", actor="ml-pipeline", payload={"accuracy": 0.94, "features_added": ["velocity_24h"]}) for snap in ledger.history("fraud_scoring"): print(snap.timestamp, snap.event_type) # ... registered # ... retrained ``` Every call appends an immutable [Snapshot](concepts/snapshot.md). Nothing is overwritten — that's what makes the inventory auditable. ## Choose where it lives Storage is a one-line decision and never changes your code: ```python from model_ledger import Ledger from model_ledger.backends.json_files import JsonFileLedgerBackend Ledger() # in-memory — tests & demos Ledger.from_sqlite("./inventory.db") # zero-infra, single file Ledger(JsonFileLedgerBackend("./inventory")) # git-friendly JSON files Ledger.from_snowflake(conn, schema="DB.MODEL_LEDGER") # production ``` [More on backends :octicons-arrow-right-24:](guides/backends.md) ## Where to next
- :material-cube-outline:  __[Concepts](concepts/index.md)__ — DataNode, Snapshot, Composite. The whole model in three ideas. - :material-robot-outline:  __[Agent guide](guides/agents.md)__ — the 8 MCP tools and a worked multi-tool transcript. - :material-book-open-variant:  __[Recipes](recipes/index.md)__ — copy-paste solutions to real tasks. - :material-api:  __[API reference](reference/index.md)__ — generated from source, never out of date.
--- # source: recipes/discover-sql.md # Recipe № 3   Discover from a SQL registry **Problem.** Your models already live in a database table (a registry, a job scheduler). You want them in the ledger — and kept in sync — without hand-entering anything. **Approach.** `sql_connector()` runs a query and turns each row into a [`DataNode`](../concepts/datanode.md). `add()` is idempotent (it content-hashes nodes), so re-running on a schedule only records genuine changes. ```python import sqlite3 from model_ledger import Ledger, sql_connector ledger = Ledger.from_sqlite("./inventory.db") source = sqlite3.connect("./ml_platform.db") models = sql_connector( name="model_registry", connection=source, query="SELECT name, owner, framework FROM ml_models WHERE active = 1", name_column="name", ) added = ledger.add(models.discover()) ledger.connect() print(f"discovered {len(added)} models") ``` ## Extract dependencies from SQL automatically If a row carries the SQL a job runs, point `sql_column` at it. The connector parses `FROM`/`JOIN` as inputs and `INSERT`/`CREATE` as outputs — so the graph links your ETL to the models that consume it: ```python etl = sql_connector( name="etl_scheduler", connection=source, query="SELECT job_name, raw_sql FROM scheduled_jobs", name_column="job_name", sql_column="raw_sql", ) ledger.add(etl.discover()) ledger.connect() # ETL outputs now link to model inputs across platforms ``` ## Run it on a schedule Wrap the discover-and-connect in your scheduler of choice (cron, Airflow, Prefect): ```python def sync(): ledger = Ledger.from_snowflake(conn, schema="DB.MODEL_LEDGER") ledger.add(models.discover()) ledger.connect() ``` Because `add()` skips unchanged nodes and refreshes a `last_seen` timestamp every run, you get two things for free: a clean changelog (only real changes are recorded) and the ability to spot models that have **gone silent** — discovered before, but missing from the latest run. !!! tip "Other sources" The same pattern works for REST APIs (`rest_connector`) and GitHub pipelines-as-code (`github_connector`), or write your own with the `SourceConnector` protocol — see [Connectors & discovery](../guides/connectors.md). --- # source: recipes/impact-analysis.md # Recipe № 1   Impact analysis **Problem.** You want to deprecate `customer_features` (or change its schema). What breaks? **Approach.** Models declare their inputs and outputs; `connect()` builds the edges. `downstream()` then returns everything that depends on a node — directly or transitively. ```python from model_ledger import Ledger, DataNode ledger = Ledger() ledger.add([ DataNode("customer_features", platform="feature-store", outputs=["customer_features"]), DataNode("fraud_scorer", platform="ml", inputs=["customer_features"], outputs=["risk_scores"]), DataNode("churn_scorer", platform="ml", inputs=["customer_features"], outputs=["churn_scores"]), DataNode("review_queue", platform="alerting", inputs=["risk_scores"]), ]) ledger.connect() # Everything that depends on customer_features, directly or transitively: blast_radius = ledger.downstream("customer_features") print(blast_radius) # ['fraud_scorer', 'churn_scorer', 'review_queue'] ``` **Expected output.** Three consumers: two models directly (`fraud_scorer`, `churn_scorer`) and one queue transitively (`review_queue`). Don't deprecate until those are handled. ```mermaid graph LR CF["customer_features"] --> FS["fraud_scorer"] --> RQ["review_queue"] CF --> CS["churn_scorer"] classDef hot fill:#7a1a1a,color:#fff,stroke:#5a1010; classDef dep fill:#efe8da,stroke:#7a1a1a,color:#1c1a17; class CF hot; class FS,CS,RQ dep; ``` ## The same question, from an agent ```json // trace(name="customer_features", direction="downstream") { "nodes": [ {"name": "fraud_scorer", "depth": 1}, {"name": "churn_scorer", "depth": 1}, {"name": "review_queue", "depth": 2} ] } ``` > **Claude:** Deprecating `customer_features` breaks 3 things — `fraud_scorer` and > `churn_scorer` consume it directly, and `review_queue` depends on it one hop further. ## Variations - `ledger.upstream("review_queue")` — the reverse: everything that feeds a node. - `ledger.trace("review_queue")` — the full path from roots to a node. - Use [`DataPort`](../concepts/datanode.md#dataport-precision) when several models write a table with the same name, so the blast radius is precise rather than over-broad. --- # source: recipes/index.md # Recipes Self-contained, copy-paste solutions to real tasks. Each one runs against the in-memory or SQLite backend with no setup.
- Recipe № 1 __[Impact analysis](impact-analysis.md)__ --- "If we deprecate this, what breaks?" Walk the dependency graph downstream to find the full blast radius before you change anything. - Recipe № 2 __[Point-in-time inventory](point-in-time.md)__ --- Reconstruct exactly which models were active — and in what state — on any past date. The answer an examiner actually wants. - Recipe № 3 __[Discover from a SQL registry](discover-sql.md)__ --- Point a connector at a database table and pull models into the ledger on a schedule, idempotently.
!!! note "More on the way" This gallery grows. Recipes are verified against the SDK so they can't quietly rot — if a release breaks one, the build fails. --- # source: recipes/point-in-time.md # Recipe № 2   Point-in-time inventory **Problem.** An examiner asks: *"Show me your model inventory as it stood on December 31."* A registry that overwrites state can't answer this. An event log can. **Approach.** Because every change is an immutable [Snapshot](../concepts/snapshot.md), the inventory at any date is just a replay of the log up to that moment. `inventory_at()` does it for you. ```python from datetime import datetime, timezone, timedelta from model_ledger import Ledger ledger = Ledger.from_sqlite("./inventory.db") ledger.register(name="fraud_scoring", owner="risk-team", model_type="ml_model", tier="high", purpose="Card fraud detection") ledger.record("fraud_scoring", event="retrained", payload={"accuracy": 0.94}, actor="ml-pipeline") now = datetime.now(timezone.utc) # The inventory as it stands now — fraud_scoring is present: for ref in ledger.inventory_at(now): print(ref.name, ref.status) # ...and as it stood a year ago — empty; the model didn't exist yet. ledger.inventory_at(now - timedelta(days=365)) ``` **Expected output.** `fraud_scoring active` for *now*, and nothing for a year ago — the model didn't exist then. Pass any timestamp (e.g. an examiner's "as of December 31") and `inventory_at` replays the event log up to that moment, returning each model with the `status` and metadata it carried *then* — not its state today. Nothing is overwritten, so history is always reconstructable. ## Why this matters | Question an auditor asks | Registry (mutable) | Ledger (event log) | |---|---|---| | What's the current state? | ✅ | ✅ | | What did it look like 6 months ago? | ❌ overwritten | ✅ replay the log | | When exactly did this change, and who did it? | ❌ | ✅ `history()` | | Prove nothing was edited after the fact | ❌ | ✅ content-addressed snapshots | ## Pair with history For one model's full timeline: ```python for snap in ledger.history("fraud_scoring"): print(snap.timestamp, snap.event_type, snap.actor) ``` Every line is immutable and ordered. That timeline *is* the audit trail — no separate logging system to keep in sync. --- # source: reference/index.md # API Reference Everything below is generated from the source at build time with [mkdocstrings](https://mkdocstrings.github.io/) + [Griffe](https://mkdocstrings.github.io/griffe/). It reflects the exact installed version — there is no hand-maintained copy to fall out of date. ## Ledger The one object you'll use most. Every method is tool-shaped — usable directly, over REST, or as an MCP tool. ::: model_ledger.Ledger options: show_root_heading: false heading_level: 3 ## Data models The event-log primitives. A model is a `ModelRef`; every change is a `Snapshot`; a `Tag` is a mutable pointer. ::: model_ledger.ModelRef options: heading_level: 3 ::: model_ledger.Snapshot options: heading_level: 3 ::: model_ledger.Tag options: heading_level: 3 ## Graph ::: model_ledger.DataNode options: heading_level: 3 ::: model_ledger.DataPort options: heading_level: 3 ## Connectors Factory functions that emit `DataNode`s from external sources. See [Connectors & discovery](../guides/connectors.md) for usage. ::: model_ledger.sql_connector options: heading_level: 3 ::: model_ledger.rest_connector options: heading_level: 3 ::: model_ledger.github_connector options: heading_level: 3 ## Introspection ::: model_ledger.introspect options: heading_level: 3 ::: model_ledger.register_introspector options: heading_level: 3 --- # source: includes/abbreviations.md *[DataNode]: The core graph primitive — anything with typed input/output ports (model, rule, ETL, queue). *[DataPort]: A named connection point on a DataNode; dependency edges form when port names match. *[Snapshot]: An immutable, content-addressed record of one thing that happened to a model. *[ModelRef]: A model's stable identity — name, owner, type, risk tier, purpose, status. *[Composite]: A governed group whose members are themselves models (e.g. a "Credit Decision System"). *[MCP]: Model Context Protocol — the agent-native interface; model-ledger's primary surface. *[SR 26-2]: 2026 US interagency model-risk-management guidance (OCC 2026-13), which superseded SR 11-7. *[Annex IV]: The EU AI Act's technical-documentation requirements for high-risk AI systems. --- # source: guides/agents.md # Agents (MCP) model-ledger is built agents-first. The Python SDK and REST API are first-class, but the surface we optimize for is the **MCP server** — because the most natural way to ask *"which high-risk models changed this week and haven't been validated?"* is to just ask. ## Connect it ```bash pip install "model-ledger[mcp]" # Claude Code (one time). Drop --demo to start empty; add a backend to persist. claude mcp add model-ledger -- model-ledger mcp --demo claude mcp add model-ledger -- model-ledger mcp --backend sqlite --path ./inventory.db ``` The server speaks stdio and works with any MCP client (Claude Desktop, Goose, Cursor). Point it at a remote deployment with `--backend http --path https://your-ledger:8000`, and the tools call that REST API directly. ## The eight tools Each tool is a plain function with Pydantic I/O, designed per [Anthropic's tool-writing guidance](https://www.anthropic.com/engineering/writing-tools-for-agents) — consolidated verbs, not a sprawl of endpoints. | Tool | What the agent uses it for | |------|----------------------------| | **`discover`** | Bulk-import models from inline dicts or a JSON file | | **`record`** | Register a model (`event="registered"`) or append any event to its history | | **`investigate`** | Deep-dive one model — identity, recent events, upstream/downstream deps, group memberships | | **`query`** | Search & filter the inventory (text + `platform`/`model_type`/`owner`/`status`, paginated) | | **`trace`** | Walk the dependency graph — `upstream`, `downstream`, or `both` | | **`changelog`** | What changed across the whole inventory in a time range (defaults to last 7 days) | | **`tag`** | Create or move a named pointer (`production`, `latest-validated`) to a model's latest snapshot | | **`list_tags`** | List the tags on a model | And three **resources** — read-only orientation an agent can pull without a tool call: | Resource | Contents | |----------|----------| | `ledger://overview` | Model count, event count, type breakdown | | `ledger://schema` | JSON Schema for every tool's I/O — the agent learns the contract | | `ledger://backends` | Which backend is active | ## A worked conversation What a multi-tool exchange actually looks like under the hood: > **You:** Anything risky change this week? ```json // changelog(since="2026-06-01") { "events": [ {"model_name": "fraud_scoring", "event_type": "retrained", "timestamp": "2026-06-03T14:11Z"}, {"model_name": "aml_threshold_rule", "event_type": "threshold_changed", "timestamp": "2026-06-04T09:02Z"} ], "total": 2 } ``` > **Claude:** Two changes. `fraud_scoring` was retrained, and `aml_threshold_rule`'s > threshold changed. Want the blast radius of either? > > **You:** The threshold rule — what depends on it? ```json // trace(name="aml_threshold_rule", direction="downstream") { "nodes": [ {"name": "sar_queue", "depth": 1, "relationship": "depends_on"}, {"name": "compliance_dashboard", "depth": 2, "relationship": "depends_on"} ] } ``` > **Claude:** It feeds `sar_queue` directly and `compliance_dashboard` one hop further. > Let me check whether the rule has been validated since the change. ```json // investigate(model_name="aml_threshold_rule", detail="full") { "owner": "compliance-eng", "model_type": "heuristic", "tier": "high", "recent_events": [{"event_type": "threshold_changed", "actor": "j.lee"}], "validations": [] } ``` > **Claude:** ⚠️ `aml_threshold_rule` is tier **high**, changed yesterday by `j.lee`, > feeds the SAR queue — and has **no validation recorded since the change**. Worth > flagging to MRM. That chain — `changelog → trace → investigate` — is the whole point: the agent reasons over the event log, not a static table. ## Discovering at scale The `discover` tool imports inline model lists, a JSON file, or a config-drivable connector: ```json // discover(source_type="inline", models=[{"name": "...", "platform": "..."}]) { "added": 12, "skipped": 0, "links_created": 8 } // discover(source_type="connector", connector_name="rest", // connector_config={"name": "mlflow", "url": "...", "items_path": "...", "name_field": "..."}) { "models_added": 40, "links_created": 12, "errors": [] } ``` !!! info "Which connectors an agent can run" `rest` and `prefect` are pure-config connectors, so an agent can run them directly through `discover`. `sql` and `github` need a live database connection or a parser callable that can't be expressed as JSON — for those, `discover` returns a message in the result's `errors` field pointing you to the SDK (see [Connectors & discovery](connectors.md)). Connector problems come back as `errors` rather than raising, so the agent always gets a usable response. ## Your docs are an agent surface, too These docs publish [`/llms.txt`](../llms.txt) and [`/llms-full.txt`](../llms-full.txt), and every page is fetchable as raw Markdown by appending `.md` to its path. Point an IDE agent at them and it learns model-ledger without leaving the editor — fitting for a tool whose product is an MCP server. --- # source: guides/backends.md # Choosing a backend Storage is a `LedgerBackend` protocol, so the choice is one line and never leaks into your code. Start simple; upgrade when you need scale. | Backend | Use it for | One-liner | |---------|-----------|-----------| | **In-memory** | Tests, demos, throwaway exploration | `Ledger()` | | **SQLite** | Local persistence, single user, zero infra | `Ledger.from_sqlite("inv.db")` | | **JSON files** | Git-friendly, human-readable, diff-able inventory | `Ledger(JsonFileLedgerBackend("./inv"))` | | **Snowflake** | Production, org-scale, shared truth | `Ledger.from_snowflake(conn, schema="DB.MODEL_LEDGER")` | | **HTTP** | Talk to a remote model-ledger REST service | `Ledger(HttpLedgerBackend(url))` | ```python from model_ledger import Ledger from model_ledger.backends.json_files import JsonFileLedgerBackend from model_ledger.backends.http import HttpLedgerBackend Ledger() # in-memory Ledger.from_sqlite("./inventory.db") # SQLite Ledger(JsonFileLedgerBackend("./inventory")) # JSON files Ledger.from_snowflake(conn, schema="DB.MODEL_LEDGER") # Snowflake Ledger(HttpLedgerBackend("https://model-ledger:8000")) # remote REST ``` ## JSON files are git-friendly The default JSON layout is meant to be inspected, diffed, and version-controlled — your inventory as plain text: ``` inventory/ ├── models/ │ ├── fraud_scoring.json │ └── churn_predictor.json ├── snapshots/ │ ├── a1b2c3d4.json │ └── e5f6g7h8.json └── tags/ └── {model_hash}/production.json ``` ## Serving and the CLI The CLI launches either agent or HTTP surfaces over any backend: ```bash model-ledger serve --backend sqlite --path ./inventory.db --port 8000 model-ledger mcp --backend snowflake --schema DB.MODEL_LEDGER ``` Snowflake reads credentials from the environment (`SNOWFLAKE_ACCOUNT`, `SNOWFLAKE_USER`, and either `SNOWFLAKE_PASSWORD` or `SNOWFLAKE_AUTHENTICATOR=externalbrowser` for SSO). Install the extra first: `pip install "model-ledger[snowflake]"`. ## Bring your own Anything that satisfies the `LedgerBackend` protocol works — Postgres, DynamoDB, a graph DB. Implement the protocol methods and pass an instance to `Ledger(...)`. See the [API reference](../reference/index.md) for the protocol surface. --- # source: guides/cli.md # CLI Install the CLI extra, then `model-ledger --help` lists everything: ```bash pip install "model-ledger[cli]" model-ledger --help ``` The CLI has two jobs: **launch the agent and HTTP surfaces** (the bridge to the rest of this documentation), and **work with a local inventory** from the terminal. ## Launch a surface These serve the [Ledger](../reference/index.md) over any [backend](backends.md) — in-memory, SQLite, JSON, Snowflake, or a remote HTTP service. === "MCP (for agents)" ```bash model-ledger mcp # in-memory model-ledger mcp --demo # sample inventory model-ledger mcp --backend sqlite --path ./inv.db # persistent model-ledger mcp --backend snowflake --schema DB.MODEL_LEDGER model-ledger mcp --backend http --path https://model-ledger.internal:8000 ``` === "REST API" ```bash model-ledger serve --demo --port 8000 # → OpenAPI docs at http://localhost:8000/docs ``` `--backend` accepts `memory` · `sqlite` · `json` · `snowflake` · `http`; `--path` is the file path (sqlite/json) or URL (http); Snowflake reads credentials from the environment (see [Choosing a backend](backends.md)). ## Work with a local inventory These commands operate on a local file-based inventory (`--db`, default `inventory.db` or `$MODEL_LEDGER_DB`) and render as a table or `--format json`. | Command | What it does | |---|---| | `model-ledger list` | List registered models | | `model-ledger show ` | Show one model's details and versions | | `model-ledger validate --profile

` | Check a model against a compliance profile (`sr_11_7`, `eu_ai_act`, `nist_ai_rmf`) | | `model-ledger audit-log ` | Print the model's audit trail | | `model-ledger export --output

` | Export an audit pack | | `model-ledger introspect --allow-pickle` | Extract algorithm/features from a fitted model file | ```bash model-ledger list --format json model-ledger validate credit_scorecard --profile sr_11_7 model-ledger audit-log credit_scorecard ``` !!! info "Which command for which surface" `mcp` and `serve` expose the full [event-log Ledger](../concepts/snapshot.md) — the one the [SDK](../quickstart.md), [agents](agents.md), and [REST API](backends.md) all share. Use them to point Claude or a dashboard at your inventory. The `validate` profiles map to the frameworks in the [Governance guide](../governance.md). --- # source: guides/connectors.md # Connectors & discovery A connector emits `DataNode`s from a source system. Add them to the ledger and call `connect()` — the cross-platform graph assembles itself from port matching. Three factory connectors ship in core; anything else is a small protocol implementation. ## SQL databases ```python from model_ledger import Ledger, sql_connector ledger = Ledger.from_sqlite("./inventory.db") # Simple: read a registry table models = sql_connector( name="model_registry", connection=my_db, query="SELECT name, owner, status FROM ml_models WHERE active = true", name_column="name", ) # Advanced: auto-parse SQL to extract table dependencies etl_jobs = sql_connector( name="etl_scheduler", connection=my_db, query="SELECT job_name, raw_sql, cron FROM scheduled_jobs", name_column="job_name", sql_column="raw_sql", # FROM/JOIN → inputs, INSERT/CREATE → outputs ) ledger.add(models.discover()) ledger.add(etl_jobs.discover()) ledger.connect() # links ETL outputs to model inputs automatically ``` ## REST APIs Works with MLflow, SageMaker, Vertex AI, or any JSON API: ```python from model_ledger import rest_connector ml_models = rest_connector( name="mlflow", url="https://mlflow.internal/api/2.0/mlflow/registered-models/list", headers={"Authorization": "Bearer ..."}, items_path="registered_models", name_field="name", ) ledger.add(ml_models.discover()) ``` ## GitHub repos (pipelines-as-code) Discover Airflow DAGs, dbt projects, or scoring pipelines from config files: ```python from model_ledger import github_connector pipelines = github_connector( name="ml_pipelines", repos=["myorg/ml-scoring"], token="ghp_...", project_path="projects", config_file="deploy.yaml", parser=my_yaml_parser, # (project_name, file_content) -> DataNode ) ledger.add(pipelines.discover()) ``` ## Custom connectors Implement the `SourceConnector` protocol — a `name` and a `discover()` returning `DataNode`s — for anything the factories don't cover: ```python from model_ledger import DataNode class SageMakerConnector: name = "sagemaker" def discover(self) -> list[DataNode]: endpoints = boto3.client("sagemaker").list_endpoints()["Endpoints"] return [ DataNode(ep["EndpointName"], platform="sagemaker", outputs=[ep["EndpointName"]], metadata={"status": ep["EndpointStatus"]}) for ep in endpoints ] ledger.add(SageMakerConnector().discover()) ledger.connect() ``` !!! tip "Every connector is a growth event" Each new connector extends the discovery surface — a node in your warehouse links to a model in MLflow links to a queue in your alerting system, with no shared ID scheme. That's how one graph spans every platform. ## Recurring discovery Run connectors on a schedule (cron, Airflow, Prefect) writing to a shared backend. `add()` is idempotent — it content-hashes nodes and skips unchanged ones — and a `last_seen` timestamp is updated every run, so you can detect models that have gone silent. See the recipe: [Discover from a SQL registry](../recipes/discover-sql.md). --- # source: concepts/composite.md # Composites A regulator doesn't approve "a SQL job." They approve a **Credit Decision System**. But that system is really a scorecard, some policy rules, and an ETL pipeline — each of which deserves its own governance. A **composite** is the business-level entity that aggregates technical components. Critically, a member *is itself a model* — so it has its own owner, history, and validation. Composites are the layer no plain registry or catalog models. ## Register a group and its members `register_group()` creates the composite and links each member with the `member_of` relationship: ```python from model_ledger import Ledger ledger = Ledger.from_sqlite("./inventory.db") group = ledger.register_group( name="Credit Scorecard", owner="risk-team", model_type="ml_model", tier="high", purpose="Credit risk scoring pipeline", members=["feature_pipeline", "scoring_model", "alert_queue"], actor="system", ) ``` ```mermaid graph TD G["Credit Scorecard
composite · tier: high"] G --- M1["feature_pipeline"] G --- M2["scoring_model"] G --- M3["alert_queue"] classDef ink fill:#1c1a17,color:#f7f3ec,stroke:#000; classDef ox fill:#efe8da,stroke:#7a1a1a,color:#1c1a17; class G ink; class M1,M2,M3 ox; ``` ## Membership is an event, too Add and remove members over time — each change is recorded as a snapshot, so you can ask *who belonged to this system on any past date*: ```python ledger.add_member("Credit Scorecard", "challenger_model", role="challenger", actor="risk-team") ledger.remove_member("Credit Scorecard", "scoring_model", actor="risk-team") ledger.members("Credit Scorecard") # current members (replayed from the event log) ledger.groups("scoring_model") # which composites a model belongs to from datetime import datetime ledger.membership_at("Credit Scorecard", datetime(2025, 12, 31)) # membership as of a date ``` ## Roll-up view `composite_summary()` aggregates a composite and its members into a single governance view — tiers, statuses, open observations, and validation state across the whole system: ```python summary = ledger.composite_summary("Credit Scorecard") ``` This is what makes composites the **primary inventory entry** for governance: an examiner reads ~one entry per business system, and every technical component beneath it remains individually traceable. !!! note "Observations & validations" Composites also carry governance events — `record_observation()`, `resolve_observation()`, and `record_validation()` — so findings and validation outcomes live in the same immutable log as everything else. See the [API reference](../reference/index.md). --- # source: concepts/datanode.md # DataNode & the graph The core insight: **a model, a rule, an ETL job, and an alert queue are the same shape.** Each consumes some things and produces others. So they're all one type — `DataNode` — and the dependency graph falls out of matching what they produce to what others consume. ## A node is what it reads and writes ```python from model_ledger import DataNode DataNode( name="fraud_scorer", platform="ml", inputs=["customer_features"], # what it consumes outputs=["risk_scores"], # what it produces metadata={"framework": "xgboost", "owner": "risk-team"}, ) ``` `inputs` and `outputs` are **ports** — the names of the data flowing in and out. A plain string becomes a [`DataPort`](#dataport-precision) automatically. ## The graph builds itself You never draw edges. You call `connect()`, and every place an output port name matches an input port name becomes a dependency: ```python from model_ledger import Ledger, DataNode ledger = Ledger() ledger.add([ DataNode("segmentation", platform="etl", outputs=["customer_segments"]), DataNode("fraud_scorer", platform="ml", inputs=["customer_segments"], outputs=["risk_scores"]), DataNode("fraud_alerts", platform="alerting", inputs=["risk_scores"]), ]) ledger.connect() ledger.trace("fraud_alerts") # ['segmentation', 'fraud_scorer', 'fraud_alerts'] ledger.upstream("fraud_alerts") # everything that feeds it ledger.downstream("segmentation")# everything that depends on it ``` ```mermaid graph LR A["segmentation"] -->|customer_segments| B["fraud_scorer"] -->|risk_scores| C["fraud_alerts"] classDef n fill:#efe8da,stroke:#7a1a1a,color:#1c1a17; class A,B,C n; ``` This is why discovery scales: a connector just emits `DataNode`s with their ports, and the cross-platform graph assembles itself — an ETL job in your warehouse links to a model in MLflow links to a queue in your alerting system, with no shared ID scheme. ## DataPort precision When two models legitimately write a table with the same name, a bare port name would collide. `DataPort` carries optional schema to disambiguate — edges only form when the schema matches too: ```python from model_ledger import DataNode, DataPort DataNode("check_rules", outputs=[DataPort("alerts", model_name="checks")]) DataNode("card_rules", outputs=[DataPort("alerts", model_name="cards")]) DataNode("check_queue", inputs=[DataPort("alerts", model_name="checks")]) # check_queue connects to check_rules only — model_name must match. ``` Port matching is case-insensitive, and schema values support `%` wildcards. ## From node to governed model A `DataNode` gives you structure. To give a node an **identity and history** — owner, risk tier, purpose, and an audit trail — you [`register()`](../reference/index.md) it as a [`ModelRef`](snapshot.md) and [`record()`](snapshot.md) events against it. Discovery and registration are two views of the same inventory: the graph (what connects to what) and the ledger (what each thing *is* and how it changed). [Next: Snapshots & the event log :octicons-arrow-right-24:](snapshot.md) --- # source: concepts/index.md # Concepts model-ledger is small on purpose. Three ideas carry the whole system.
- :material-graph-outline:{ .lg }  __[DataNode & the graph](datanode.md)__ --- Everything is a `DataNode` with typed input/output ports. Declare what a node reads and writes; the dependency graph builds itself from port matching. - :material-history:{ .lg }  __[Snapshot & the event log](snapshot.md)__ --- A model is an identity (`ModelRef`). Everything that happens to it is an immutable, content-addressed `Snapshot`. The inventory is an append-only log. - :material-layers-outline:{ .lg }  __[Composites](composite.md)__ --- Governed groups whose members are themselves models. A "credit decision system" that rolls up its scorecard, policy rules, and ETL — each governed in its own right.
## How they fit together ```mermaid graph TB subgraph identity ["Identity"] REF["ModelRef
name · owner · type · tier · purpose"] end subgraph history ["History (append-only)"] S1["Snapshot
registered"] --> S2["Snapshot
retrained"] --> S3["Snapshot
validated"] end subgraph graph ["Graph"] N1["DataNode"] -->|port match| N2["DataNode"] end REF --- S1 REF -.is a node in.- N1 classDef ink fill:#1c1a17,color:#f7f3ec,stroke:#000; classDef ox fill:#7a1a1a,color:#fff,stroke:#5a1010; class REF ink; class S1,S2,S3 ox; ``` - **Identity** is the minimum a regulator needs: who owns it, what kind of model, how risky, what it's for. - **History** is every change, immutable and ordered. You can ask the inventory what it looked like on any past date. - **Graph** is how models relate. Declare ports; dependencies follow. A fourth idea — **compliance profiles** (SR 11-7, EU AI Act, NIST AI RMF) — reads this data to check completeness. It's a pluggable layer, not part of the core model; see the [API reference](../reference/index.md). --- # source: concepts/snapshot.md # Snapshots & the event log Most registries store *current state* and overwrite it. model-ledger stores *what happened* and never overwrites anything. The inventory is an **append-only event log** — which is exactly the shape an auditor asks for. ## Identity vs. history A model splits into two things: | | What it is | Mutable? | |---|---|---| | [`ModelRef`](../reference/index.md) | The regulatory identity: `name`, `owner`, `model_type`, `tier`, `purpose`, `status` | A stable identity (`model_hash`) | | [`Snapshot`](../reference/index.md) | One immutable observation: an event with a `timestamp`, `actor`, `event_type`, and a free-form `payload` | Never — content-addressed | ```python from model_ledger import Ledger ledger = Ledger.from_sqlite("./inventory.db") ref = ledger.register( name="fraud_scoring", owner="risk-team", model_type="ml_model", tier="high", purpose="Real-time fraud detection", ) ref.model_hash # stable identity, derived from name + owner + created_at ``` ## Every change is an event `record()` appends a Snapshot. The `payload` is **schema-free** — record whatever matters, no migrations: ```python ledger.record("fraud_scoring", event="retrained", actor="ml-pipeline", payload={"accuracy": 0.94, "auc": 0.98, "features_added": ["velocity_24h"]}) ledger.record("fraud_scoring", event="validated", actor="mrm-team", payload={"profile": "sr_11_7", "validator": "mrm-team", "result": "pass"}) for s in ledger.history("fraud_scoring"): print(s.timestamp, s.event_type, s.payload) ``` Each Snapshot is **content-addressed**: its `snapshot_hash` is derived from the model hash, the timestamp, and the payload. Identical content can't be silently duplicated, and the chain is tamper-evident. ```mermaid graph LR R["ModelRef
fraud_scoring"] R --> A["registered"] --> B["retrained
acc 0.94"] --> C["validated
sr_11_7 · pass"] classDef ink fill:#1c1a17,color:#f7f3ec,stroke:#000; classDef ox fill:#7a1a1a,color:#fff,stroke:#5a1010; class R ink; class A,B,C ox; ``` ## Point-in-time reconstruction Because nothing is overwritten, you can ask the inventory what it looked like on any date — the answer an examiner actually wants: ```python from datetime import datetime, timezone inventory = ledger.inventory_at(datetime.now(timezone.utc)) # pass any datetime — a past date reconstructs the inventory as it stood then ``` See the recipe: [Point-in-time inventory](../recipes/point-in-time.md). ## Tags: mutable pointers over an immutable log The log is immutable, but you still want moving labels like `production` or `latest-validated`. A [`Tag`](../reference/index.md) is a named pointer to a specific Snapshot; moving it forward is itself recorded. ```python ledger.tag("fraud_scoring", "production") # points at the current latest snapshot ``` [Next: Composites :octicons-arrow-right-24:](composite.md)