Otterdog Architecture¶
Fair warning
This looks complex. It is complex. But reading this page a few times is better than randomly poking at the code for weeks.
otterdog plan & otterdog apply¶
Overview¶
Otterdog is a stateless GitHub Infrastructure-as-Code tool.
At a high level it:
- Reads org-level configuration (JSON / Jsonnet)
- Builds a desired in-memory model
- Fetches the actual state from GitHub via providers
- Computes a structured diff in the form of LivePatch operations
- Either prints the diff (
plan) or applies it (apply) via GitHub APIs
flowchart TB
subgraph shared
cli_apply([otterdog plan / apply])
cli_apply --> load_config(Load org config)
load_config --> desired_config[Desired State]
cli_apply --> load_live(Fetch from GitHub)
load_live --> current_state[Current State]
desired_config --> live_gen(Generate Live Patch)
current_state --> live_gen
live_gen --> live_patch[Live Patch]
end
subgraph plan
live_patch --> print(Print Diff)
end
subgraph apply
live_patch --> apply_changes(Apply Changes)
end
Important note:
Important note
Unlike most other infrastructure-as-code tools, apply does not apply a previously generated plan. There is no persistent plan or state file — apply recomputes the diff from the current GitHub state and the config at runtime. This stateless design is intentional. While it might seem vulnerable to race conditions, applying an outdated plan would be even more problematic.
Load Org Config & Desired State¶
This phase loads the desired state purely from configuration files.
- Entry point:
GitHubOrganization.load_from_file()loads the org-specific config file
- Model construction:
- Configuration data is deserialized via
from_model_data()(inherited fromModelObject) - Each model class recursively deserializes itself and nested resources by default, treating all members as optional nested objects
- Override
get_mapping_from_model()in individual model classes to customize deserialization (rare) - The Jsonnet configuration format is designed to align closely with the ModelObject structure
- Configuration data is deserialized via
- At this point:
- No GitHub API calls have occurred
- The in-memory model represents what GitHub should look like
Fetch from GitHub & Current State¶
This phase loads the actual (live) state purely from GitHub.
- Entry point:
GitHubOrganization.load_from_provider()is called with the organization name and aJsonnetConfigfor defaults- Multiple independent tasks are launched concurrently to retrieve all organization data
- Provider:
GitHubProvider(otterdog/providers/github/__init__.py) manages API interactions- Three sub-components query distinct APIs:
- REST API (
rest/) — classic GitHub REST endpoints - GraphQL API (
graphql.py) — used for complex queries - Web UI (
web.py) — for settings only available through the web interface
- REST API (
- Provides
get_*methods that return raw API responses (untyped dicts/lists)
- Model construction:
- Each model class deserializes itself and nested resources via
from_provider_data()(inherited fromModelObject) - By default, recursively processes all class members as optional nested objects
- Override
get_mapping_from_provider()in individual model classes to customize (typically required)
- Each model class deserializes itself and nested resources via
- Result:
- A second in-memory model mirroring GitHub's current state
- Same class structure, different data source
Diff & LivePatch Generation¶
This is the core of Otterdog — where desired and live models are compared and converted into actionable patches.
-
Entry point:
DiffOperation.generate_diff()(otterdog/operations/diff_operation.py)
-
Patch generation:
- Each model inherits
generate_live_patchfromModelObject(otterdog/models/__init__.py) - Models can override this method for custom behavior; otherwise the inherited implementation applies
- The inherited
generate_live_patchmethod callsget_difference_fromon the model to compare fields and produceChangeobjects:- Fields marked with
model_onlyare excluded from diff computation - Custom filtering logic can be implemented by overriding
include_field_for_diff_computationandis_key_valid_for_diff_computationmethods
- Fields marked with
- Differences are collected and stored in the
changesfield within theLivePatchobject - Model classes serialize themselves to provider format via
to_provider_data()(inherited fromModelObject) - By default,
to_provider_data()recurses into all class members as optional nested objects; overrideget_mapping_to_providerin individual model classes to customize this behavior (typically required)
- Each model inherits
Plan: Display Diff¶
The LivePatch operations are formatted and shown to the user.
- Entry point:
otterdog/operations/plan.py(PlanOperationclass) - Responsibilities:
- Format
LivePatchoperations into human-readable output - Show legends and summaries
- Warn about read-only or ignored fields
- Models contribute via
__str__and custom formatting
- Format
This is a rendering step only — no GitHub changes happen here. The user reviews the diff and decides whether to apply it.
Apply: Execute Changes¶
The user approved the changes. Time to execute them against GitHub.
- Entry point:
otterdog/operations/apply.py(ApplyOperationclass) - Execution flow:
- For each
LivePatch, invokeapply_live_patch()on the corresponding model object - Each model class implements
apply_live_patch()to translate the patch into the correct GitHub API operations - Operations are executed via the provider (REST, GraphQL, or web UI as appropriate)
- For each