Orion¶
Research / training / benchmarking for Constellation’s research stack.
Orion is the package researchers actively wield. Built on top of Ursa and Virgo, it wraps torch_brain modules in orion.models.* so iteration speed isn’t gated on upstream PRs.
Pydantic + Tyro configs — typed, validated, CLI-overridable, serialized into the run record
Lightning trainer with pre-wired callbacks (
OrionTrainer); DDP default, FSDP exposed, PyTorch Monarch opt-inLance-streaming dataloader that resolves an Ursa
QuerySpecto a streaming scanSkyPilot for cloud orchestration; Slurm on Polaris
ClearML for run tracking + model registry (self-hosted, R2-backed)
Rich checkpoints with bit-exact resume state plus a
data_hashes/manifest.jsonanswering “what data was this trained on?”Benchmarks as first-class artifacts — content-addressed in Ursa, configurable compute boundaries, partial subsets for fast in-training metrics
Multi-stage training pipelines with full lineage propagation
Aggressive pre-flight validation so doomed runs never reach a GPU
Contents
Where this fits¶
Orion is one of three packages in Constellation’s research stack:
Ursa — database / data layer
Virgo — DAG-based preprocessing
Orion (this site) — research / training / benchmarking
Full architecture: Research Stack Architecture (Notion).
Status¶
🌱 Early bootstrap. Implementation tracked in the Linear Orion project.
Phasing:
M1 — Foundations (in progress)
M2 — MVP (Phase 3)
M3 — Benchmarking Framework (Phase 5)
M4 — Multi-stage Training & Lineage (Phase 5)
M5 — Production-scale (Phase 4)
M6 — Polish & onboarding