Building a Type-Safe Ontology Runtime

The Problem with Ontology Tooling

Open Protege. Define your classes and relations. Export a diagram. Hand it to a development team. Watch them build something different in code. The ontology lives in an OWL file that nobody reads after the initial design phase. Within a quarter, the code and the model have diverged. Within a year, the OWL file is fiction.

When we started building P3, we wanted to avoid that drift. The goal was to generate runtime code directly from the ontology — TypeScript types, runtime validators, and API contracts where creating an invalid entity fails at compile time. If the ontology says a WorkOrder must reference a Machine, then code that constructs a WorkOrder without a Machine reference should not compile.

What We Needed

The requirements were specific:

Compile-time safety. If a developer tries to assign a string to a field that the ontology defines as a reference to another entity, the TypeScript compiler rejects it. Not a runtime error. Not a test failure. A red squiggle in the IDE before the code is even saved.
Runtime validation. Compile-time types only protect you from your own code. Data coming in from external systems — API requests, message queues, file imports — arrives as untyped JSON. The runtime must validate it against the same ontological constraints that the compiler enforces.
Single source of truth. The ontology definition must be the single artifact from which types, validators, and API schemas are generated. No manual synchronization. No "update the types when you change the ontology" checklist items that someone will inevitably forget.
Evolvability. Ontologies change. New classes are added, relations are refined, constraints are tightened. The generated code must handle schema evolution gracefully — old consumers should not break when new properties are added.

The Pipeline: Ontology to Types to Runtime

The approach we are using is a four-stage code generation pipeline. Each stage transforms the ontology into a progressively more concrete artifact.

Step 1: Ontology Parsing

The formal ontology — classes, object properties, data properties, axioms, cardinality constraints — is defined in a structured format. We parse it into an intermediate representation (IR) that captures the full semantics: class hierarchies, property domains and ranges, cardinality restrictions, disjointness axioms, and defined classes (classes defined by necessary and sufficient conditions).

This IR is the backbone of everything downstream. If a class WorkOrder has a mandatory object property assignedTo with range Machine and cardinality exactly 1, the IR captures all of that. If Machine is a subclass of Asset, the IR captures the full inheritance chain.

Step 2: TypeScript Type Generation

The IR feeds a code generator that emits TypeScript interfaces and type aliases. This is where the design gets interesting. Consider a simplified example:

// Generated from ontology — DO NOT EDIT

interface Asset {
  readonly id: EntityId<"Asset">;
  name: string;
  location: EntityRef<"Facility">;
}

interface Machine extends Asset {
  readonly id: EntityId<"Machine">;
  machineType: MachineType;
  status: MachineStatus;
  currentWorkOrder: EntityRef<"WorkOrder"> | null;
  capabilities: EntityRef<"Operation">[];
}

interface WorkOrder {
  readonly id: EntityId<"WorkOrder">;
  assignedTo: EntityRef<"Machine">;  // mandatory — no null
  operations: [EntityRef<"Operation">, ...EntityRef<"Operation">[]]; // non-empty
  priority: Priority;
  status: WorkOrderStatus;
  dueDate: ISODateTime;
}

A few things to notice. EntityId is a branded type — an EntityId<"Machine"> is not assignable to an EntityId<"WorkOrder">, even though both are strings at runtime. This prevents a common class of bugs where entity IDs are mixed up. EntityRef is similarly branded, ensuring you cannot accidentally reference the wrong entity type.

The operations field on WorkOrder uses a non-empty tuple type — the ontology says a WorkOrder must have at least one Operation, and the TypeScript type enforces it. You literally cannot create a WorkOrder with an empty operations array. The compiler refuses.

Inheritance works as expected: Machine extends Asset, so every Machine has a name and location. This mirrors the ontology's class hierarchy directly in the type system.

Step 3: Runtime Validation

Types disappear at runtime. TypeScript compiles to JavaScript, and JavaScript does not enforce interfaces. So the same code generator that produces types also produces Zod schemas:

// Generated from ontology — DO NOT EDIT

const MachineSchema = AssetSchema.extend({
  id: entityId("Machine"),
  machineType: MachineTypeSchema,
  status: MachineStatusSchema,
  currentWorkOrder: entityRef("WorkOrder").nullable(),
  capabilities: z.array(entityRef("Operation")),
});

const WorkOrderSchema = z.object({
  id: entityId("WorkOrder"),
  assignedTo: entityRef("Machine"),
  operations: z.array(entityRef("Operation")).nonempty(),
  priority: PrioritySchema,
  status: WorkOrderStatusSchema,
  dueDate: isoDateTime(),
});

If you have used Prisma, this pattern will feel familiar. Prisma reads your database schema and generates both TypeScript types and a runtime client. The idea is similar here, but the source of truth is the ontology rather than a database schema. The generated validators also enforce semantic constraints — axioms that go beyond structural validity.

For example, an axiom like "a WorkOrder's dueDate must be after its createdDate" becomes a Zod .refine() call in the generated schema. These cross-field validations are derived directly from ontology axioms and enforced at every system boundary where data enters.

Step 4: API Enforcement

The generated Zod schemas become the validation layer for every API endpoint. An incoming request to create a WorkOrder is parsed through WorkOrderSchema before any business logic executes. If validation fails, the client gets a structured error response that maps directly to the ontology constraint that was violated. Not "Bad Request." But: "WorkOrder.assignedTo is required: a WorkOrder must reference exactly one Machine (constraint: WorkOrder_assignedTo_cardinality)."

This approach draws from GraphQL code generation tools like graphql-codegen, which generate TypeScript types from GraphQL schemas. But GraphQL schemas express structure. Our ontology expresses semantics. The generated code enforces not just "this field is a string" but "this field is a reference to a Machine that must be in Active status and must have the capability to perform the Operations listed in this WorkOrder."

What the Generated Types Look Like

Here is a more realistic example of generated code, showing how ontology axioms translate to type-level constraints:

// Enumerated class from ontology
type MachineStatus = "idle" | "running" | "maintenance" | "faulted";

// Defined class: ActiveMachine ≡ Machine ⊓ status ∈ {idle, running}
type ActiveMachine = Machine & { status: "idle" | "running" };

// Type guard generated from defined class axiom
function isActiveMachine(m: Machine): m is ActiveMachine {
  return m.status === "idle" || m.status === "running";
}

// Builder with ontological constraint enforcement
function createWorkOrder(
  params: Omit<WorkOrder, "id" | "status"> & {
    assignedTo: EntityRef<"Machine">;  // must be ActiveMachine at runtime
  }
): Result<WorkOrder, OntologyViolation[]> {
  const machine = resolve(params.assignedTo);
  if (!isActiveMachine(machine)) {
    return err([{
      axiom: "WorkOrder_requiresActiveMachine",
      message: `Machine ${machine.id} has status "${machine.status}" — `
             + `WorkOrders can only be assigned to active machines`,
    }]);
  }
  // ... additional axiom checks
  return ok(validated);
}

The pattern is consistent throughout: the ontology defines a constraint, the code generator translates it into a type-level restriction where possible and a runtime check where necessary, and the developer writing business logic gets immediate feedback — from the compiler, from the IDE, or from the validation layer — when they try to violate the ontology.

The Trade-Offs

Strictness vs. Flexibility

Maximum type safety has a cost: rigidity. If every ontological constraint is enforced at compile time, adding a new property to a class requires regenerating types, recompiling all consumers, and deploying everything in lockstep. That is not practical in a distributed system with multiple teams.

The compromise we settled on: mandatory constraints are strict, extensions are flexible. The core properties of a class — the ones that appear in ontology axioms — are enforced at compile time. Extended properties — those added by domain-specific ontology layers — use a typed-but-optional pattern that allows consumers to ignore properties they do not care about. This mirrors the Open World Assumption: the core ontology is closed (you must conform), but extensions are open (you may add without breaking).

Schema Evolution and Backward Compatibility

Ontologies evolve. The question is how to evolve the generated types without breaking existing code.

We are using a versioning strategy inspired by API versioning. Each generated type carries a schema version. When a breaking change occurs — a mandatory property is added, a type is narrowed, a cardinality constraint is tightened — the generator emits both the new type and a migration function that transforms old instances to the new shape. Consumers can adopt the new schema at their own pace, and the runtime coerces old-format data through the migration on read.

Non-breaking changes — new optional properties, new subclasses, new enumeration values — do not trigger version bumps. Old consumers simply ignore the new properties. This matches how OWL ontologies naturally evolve: adding knowledge is non-breaking, removing or constraining knowledge is breaking.

Why This Matters

The typical failure mode goes like this: an architect designs an ontology in Protege. A developer looks at it, nods, and writes a PostgreSQL schema that captures some of the semantics. The rest -- the axioms, the constraints, the inheritance -- gets "handled in application logic," which in practice means scattered across if-statements that gradually fall out of sync with the ontology.

Code generation from the ontology is our attempt to close that gap. The ontology becomes a specification that a compiler transforms into code. The axioms become validations. The class hierarchy becomes an inheritance tree in the type system.

The result is that the formal model directly shapes the code developers write, the APIs that systems expose, and the validations that data must pass. The ontology is not a separate artifact from the code -- it is the source from which the code is derived. How far this scales is an open question. We have not yet stress-tested it against ontologies with thousands of classes, and the generated type files can get large enough to slow down IDE type-checking. Those are the engineering problems we are working on next.