Engineering Log: Testing round-trip validation for lossless ingestion
Stategraph turns Terraform state files into queryable infrastructure graphs. That means taking JSON, deserializing it into OCaml types, storing it in PostgreSQL, reading it back, and serializing it to JSON again. Four transformations, four opportunities to drop a field or mangle a value. If any of those transformations is lossy, the resulting infrastructure graph diverges from reality. That's not acceptable.
Data integrity is not negotiable
Stategraph is fundamentally about turning Terraform state into queryable infrastructure graphs. That means we ingest state files, store them in PostgreSQL, and then reconstruct them for queries and operations. The entire system rests on a single assumption: nothing gets lost in translation. If a value disappears during ingestion, or a field gets transformed incorrectly during reification (the process of reconstructing OCaml values from the database), then the graph we're building is fiction. The whole point of having a structured graph instead of JSON blobs is precision, and precision requires correctness.
So we built a test to prove it. Not a theoretical proof, not a manual spot check, but an automated round-trip validation that takes a real Terraform state file, runs it through the entire ingestion pipeline, writes it back out as JSON, and diffs the result against the original. If the files differ, the test fails. If they match, we know every transformation in the pipeline preserved the data exactly.
Four transformations that need to be perfect
The test validates four distinct operations, each with its own failure modes.
1. Deserialization (JSON → OCaml)
Raw JSON from disk gets deserialized into OCaml values using a set of type definitions and JSON decoders. This step validates that our type definitions accurately match the Terraform state schema. If the schema drifts or we miss a field in the type definition, deserialization fails or silently drops data.
2. Ingestion (OCaml → PostgreSQL)
OCaml values get inserted into PostgreSQL through the ingestion code. This is mostly boilerplate (SQL queries and corresponding OCaml bindings), but boilerplate is exactly where subtle bugs hide. A missing column, an incorrect type cast, a null constraint that doesn't match the schema. All of those would result in lost data.
3. Reification (PostgreSQL → OCaml)
We read the data back from PostgreSQL and reconstruct the OCaml values through a process called reification. This is the dual of ingestion, the mirror image. If ingestion writes data into specific tables and columns, reification reads from those same tables and rebuilds the original structures. Any mismatch between the write path and the read path shows up here.
4. Serialization (OCaml → JSON)
We serialize the reconstructed OCaml values back into JSON and write them to disk. This validates that the JSON serialization logic is the inverse of the deserialization logic. If serialization omits fields that deserialization expects, or formats values differently, the round-trip breaks.
All four of these operations have to be perfect for the test to pass. That's the point. We're not testing individual components in isolation. We're testing the entire pipeline as an integrated system.
Running the test and proving it works
The test runs through the entire pipeline automatically, then runs the validation.
Why this matters
It's not glamorous work. Writing deserialization logic, generating SQL boilerplate, testing round-trip fidelity. None of that is exciting compared to building user-facing features or designing novel algorithms. But it's the kind of foundational correctness that we want everywhere in Stategraph, and it's the kind of rigor that infrastructure tooling should have but often doesn't. This is what it looks like to build data-correct systems from the ground up.
What comes next
The next step is integrating this more tightly with the Stategraph Server, using the same API that production calls will use. Once that's done, we can test the ingestion pipeline in a configuration that mirrors the real deployment, with real API boundaries and real server behavior.
And once that's validated, we can finally start doing something interesting with the data. Insights about resource sprawl, blast radius, dependency graphs. All of that depends on having a correct foundation, and this test is the first milestone toward that foundation.
Follow along as we build Stategraph
This is the first engineering log in what will be an ongoing series. We're building Stategraph in the open, sharing our progress, our technical decisions, and the engineering challenges we hit along the way. If you want to follow the journey, subscribe for updates or reach out about becoming a design partner.