Engineering · 9 min read

The chain behind every report.

A non-cryptographer's walk through what those eight nodes mean and what they actually prove.

Every Felarity meeting report ships with an attestation block at the bottom. It contains eight node hashes, a Merkle root, and an Ed25519 signature over that root. People look at it, nod respectfully, and then ask the only question that matters: what does this actually prove?

This post is the answer. It is written for the partner who will read the report, the general counsel who will decide whether to put it in front of a judge, and the engineer who has to maintain the verifier on the other side. No cryptography background assumed. No marketing fog. We will be specific about what each node attests to, and equally specific about what it does not.

Why a chain at all

A single hash over the whole report would tell you the report has not been altered since it was signed. That is useful but thin. It cannot tell you which stage of the pipeline produced which fact, and it cannot tell you whether the speaker attribution was bolted on after the contradiction was found. Those are exactly the questions that come up under cross-examination.

So we hash each stage separately, then chain the stages together. Every node hash includes the previous node's hash as one of its inputs. If anyone edits the transcript after the fact, the transcription hash changes, which changes every downstream hash, which changes the Merkle root, which invalidates the signature. You cannot quietly retouch one stage without breaking the seal on every stage after it.

The eight nodes, in order

The pipeline runs in a fixed order. Each node attests to one transformation. We will walk through each one, what it inputs, what it outputs, and what its hash actually covers.

1. AUDIO_CAPTURE

Inputs: the raw audio chunks that arrived from the browser, in arrival order, with their byte offsets and millisecond timestamps. Output: a single canonical audio artifact (the concatenation, in Phase 2) plus its length and sample rate. The hash covers the SHA-256 of the concatenated audio bytes, the chunk boundary table, and the session start timestamp. This is the anchor. Everything downstream is derived from this audio. If the audio gets swapped, this hash changes and the whole chain falls apart.

2. TRANSCRIPTION

Inputs: the audio artifact from node 1 plus the Whisper large-v3 model identifier and decoding parameters. Output: a time-aligned transcript with per-segment confidence. The hash covers the transcript JSON, the model version string, and the previous node's hash. We commit to the model version on purpose. If you later upgrade Whisper and re-transcribe, you get a different node 2 hash, which means a different chain. Old reports are not silently re-described by new models.

3. CONTRADICTION_DETECTION (pre-attribution)

This is the node that gets the most questions, so read it carefully. Contradiction detection runs before speakers are attributed. The council reads the transcript, finds statements that conflict, and records each contradiction with character offsets back into the transcript. At this point we do not yet know who said what. We only know that statement A and statement B disagree, and where in the transcript they live.

This ordering is deliberate. We want the model that finds contradictions to find them based on the words, not based on who said them. Detection before attribution removes a class of bias and makes the later attribution step independently auditable. The node hash covers the contradiction list, each contradiction's offsets, the council model identifier, and the previous node's hash.

4. SPEAKER_DIARIZATION

Inputs: the audio artifact from node 1 (not the transcript) plus the pyannote model identifier. Output: a list of speaker turns with start time, end time, and an anonymous speaker label (SPEAKER_00, SPEAKER_01, and so on). Diarization decides where one person stops speaking and another starts. It does not name people. The hash covers the turn table and the model version.

Diarization runs on the audio in parallel with the transcript, not on top of it. This matters because if you wanted to forge a contradiction onto a specific speaker, you would have to fake the audio in node 1 in a way that also produces consistent turn boundaries in node 4. That is a much harder forgery than editing a transcript.

5. ATTRIBUTION_BINDING

This is where the transcript (node 2), the contradictions (node 3), and the diarization (node 4) are joined. For each contradiction, we look up which speaker turn covers the contradicting statement and bind the contradiction to that anonymous speaker. The output is the same contradiction list as node 3, now annotated with speaker labels and a confidence score for each binding. The hash covers the bound contradictions and the previous node's hash.

Splitting attribution out as its own node lets a verifier ask a precise question: was this contradiction attributed to this speaker, or was it found and then attributed? The chain answers: found at node 3, attributed at node 5, in that order, and you can hash-check both.

6. ACOUSTIC_ANALYSIS

Inputs: the audio artifact and the diarization turns. Output: per-speaker acoustic markers — speech rate, pause distribution, vocal stress indicators, and turn-level energy. These are signals, not verdicts. We do not claim to detect lies. We measure observable acoustic properties of speech that humans already pay attention to, and we record them so they can be reviewed alongside the transcript. The hash covers the marker table.

7. TOPOLOGY_ANALYSIS

Inputs: the attributed contradictions from node 5. Output: a NetworkX-derived contradiction graph and a pattern classification (isolated incident, recurring with one speaker, mutual disagreement, escalating chain, and so on). Topology is what turns a list of contradictions into a shape. Three contradictions between the same two people on the same topic over forty minutes is a different shape from three contradictions distributed across six speakers. The hash covers the graph adjacency, the classification, and the previous node's hash.

8. FINAL_REPORT

Inputs: everything above. Output: the rendered report — narrative, citations back into the transcript, speaker credibility arcs, and the appendix containing the chain itself. The hash covers the report bytes. This is the last node. The Merkle root is computed over all eight node hashes in order, and the Ed25519 signature is computed over that root.

What the verifier sees

A verifier — your auditor, your lawyer, anyone with the public key — does three things. First, recompute each node hash from the artifacts we shipped with the report. Second, recompute the Merkle root from the eight node hashes. Third, verify the Ed25519 signature against that root using the public key at /.well-known/felarity-signing-key.pem. If all three pass, the report is exactly the report we signed.

curl -X POST https://api.felarity.com/v1/verify \
  -H "Content-Type: application/json" \
  -d @report.json

# response
{
  "verified": true,
  "signing_key_fingerprint": "ed25519:6f3b...",
  "signed_at": "2026-03-14T19:22:08Z",
  "nodes": [
    {"name": "AUDIO_CAPTURE",            "hash_ok": true},
    {"name": "TRANSCRIPTION",            "hash_ok": true},
    {"name": "CONTRADICTION_DETECTION",  "hash_ok": true},
    {"name": "SPEAKER_DIARIZATION",      "hash_ok": true},
    {"name": "ATTRIBUTION_BINDING",      "hash_ok": true},
    {"name": "ACOUSTIC_ANALYSIS",        "hash_ok": true},
    {"name": "TOPOLOGY_ANALYSIS",        "hash_ok": true},
    {"name": "FINAL_REPORT",             "hash_ok": true}
  ],
  "merkle_root_ok": true,
  "signature_ok": true
}

Why Ed25519

Ed25519 is fast, has small keys and small signatures, and has a clean, widely audited reference implementation. It is supported by every modern crypto library, including the ones your auditor's team is already using. The signature is 64 bytes. The public key is 32 bytes. A verifier can be written in a screen of code. The boring choice is the right choice here.

What this does not prove

Be careful here. The chain proves the pipeline ran in the order we say it ran, on the bytes we say it ran on, and that the report you are holding is the report we signed. It does not prove the following, and we will not say it does:

That a person is lying. Acoustic markers are signals, not verdicts. A nervous honest person and a calm dishonest person both exist.
That the speaker we bound a statement to was the real human in the room. Diarization labels are anonymous. Mapping SPEAKER_01 to "Jane Doe" is a human decision made outside the chain.
That the transcript captured every word correctly. Whisper is excellent and not perfect. The chain commits to the transcript we produced; it does not certify the transcript against ground truth.
That the contradictions identified are the only ones in the meeting. The council is good. It is not omniscient.

Court-friendly language

When the report goes into evidence, the affidavit language we recommend is narrow and exact:

"The attached report is the output of the Felarity intelligence pipeline applied to the audio recorded on [date]. The attestation chain attached as Exhibit A was generated by the pipeline at the time of analysis and signed using the Ed25519 private key whose corresponding public key is published at https://felarity.com/.well-known/felarity-signing-key.pem. Verification of the chain confirms that the report has not been altered since signing and that the eight enumerated processing stages ran in the order recorded."

Notice what that paragraph carefully does not say. It does not say the conclusions are correct. It does not say any speaker lied. It says the pipeline ran, in this order, on these bytes, and produced this report, and the report has not been edited since. That is what the math supports. Anything beyond that is a human judgment, made by a human, who can be cross-examined.

Infrastructure, not judgment

The chain exists so that the boring questions — was this edited, who ran what when, can I trust the artifact in my hand — have boring answers. It does not exist to settle the question of who was right in the meeting. That part is still yours. The chain just makes sure you and the other side are arguing over the same record.

Private. Remembered. Defended. The defense in that line is not rhetorical. It is eight hashes, a Merkle root, and a signature, every time.