Skip to main content
Teleox.ai

Framework

Derived Data Abundance keeps the corpus fixed.

A DDA pipeline holds raw observations constant and materializes structured supervision through frozen embedders and pairwise interaction features.

Diagram showing raw inputs projected through frozen embedders and pairwise interactions.
identity/signals <= n * (N + choose(N, 2))
Diagram showing raw inputs projected through frozen embedders and pairwise interactions.

Figure 1.The DDA identity counts per-embedder projections plus pairwise interaction features.

Figure description

The diagram starts with n raw inputs. Each input is passed through N frozen embedders, producing N direct projections. Pairwise interaction features are then computed for every embedder pair, producing choose(N, 2) additional structured signals per input. The displayed identity is signals less than or equal to n * (N + choose(N, 2)); the pairwise term is a constructive upper bound pending mutual-information audit.

Fixed data, more measurements

A DDA pipeline holds the raw corpus fixed and projects it through frozen embedders. The derived dataset contains per-embedder projections and pairwise interaction features attached to real observations.

Context Graph example

Context Graph at N=13 yields up to 91 structured signals per input under the DDA counting identity; the effective pairwise count remains pending a mutual-information audit.

Context Graph diagram with 13 embedders and pairwise interactions.

Figure 2.Context Graph is the text-side DDA witness: N=13, up to 91 signals per input as a constructive upper bound.

Figure description

The figure represents Context Graph as a text-side DDA witness with 13 frozen embedders. The direct projections contribute 13 signals per input and the pairwise interactions contribute up to 78 more, for up to 91 structured signals per input as a constructive upper bound pending mutual-information audit.

text-dda-witness

Context Graph

Text-side DDA witness using 13 frozen embedders, RocksDB storage, and MCP retrieval tools.

Embedder panel
N=13
Structured signals
up to 91 per input

Open system

ClipCannon example

ClipCannon at N=7 yields up to 28 structured signals per source clip as a constructive upper bound, and persists those records through a video-side provenance chain.

ClipCannon pipeline diagram showing staged analysis from source video to multimodal records.

Figure 3.ClipCannon materializes video into multi-modal records through a 23-stage analysis DAG.

Figure description

The figure shows source video moving through a staged ClipCannon DAG. Analysis stages extract visual, semantic, emotion, speaker, prosody, sentiment, and voice-identity records. The outputs are persisted as multi-modal training and provenance records rather than one unstructured video blob.

video-dda-witness

ClipCannon

Video-side DDA/TCT witness using a 23-stage DAG, seven modalities, and per-project provenance records.

Pipeline
23-stage DAG
Modality panel
N=7

Open system

What remains open

The count is structural. The important empirical question is how much independent signal remains after redundancy, overlap, and task-specific relevance are measured.

Related videos

00:11:23 / 6umU6kuXR3s

Meaning compression and Derived Data Abundance

Plain-language talk track for the fixed-data DDA move and the meaning-compression ratio.

The formal site claim is narrower than the talk title: DDA is fixed-corpus decomposition and meaning compression is a proposed structured-signal-density measure.

Open video

00:11:49 / mXAJgE2G87Q

Shakespeare LoRA as a signal-density case study

Concrete public-domain text example for how a small corpus can produce many derived training records.

Open video

Sources

  • docs2/PAPER.md#3.1
  • docs2/alldata.md#derived-data-abundance