Skip to main content
Teleox.ai

Video

Voice-cloning SECS measurement case

Case-study video for the one-speaker WavLM SECS measurement and its scope limits.

Voice-cloning SECS measurement case

00:03:27 / JGFFOTL35wA

Case-study video for the one-speaker WavLM SECS measurement and its scope limits.

The formal site claim is 0.961 mean WavLM SECS under an encoder-matched protocol for one speaker and 10 held-out English sentences.

Open video page
Voice protocol diagram showing centroid enrollment, best-of-12 generation, and WavLM scoring.

Figure 1.The voice case reports 0.961 mean WavLM SECS for one speaker on held-out English sentences under an encoder-matched protocol.

Figure description

The figure shows a one-speaker voice protocol. A 50-clip centroid enrolls the target. For each held-out English sentence, the pipeline generates 12 candidates and selects by WavLM scoring. The reported result is 0.961 mean WavLM SECS under an encoder-matched protocol; cross-encoder triangulation remains pending.

Related publications

preprint / March 2026

ClipCannon: 0.961 Mean Cross-Encoder Speaker Similarity via Pipeline Engineering for Personalized Voice Cloning

Measured voice case: 0.961 mean WavLM SECS under an encoder-matched one-speaker protocol with held-out English sentences.

DOI 10.13140/RG.2.2.33842.16324

  • Preprint status; not peer reviewed.
  • One speaker and held-out English sentences.

Open source record

Related systems

video-dda-witness

ClipCannon

Video-side DDA/TCT witness using a 23-stage DAG, seven modalities, and per-project provenance records.

Pipeline
23-stage DAG
Modality panel
N=7

Open system

Sources