Skip to main content
Chris Royse field notes

The Meeting Bot Is A Multimodal Alignment Problem

A real-time avatar has to preserve voice, face, expression, timing, conversation state, and meeting latency all at once.

Media / alldata.md / 7:49

The Meeting Bot Is A Multimodal Alignment Problem - Teleox.ai field note thumbnail

Audience

Multimodal model teams, agent teams, safety reviewers

Core idea

The hard part is not making an avatar move. The hard part is keeping every modality in distribution while it responds live.

Founder source

Phoenix + VoiceAgent

Watch on YouTube· 7:49

The Meeting Bot Is A Multimodal Alignment Problem

This is why the constellation framing matters beyond a demo: a live agent needs runtime checks, not just a trained prior.

Watch videoOpen the full video on YouTube

What to take from it

The videos are raw build context. These notes translate them into the shortest useful frame for creators, companies, and AI lab readers.

Identity drift can occur in face, voice, affect, or timing.

Latency pressure should not remove verification.

A meeting bot needs explicit consent and scope controls.

Continue this thread.

Related notes stay inside the same problem area first, then move to the next useful context.

Make it concrete.

Send the audience, data type, target task, proof bar, and sharing limits.