Tavus Introduces Sparrow-1, Advancing Human-Level Conversational Timing in Real-Time Voice and Video
Tavus Introduces Sparrow-1, Advancing Human-Level Conversational Timing in Real-Time Voice and Video
SAN FRANCISCO--(BUSINESS WIRE)--Tavus, the human computing company building lifelike AI humans that can see, hear, and respond in real time, today launched Sparrow-1, a conversational-flow control model designed to bring human-level timing to real-time voice and video AI.
Sparrow-1 enables AI systems to determine when to listen, wait, or speak, responding at the moment a human listener would rather than as fast as possible.
Share
Sparrow-1 enables AI systems to determine when to listen, wait, or speak, responding at the moment a human listener would rather than as fast as possible. The model is now generally available across Tavus APIs and products and already powers conversational experiences in Tavus PALs and enterprise deployments.
Conversational AI has made rapid progress in language generation and speech synthesis, yet timing remains a persistent challenge. Most voice systems rely on silence-based endpoint detection, waiting for speech to stop before responding. This approach introduces delays, causes premature interruptions, and breaks conversational flow.
Sparrow-1 takes a different approach. Instead of reacting to silence, it models conversational timing continuously, allowing responses to arrive immediately when intent is clear while deliberately waiting when uncertainty remains. This results in conversations that feel attentive, natural, and human.
A New Model for Conversational Timing
Sparrow-1 is a conversational-flow control model built for real-time conversational video in Tavus’s Conversational Video Interface (CVI). Rather than treating turn-taking as an endpoint-detection problem, Sparrow-1 predicts conversational floor ownership at the frame level, enabling proactive, interruption-aware turn transitions.
Key capabilities include:
- Audio-native, streaming inference that preserves prosody and timing cues
- Explicit modeling of conversational floor ownership
- Real-time speaker adaptation without calibration or fine-tuning
- Graceful handling of interruptions, overlap, and hesitation
- Dynamic response latency based on conversational certainty
Sparrow-1 functions as a standalone timing and control layer that integrates with modular voice pipelines while restoring natural conversational flow.
Benchmarking Human Conversation
To evaluate real conversational behavior, Tavus benchmarked Sparrow-1 against leading turn-taking systems using 28 challenging real-world conversational audio samples designed to surface hesitation, overlap, and ambiguous turn endings.
Across these evaluations, Sparrow-1 achieved:
- 100 percent precision and recall
- Zero interruptions
- 55 millisecond median response latency
By comparison, existing systems were forced to choose between avoiding interruptions by waiting several seconds or responding quickly at the cost of frequent cut-offs. The results indicate that the speed and correctness tradeoff commonly observed in conversational AI is a consequence of silence-based design rather than an inherent property of conversation.
Built for How Humans Actually Talk
Teaching AI the art of being human requires them to learn the dance of conversation. At runtime, Sparrow-1 adapts continuously to each speaker, learning cadence, pause duration, and hesitation patterns as a conversation unfolds. The model incorporates fillers, trailing vocalizations, prosodic rhythm, and emotional cadence into its timing decisions.
When interruptions occur, Sparrow-1 resolves them in real time, distinguishing deliberate bids for the conversational floor from incidental overlap within tens of milliseconds. Over the course of a conversation, timing progressively synchronizes to the speaker, producing smoother and more natural interactions.
Availability
Sparrow-1 is generally available today across Tavus APIs and products and already powers conversational experiences in Tavus PALs and enterprise deployments.
To see Sparrow-1 in action, visit the demo at https://www.tavus.io.
About Tavus
Tavus is a San Francisco-based AI research company pioneering human computing, the next era of computing built around adaptive and emotionally intelligent AI humans. Tavus develops foundational models that enable machines to see, hear, respond, and act in ways that feel natural to people.
In addition to consumer PALs, Tavus provides APIs and enterprise solutions for deploying lifelike AI humans at scale.
Learn more at https://www.tavus.io
Contacts
For Contact:
Leigh Disher
leigh@gmkcommunications.com

