Tavus Introduces Sparrow-1, Advancing Human-Level Conversational Timing in Real-Time Voice and Video

Sparrow-1 is the leading model for streaming turn detection

SAN FRANCISCO--(BUSINESS WIRE)--Tavus, the human computing company building lifelike AI humans that can see, hear, and respond in real time, today launched Sparrow-1, a conversational-flow control model designed to bring human-level timing to real-time voice and video AI.

Sparrow-1 enables AI systems to determine when to listen, wait, or speak, responding at the moment a human listener would rather than as fast as possible.
Share

Sparrow-1 enables AI systems to determine when to listen, wait, or speak, responding at the moment a human listener would rather than as fast as possible. The model is now generally available across Tavus APIs and products and already powers conversational experiences in Tavus PALs and enterprise deployments.

Conversational AI has made rapid progress in language generation and speech synthesis, yet timing remains a persistent challenge. Most voice systems rely on silence-based endpoint detection, waiting for speech to stop before responding. This approach introduces delays, causes premature interruptions, and breaks conversational flow.

Sparrow-1 takes a different approach. Instead of reacting to silence, it models conversational timing continuously, allowing responses to arrive immediately when intent is clear while deliberately waiting when uncertainty remains. This results in conversations that feel attentive, natural, and human.

A New Model for Conversational Timing

Sparrow-1 is a conversational-flow control model built for real-time conversational video in Tavus’s Conversational Video Interface (CVI). Rather than treating turn-taking as an endpoint-detection problem, Sparrow-1 predicts conversational floor ownership at the frame level, enabling proactive, interruption-aware turn transitions.

Key capabilities include:

Audio-native, streaming inference that preserves prosody and timing cues
Explicit modeling of conversational floor ownership
Real-time speaker adaptation without calibration or fine-tuning
Graceful handling of interruptions, overlap, and hesitation
Dynamic response latency based on conversational certainty

Sparrow-1 functions as a standalone timing and control layer that integrates with modular voice pipelines while restoring natural conversational flow.

Benchmarking Human Conversation

To evaluate real conversational behavior, Tavus benchmarked Sparrow-1 against leading turn-taking systems using 28 challenging real-world conversational audio samples designed to surface hesitation, overlap, and ambiguous turn endings.

Across these evaluations, Sparrow-1 achieved:

100 percent precision and recall
Zero interruptions
55 millisecond median response latency

By comparison, existing systems were forced to choose between avoiding interruptions by waiting several seconds or responding quickly at the cost of frequent cut-offs. The results indicate that the speed and correctness tradeoff commonly observed in conversational AI is a consequence of silence-based design rather than an inherent property of conversation.

Built for How Humans Actually Talk

Teaching AI the art of being human requires them to learn the dance of conversation. At runtime, Sparrow-1 adapts continuously to each speaker, learning cadence, pause duration, and hesitation patterns as a conversation unfolds. The model incorporates fillers, trailing vocalizations, prosodic rhythm, and emotional cadence into its timing decisions.

When interruptions occur, Sparrow-1 resolves them in real time, distinguishing deliberate bids for the conversational floor from incidental overlap within tens of milliseconds. Over the course of a conversation, timing progressively synchronizes to the speaker, producing smoother and more natural interactions.

Availability

Sparrow-1 is generally available today across Tavus APIs and products and already powers conversational experiences in Tavus PALs and enterprise deployments.

To see Sparrow-1 in action, visit the demo at https://www.tavus.io.

About Tavus

Tavus is a San Francisco-based AI research company pioneering human computing, the next era of computing built around adaptive and emotionally intelligent AI humans. Tavus develops foundational models that enable machines to see, hear, respond, and act in ways that feel natural to people.

In addition to consumer PALs, Tavus provides APIs and enterprise solutions for deploying lifelike AI humans at scale.

Learn more at https://www.tavus.io

Contacts

For Contact:
Leigh Disher
leigh@gmkcommunications.com

Industry:

More News From Tavus

Tavus Launches Phoenix-4: the First Real-Time Human Rendering Model with Emotional Intelligence

SAN FRANCISCO--(BUSINESS WIRE)--Tavus, the human computing company building lifelike AI humans that can see, hear, and respond in real time, today launched Phoenix-4, a real-time behavior generation engine that generates emotionally responsive, context-aware human presence in live conversation. Phoenix-4 is the first real-time model to generate and control emotional states, active listening behavior, and continuous facial motion as a single, unified system. It is a full-duplex model that listen...

Tavus Introduces Raven-1, Bringing Multimodal Perception to Real-Time Conversational AI

SAN FRANCISCO--(BUSINESS WIRE)--Tavus, the human computing company building lifelike AI humans that can see, hear, and respond in real time, launched Raven-1 into GA today, a multimodal perception system that enables AI to understand emotion, intent, and context the way humans do. Raven-1 captures and interprets audio and visual signals together, enabling AI systems to understand not just what users say, but how they say it and what that combination actually means. The model is now generally av...

Tavus Announces AI Santa 2.0: The World’s First Emotionally Intelligent Holiday PAL

SAN FRANCISCO--(BUSINESS WIRE)--Tavus, the leading human computing company building lifelike AI humans that can see, hear, respond, and take actions, today announced AI Santa 2.0, the most advanced and emotionally intelligent version of Santa ever created. Built as an official Tavus PAL, AI Santa brings the magic of the North Pole to life with human-level presence, memory, and multimodal communication across video, voice, and text. Last year, millions of people spoke with the original AI Santa...

Back to Newsroom

Services & Solutions

Services

Solutions For

Resources

Education

Why Business Wire

Tavus Introduces Sparrow-1, Advancing Human-Level Conversational Timing in Real-Time Voice and Video

Contacts

Tavus

Contacts

Tavus Launches Phoenix-4: the First Real-Time Human Rendering Model with Emotional Intelligence

Tavus Introduces Raven-1, Bringing Multimodal Perception to Real-Time Conversational AI

Tavus Announces AI Santa 2.0: The World’s First Emotionally Intelligent Holiday PAL

Tavus

Contacts