Assets Guide
Learn how to use faces, voices, and driving motions to customize your realtime avatar. You can use standard assets or upload your own custom assets.
Overview
rtAV supports three types of assets:
- Faces - Visual appearance of the avatar
- Voices - Audio voice for text-to-speech
- Driving Motions - Animation style and gestures
Standard Assets
rtAV provides a variety of standard assets that you can use immediately. List available assets via the API:
List Standard Faces
// Browser JS / Node.js
const response = await fetch('https://api.rtav.io/v1/faces');
const { faces } = await response.json();
console.log('Available faces:', faces);List Standard Voices
// Browser JS / Node.js
const response = await fetch('https://api.rtav.io/v1/voices');
const { voices } = await response.json();
console.log('Available voices:', voices);List Standard Driving Motions
// Browser JS / Node.js
const response = await fetch('https://api.rtav.io/v1/driving');
const { driving } = await response.json();
console.log('Available driving motions:', driving);Using Assets in Sessions
Specify assets when creating a session:
// Browser JS / Node.js
const response = await fetch('https://api.rtav.io/v1/sessions', {
method: 'POST',
headers: {
'Authorization': 'Bearer rtav_ak_your_api_key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: 'gpt-5.2',
voice: 'alice', // Standard voice ID
face: 'face1', // Standard face ID
driving: 'idle' // Standard driving motion ID
})
});Updating Assets During Session
Update assets dynamically via session.update. You can change voice, face, or driving motion at any time during an active session:
// Browser JavaScript - WebSocket
ws.send(JSON.stringify({
type: 'session.update',
session: {
voice: 'anthony', // Change voice
face: 'face2', // Change face
driving: 'SmileTeeth' // Change driving motion
}
}));
// Browser JavaScript - WebRTC (via data channel)
dataChannel.send(JSON.stringify({
type: 'session.update',
session: {
voice: 'anthony',
face: 'face2',
driving: 'SmileTeeth'
}
}));Driving Motion Sequences
You can update driving motion with a single motion or a sequence of motions. When you send a sequence, the avatar will transition smoothly between motions, playing intermediate motions once, and looping the final motion.
Single Driving Change
Change to a single driving motion (permanent change):
// Browser JavaScript
ws.send(JSON.stringify({
type: 'session.update',
session: {
driving: 'Wink' // Single string - transitions to Wink and loops
}
}));Driving Sequence (Multiple Motions)
Send an array of driving motions to create a sequence. The avatar will:
- Transition from the current motion to the first motion in the sequence
- Play all intermediate motions once (in order)
- Transition between intermediate motions
- Loop the final motion continuously
Permanent Change with Intermediate
Play an intermediate motion, then transition to a final motion:
// Browser JavaScript
ws.send(JSON.stringify({
type: 'session.update',
session: {
driving: ['AgreeYesTotaly', 'default']
// Plays AgreeYesTotaly once, then transitions to default and loops
}
}));Repeating Intermediate Motion
Repeat an intermediate motion multiple times, then transition to a final motion:
// Browser JavaScript
ws.send(JSON.stringify({
type: 'session.update',
session: {
driving: ['Wink', 'Wink', 'Wink', 'default']
// Plays Wink 3 times, then transitions to default and loops
}
}));Note: All transitions are smooth and real-time. The system uses motion template frames (not pre-rendered video) to ensure low latency and seamless transitions.
Custom Assets
Upload your own custom assets via the platform dashboard:
- Navigate to Assets in your dashboard
- Upload your asset (image for face, audio for voice, video for driving motion)
- Wait for processing to complete
- Use the asset UUID in your sessions
Using Custom Assets
Use the UUID of your custom asset:
// Browser JS / Node.js
const response = await fetch('https://api.rtav.io/v1/sessions', {
method: 'POST',
headers: {
'Authorization': 'Bearer rtav_ak_your_api_key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: 'gpt-5.2',
voice: '550e8400-e29b-41d4-a716-446655440000', // Custom voice UUID
face: '660e8400-e29b-41d4-a716-446655440001', // Custom face UUID
driving: '770e8400-e29b-41d4-a716-446655440002' // Custom driving UUID
})
});Asset Requirements
Face Images
- Format: JPEG or PNG
- Recommended size: 512x512 pixels or larger
- Clear, front-facing portrait
- Good lighting and contrast
Voice Audio
- Format: WAV, MP3, or other common audio formats
- Duration: 10-60 seconds recommended
- Clear speech, minimal background noise
- Single speaker
Driving Motion Videos
- Format: MP4 or other common video formats
- Duration: 5-30 seconds recommended
- Front-facing subject
- Clear facial expressions and movements
Popular Standard Assets
Voices
- default - Default voice
- alice - Female voice
- anthony - Male voice
- alloy - Neutral voice
- echo - Clear, professional voice
- nova - Warm, friendly voice
Driving Motions
- idle - Idle animation
- IdleListeningNatural - Natural listening pose
- AgreeYesTotaly - Agreeing gesture
- SmileTeeth - Smiling expression
- Thinking - Thinking pose
- Surprised - Surprised expression
Next Steps
- Check the Assets API Reference for complete endpoint documentation
- Upload custom assets via the Assets Dashboard
- Try different combinations in the WebSocket Demo or WebRTC Demo