Assets Guide

Learn how to use faces, voices, and driving motions to customize your realtime avatar. You can use standard assets or upload your own custom assets.

Overview

rtAV supports three types of assets:

Faces - Visual appearance of the avatar
Voices - Audio voice for text-to-speech
Driving Motions - Animation style and gestures

Standard Assets

rtAV provides a variety of standard assets that you can use immediately. List available assets via the API:

List Standard Faces

// Browser JS / Node.js
const response = await fetch('https://api.rtav.io/v1/faces');
const { faces } = await response.json();
console.log('Available faces:', faces);

List Standard Voices

// Browser JS / Node.js
const response = await fetch('https://api.rtav.io/v1/voices');
const { voices } = await response.json();
console.log('Available voices:', voices);

List Standard Driving Motions

// Browser JS / Node.js
const response = await fetch('https://api.rtav.io/v1/driving');
const { driving } = await response.json();
console.log('Available driving motions:', driving);

Using Assets in Sessions

Specify assets when creating a session:

// Browser JS / Node.js
const response = await fetch('https://api.rtav.io/v1/sessions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer rtav_ak_your_api_key',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'gpt-5.2',
    voice: 'alice',        // Standard voice ID
    face: 'face1',         // Standard face ID
    driving: 'idle'        // Standard driving motion ID
  })
});

Updating Assets During Session

Update assets dynamically via session.update. You can change voice, face, or driving motion at any time during an active session:

// Browser JavaScript - WebSocket
ws.send(JSON.stringify({
  type: 'session.update',
  session: {
    voice: 'anthony',     // Change voice
    face: 'face2',        // Change face
    driving: 'SmileTeeth' // Change driving motion
  }
}));

// Browser JavaScript - WebRTC (via data channel)
dataChannel.send(JSON.stringify({
  type: 'session.update',
  session: {
    voice: 'anthony',
    face: 'face2',
    driving: 'SmileTeeth'
  }
}));

Driving Motion Sequences

You can update driving motion with a single motion or a sequence of motions. When you send a sequence, the avatar will transition smoothly between motions, playing intermediate motions once, and looping the final motion.

Single Driving Change

Change to a single driving motion (permanent change):

// Browser JavaScript
ws.send(JSON.stringify({
  type: 'session.update',
  session: {
    driving: 'Wink'  // Single string - transitions to Wink and loops
  }
}));

Driving Sequence (Multiple Motions)

Send an array of driving motions to create a sequence. The avatar will:

Transition from the current motion to the first motion in the sequence
Play all intermediate motions once (in order)
Transition between intermediate motions
Loop the final motion continuously

Permanent Change with Intermediate

Play an intermediate motion, then transition to a final motion:

// Browser JavaScript
ws.send(JSON.stringify({
  type: 'session.update',
  session: {
    driving: ['AgreeYesTotaly', 'default']
    // Plays AgreeYesTotaly once, then transitions to default and loops
  }
}));

Repeating Intermediate Motion

Repeat an intermediate motion multiple times, then transition to a final motion:

// Browser JavaScript
ws.send(JSON.stringify({
  type: 'session.update',
  session: {
    driving: ['Wink', 'Wink', 'Wink', 'default']
    // Plays Wink 3 times, then transitions to default and loops
  }
}));

Note: All transitions are smooth and real-time. The system uses motion template frames (not pre-rendered video) to ensure low latency and seamless transitions.

Custom Assets

Upload your own custom assets via the platform dashboard:

Navigate to Assets in your dashboard
Upload your asset (image for face, audio for voice, video for driving motion)
Wait for processing to complete
Use the asset UUID in your sessions

Using Custom Assets

Use the UUID of your custom asset:

// Browser JS / Node.js
const response = await fetch('https://api.rtav.io/v1/sessions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer rtav_ak_your_api_key',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'gpt-5.2',
    voice: '550e8400-e29b-41d4-a716-446655440000',  // Custom voice UUID
    face: '660e8400-e29b-41d4-a716-446655440001',    // Custom face UUID
    driving: '770e8400-e29b-41d4-a716-446655440002'   // Custom driving UUID
  })
});

Asset Requirements

Face Images

Format: JPEG or PNG
Recommended size: 512x512 pixels or larger
Clear, front-facing portrait
Good lighting and contrast

Voice Audio

Format: WAV, MP3, or other common audio formats
Duration: 10-60 seconds recommended
Clear speech, minimal background noise
Single speaker

Driving Motion Videos

Format: MP4 or other common video formats
Duration: 5-30 seconds recommended
Front-facing subject
Clear facial expressions and movements

Popular Standard Assets

Voices

default - Default voice
alice - Female voice
anthony - Male voice
alloy - Neutral voice
echo - Clear, professional voice
nova - Warm, friendly voice

Driving Motions

idle - Idle animation
IdleListeningNatural - Natural listening pose
AgreeYesTotaly - Agreeing gesture
SmileTeeth - Smiling expression
Thinking - Thinking pose
Surprised - Surprised expression

Next Steps

Check the Assets API Reference for complete endpoint documentation
Upload custom assets via the Assets Dashboard
Try different combinations in the WebSocket Demo or WebRTC Demo