HeyGen LiveAvatar Integration for AI Agent
This document explains how to use HeyGen's streaming avatars with the AI Agent system.
Overview
HeyGen LiveAvatar provides photorealistic AI avatars that can speak and respond in real-time via WebRTC streaming. This integration allows you to swap out the 3D avatar for a HeyGen streaming avatar displayed on a video screen in VR.
There are two ways to use HeyGen LiveAvatar with SightLab:
1. Browser ScreenCast Method (Simple) - Run LiveAvatar in a browser and screencast it into SightLab
2. API Integration Method (Advanced) - Direct API integration for programmatic control
Method 1: Browser ScreenCast (Recommended for Quick Setup)
This is the simplest way to use HeyGen LiveAvatar with SightLab or E-Learning Lab. You run the avatar in your browser and screencast the window into the VR environment.
Step 1: Create a LiveAvatar Account
- Go to LiveAvatar and sign up for an account
- Log in to access the LiveAvatar dashboard
Step 2: Choose or Create an Avatar
- Browse the available avatars in the LiveAvatar library
- Select an existing avatar or create a new one
- Customize your avatar's appearance and voice settings as desired
Step 3: Configure Your Avatar Session
You can either:
- Embed in a webpage: Get the embed code and run it on a local webpage
- Run directly from LiveAvatar: Use the LiveAvatar web interface directly
Step 4: Configure Browser Settings
Important: Turn off hardware acceleration in your browser for proper screen capture.
For Chrome:
1. Go to chrome://settings
2. Scroll to "System"
3. Toggle "Use hardware acceleration when available" OFF
4. Restart Chrome
Step 5: Select Your Microphone
Note: Using the default microphone option often doesn't work properly. You must specifically select your microphone device in the LiveAvatar settings.
- In the LiveAvatar interface, open audio/microphone settings
- Select your specific microphone device from the dropdown (do not use "Default")
- Test that your audio is being captured
Step 6: Run the ScreenCast Script
Note: For the E-Learning Lab, you just need to drag in the "Cast" object from the Videos tab
- Start the LiveAvatar session in your browser
- Run the SightLab screencast script:
python HeyGen_ScreenCast.py - A window selection dialog will appear - select "liveavatarapp" from the dropdown
- Put on your VR headset
Step 7: Start Your Session
| Key | Action |
|---|---|
Space |
Start/end trial (begins recording) |
m |
Scale video screen up |
n |
Scale video screen down |
t |
Print current screen position/rotation/scale |
- Press
Spaceto start the trial and begin recording - Interact with the LiveAvatar in your browser - have a conversation
- Eye tracking data and Biopac data (if connected) will be collected automatically
- Press
Spaceagain to end the trial
Step 8: View Your Data
After the session, your data files are saved in the data/ folder:
- experiment_summary.csv - Overview of the experiment
- trial_data/ - Individual trial CSV files with eye tracking, biometric data, etc.
- replay_data/ - Replay files for playback
- recordings/ - Video recordings of the screencast
Step 9: Replay Your Session
Run the replay script to view the recorded session with synchronized data:
python HeyGen_ScreenCast_Replay.py
- Select the video file from the dialog (most recent files appear first)
- The replay will show the 3D VR scene with the video feed playing back
- View synchronized eye tracking visualizations and interactions
Method 2: API Integration (Advanced)
For programmatic control and tighter integration, you can use the HeyGen API directly.
Requirements
Python Packages
Install the required packages:
pip install livekit requests numpy opencv-python Pillow pyaudio
HeyGen Account
- Sign up for a HeyGen/LiveAvatar account at https://www.liveavatar.com/
- Get your API key from the LiveAvatar settings page
- Get an Avatar ID from the available avatars
Configuration
Method 1: Environment Variables (Recommended)
Set these environment variables:
# Windows (PowerShell)
$env:HEYGEN_API_KEY = "your-api-key-here"
$env:HEYGEN_AVATAR_ID = "your-avatar-id"
# Windows (Command Prompt)
setx HEYGEN_API_KEY your-api-key-here
setx HEYGEN_AVATAR_ID your-avatar-id
# Linux/Mac
export HEYGEN_API_KEY="your-api-key-here"
export HEYGEN_AVATAR_ID="your-avatar-id"
Method 2: Config File
Edit Config_Global.py and set:
USE_HEYGEN_AVATAR = True
HEYGEN_API_KEY = "your-api-key-here"
HEYGEN_AVATAR_ID = "your-avatar-id"
Optional Configuration
In Config_Global.py:
# HeyGen Settings
HEYGEN_VOICE_ID = None # Optional: specific voice ID
HEYGEN_CONTEXT_ID = None # Optional: persona context
HEYGEN_LANGUAGE = "en" # Language code
HEYGEN_MODE = "FULL" # FULL or CUSTOM mode
HEYGEN_USE_BUILTIN_AI = False # Use HeyGen's AI or your own
# Video Screen Position in VR
HEYGEN_VIDEO_SCREEN_POSITION = [0, 1.5, 2] # [x, y, z]
HEYGEN_VIDEO_SCREEN_SCALE = [1.5, 1.5, 1.5] # Scale
HEYGEN_VIDEO_SCREEN_EULER = [0, 180, 0] # Rotation
# Development mode (uses fewer credits)
HEYGEN_SANDBOX_MODE = False
Usage
Running with HeyGen Avatar
# Run with HeyGen avatar
python AI_Agent_HeyGen.py
# Or run the traditional 3D avatar version
python AI_Agent.py
Controls
| Key | Action |
|---|---|
c |
Start/stop speaking to avatar |
t |
Open text chat window |
m |
Scale video up |
n |
Scale video down |
h |
Take screenshot |
Space |
Start/end trial |
Architecture
How It Works
- Session Creation: The system creates a HeyGen session via the REST API
- LiveKit Connection: Connects to a LiveKit room for WebRTC streaming
- Video Display: Receives video frames and displays them on a quad in VR
- Audio Playback: Receives audio from the avatar for spatial playback
- Commands: Sends text to the avatar via LiveKit data channels
Modes
FULL Mode (Recommended):
- HeyGen handles speech recognition and can use its built-in AI
- You can also provide your own AI responses
- Full transcription and event support
CUSTOM Mode:
- You provide all audio input via websocket
- More control but more complex setup
Using Your Own AI vs HeyGen's AI
Your Own AI (HEYGEN_USE_BUILTIN_AI = False):
# Your AI generates the response
response = your_ai_model.generate(user_input)
# Send to HeyGen just for speech synthesis
heygen_session.speak_text(response)
HeyGen's Built-in AI (HEYGEN_USE_BUILTIN_AI = True):
# HeyGen handles AI response generation
heygen_session.speak_response(user_input)
API Reference
HeyGen_Avatar Module
import HeyGen_Avatar
# Initialize session
session = HeyGen_Avatar.initialize_heygen(
api_key="your-key",
avatar_id="your-avatar"
)
# Set up video display in Vizard
display = HeyGen_Avatar.setup_heygen_display(viz, vizfx)
# Make avatar speak
HeyGen_Avatar.speak("Hello, how can I help you?")
# Use HeyGen's AI for response
HeyGen_Avatar.speak_response("What is the weather?")
# Interrupt avatar
HeyGen_Avatar.interrupt()
# Clean up
HeyGen_Avatar.cleanup_heygen()
Event Handling
# Register event callbacks
session.on_event("avatar.speak_started", lambda msg: print("Speaking..."))
session.on_event("avatar.speak_ended", lambda msg: print("Done"))
session.on_event("user.transcription", lambda msg: print(f"User: {msg['text']}"))
Available Events (FULL Mode)
Command Events (you send):
- avatar.speak_text - Speak specific text
- avatar.speak_response - Generate AI response to text
- avatar.interrupt - Stop current speech
- avatar.start_listening - Start listening mode
- avatar.stop_listening - Stop listening mode
Server Events (you receive):
- user.speak_started - User started talking
- user.speak_ended - User stopped talking
- avatar.speak_started - Avatar started talking
- avatar.speak_ended - Avatar stopped talking
- user.transcription - User speech transcribed
- avatar.transcription - Avatar response text
Troubleshooting
Common Issues
"livekit package not installed"
pip install livekit
"Session creation failed"
- Check your API key is valid
- Check your internet connection
- Verify your avatar ID exists
"No video appearing"
- Check the video screen position is visible
- Try adjusting HEYGEN_VIDEO_SCREEN_POSITION
- Check console for frame processing errors
"Avatar not speaking"
- Ensure session is connected
- Check the text is not empty
- Look for errors in console
Debug Mode
# Print detailed status
HeyGen_Avatar.print_heygen_status()
Resources
- LiveAvatar Documentation
- LiveAvatar Quick Start
- FULL Mode Events
- LiveKit Python SDK
- HeyGen Community
License
This integration is provided as-is for use with SightLab VR.
HeyGen/LiveAvatar services require a separate subscription.