HeyGen LiveAvatar Integration for AI Agent

This document explains how to use HeyGen's streaming avatars with the AI Agent system.

Overview

HeyGen LiveAvatar provides photorealistic AI avatars that can speak and respond in real-time via WebRTC streaming. This integration allows you to swap out the 3D avatar for a HeyGen streaming avatar displayed on a video screen in VR.

There are two ways to use HeyGen LiveAvatar with SightLab:
1. Browser ScreenCast Method (Simple) - Run LiveAvatar in a browser and screencast it into SightLab
2. API Integration Method (Advanced) - Direct API integration for programmatic control

Method 1: Browser ScreenCast (Recommended for Quick Setup)

This is the simplest way to use HeyGen LiveAvatar with SightLab or E-Learning Lab. You run the avatar in your browser and screencast the window into the VR environment.

Step 1: Create a LiveAvatar Account

Go to LiveAvatar and sign up for an account
Log in to access the LiveAvatar dashboard

Step 2: Choose or Create an Avatar

Browse the available avatars in the LiveAvatar library
Select an existing avatar or create a new one
Customize your avatar's appearance and voice settings as desired

Step 3: Configure Your Avatar Session

You can either:
- Embed in a webpage: Get the embed code and run it on a local webpage
- Run directly from LiveAvatar: Use the LiveAvatar web interface directly

Step 4: Configure Browser Settings

Important: Turn off hardware acceleration in your browser for proper screen capture.

For Chrome:
1. Go to chrome://settings
2. Scroll to "System"
3. Toggle "Use hardware acceleration when available" OFF
4. Restart Chrome

Step 5: Select Your Microphone

Note: Using the default microphone option often doesn't work properly. You must specifically select your microphone device in the LiveAvatar settings.

In the LiveAvatar interface, open audio/microphone settings
Select your specific microphone device from the dropdown (do not use "Default")
Test that your audio is being captured

Step 6: Run the ScreenCast Script

Note: For the E-Learning Lab, you just need to drag in the "Cast" object from the Videos tab

Start the LiveAvatar session in your browser
Run the SightLab screencast script:
```
python HeyGen_ScreenCast.py
```
A window selection dialog will appear - select "liveavatarapp" from the dropdown
Put on your VR headset

Step 7: Start Your Session

Key	Action
`Space`	Start/end trial (begins recording)
`m`	Scale video screen up
`n`	Scale video screen down
`t`	Print current screen position/rotation/scale

Press Space to start the trial and begin recording
Interact with the LiveAvatar in your browser - have a conversation
Eye tracking data and Biopac data (if connected) will be collected automatically
Press Space again to end the trial

Step 8: View Your Data

After the session, your data files are saved in the data/ folder:
- experiment_summary.csv - Overview of the experiment
- trial_data/ - Individual trial CSV files with eye tracking, biometric data, etc.
- replay_data/ - Replay files for playback
- recordings/ - Video recordings of the screencast

Step 9: Replay Your Session

Run the replay script to view the recorded session with synchronized data:

python HeyGen_ScreenCast_Replay.py

Select the video file from the dialog (most recent files appear first)
The replay will show the 3D VR scene with the video feed playing back
View synchronized eye tracking visualizations and interactions

Method 2: API Integration (Advanced)

For programmatic control and tighter integration, you can use the HeyGen API directly.

Requirements

Python Packages

Install the required packages:

pip install livekit requests numpy opencv-python Pillow pyaudio

HeyGen Account

Sign up for a HeyGen/LiveAvatar account at https://www.liveavatar.com/
Get your API key from the LiveAvatar settings page
Get an Avatar ID from the available avatars

Configuration

Method 1: Environment Variables (Recommended)

Set these environment variables:

# Windows (PowerShell)
$env:HEYGEN_API_KEY = "your-api-key-here"
$env:HEYGEN_AVATAR_ID = "your-avatar-id"

# Windows (Command Prompt)
setx HEYGEN_API_KEY your-api-key-here
setx HEYGEN_AVATAR_ID your-avatar-id

# Linux/Mac
export HEYGEN_API_KEY="your-api-key-here"
export HEYGEN_AVATAR_ID="your-avatar-id"

Method 2: Config File

Edit Config_Global.py and set:

USE_HEYGEN_AVATAR = True
HEYGEN_API_KEY = "your-api-key-here"
HEYGEN_AVATAR_ID = "your-avatar-id"

Optional Configuration

In Config_Global.py:

# HeyGen Settings
HEYGEN_VOICE_ID = None  # Optional: specific voice ID
HEYGEN_CONTEXT_ID = None  # Optional: persona context
HEYGEN_LANGUAGE = "en"  # Language code
HEYGEN_MODE = "FULL"  # FULL or CUSTOM mode
HEYGEN_USE_BUILTIN_AI = False  # Use HeyGen's AI or your own

# Video Screen Position in VR
HEYGEN_VIDEO_SCREEN_POSITION = [0, 1.5, 2]  # [x, y, z]
HEYGEN_VIDEO_SCREEN_SCALE = [1.5, 1.5, 1.5]  # Scale
HEYGEN_VIDEO_SCREEN_EULER = [0, 180, 0]  # Rotation

# Development mode (uses fewer credits)
HEYGEN_SANDBOX_MODE = False

Usage

Running with HeyGen Avatar

# Run with HeyGen avatar
python AI_Agent_HeyGen.py

# Or run the traditional 3D avatar version
python AI_Agent.py

Controls

Key	Action
`c`	Start/stop speaking to avatar
`t`	Open text chat window
`m`	Scale video up
`n`	Scale video down
`h`	Take screenshot
`Space`	Start/end trial

Architecture

How It Works

Session Creation: The system creates a HeyGen session via the REST API
LiveKit Connection: Connects to a LiveKit room for WebRTC streaming
Video Display: Receives video frames and displays them on a quad in VR
Audio Playback: Receives audio from the avatar for spatial playback
Commands: Sends text to the avatar via LiveKit data channels

Modes

FULL Mode (Recommended):
- HeyGen handles speech recognition and can use its built-in AI
- You can also provide your own AI responses
- Full transcription and event support

CUSTOM Mode:
- You provide all audio input via websocket
- More control but more complex setup

Using Your Own AI vs HeyGen's AI

Your Own AI (HEYGEN_USE_BUILTIN_AI = False):

# Your AI generates the response
response = your_ai_model.generate(user_input)
# Send to HeyGen just for speech synthesis
heygen_session.speak_text(response)

HeyGen's Built-in AI (HEYGEN_USE_BUILTIN_AI = True):

# HeyGen handles AI response generation
heygen_session.speak_response(user_input)

API Reference

HeyGen_Avatar Module

import HeyGen_Avatar

# Initialize session
session = HeyGen_Avatar.initialize_heygen(
    api_key="your-key",
    avatar_id="your-avatar"
)

# Set up video display in Vizard
display = HeyGen_Avatar.setup_heygen_display(viz, vizfx)

# Make avatar speak
HeyGen_Avatar.speak("Hello, how can I help you?")

# Use HeyGen's AI for response
HeyGen_Avatar.speak_response("What is the weather?")

# Interrupt avatar
HeyGen_Avatar.interrupt()

# Clean up
HeyGen_Avatar.cleanup_heygen()

Event Handling

# Register event callbacks
session.on_event("avatar.speak_started", lambda msg: print("Speaking..."))
session.on_event("avatar.speak_ended", lambda msg: print("Done"))
session.on_event("user.transcription", lambda msg: print(f"User: {msg['text']}"))

Available Events (FULL Mode)

Command Events (you send):
- avatar.speak_text - Speak specific text
- avatar.speak_response - Generate AI response to text
- avatar.interrupt - Stop current speech
- avatar.start_listening - Start listening mode
- avatar.stop_listening - Stop listening mode

Server Events (you receive):
- user.speak_started - User started talking
- user.speak_ended - User stopped talking
- avatar.speak_started - Avatar started talking
- avatar.speak_ended - Avatar stopped talking
- user.transcription - User speech transcribed
- avatar.transcription - Avatar response text

Troubleshooting

Common Issues

"livekit package not installed"

pip install livekit

"Session creation failed"
- Check your API key is valid
- Check your internet connection
- Verify your avatar ID exists

"No video appearing"
- Check the video screen position is visible
- Try adjusting HEYGEN_VIDEO_SCREEN_POSITION
- Check console for frame processing errors

"Avatar not speaking"
- Ensure session is connected
- Check the text is not empty
- Look for errors in console

Debug Mode

# Print detailed status
HeyGen_Avatar.print_heygen_status()

Resources

License

This integration is provided as-is for use with SightLab VR.
HeyGen/LiveAvatar services require a separate subscription.