AI Agents for VR & XR

AI Agent

The SightLab AI Agent adds interactive, conversational AI characters to VR and XR environments. Connect to major large language models, customize avatar appearance and personality, and use voice or text to have real-time conversations — all within SightLab.

Download Latest Version

Overview

The AI Agent places an avatar in your scene that you can talk to using speech recognition or text input. The avatar responds using a connected LLM, speaks back with synthesized speech, and reacts with animations and facial expressions. It works for training simulations, research studies, educational tools, interactive demos, and more.

Supported AI Models

Connect to multiple LLM providers through a single interface:

OpenAI (GPT-4o, GPT-4, and more)
Anthropic (Claude)
Google (Gemini)
Hundreds of offline models via Ollama — including DeepSeek, Gemma, Llama, Mistral, and more

Online models require an API key. Ollama models run fully offline with no internet connection needed.

Custom Avatars

Use avatars from Avaturn, ReadyPlayerMe, Mixamo, Rocketbox, Reallusion, and other sources. Avatars can be dragged and dropped into any environment using Inspector. Supported avatar features:

Idle and talking animations
Facial expressions (smile, sad, neutral) triggered during conversation
Head tracking — the avatar looks at and follows the user
Blinking and lip-sync mouth movement

Avatar

Personality & Context

Each agent's personality, backstory, expertise, and conversational style is defined through a text prompt file. Examples of what you can create:

A tutor that adapts explanations to the learner
A medical professional for clinical training
A historical figure for immersive education
A guide for onboarding or orientation

Custom agents can be saved and reused across projects. AI agents can also serve as instructional tutors trained on your own educational content — see the E-Learning Lab for a ready-made tool built around this.

Speech

Personality

Voice Interaction

Use speech recognition to talk to the agent, or type responses via text input. The agent responds with synthesized speech from one of several supported TTS engines:

Edge TTS — high quality, built in
Kokoro — high-quality offline voice synthesis
Piper — lightweight and fast offline TTS
OpenAI TTS — cloud-based voices
ElevenLabs — cloud-based voice synthesis and cloning

All TTS engines support 40+ languages and automatically adjust to the selected language.

Vision Capabilities

The agent can analyze what's visible in the scene. Ask "What do you see?" or "What are we looking at?" and the agent will process a screenshot and respond based on its surroundings. This is useful for guided tours, spatial reasoning tasks, environmental assessments, and training scenarios.

Event System

The agent can trigger actions during conversation based on context:

Facial expressions that match the tone of the conversation
Animations triggered by context (waving, nodding, gesturing)
Scene interactions — change lighting, move objects, play sounds, and more

You can also create your own custom events to extend agent behavior.

Events

Multi-Agent Conversations

Place multiple AI agents in the same scene. They can converse with each other, and you can jump in and talk to either one. Use cases include:

Group dynamics and social scenario simulations
Multi-character training exercises
Research on conversational behavior and turn-taking

Each agent has its own personality, voice, avatar, and AI model configuration.

Multi Agent

Augmented Reality & Passthrough

The AI Agent supports passthrough AR on:

Meta Quest Pro
Meta Quest 3
Varjo headsets

This places the AI agent in your real-world environment for mixed-reality use cases.

Research & Data Collection

The AI Agent integrates with SightLab's data collection and analytics tools:

Full conversation transcripts saved automatically
Eye tracking data and gaze analytics on the AI agent
Behavioral metrics and interaction logging
Visual analytics and heatmaps
Session replay to review interactions
Adaptive learning — the agent uses conversation history to improve responses

Features

Setup & Compatibility

GUI included — configure settings visually without writing code
Add to any SightLab project with a few lines of Python
Publish as a standalone executable for distribution
Works with all SightLab-supported hardware — desktop, VR headsets, and AR devices

At a Glance


Real-time conversation with AI agents in VR or XR	Multiple LLM models — online or fully offline	Custom avatars with animations, expressions, and head tracking


Customizable personalities, backstories, and conversational styles	Voice and text input with speech recognition	Multiple TTS engines with 40+ language support


Adaptive learning — the agent evolves with each conversation	Full SightLab integration: data collection, analytics, transcripts, and more

Ready to get started? See the full AI Agent documentation for setup instructions, configuration details, and examples.