Skip to content

Educational Interaction Tool- AI

This document describes how to use the Educational Interaction Tool in SightLab. This can be connected to an interactive, intelligent AI agent powered by various large language models like GPT-4 and Claude Opus. You can customize the agent's personality, use speech recognition, and leverage high-quality text-to-speech models. You can also record your own annotations that work with these features, connecting them to a virtual avatar or just using voice-over.

Tagged objects can display 3D text annotations and trigger audio explanations. With AI integration, users can ask follow-up questions. Any 3D scene object should automatically be taggable for interactions and conversational information.

Location: ExampleScripts > Education_Application_AI


Key Features

  • Interact and converse with custom AI Large Language Models in real-time VR or XR simulations.
  • Choose from OpenAI models (including GPT-4 and custom GPTs) and Anthropic models (e.g., Claude 3 Opus).
  • Customize the agent's personality, contextual awareness, emotional state, interactions, and more. Save your creations as custom agents.
  • Use speech recognition for voice or text-based interaction.
  • Select high-quality voices from OpenAI TTS or Eleven Labs (requires API).
  • Train the agent to adapt using conversation history and interactions.
  • Works seamlessly with all SightLab features, including data collection, visualizations, and transcript saving.
  • Automatically tag objects in scenes to prompt questions and present information.

Instructions

1. Installation

  • Install the required libraries using the

Vizard Package Manager

. These include:

  • openai (for OpenAI GPT agents)
  • anthropic (for Anthropic Claude agent)
  • elevenlabs (for ElevenLabs text-to-speech)
  • SpeechRecognition
  • sounddevice (pyaudio for older versions)
  • python-vlc
  • numpy (included in SightLab)

  • Install VLC Player (minimum version 3.0.20).

  • Download VLC 3.0.20

  • Note: An active internet connection is required.

  • If using Vizard 8 or higher, copy the contents of the "updated speech recognition files" into C:\Program Files\WorldViz\Vizard8\bin\lib\site-packages\speech_recognition, overwriting __init__.py and audio.py.

2. API Keys

  • Create a "keys" folder in your SightLab root directory and add text files:
  • key.txt: OpenAI API key.
  • elevenlabs_key.txt: Eleven Labs API key.
  • ffmpeg_path.txt: Path to ffmpeg's bin folder (if needed).
  • anthropic_key.txt: Anthropic API key for using Anthropic models.

3. Configuration

  • Open AI_Agent_Config_Education.py

(in configs folder) to configure options:

  • AI_MODEL: 'CHAT_GPT' or 'CLAUDE'.
  • OPENAI_MODEL: OpenAI model name (e.g., "gpt-4").
  • ANTHROPIC_MODEL: Anthropic model name (e.g., "claude-3-opus-20240229").
  • MAX_TOKENS: Number of tokens per exchange.
  • USE_SPEECH_RECOGNITION: Toggle for voice interaction.
  • SPEECH_MODEL: Choose OpenAI TTS or Eleven Labs.
  • USE_GUI: Enable or disable the GUI for environment selection.

  • Adjust avatar properties, environment settings, and more as needed.


4. Running the Script

Run AI_Agent_Education.py to start.

5. Interaction

  • Press and hold the 'c' key or RH grip button to speak. Release to stop and let the AI respond.
  • Use the mouse or RH trigger to select objects and prompt information.
  • 3D text will appear based on head position or eye gaze.

Modifying Environment and Avatars

  • Place environment models in resources/environments or update the path in the configuration file.
  • Use the SightLab VR GUI to select which objects in the scene will be interactive.

Obtaining API Keys

  • OpenAI:
  • Visit OpenAI.
  • Sign up/log in and navigate to the API section.
  • Generate and save a new secret key in key.txt.
  • Eleven Labs:
  • Log in at Eleven Labs.
  • Copy your API key into elevenlabs_key.txt.
  • Anthropic:
  • Go to Anthropic, sign up, and verify your account.
  • Save the API key in anthropic_key.txt.

Additional Information

  • Prompts for GPT models should be enclosed in quotation marks. Example: "I am..." for OpenAI; Anthropic does not require quotes.
  • Refer to ElevenLabs GitHub for more information.
  • See Connecting Assistants for integrating assistants through OpenAI.

Issues and Troubleshooting

  • Microphone Settings: Errors may occur if the microphone source conflicts between the VR headset and the system output.
  • Character Limits: Eleven Labs' free tier limits output to 10,000 characters (paid plans offer higher limits).
  • ffmpeg/mpv Errors: Ensure ffmpeg and mpv are installed and their paths are added to Vizard's environment path.

Tips

  • Environment Awareness: Take screenshots in SightLab (/ key), upload them to ChatGPT, and use the descriptions in prompts.
  • Custom Event Mapping: Modify vizconnect settings in settings.py under sightlab_utils/vizconnect_configs for speaking button events. See Vizconnect Events for more.
  • You can connect "Assistants" through the openai API, but not custom GPTs.