AI Data Analysis Tools

AI Agent

This document provides an overview, configuration, dependencies, and usage examples for the AI-driven data analysis scripts in this toolkit.

1. AI_Data_Analysis.py

Purpose:

Interactively query a single CSV file from your experiment’s data tree using natural language via either OpenAI or a local Ollama model.

Key Features:

Choice of AI backend at launch: CHAT_GPT or OFFLINE_OLLAMA (note: offline models may take a long time to load summary).
Automatic context generation including:
First 20 rows of the dataset (adjustable). If parsing lots of files or the trial_data that has lots of timestamps may need to significantly increase this number.
Summary statistics (DataFrame.describe()).
Top 10 numeric correlation pairs (adjustable)
Create custom study context file to guide data summaries
Text-to-speech (TTS) support when using cloud GPT (Nova voice).

Configuration Constants:

OFFLINE_MODEL       = "gemma3"        # default Ollama model name
PRODUCE_SUMMARY     = True            # toggle automatic summary on startup
STUDY_CONTEXT       = """"""          # inline fallback context (optional)
STUDY_CONTEXT_FILE  = "study_context.txt"  # external notes (set "" to disable)
MAX_PREVIEW_ROWS    = 20              # rows of raw data to include in context
MAX_CORR_PAIRS      = 10              # top correlation pairs

Dependencies:

pandas, vizinput, openai (for cloud), vlc (for audio playback)
Environment variable OPENAI_API_KEY for cloud usage.

Usage:

Place script within your project (or run SightLab_VR.py script in folder).
Run:

AI_Data_Analysis.py

3. Select AI backend when prompted.
4. Choose a CSV file from the scanned directory.
5. If PRODUCE_SUMMARY is set to True it will automatically produce a summary followed by the option to ask follow up questions. Note: It may take some time to produce the summary depending on the amount of data (at least around 15-20 seconds if not longer)
6. Enter natural-language questions at the prompt; type exit to quit.

2. AI_Data_Analysis_Entire_Folder.py

Purpose:

Sweep the entire experiment data tree for CSV files, build compact summaries per file, and allow natural-language interrogation across the full collection.

Key Features:

Scans Data/.
Summarizes each CSV with limited preview rows and correlation pairs.
AI backend options: CHAT_GPT (with TTS) or OFFLINE_OLLAMA.
Adjustable token budget knobs:
MAX_PREVIEW_ROWS (default 5)
MAX_CORR_PAIRS (default 5)

Configuration Constants:

MAX_PREVIEW_ROWS = 5
MAX_CORR_PAIRS   = 5
OFFLINE_MODEL    = "gemma3"

Dependencies:

Same as AI_Data_Analysis.py.

Usage:

Place script in your experiment root.
Run:

AI_Data_Analysis_Entire_Folder.py

3. Choose AI backend when prompted.
4. Ask questions across all CSVs; type exit to quit.

3. Auto_AI_View_Detection.py

Purpose:

Analyze extracted video frames using OpenAI’s Vision API (GPT-4o-mini) to detect objects of interest and compute dwell times based on frame counts.

Key Features:

Encodes PNG frames as Base64 and sends to the Vision API.
Threshold-based dwell detection (FRAMES_FOR_DWELL, default 30 frames).
Aggregates object mention counts to approximate attention/dwell metrics.
Logs analysis results to timestamped text files in Analysis_Outputs/.

Configuration Constants:

FRAMES_FOR_DWELL = 30

Dependencies:

openai, pandas, Python standard libraries, and environment variable OPENAI_API_KEY.

Workflow:

Record session: Use SightLab’s built-in screen recording to capture your VR session.
Add video: Place the recorded .mp4/.avi file into the replay_videos/ folder.
Extract frames: Run Convert Video to Images.py to generate individual PNG frames in extracted_frames/.
Detect views: Execute Auto_AI_View_Detection.py, which analyzes those frames and saves output to Analysis_Outputs/analysis_output_<timestamp>.txt.
Summarize interactions: Run object_interactions.py on the same frame set or analysis files to compute summary metrics of viewed objects.
Follow up: Use Follow_Up_Questions.py to generate deeper AI-driven follow-up questions based on the saved analysis.

Usage:

After extracting frames via the workflow above, run:

Auto_AI_View_Detection.py

2. Enter the number of frames to analyze (0 to process all).
3. Review the generated analysis in Analysis_Outputs/.

4. Convert Video to Images.py

Purpose:

Extract frames from session videos saved in replay_videos/ (.mp4/.avi) into individual PNG images.

Key Features:

GUI prompt to select a video file from replay_videos/ using vizinput.
Batch extraction of frames via OpenCV.
Saves frames sequentially as extracted_frames/frame_####.png.

Dependencies:

opencv-python (cv2), vizinput.

Usage:

Place your .mp4/.avi files in replay_videos/.
Run in Vizard:

"Convert Video to Images.py"

3. Select a video when prompted.
4. Frames are saved in extracted_frames/.

5. Follow_Up_Questions.py

Purpose:

Generate AI-driven follow-up questions or deeper analysis based on existing analysis output texts.

Key Features:

Lists and selects analysis files from Analysis_Outputs/.
Loads optional context_prompt.txt to seed the AI prompt.
Uses OpenAI Chat Completion with TTS playback (optional).
Writes follow-up answers to Analysis_Answers/ with timestamped filenames.

Configuration Constants:

SPEAK_RESPONSE = True
OPEN_AI_TTS_MODEL = "tts-1"
OPEN_AI_VOICE = "nova"

Dependencies:

openai, vlc, vizinput, and environment variable OPENAI_API_KEY.

Usage:

Ensure Analysis_Outputs/ contains .txt files.
(Optional) Create context_prompt.txt for custom AI context.
Run:

Follow_Up_Questions.py

4. Choose a file; AI-generated follow-ups appear and are saved in Analysis_Answers/.

6. object_interactions.py

Purpose:

Similar to Auto_AI_View_Detection.py, this script analyzes object interactions from frame sequences, detecting and counting via Vision API.

Key Features:

Reads frames from extracted_frames/.
Uses a local key.txt for the OpenAI API key.
Logs dwell-based interaction counts to timestamped files in Analysis_Outputs/.

Configuration Constants:

FRAMES_FOR_DWELL = 30  # frames threshold

Dependencies:

openai, pandas, and Python standard libraries.

Usage:

After obtaining an openai key, In windows search type "cmd" enter setx OPENAI_API_KEY "your-api-key"Restart Vizard
Ensure extracted_frames/ contains frames.
Run:

object_interactions.py

4. Enter number of frames to analyze.
5. Check Analysis_Outputs/ for results.