AI Data Analysis Tools

This document provides an overview, configuration, dependencies, and usage examples for the AI-driven data analysis scripts in this toolkit.
1. AI_Data_Analysis.py
Purpose:
- Interactively query a single CSV file from your experiment’s data tree using natural language via either OpenAI or a local Ollama model.
Key Features:
- Choice of AI backend at launch:
CHAT_GPTorOFFLINE_OLLAMA(note: offline models may take a long time to load summary). -
Automatic context generation including:
-
First 20 rows of the dataset (adjustable). If parsing lots of files or the trial_data that has lots of timestamps may need to significantly increase this number.
- Summary statistics (
DataFrame.describe()). - Top 10 numeric correlation pairs (adjustable)
- Create custom study context file to guide data summaries
- Text-to-speech (TTS) support when using cloud GPT (Nova voice).
Configuration Constants:
OFFLINE_MODEL = "gemma3" # default Ollama model name
PRODUCE_SUMMARY = True # toggle automatic summary on startup
STUDY_CONTEXT = """""" # inline fallback context (optional)
STUDY_CONTEXT_FILE = "study_context.txt" # external notes (set "" to disable)
MAX_PREVIEW_ROWS = 20 # rows of raw data to include in context
MAX_CORR_PAIRS = 10 # top correlation pairs
Dependencies:
pandas,vizinput,openai(for cloud),vlc(for audio playback)- Environment variable
OPENAI_API_KEYfor cloud usage.
Usage:
- Place script within your project (or run
SightLab_VR.pyscript in folder). - Run:
AI_Data_Analysis.py
3. Select AI backend when prompted.
4. Choose a CSV file from the scanned directory.
5. If PRODUCE_SUMMARY is set to True it will automatically produce a summary followed by the option to ask follow up questions. Note: It may take some time to produce the summary depending on the amount of data (at least around 15-20 seconds if not longer)
6. Enter natural-language questions at the prompt; type
exit to quit.
2. AI_Data_Analysis_Entire_Folder.py
Purpose:
- Sweep the entire experiment data tree for CSV files, build compact summaries per file, and allow natural-language interrogation across the full collection.
Key Features:
- Scans
Data/. - Summarizes each CSV with limited preview rows and correlation pairs.
- AI backend options:
CHAT_GPT(with TTS) orOFFLINE_OLLAMA. -
Adjustable token budget knobs:
-
MAX_PREVIEW_ROWS(default 5) MAX_CORR_PAIRS(default 5)
Configuration Constants:
MAX_PREVIEW_ROWS = 5
MAX_CORR_PAIRS = 5
OFFLINE_MODEL = "gemma3"
Dependencies:
- Same as
AI_Data_Analysis.py.
Usage:
- Place script in your experiment root.
- Run:
AI_Data_Analysis_Entire_Folder.py
3. Choose AI backend when prompted.
4. Ask questions across all CSVs; type
exit to quit.
3. Auto_AI_View_Detection.py
Purpose:
- Analyze extracted video frames using OpenAI’s Vision API (GPT-4o-mini) to detect objects of interest and compute dwell times based on frame counts.
Key Features:
- Encodes PNG frames as Base64 and sends to the Vision API.
- Threshold-based dwell detection (
FRAMES_FOR_DWELL, default 30 frames). - Aggregates object mention counts to approximate attention/dwell metrics.
- Logs analysis results to timestamped text files in
Analysis_Outputs/.
Configuration Constants:
FRAMES_FOR_DWELL = 30
Dependencies:
openai,pandas, Python standard libraries, and environment variableOPENAI_API_KEY.
Workflow:
- Record session: Use SightLab’s built-in screen recording to capture your VR session.
- Add video: Place the recorded
.mp4/.avifile into thereplay_videos/folder. - Extract frames: Run Convert Video to Images.py to generate individual PNG frames in
extracted_frames/. - Detect views: Execute Auto_AI_View_Detection.py, which analyzes those frames and saves output to
Analysis_Outputs/analysis_output_<timestamp>.txt. - Summarize interactions: Run object_interactions.py on the same frame set or analysis files to compute summary metrics of viewed objects.
- Follow up: Use Follow_Up_Questions.py to generate deeper AI-driven follow-up questions based on the saved analysis.
Usage:
- After extracting frames via the workflow above, run:
Auto_AI_View_Detection.py
2. Enter the number of frames to analyze (
0 to process all).3. Review the generated analysis in
Analysis_Outputs/.
4. Convert Video to Images.py
Purpose:
- Extract frames from session videos saved in
replay_videos/(.mp4/.avi) into individual PNG images.
Key Features:
- GUI prompt to select a video file from
replay_videos/usingvizinput. - Batch extraction of frames via OpenCV.
- Saves frames sequentially as
extracted_frames/frame_####.png.
Dependencies:
opencv-python(cv2),vizinput.
Usage:
- Place your
.mp4/.avifiles inreplay_videos/. - Run in Vizard:
"Convert Video to Images.py"
3. Select a video when prompted.
4. Frames are saved in
extracted_frames/.
5. Follow_Up_Questions.py
Purpose:
- Generate AI-driven follow-up questions or deeper analysis based on existing analysis output texts.
Key Features:
- Lists and selects analysis files from
Analysis_Outputs/. - Loads optional
context_prompt.txtto seed the AI prompt. - Uses OpenAI Chat Completion with TTS playback (optional).
- Writes follow-up answers to
Analysis_Answers/with timestamped filenames.
Configuration Constants:
SPEAK_RESPONSE = True
OPEN_AI_TTS_MODEL = "tts-1"
OPEN_AI_VOICE = "nova"
Dependencies:
openai,vlc,vizinput, and environment variableOPENAI_API_KEY.
Usage:
- Ensure
Analysis_Outputs/contains.txtfiles. - (Optional) Create
context_prompt.txtfor custom AI context. - Run:
Follow_Up_Questions.py
4. Choose a file; AI-generated follow-ups appear and are saved in
Analysis_Answers/.
6. object_interactions.py
Purpose:
- Similar to
Auto_AI_View_Detection.py, this script analyzes object interactions from frame sequences, detecting and counting via Vision API.
Key Features:
- Reads frames from
extracted_frames/. - Uses a local
key.txtfor the OpenAI API key. - Logs dwell-based interaction counts to timestamped files in
Analysis_Outputs/.
Configuration Constants:
FRAMES_FOR_DWELL = 30 # frames threshold
Dependencies:
openai,pandas, and Python standard libraries.
Usage:
- After obtaining an openai key, In windows search type "cmd" enter
setx OPENAI_API_KEY "your-api-key"Restart Vizard - Ensure
extracted_frames/contains frames. - Run:
object_interactions.py
4. Enter number of frames to analyze.
5. Check
Analysis_Outputs/ for results.