This document provides an overview, configuration, dependencies, and usage examples for the AI-driven data analysis scripts in this toolkit.
1. AI_Data_Analysis.py
Purpose:
Interactively query a single CSV file from your experiment’s data tree using natural language via either OpenAI or a local Ollama model.
Key Features:
Choice of AI backend at launch: CHAT_GPT or OFFLINE_OLLAMA.
Configurable offline model (default: gemma3).
Automatic context generation including:
First 20 rows of the dataset (adjustable). If parsing lots of files or the trial_data that has lots of timestamps may need to significantly increase this number.
Summary statistics (DataFrame.describe()).
Top 10 numeric correlation pairs (adjustable)
Create custom study context file to guide data summaries
Text-to-speech (TTS) support when using cloud GPT (Nova voice).
Configuration Constants:
OFFLINE_MODEL="gemma3"# default Ollama model namePRODUCE_SUMMARY=True# toggle automatic summary on startupSTUDY_CONTEXT=""""""# inline fallback context (optional)STUDY_CONTEXT_FILE="study_context.txt"# external notes (set "" to disable)MAX_PREVIEW_ROWS=20# rows of raw data to include in contextMAX_CORR_PAIRS=10# top correlation pairs
Environment variable OPENAI_API_KEY for cloud usage.
Usage:
Place script within your project.
Run:
AI_Data_Analysis.py
3. Select AI backend when prompted.
4. Choose a CSV file from the scanned directory.
5. If PRODUCE_SUMMARY is set to True it will automatically produce a summary followed by the option to ask follow up questions. Note: It may take some time to produce the summary depending on the amount of data (at least around 15-20 seconds if not longer)
6. Enter natural-language questions at the prompt; type exit to quit.
2. AI_Data_Analysis_Entire_Folder.py
Purpose:
Sweep the entire experiment data tree for CSV files, build compact summaries per file, and allow natural-language interrogation across the full collection.
Key Features:
Scans Data/.
Summarizes each CSV with limited preview rows and correlation pairs.
AI backend options: CHAT_GPT (with TTS) or OFFLINE_OLLAMA.