Welcome to Actyze
Actyze is a fully open-source, AI-native analytics platform that lets anyone ask questions across live data and act on insights — without ETL, license keys, or vendor lock-in. Licensed under AGPL v3.
Why Actyze?
Most analytics tools either lack AI capabilities (Metabase, Superset) or lock AI features behind proprietary clouds (Snowflake Cortex, Databricks Genie). Actyze is different:
- Fully open source under AGPL v3 — no feature gates, no license keys, no paywalls
- NL-to-SQL with a smart semantic layer — FAISS-powered schema matching, relationship graph that learns your JOINs, ML-based intent detection
- Predictive intelligence built in — forecast revenue, detect anomalies, classify customers, and estimate values with no-code ML pipelines (XGBoost, LightGBM, AutoGluon)
- Scheduled KPIs on autopilot — materialize gold-layer tables on a schedule, trigger ML retraining when data arrives
- Voice AI assistant — talk to your data, hands-free
- Federated querying via Trino — query any database (PostgreSQL, MySQL, Snowflake, MongoDB, and more) from a single interface
- Self-host in minutes with Docker Compose — your data never leaves your infrastructure
"Show me total revenue by region for Q4 2025"
↓
SELECT region, SUM(revenue) as total_revenue
FROM sales
WHERE quarter = 'Q4' AND year = 2025
GROUP BY region
How Actyze Compares
| Capability | Actyze | Metabase | Superset | Snowflake Cortex | Databricks Genie |
|---|---|---|---|---|---|
| Open source | AGPL v3 | AGPL v3 | Apache 2.0 | Proprietary | Proprietary |
| NL-to-SQL | 50+ languages | No | No | English only | English only |
| Voice AI assistant | Built in | No | No | No | No |
| Predictive intelligence (ML) | Built in (XGBoost, LightGBM, AutoGluon) | No | No | Snowflake ML (paid) | MLflow (separate) |
| Scheduled KPIs | Built in | No | No | Tasks (paid) | Jobs (paid) |
| Smart relationship graph | Auto-learns JOINs | No | No | No | No |
| Federated queries (Trino) | Any database | Limited | Limited | Snowflake only | Databricks only |
| LLM flexibility | 100+ providers | N/A | N/A | Snowflake LLM | Databricks LLM |
| Self-hosted | Docker + K8s | Docker + K8s | Docker + K8s | Cloud only | Cloud only |
| Feature gates / license keys | None | None | None | Per-credit pricing | Per-credit pricing |
Platform Capabilities
AI Query Engine
Natural Language to SQL — Ask questions in plain English (or 50+ other languages) and get accurate SQL. Recommended: Claude Sonnet 4.5, GPT-4o, or equivalent models.
Smart Intent Detection — ML-based classification understands whether you want a new query, a refinement, or a correction. Prevents redundant LLM calls and reduces hallucination.
Voice AI Assistant — Speak your question. Actyze converts speech to SQL and reads back results. Supports Web Speech API and OpenAI Whisper.
100+ LLM Providers — Bring your own LLM via LiteLLM — OpenAI, Anthropic, Google, Groq, Together AI, Ollama, Azure, AWS Bedrock, and more. No vendor lock-in.
Smart Semantic Layer
The semantic layer is what makes Actyze's SQL generation accurate instead of guessed.
FAISS Schema Matching — Multilingual MPNet embeddings encode your entire schema. When you ask a question, FAISS finds the most relevant tables in under 100ms.
Smart Relationship Graph — Actyze builds a living map of how your tables connect through three layers:
- Inferred — auto-detects foreign key patterns from column names (
customer_id → customers.id) - Mined — parses your successful query history to discover proven JOIN patterns
- Admin-verified — admins can create, verify, or disable relationships through the UI
When a query spans multiple tables, BFS traversal finds the optimal join path and passes it to the LLM — so JOINs are correct, not hallucinated.
Entity Recognition — spaCy NER detects people, locations, products, and dates in your query, boosting the right tables in recommendations.
Metadata & Descriptions — Add business context at catalog, schema, table, and column levels. Descriptions are embedded into FAISS vectors, improving semantic matching.
Preferred Tables — Mark frequently-used tables as preferred. The AI receives their full metadata on every query, prioritizing them over other tables.
Schema Governance — Admins can hide tables from AI recommendations, manage exclusions at database/schema/table level, and control what the LLM can see.
→ Relationship Graph Guide | → Metadata Guide | → Preferred Tables Guide
Predictive Intelligence
Build ML prediction pipelines from your data without writing code. Choose an outcome, pick your data, and Actyze handles model selection, training, and deployment.
Four prediction types:
| Type | What it does | Example | Model |
|---|---|---|---|
| Forecast | Predict future values over time | Revenue for next 30 days | AutoGluon, XGBoost |
| Classify | Predict categories | Which customers will churn | LightGBM, XGBoost |
| Estimate | Predict continuous values | Customer lifetime value | LightGBM, XGBoost |
| Detect | Find anomalies (unsupervised) | Unusual transactions | Isolation Forest |
How it works:
- Choose prediction type and data source (KPI table or custom SQL)
- Actyze validates data quality, selects the best model, and trains automatically
- Predictions land in a queryable table — ask "show churn predictions" and the AI finds them
Three ML workers — deployed as separate containers, independently scalable:
- XGBoost — tabular classification, regression, anomaly detection
- LightGBM — fast gradient boosting for large datasets (>100K rows)
- AutoGluon — multivariate time-series forecasting with ensemble models
Data quality gates block training on insufficient data and warn about class imbalance, missing values, or low sample sizes.
Training triggers — retrain after KPI collection (automatic), on a schedule, or manually.
Business-friendly accuracy — "Predictions within ±8% of actual values" instead of raw RMSE.
→ Predictive Intelligence Guide
Scheduled KPIs
Define SQL queries that run on a recurring schedule and materialize results into real, typed PostgreSQL tables — creating a reliable gold layer without ETL.
- Collection interval from 1 to 24 hours — data appends with
collected_attimestamps - Real typed columns — proper PostgreSQL types, not JSONB blobs
- AI-discoverable — tables register with FAISS automatically, queryable via natural language
- Prediction triggers — link a KPI to an ML pipeline; when fresh data arrives, the model retrains automatically
Schedule collects data → KPI table updated → ML model retrains → Fresh predictions available
Dashboards & Visualization
- Interactive dashboards with multiple tiles, each powered by a saved query
- AI-recommended charts — the LLM suggests bar, line, scatter, or pie based on your data
- Plotly.js charts with drill-down, zoom, and export
- CSV/Excel export from any query result
Multi-Database Federation
Connect to any database through Trino:
- Relational: PostgreSQL, MySQL, Oracle, SQL Server
- Cloud warehouses: Snowflake, Databricks, BigQuery, Redshift
- NoSQL: MongoDB, Cassandra, Elasticsearch
- Data lakes: Iceberg, Delta Lake, Hudi
Query across all of them in a single natural language question — no data movement required.
CSV & Excel Upload
Upload CSV or Excel files for instant analysis. Files are stored as queryable tables that can be joined with your live databases.
Security & Access Control
- Self-hosted — deploy in your infrastructure, data never leaves your network
- RBAC — four roles (ADMIN, EDITOR, VIEWER, READONLY) with granular permissions
- Data access control at catalog, schema, and table levels
- Air-gapped deployment support for compliance-heavy environments
Architecture
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Frontend │ │ Nexus API │ │ PostgreSQL │
│ (React+nginx) │───▶│ (FastAPI) │───▶│ (Metadata DB) │
└─────────────────┘ └──────────────────┘ └─────── ──────────┘
│
┌─────────┼──────────┐
▼ ▼
┌──────────────┐ ┌──────────────────┐
│Schema Service│ │ Prediction Workers│
│ (FAISS+spaCy)│ │ (XGBoost/LightGBM │
└──────────────┘ │ /AutoGluon) │
│ └──────────────────┘
▼
┌──────────────┐
│ Trino │
│ (Optional) │
└──────────────┘
│
┌─────────┬──┴───┬──────────┐
▼ ▼ ▼ ▼
┌──────────┐┌──────┐┌──────────┐┌─────────┐
│PostgreSQL││MySQL ││Snowflake ││ MongoDB │
│ Your Data││ ││ ││ │
└──────────┘└──────┘└──────────┘└─────────┘
Components
- Frontend — React web application with voice input, dashboard builder, and data intelligence UI
- Nexus API — FastAPI orchestration service handling LLM integration, query management, KPI scheduling, prediction pipelines, and relationship graph
- PostgreSQL — stores user data, query history, metadata, KPI definitions, prediction configs, and relationship graph
- Schema Service — FAISS vector search with multilingual MPNet embeddings and spaCy NER for schema recommendations and intent detection
- Prediction Workers — XGBoost, LightGBM, and AutoGluon containers for ML training and inference (independently deployable)
- Trino — optional distributed SQL engine connecting to multiple data sources
- Your Databases — connect existing databases through Trino connectors for unified querying
Get Running in Minutes
Docker Compose (recommended for getting started)
git clone https://github.com/actyze/dashboard-docker.git
cd dashboard
cp docker/env.example docker/.env
# Add your LLM API key to docker/.env
./docker/start.sh
Access at http://localhost:3000 — no license key needed.
Kubernetes (production)
helm install dashboard ./dashboard \
--namespace actyze \
--create-namespace \
--values dashboard/values.yaml \
--values dashboard/values-secrets.yaml
LLM Provider Support
Actyze supports any LLM provider with flexible authentication:
| Provider | Best For | Response Time |
|---|---|---|
| Claude Sonnet 4.5 | Recommended — superior accuracy | 2-5s |
| GPT-4o | Recommended — excellent accuracy | 3-6s |
| Perplexity | Fast inference | 2-4s |
| Groq | Ultra-fast inference | <1s |
| Together AI | Open source models | 1-3s |
Any equivalent or higher-performance model will also provide excellent results.
See: LLM Provider Configuration
Performance
- Query generation: 1-3 seconds with external LLMs
- Schema recommendations: under 100ms (FAISS vector search)
- ML training: minutes to hours depending on data size and model
- KPI collection: runs on schedule with sub-second materialization
- Kubernetes-native: horizontal scaling for all services
Quick Start
Get started with Actyze in three steps:
- Install Actyze — choose Docker or Helm
- Configure LLM Provider — set up your AI model
- Connect Your Data Sources — configure Trino connectors
Then explore the advanced capabilities:
- Smart Relationship Graph — teach the AI how your tables connect
- Scheduled KPIs — automate gold-layer materialization
- Predictive Intelligence — build no-code ML pipelines
Use Cases
Business Intelligence
Enable business analysts to query data without SQL knowledge:
- "What were our top 10 products by revenue last quarter?"
- "Show customer churn rate by region"
Predictive Analytics
Let business users build ML models without a data science team:
- "Forecast daily revenue for the next 30 days"
- "Which customers are likely to churn?"
- "Detect anomalous transactions in payments"
Operational Analytics
Empower operations teams with real-time insights:
- "How many orders are pending shipment?"
- "Show inventory levels below reorder point"
Data Exploration
Help data scientists explore datasets faster:
- "Find all tables containing customer information"
- "Show me the schema for the sales database"
Automated KPI Monitoring
Track key metrics on autopilot:
- Schedule daily revenue, hourly active users, or periodic inventory snapshots
- Link KPIs to predictions for automatic retraining
Support & Resources
- Source Code: github.com/actyze/dashboard
- Docker Quickstart: github.com/actyze/dashboard-docker
- Helm Charts: github.com/actyze/helm-charts
- Issues: Report bugs and request features on GitHub
- License: AGPL v3 — License FAQ
Next Steps
Ready to get started? Follow our installation guide: