Features Overview
Actyze delivers enterprise-grade features for secure, real-time analytics across your live data — without ETL pipelines, vendor lock-in, or feature gates. Every feature listed below is free and unlimited under the AGPL v3 open-source license.
Core Capabilities
Natural Language to SQL
Ask questions in plain English (or 50+ other languages) and get accurate SQL queries. Recommended: Claude Sonnet 4.5, GPT-4o, or equivalent high-performance models for best accuracy.
"Show me top 10 customers by revenue in Q4"
↓
SELECT customer_name, SUM(revenue) as total_revenue
FROM sales
WHERE quarter = 'Q4'
GROUP BY customer_name
ORDER BY total_revenue DESC
LIMIT 10
Multilingual Support
Query your data in 50+ languages using our advanced multilingual semantic search:
Supported Languages:
- European: English, German, French, Spanish, Italian, Portuguese, Dutch, Polish, Russian, Czech, Swedish, Danish, Norwegian, Finnish, Romanian, Bulgarian, Greek, Ukrainian
- Asian: Chinese (Simplified & Traditional), Japanese, Korean, Thai, Vietnamese, Indonesian, Malay, Hindi, Bengali, Tamil, Telugu, Marathi, Urdu, Nepali
- Middle Eastern: Arabic, Hebrew, Persian (Farsi), Turkish
- Others: Afrikaans, Albanian, Azerbaijani, Basque, Belarusian, Bosnian, Catalan, Croatian, Estonian, Galician, Georgian, Gujarati, Hungarian, Icelandic, Irish, Kannada, Kazakh, Kurdish, Kyrgyz, Latvian, Lithuanian, Macedonian, Malayalam, Maltese, Marathi, Mongolian, Punjabi, Serbian, Sinhala, Slovak, Slovenian, Somali, Swahili, Tagalog, Tajik, Tatar, Uzbek, Welsh, Yiddish
Powered by:
- Multilingual MPNet (
paraphrase-multilingual-mpnet-base-v2) - Primary semantic search engine that understands queries across all 50+ languages - spaCy NER (English) - Lightweight entity extraction for enhanced query understanding
The heavy lifting is performed by our multilingual embedding model, enabling accurate schema matching regardless of the query language.
Intelligent Schema Detection
FAISS-powered semantic search combined with spaCy Named Entity Recognition automatically identifies the right tables and columns for your queries.
Multi-Database Support
Connect to multiple data sources through Trino connectors:
- Relational: PostgreSQL, MySQL, Oracle, SQL Server
- Cloud Warehouses: Snowflake, Databricks, BigQuery, Redshift
- NoSQL: MongoDB, Cassandra, Elasticsearch
- Data Lakes: Iceberg, Delta Lake, Hudi
Key Features
Scheduled KPIs (Gold Layer)
Define SQL-based KPIs that are collected on a schedule (every 1-24 hours) and materialized as real typed Postgres tables in the kpi_data schema. Each KPI creates a first-class queryable table discoverable by the AI via FAISS indexing.
- Define any SQL query as a recurring KPI
- Collection interval from 1 to 24 hours
- Results materialized as real typed columns (not JSONB blobs)
- Auto-registered with FAISS for AI/natural language discovery
- Time-series data with
collected_attimestamp for trend analysis - Import SQL from saved queries
- RBAC enforced: ADMIN and USER can create, READONLY can view
Use cases: pre-aggregated daily revenue, hourly active users, periodic inventory snapshots — like Databricks gold tables or Power BI import mode, using PostgreSQL.
Predictive Intelligence
Create ML-powered predictions from your KPI data — no data science knowledge required. Choose what you want to predict, select your data, and the system handles model selection, training, and output automatically.
How it works:
- Choose a prediction type: Forecast (future values), Classify (group membership), Estimate (expected numbers), or Detect (anomalies)
- Select data source: an existing KPI or a custom SQL query (supports multi-table JOINs via Trino)
- System auto-validates data quality, selects the best ML model, and trains
- Predictions stored in
prediction_dataschema — queryable via NL→SQL just like any other table
Models (per-image, independently deployable via Helm):
- XGBoost — tabular classification, regression, and anomaly detection via Isolation Forest (churn, fraud, CLV, scoring, outlier identification)
- LightGBM — fast tabular predictions, preferred for large datasets (>100K rows)
- AutoGluon TimeSeries — multivariate ensemble forecasting with ARIMA + ETS + Theta + tree models. Supports covariates — forecast revenue considering ad spend, promotions, and seasonality together
Data quality gate:
- Blocks training when data is insufficient (e.g., "Need 60+ data points for 30-day forecast")
- Warns about missing values, class imbalance, limited data, and time-aggregated data used for classification
- Business-friendly accuracy display: "Predictions within ±8% of actual values" instead of raw ML metrics
Training triggers:
- After KPI collection (automatic retrain when new data arrives)
- On a schedule (daily/weekly)
- Manual (on-demand)
Post-prediction actions:
- "Explore in Queries" — full query page with CSV export, charts, and NL→SQL
- "Get Recommendations" — LLM analyzes prediction results and suggests actionable next steps
Industry use cases:
| Industry | Forecast | Classify | Estimate | Detect |
|---|---|---|---|---|
| E-commerce | Demand forecasting, revenue projections | Customer churn, fraud detection | Customer lifetime value, lead scoring | Unusual transactions, pricing anomalies |
| Retail | Inventory planning, seasonal demand | Promotion effectiveness, store clustering | Price optimization, basket value | Inventory shrinkage, POS irregularities |
| SaaS | MRR/ARR forecasting, usage trends | Churn risk, trial-to-paid conversion | Expansion revenue, support ticket priority | Usage spikes, abnormal API patterns |
| Finance | Cash flow forecasting, market trends | Credit risk, transaction fraud | Portfolio value, risk scoring | Suspicious transactions, market anomalies |
| Healthcare | Patient volume, resource planning | Readmission risk, diagnosis classification | Treatment cost, length of stay | Abnormal lab results, billing outliers |
| Manufacturing | Production demand, supply planning | Quality defect detection, equipment failure | Maintenance cost, yield optimization | Sensor anomalies, process deviations |
| Logistics | Shipment volume, delivery time | Route optimization, delay risk | Freight cost, warehouse capacity | Delivery exceptions, fleet irregularities |
CSV & Excel Upload
Upload data files directly into Actyze for instant analysis:
- CSV files with UTF-8 encoding
- Excel files (.xlsx) with single worksheet
- Automatic schema detection
- Custom column type mapping
- Temporary or permanent tables
Role-Based Access Control (RBAC)
Three-tier permission system for secure data access:
- ADMIN: Full system access, user management, schema configuration
- USER: Create dashboards, upload files, query accessible data
- READONLY: View-only access to data intelligence
Preferred Tables
Mark up to 25 tables (configurable) as preferred. The AI receives full column metadata and descriptions for your preferred tables on every query, prioritizing them over other recommendations.
Metadata Descriptions
Add business context to your data with organization-level metadata:
- Describe catalogs, schemas, tables, and columns
- Improve AI understanding of your data
- Better query recommendations
- Team-wide knowledge sharing
Semantic Relationship Graph
The relationship graph stores verified table-to-table JOIN relationships and provides them to the LLM for accurate multi-table queries:
- Convention inference — auto-detects
*_id/*_fkcolumn patterns - Query history mining — extracts proven JOINs from past queries
- Admin curation — manual creation with confidence scoring
- BFS join path resolution — finds optimal paths across tables
- Usage tracking — relationships that get used rank higher over time
Manage via Data Intelligence → Relationships tab. Enabled by default (RELATIONSHIP_GRAPH_ENABLED=true).
Feature Matrix
| Feature | USER | ADMIN | READONLY |
|---|---|---|---|
| Query with natural language | |||
| View query results | |||
| Create/edit dashboards | No | ||
| Create scheduled KPIs | No | ||
| Create prediction pipelines | No | ||
| Upload CSV/Excel files | No | ||
| Mark preferred tables | No | ||
| Add metadata descriptions | No | ||
| Manage table relationships | No | ✅ | No |
| User management | No | No | |
| Configure data access | No | No | |
| Manage schema visibility | No | No |
Security Features
Data Access Control
- Granular permissions at catalog, schema, and table levels
- User-level and group-level access rules
- Automatic enforcement through Trino access control
Authentication
- JWT-based authentication
- Session management
- Secure password hashing
Audit Logging
- Query history tracking
- User action logging
- Access attempt monitoring
Performance Features
Query Optimization
- Intelligent query caching
- Automatic query rewriting
- Connection pooling
Scalability
- Horizontal pod autoscaling (Kubernetes)
- Load balancing support
- Efficient resource management
Next Steps
Explore individual feature guides:
- Smart Relationship Graph - How Actyze learns table connections for accurate JOINs
- Predictive Intelligence - No-code ML predictions (forecast, classify, estimate, detect)
- Scheduled KPIs - Automated gold-layer materialization
- Database Connectors - Connect to your data sources
- CSV Upload - Import data files
- RBAC - Configure user permissions
- Preferred Tables - Personalize AI recommendations
- Metadata - Add business context
Or learn how to configure Actyze:
- LLM Providers - Set up AI models
- Database Connectors - Configure Trino
- Environment Variables - All settings