Skip to main content

Features Overview

Actyze delivers enterprise-grade features for secure, real-time analytics across your live data — without ETL pipelines, vendor lock-in, or feature gates. Every feature listed below is free and unlimited under the AGPL v3 open-source license.

Core Capabilities

Natural Language to SQL

Ask questions in plain English (or 50+ other languages) and get accurate SQL queries. Recommended: Claude Sonnet 4.5, GPT-4o, or equivalent high-performance models for best accuracy.

"Show me top 10 customers by revenue in Q4"

SELECT customer_name, SUM(revenue) as total_revenue
FROM sales
WHERE quarter = 'Q4'
GROUP BY customer_name
ORDER BY total_revenue DESC
LIMIT 10

Multilingual Support

Query your data in 50+ languages using our advanced multilingual semantic search:

Supported Languages:

  • European: English, German, French, Spanish, Italian, Portuguese, Dutch, Polish, Russian, Czech, Swedish, Danish, Norwegian, Finnish, Romanian, Bulgarian, Greek, Ukrainian
  • Asian: Chinese (Simplified & Traditional), Japanese, Korean, Thai, Vietnamese, Indonesian, Malay, Hindi, Bengali, Tamil, Telugu, Marathi, Urdu, Nepali
  • Middle Eastern: Arabic, Hebrew, Persian (Farsi), Turkish
  • Others: Afrikaans, Albanian, Azerbaijani, Basque, Belarusian, Bosnian, Catalan, Croatian, Estonian, Galician, Georgian, Gujarati, Hungarian, Icelandic, Irish, Kannada, Kazakh, Kurdish, Kyrgyz, Latvian, Lithuanian, Macedonian, Malayalam, Maltese, Marathi, Mongolian, Punjabi, Serbian, Sinhala, Slovak, Slovenian, Somali, Swahili, Tagalog, Tajik, Tatar, Uzbek, Welsh, Yiddish

Powered by:

  • Multilingual MPNet (paraphrase-multilingual-mpnet-base-v2) - Primary semantic search engine that understands queries across all 50+ languages
  • spaCy NER (English) - Lightweight entity extraction for enhanced query understanding

The heavy lifting is performed by our multilingual embedding model, enabling accurate schema matching regardless of the query language.

Intelligent Schema Detection

FAISS-powered semantic search combined with spaCy Named Entity Recognition automatically identifies the right tables and columns for your queries.

Multi-Database Support

Connect to multiple data sources through Trino connectors:

  • Relational: PostgreSQL, MySQL, Oracle, SQL Server
  • Cloud Warehouses: Snowflake, Databricks, BigQuery, Redshift
  • NoSQL: MongoDB, Cassandra, Elasticsearch
  • Data Lakes: Iceberg, Delta Lake, Hudi

Database Connectors Guide

Key Features

Scheduled KPIs (Gold Layer)

Define SQL-based KPIs that are collected on a schedule (every 1-24 hours) and materialized as real typed Postgres tables in the kpi_data schema. Each KPI creates a first-class queryable table discoverable by the AI via FAISS indexing.

  • Define any SQL query as a recurring KPI
  • Collection interval from 1 to 24 hours
  • Results materialized as real typed columns (not JSONB blobs)
  • Auto-registered with FAISS for AI/natural language discovery
  • Time-series data with collected_at timestamp for trend analysis
  • Import SQL from saved queries
  • RBAC enforced: ADMIN and USER can create, READONLY can view

Use cases: pre-aggregated daily revenue, hourly active users, periodic inventory snapshots — like Databricks gold tables or Power BI import mode, using PostgreSQL.

Predictive Intelligence

Create ML-powered predictions from your KPI data — no data science knowledge required. Choose what you want to predict, select your data, and the system handles model selection, training, and output automatically.

How it works:

  1. Choose a prediction type: Forecast (future values), Classify (group membership), Estimate (expected numbers), or Detect (anomalies)
  2. Select data source: an existing KPI or a custom SQL query (supports multi-table JOINs via Trino)
  3. System auto-validates data quality, selects the best ML model, and trains
  4. Predictions stored in prediction_data schema — queryable via NL→SQL just like any other table

Models (per-image, independently deployable via Helm):

  • XGBoost — tabular classification, regression, and anomaly detection via Isolation Forest (churn, fraud, CLV, scoring, outlier identification)
  • LightGBM — fast tabular predictions, preferred for large datasets (>100K rows)
  • AutoGluon TimeSeries — multivariate ensemble forecasting with ARIMA + ETS + Theta + tree models. Supports covariates — forecast revenue considering ad spend, promotions, and seasonality together

Data quality gate:

  • Blocks training when data is insufficient (e.g., "Need 60+ data points for 30-day forecast")
  • Warns about missing values, class imbalance, limited data, and time-aggregated data used for classification
  • Business-friendly accuracy display: "Predictions within ±8% of actual values" instead of raw ML metrics

Training triggers:

  • After KPI collection (automatic retrain when new data arrives)
  • On a schedule (daily/weekly)
  • Manual (on-demand)

Post-prediction actions:

  • "Explore in Queries" — full query page with CSV export, charts, and NL→SQL
  • "Get Recommendations" — LLM analyzes prediction results and suggests actionable next steps

Industry use cases:

IndustryForecastClassifyEstimateDetect
E-commerceDemand forecasting, revenue projectionsCustomer churn, fraud detectionCustomer lifetime value, lead scoringUnusual transactions, pricing anomalies
RetailInventory planning, seasonal demandPromotion effectiveness, store clusteringPrice optimization, basket valueInventory shrinkage, POS irregularities
SaaSMRR/ARR forecasting, usage trendsChurn risk, trial-to-paid conversionExpansion revenue, support ticket priorityUsage spikes, abnormal API patterns
FinanceCash flow forecasting, market trendsCredit risk, transaction fraudPortfolio value, risk scoringSuspicious transactions, market anomalies
HealthcarePatient volume, resource planningReadmission risk, diagnosis classificationTreatment cost, length of stayAbnormal lab results, billing outliers
ManufacturingProduction demand, supply planningQuality defect detection, equipment failureMaintenance cost, yield optimizationSensor anomalies, process deviations
LogisticsShipment volume, delivery timeRoute optimization, delay riskFreight cost, warehouse capacityDelivery exceptions, fleet irregularities

CSV & Excel Upload

Upload data files directly into Actyze for instant analysis:

  • CSV files with UTF-8 encoding
  • Excel files (.xlsx) with single worksheet
  • Automatic schema detection
  • Custom column type mapping
  • Temporary or permanent tables

CSV Upload Guide

Role-Based Access Control (RBAC)

Three-tier permission system for secure data access:

  • ADMIN: Full system access, user management, schema configuration
  • USER: Create dashboards, upload files, query accessible data
  • READONLY: View-only access to data intelligence

RBAC Guide

Preferred Tables

Mark up to 25 tables (configurable) as preferred. The AI receives full column metadata and descriptions for your preferred tables on every query, prioritizing them over other recommendations.

Preferred Tables Guide

Metadata Descriptions

Add business context to your data with organization-level metadata:

  • Describe catalogs, schemas, tables, and columns
  • Improve AI understanding of your data
  • Better query recommendations
  • Team-wide knowledge sharing

Metadata Guide

Semantic Relationship Graph

The relationship graph stores verified table-to-table JOIN relationships and provides them to the LLM for accurate multi-table queries:

  • Convention inference — auto-detects *_id/*_fk column patterns
  • Query history mining — extracts proven JOINs from past queries
  • Admin curation — manual creation with confidence scoring
  • BFS join path resolution — finds optimal paths across tables
  • Usage tracking — relationships that get used rank higher over time

Manage via Data Intelligence → Relationships tab. Enabled by default (RELATIONSHIP_GRAPH_ENABLED=true).

Feature Matrix

FeatureUSERADMINREADONLY
Query with natural language
View query results
Create/edit dashboardsNo
Create scheduled KPIsNo
Create prediction pipelinesNo
Upload CSV/Excel filesNo
Mark preferred tablesNo
Add metadata descriptionsNo
Manage table relationshipsNoNo
User managementNoNo
Configure data accessNoNo
Manage schema visibilityNoNo

Security Features

Data Access Control

  • Granular permissions at catalog, schema, and table levels
  • User-level and group-level access rules
  • Automatic enforcement through Trino access control

Authentication

  • JWT-based authentication
  • Session management
  • Secure password hashing

Audit Logging

  • Query history tracking
  • User action logging
  • Access attempt monitoring

Performance Features

Query Optimization

  • Intelligent query caching
  • Automatic query rewriting
  • Connection pooling

Scalability

  • Horizontal pod autoscaling (Kubernetes)
  • Load balancing support
  • Efficient resource management

Next Steps

Explore individual feature guides:

  1. Smart Relationship Graph - How Actyze learns table connections for accurate JOINs
  2. Predictive Intelligence - No-code ML predictions (forecast, classify, estimate, detect)
  3. Scheduled KPIs - Automated gold-layer materialization
  4. Database Connectors - Connect to your data sources
  5. CSV Upload - Import data files
  6. RBAC - Configure user permissions
  7. Preferred Tables - Personalize AI recommendations
  8. Metadata - Add business context

Or learn how to configure Actyze: