Contributing to the Semantic Layer

Actyze's semantic layer is what turns natural-language questions into accurate SQL. It relies on community-contributed knowledge packs that teach the system about industry-specific terminology, relationships, metrics, and query patterns.

This guide explains each contribution type and how to submit yours.

Synonym Packs

Synonym packs map business terms to database column names so the AI understands domain-specific language.

Each pack is a YAML file organised by industry:

# synonyms/ecommerce.yaml
synonyms:
  - term: "revenue"
    columns: ["total_amount", "order_total", "gross_revenue", "net_sales"]
  - term: "customer"
    columns: ["buyer", "client", "account", "purchaser"]
  - term: "SKU"
    columns: ["product_id", "item_code", "product_code"]
  - term: "churn"
    columns: ["is_churned", "churn_flag", "subscription_status"]

Guidelines:

Group synonyms by industry (e-commerce, SaaS, finance, healthcare, etc.).
Include common abbreviations and acronyms.
Keep mappings general enough to apply across different schemas.

Relationship Graph (Implemented)

The relationship graph is a PostgreSQL-backed store of table-to-table JOIN relationships that the LLM uses during SQL generation. Instead of guessing JOINs from column names, it gets verified relationship paths with confidence scores.

How It Works

Relationships are populated through three layers:

Layer	How	Confidence	Effort
Convention inference	Auto-detects `_id`, `_uuid`, `*_fk` patterns	0.4–0.6	Zero (runs in background)
Query history mining	Parses JOINs from successful past SQL	0.5–0.9	Zero (runs in background)
Admin curation	Manual entry via Data Intelligence → Relationships UI	1.0	Low (point and click)

Managing Relationships

Go to Data Intelligence → Relationships tab
Use Auto-Infer to detect relationships from column naming conventions
Use Mine History to extract JOIN patterns from past queries
Use + Add to manually create relationships
Verify (✓) correct relationships → confidence becomes 100%
Disable (🚫) incorrect ones → excluded from LLM prompts

Configuration

# Enable/disable in docker-compose.yml or Helm values
RELATIONSHIP_GRAPH_ENABLED=true   # Default: true

When disabled, falls back to column-name inference (the previous behavior).

API Endpoints

GET  /api/relationships                    # List (filterable by catalog/method)
GET  /api/relationships/graph/path?tables=a,b  # Find optimal join path (BFS)
POST /api/relationships                    # Create (admin)
POST /api/relationships/{id}/verify        # Verify → confidence=1.0 (admin)
POST /api/relationships/{id}/disable       # Soft delete (admin)
POST /api/relationships/infer              # Trigger convention inference (admin)
POST /api/relationships/mine               # Mine query history (admin)

Verified Query Templates

Verified query templates are pre-approved SQL patterns for common business questions. They serve as few-shot examples that improve accuracy for frequently asked queries.

# templates/saas_metrics.yaml
templates:
  - question: "What is the monthly recurring revenue?"
    sql: |
      SELECT
        DATE_TRUNC('month', payment_date) AS month,
        SUM(amount) AS mrr
      FROM subscriptions
      WHERE status = 'active'
      GROUP BY 1
      ORDER BY 1
    tags: ["saas", "revenue", "mrr"]

  - question: "What is the customer churn rate?"
    sql: |
      SELECT
        DATE_TRUNC('month', churned_at) AS month,
        COUNT(*) * 100.0 / LAG(COUNT(*)) OVER (ORDER BY DATE_TRUNC('month', churned_at)) AS churn_rate
      FROM customers
      WHERE churned_at IS NOT NULL
      GROUP BY 1
      ORDER BY 1
    tags: ["saas", "churn", "retention"]

Guidelines:

Use standard SQL (ANSI) where possible; note dialect-specific syntax.
Include multiple phrasings of the same question in the question field or as aliases.
Tag templates so the system can match them to the right context.

KPI Definition Packs

KPI packs define standard metrics per industry with their formulas, so the AI can compute them correctly without guessing.

# kpis/saas.yaml
kpis:
  - name: "MRR (Monthly Recurring Revenue)"
    formula: "SUM(amount) WHERE status = 'active' GROUP BY month"
    description: "Total recurring revenue normalised to a monthly period."

  - name: "CAC (Customer Acquisition Cost)"
    formula: "SUM(marketing_spend + sales_spend) / COUNT(new_customers)"
    description: "Average cost to acquire a new customer."

  - name: "LTV (Lifetime Value)"
    formula: "AVG(revenue_per_customer) * AVG(customer_lifespan_months)"
    description: "Predicted total revenue from a customer over their lifetime."

# kpis/ecommerce.yaml
kpis:
  - name: "AOV (Average Order Value)"
    formula: "SUM(order_total) / COUNT(DISTINCT order_id)"
    description: "Average revenue per order."

  - name: "Cart Abandonment Rate"
    formula: "(COUNT(carts_created) - COUNT(orders_completed)) / COUNT(carts_created) * 100"
    description: "Percentage of shopping carts that do not convert to orders."

Available industry packs: SaaS, E-commerce, Finance, Healthcare.

How to Test Your Contributions

Fork the repository and create a branch for your contribution.

Validate YAML syntax:

pip install yamllint
yamllint your-file.yaml

Run the test suite to ensure your additions do not break existing behaviour:
```
make test-semantic
```
Test with sample queries — connect a database with a matching schema and verify that your synonyms, heuristics, or templates improve query accuracy.
Submit a pull request with:
- A clear description of the industry or use case your contribution covers.
- Example questions that are now answered correctly thanks to your changes.
- Any edge cases or limitations you are aware of.

File Organisation

semantic/
├── synonyms/
│   ├── ecommerce.yaml
│   ├── saas.yaml
│   ├── finance.yaml
│   └── healthcare.yaml
├── relationships/
│   └── naming_conventions.yaml
├── templates/
│   ├── saas_metrics.yaml
│   ├── ecommerce_analytics.yaml
│   └── finance_reporting.yaml
└── kpis/
    ├── saas.yaml
    ├── ecommerce.yaml
    ├── finance.yaml
    └── healthcare.yaml

Synonym Packs​

Relationship Graph (Implemented)​

How It Works​

Managing Relationships​

Configuration​

API Endpoints​

Verified Query Templates​

KPI Definition Packs​

How to Test Your Contributions​

File Organisation​