Smart Relationship Graph

The relationship graph is Actyze's semantic layer for table connections. It stores verified JOIN relationships between tables and provides them to the LLM at query time, so AI-generated SQL uses correct JOINs instead of guessing.

Why It Matters

Without relationship context, LLMs frequently hallucinate JOIN conditions — guessing column names, inventing foreign keys, or joining on the wrong columns. The relationship graph solves this by giving the LLM a map of how your tables actually connect.

Before (without graph):

-- LLM guesses the join condition
SELECT * FROM orders JOIN customers ON orders.cust = customers.customer_id
-- Wrong: the actual column is orders.customer_id = customers.id

After (with graph):

-- LLM uses the verified relationship
SELECT * FROM orders JOIN customers ON orders.customer_id = customers.id
-- Correct: relationship graph provided the exact join condition

Three-Layer Discovery

Relationships are discovered and managed through three layers, each with increasing confidence:

Layer 1: Inferred (Automatic)

On first schema load, Actyze scans column names for foreign-key naming patterns:

Pattern	Example	Inferred Relationship
`*_id` suffix	`orders.customer_id`	`orders.customer_id → customers.id`
`*_fk` suffix	`invoices.order_fk`	`invoices.order_fk → orders.id`
Exact table name match	`orders.product`	`orders.product → products.id`

Confidence: 0.4–0.6 (medium — naming conventions aren't always correct)

Layer 2: Mined from Query History

Actyze parses every successful query using sqlglot to extract JOIN patterns:

-- This successful query:
SELECT o.id, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.total > 100

-- Produces this mined relationship:
-- orders.customer_id → customers.id (confidence: 0.55)

Confidence grows with frequency: min(0.9, 0.5 + 0.05 × occurrence_count)

After 8 successful queries using the same JOIN, confidence reaches 0.9.

Key behavior:

Mining is triggered manually via the UI or API (POST /api/relationships/mine)
Only successful queries are mined (failed queries are ignored)
Duplicate JOINs increase confidence rather than creating new entries

Layer 3: Admin-Verified

Admins can create, verify, or disable relationships through the Data Intelligence → Relationships tab:

Create — add a relationship with a custom join condition
Verify — mark an inferred or mined relationship as verified (highest confidence)
Disable — disable a problematic relationship without deleting it
Bulk import — upload relationships from CSV

Confidence: verified relationships get the highest priority regardless of score.

How the Graph Powers Queries

When a user asks a natural language question that spans multiple tables, Actyze uses the relationship graph in a three-tier fallback:

Tier 1: Direct Relationships

If the tables are directly connected, the exact join condition and relationship type are passed to the LLM:

"Show me orders with customer names"
→ Graph provides: orders.customer_id → customers.id (1:N, confidence: 0.85)
→ LLM generates: SELECT o.*, c.name FROM orders o JOIN customers c ON o.customer_id = c.id

Tier 2: Multi-Hop Path Finding (BFS)

If the tables aren't directly connected, BFS traversal finds the shortest path:

"Show orders with country names"
→ No direct relationship between orders and nations
→ BFS finds: orders → customers → nations
→ Graph provides both join conditions:
   orders.customer_id → customers.id
   customers.nation_id → nations.id
→ LLM generates the correct two-JOIN query

Tier 3: Schema Fallback

If no relationship path exists, the LLM falls back to full schema context and attempts to generate JOINs from column names alone. When this query succeeds, the JOIN pattern is captured for future mining.

New tables with no relationships yet
→ LLM uses full schema context
→ Query succeeds
→ Next mining run captures the JOIN pattern
→ Future queries use the mined relationship

This creates a self-improving loop: the more you query, the smarter the graph gets.

Relationship Metadata

Each relationship stores:

Field	Description
`source_table`	The table with the foreign key
`target_table`	The referenced table
`join_condition`	Exact SQL condition (e.g., `source.customer_id = target.id`)
`relationship_type`	`1:1`, `1:N`, `N:1`, or `M:N`
`source_method`	How discovered: `inferred`, `mined`, or `admin`
`confidence`	Score from 0.0 to 1.0
`is_verified`	Whether an admin has verified this relationship
`is_disabled`	Whether this relationship is excluded from LLM context
`usage_count`	How many times used in AI-generated queries
`last_used_at`	Timestamp of most recent use

Admin UI

Access via Data Intelligence → Relationships tab.

Filtering

Filter by source method (inferred / mined / admin)
Filter by table name
Filter by confidence threshold
Filter by verified / unverified status

Actions

Verify — mark a relationship as admin-verified
Disable — exclude from LLM context without deleting
Edit — modify join condition or relationship type
Delete — permanently remove a relationship
Bulk Import — upload from CSV with columns: source_catalog, source_schema, source_table, target_catalog, target_schema, target_table, join_condition, relationship_type

Usage Insights

The table shows usage_count and last_used_at for each relationship, so you can see which connections the AI relies on most.

API Endpoints

List Relationships

GET /api/relationships
GET /api/relationships?source_method=mined&min_confidence=0.7

Find Join Path

GET /api/relationships/graph/path?tables=orders,nations
# Returns: orders → customers → nations with join conditions

Trigger Mining

POST /api/relationships/mine
# Parses query history and creates/updates relationships

Create Relationship

POST /api/relationships
Content-Type: application/json

{
  "source_catalog": "postgresql",
  "source_schema": "public",
  "source_table": "orders",
  "target_catalog": "postgresql",
  "target_schema": "public",
  "target_table": "customers",
  "join_condition": "orders.customer_id = customers.id",
  "relationship_type": "N:1"
}

Configuration

Variable	Default	Description
`RELATIONSHIP_GRAPH_ENABLED`	`true`	Include relationship context in LLM prompts

When disabled, the LLM falls back to schema-only context (Tier 3 behavior).

Best Practices

Getting Started

Enable the graph (on by default) — inferred relationships are created on first schema load
Run mining after you have some query history — click "Mine Query History" in the Relationships tab
Verify key relationships — verify the most important joins so they always rank highest

Ongoing Maintenance

Re-mine periodically — as users write new queries, new JOIN patterns emerge
Disable bad relationships — if the AI generates wrong JOINs, check for incorrect relationships and disable them
Review low-confidence entries — relationships with confidence < 0.5 may be unreliable; verify or disable them

For Large Schemas

Focus on high-traffic tables — verify relationships for tables that appear in the most queries
Use bulk import — if you have ERD documentation or a data dictionary, export it as CSV and import
Monitor usage tracking — relationships with zero usage after weeks may be candidates for cleanup

Why It Matters​

Three-Layer Discovery​

Layer 1: Inferred (Automatic)​

Layer 2: Mined from Query History​

Layer 3: Admin-Verified​

How the Graph Powers Queries​

Tier 1: Direct Relationships​

Tier 2: Multi-Hop Path Finding (BFS)​

Tier 3: Schema Fallback​

Relationship Metadata​

Admin UI​

Filtering​

Actions​

Usage Insights​

API Endpoints​

List Relationships​

Find Join Path​

Trigger Mining​

Create Relationship​

Configuration​

Best Practices​

Getting Started​

Ongoing Maintenance​

For Large Schemas​