Skip to main content

DQ Suite

The full data-quality analytics layer that wraps DQX, expectations, and ad-hoc rules — plus AI-driven authoring and intelligent alert routing.

Overview

The DQ Suite is the umbrella for everything beyond "did this single check pass?". It answers higher-order questions:

  • How trustworthy is this table overall? → Trust Score
  • Which tables aren't even monitored? → Coverage Map
  • What does poor quality cost us in dollars? → COPQ (FinOps guide)
  • Are these anomalies related to the same root cause? → Correlation Engine
  • Can I describe a rule in plain English? → NL Rule Builder
  • How do I avoid alert fatigue? → Alert Routing & Digest

All five are first-class pages under the Data Quality portal and share the same underlying DQ events stream from api/routers/data_quality.py.

Trust Scores

Source: src/trust_score.py · /api/trust-scores · UI /data-quality/trust-scores.

A composite 0–100 score per table from six dimensions:

DimensionDefault weightSource
DQ pass rate0.30governance.dq_results over last 7d
Freshness0.20latest lastModificationTime vs SLA
Anomaly history0.15data_quality.anomalies count over 7d
PII coverage0.15pii_scans × tagged columns
Schema stability0.10governance.schema_changes over 30d
Lineage completeness0.10upstream/downstream nodes resolved

Weights are configurable

curl -X PUT $CLXS_HOST/api/trust-scores/config \
-d '{"weights": {"dq": 0.4, "freshness": 0.25, "anomaly": 0.10, "pii": 0.10, "schema": 0.10, "lineage": 0.05}}'

Weights must sum to 1.0 — the API validates and rejects otherwise.

Compute & view

curl -X POST $CLXS_HOST/api/trust-scores/compute/{catalog}
curl $CLXS_HOST/api/trust-scores/scores/{catalog}

A trend view tracks score history per table — useful for spotting degradations: GET /api/trust-scores/trend?table_fqn=cat.sch.tbl&days=30.

Coverage Map

Source: src/coverage_map.py · /api/coverage · UI /data-quality/coverage.

The Coverage Map cross-references information_schema.tables against:

  • DQ rules in governance.dq_rules
  • DQX checks in governance.dqx_checks
  • SLA monitors in governance.sla_rules
  • PII scans in governance.pii_scans
  • Profiling results in data_quality.profiles
  • ODCS contracts in governance.odcs_contracts

…and computes a per-table coverage percentage: how many of those six monitoring layers cover the table.

GET /api/coverage/{catalog}/summary returns aggregates (covered_pct, unmonitored_count, partial_count); GET /api/coverage/{catalog} returns per-table breakdown so the UI can render a heatmap.

The dashboard sorts by table size × access frequency so the most expensive unmonitored tables float to the top — usually you find one or two business-critical fact tables that no one had attached a rule to.

COPQ — Cost of Poor Data Quality

See the FinOps guide for the full spec; in short, COPQ takes DQ failure events and converts them into dollars (rerun cost + SLA penalty + engineer triage time + downstream multiplier). Use it to make the business case for remediation playbooks.

Anomaly Correlation

Source: src/anomaly_correlation.py · /api/anomaly-correlations · UI /data-quality/correlations.

When dozens of anomalies fire in the same window, you usually have one root cause. The correlation engine groups anomalies by:

  • Time proximity — anomalies within ±5 minutes are candidate-grouped
  • Lineage proximity — anomalies on tables connected via lineage are upgraded to the same group
  • Statistical similarity — anomalies with similar magnitude / direction stay grouped

Each group surfaces a probable root cause table (the upstream-most table in the group) so the responder triages one anomaly, not 30.

GET /api/anomaly-correlations/groups lists current groups with the implicated upstream + downstream impact set; GET /api/anomaly-correlations/root-causes ranks them by downstream blast radius.

To force a fresh correlation run (rather than waiting for the scheduled hourly sweep):

curl -X POST $CLXS_HOST/api/anomaly-correlations/correlate

NL Rule Builder

Source: src/nl_rule_builder.py · /api/nl-rules · UI /governance/nl-rules.

Authors describe a rule in plain English; the AI backend (Anthropic API or Databricks Model Serving — pick via X-Databricks-Model header or DATABRICKS_MODEL / ANTHROPIC_API_KEY env vars) returns a structured DQ rule config plus a confidence score and a one-line explanation.

curl -X POST $CLXS_HOST/api/nl-rules/from-natural-language \
-d '{
"text": "order_total must always be positive and less than 100000",
"table_fqn": "prod.ecommerce.orders"
}'

Returns:

{
"rule": {
"name": "order_total_range",
"rule_type": "range",
"column": "order_total",
"params": {"min": 0, "max": 100000, "exclusive_min": true},
"severity": "warning"
},
"confidence": 0.92,
"explanation": "Range rule on order_total — 'positive' interpreted as exclusive_min=true."
}

Two operating modes worth knowing:

  • Single rule — one NL → one rule. Save with POST /api/governance/dq/rules.
  • Batch — paste a markdown bullet list, get one rule per bullet, review & save in bulk.
  • Explain — paste an existing rule JSON, get a plain-English description (great for code review).

Alert Routing

Source: src/alert_routing.py · /api/alerts · UI /data-quality/alert-routing.

The DQ event stream produces too much noise to paste verbatim into Slack. The Alert Routing layer adds:

  • Inbox — every event lands here first; humans triage critical/warning/info; one-click acknowledge / resolve / snooze
  • Routing rules — pattern → team + channel (e.g. prod.fraud.* + critical → #fraud-oncall (PagerDuty))
  • Digests — daily / weekly digest emails per recipient that aggregate non-critical events
  • Analytics — alert volume, MTTA, MTTR, escalation rate per team

A typical routing rule

table_pattern: "prod.fraud.*"
severity_filter: critical
route_to_team: fraud-oncall
channel: pagerduty:fraud
enabled: true

POST /api/alerts/routing-rules with the YAML above; POST /api/alerts/{id}/acknowledge, /resolve, /snooze for human actions.

Expectation Suites

Source: src/expectation_suites.py · /api/data-quality/suites · UI /data-quality/expectations.

Group multiple DQ rules + DQX checks into a named suite, then run the whole thing end-to-end:

name: orders_bronze_landing
description: Checks every Bronze→Silver hand-off in ecommerce/orders
checks:
- type: dq_rule
rule_id: not_null_order_id
- type: dq_rule
rule_id: order_total_range
- type: dqx_check
check_id: orders_referential_integrity

POST /api/data-quality/suites/{id}/run executes the suite and returns a per-check pass/fail array.