Skip to main content

Web UI

Clone-Xs includes a full web interface with 60+ pages across 8 portals for managing Unity Catalog operations. Start it with:

make web-start

The UI runs at http://localhost:3001 and connects to the API server (by default) at http://localhost:8000.

Features

  • Command palette search — search any page by name or keyword (e.g., type "terraform" to jump to Generate, or "pii" to jump to PII Scanner)
  • 10 built-in themes — Light, Dark, Midnight, Sunset, High Contrast, Ocean, Forest, Solarized, Rose, and Slate; pick from a visual grid in Settings or the HeaderBar theme picker; all themes use CSS variables for consistent sidebar, header, and content colors
  • Real-time connection status indicator — compact status bar with a green/red dot; polls /api/health every 15 seconds
  • Notification center — bell icon in the header bar showing unread clone events sourced from Delta tables with time-ago formatting (e.g., "3 minutes ago"); opens a slide-out panel with event details. The badge count tracks only new events since you last opened the panel (uses a "last seen" timestamp stored in localStorage), so it resets to zero when you check your notifications
  • Pinned catalog pairs — favorite source/destination pairs on the Dashboard for quick access
  • Collapsible sidebar — five collapsible navigation groups with 33 pages total; sidebar collapses to an icon-only rail via a button at the bottom or a toggle in Settings; Databricks-style density (13px font, 16px icons, compact padding) with theme-aware colors
  • Page state persistence — every analysis/management page (PII Scanner, Schema Drift, Preflight, Diff & Compare, Cost Estimator, Profiling, Impact Analysis, Compliance, Monitor, Storage Metrics, plus all DQ Suite, Governance, FinOps, MDM, Federation pages) preserves its results, filters, tabs, catalog/schema pickers, and view modes when you navigate away — backed by sessionStorage with a 30-minute TTL so coming back never re-queries Databricks within the freshness window. Form fields about to be POSTed (notes, descriptions, YAML/SQL, credentials, typed-confirm) intentionally stay local so they don't restore stale values
  • Durable in-flight job tracking — long-running operations (clone, sync, incremental sync, demo-data batch & streaming, IaC generation, batch reconciliation) survive page navigation and browser refresh. The useDurableJob hook reconnects to the server-side job on remount, so coming back mid-run resumes the same progress bar, log tail, and (for streaming demos) throughput chart instead of resetting to a blank form
  • Resizable panels — Main sidebar, Catalog Browser, Table Detail Drawer, and Lineage Graph all support drag-to-resize with widths persisted in localStorage
  • Session persistence — server-side session store keeps all auth methods alive across page refreshes; credentials are stored in localStorage so no re-authentication is needed after reload
  • Dedicated login page — full-screen dark login with Clone-Xs branding, PAT and Azure auth tabs, an Azure multi-step wizard (Login, Tenant, Subscription, Workspace), and an "Explore Clone-Xs" bypass button for demo mode
  • Portal switcher — positioned in the right corner of the header with full keyboard support (arrow keys, Escape, Enter)
  • Databricks-style compact layout — 18px h1, 15px h2, 13px body; 48px header height; content max-width capped at 1400px and centered; cards, inputs, and buttons all tightened for density
  • Accessibility (WCAG 2.1 AA) — focus-visible outlines on all interactive elements, print styles (hides nav, sidebar, header), proper ARIA tab patterns with keyboard navigation on the login page, required field indicators with aria-required, loading states with aria-busy and screen reader text, 32px minimum touch targets for Databricks density, and prefers-reduced-motion media query support
  • Warehouse management — "Start" button for stopped warehouses in Settings, instant fail (no retry) for invalid or missing warehouses, and toast notifications for warehouse errors

Login Page

The login page is the first screen users see when Clone-Xs starts. It uses a full-screen dark design with centered Clone-Xs branding.

Two authentication methods are available as tabs:

  • PAT (Personal Access Token) — enter a Databricks host URL and token to connect immediately.
  • Azure — a multi-step wizard that walks through Login, Tenant selection, Subscription selection, and Workspace selection.

An "Explore Clone-Xs" button at the bottom of the login form lets users bypass authentication and enter demo mode to browse the interface without a live Databricks connection.

The login page implements a full ARIA tab pattern with keyboard navigation, required field indicators (* markers plus aria-required), and loading states with aria-busy and screen reader announcements.

Theme System

Clone-Xs ships with 10 themes, selectable from the Settings page (Interface section) or the HeaderBar theme picker. Both controls stay in sync.

ThemeStyle
LightClean white background with neutral accents
DarkStandard dark mode with muted tones
MidnightDeep blue-black for low-light environments
SunsetWarm amber and orange tones
High ContrastMaximum contrast for readability
OceanCool blue palette
ForestGreen-tinted dark mode
SolarizedEthan Schoonover's Solarized palette
RoseSoft pink accents on a light or dark base
SlateBlue-gray neutral tones

All themes define their colors through CSS variables, so sidebars, headers, cards, and content areas automatically pick up the correct accent and background colors.

Settings Page

The Settings page (/settings) uses a two-panel layout similar to VS Code Settings: a left sidebar with navigation links and a right content panel that scrolls to the selected section.

Sections

SectionContents
ConnectionDatabricks host URL with a compact connection status bar (green/red dot)
AuthenticationPill-style tabs for PAT, OAuth, Azure, and Service Principal. Once authenticated, a "✓ Logged in as <user>" line appears under the Save & Connect button so you can spot wrong-account / wrong-token mistakes immediately.
WarehousesRadio-button warehouse selection with Start and Test buttons for each warehouse
Target WorkspacesManage saved cross-workspace clone targets. See Target Workspaces below.
Audit & LogsAudit table catalog and schema, loaded from the application YAML config
InterfaceTheme picker grid (10 themes), Sidebar Navigation toggle (collapse/expand), Export Buttons visibility, Catalog Browser visibility
PerformanceCost Estimation Settings with configurable storage price per GB/month, currency selection (10 currencies)
FeaturesFeature flags and experimental toggles

Target Workspaces

Clone-Xs supports cross-workspace and cross-cloud catalog migration via Delta Sharing + DEEP CLONE (see the cross-workspace clone guide). Rather than typing target host + PAT + warehouse_id on every clone, save target connections once here and pick them from a dropdown on /clone.

Click + Add target to open the dialog:

FieldNotes
NameSlug used to identify this connection (e.g. prod-azure, dev-aws). Letters, digits, -, _.
Target HostFull https URL of the destination workspace
Auth MethodPersonal Access Token, Service Principal, or CLI Profile
Token / Client ID + Secret / ProfileCredentials for the chosen method. The token field becomes a masked password input on edit (*** placeholder = "keep existing")
Target SQL WarehouseClick Browse to populate the dropdown from the target workspace. Required.
Default data sync modesnapshot_once (default), incremental, or force_full
Auto-handle column masks & row filtersWhen ticked, Clone-Xs drops masks/filters before adding tables to the share and re-applies them on the target after the clone
Keep migration share after cloneLeaves the Delta Share intact for audit/debugging

Each saved connection card shows:

  • Name + a ✓ Connected · WH RUNNING badge after a successful Test
  • Host (monospace, truncated if long)
  • ✓ Logged in as <user> — auto-fetched on page mount via the lightweight POST /api/target/whoami endpoint, so you see who you're authenticated as without having to click Test
  • Auth method · Warehouse ID · sync mode (one-line summary)
  • Test / Edit / Delete buttons. Test triggers a full check (auth, metastore sharing, warehouse existence, non-blocking warehouse-start if STOPPED) and surfaces toast notifications with the result.

Where credentials are stored: browser localStorage, key clxs_target_connections. The server is intentionally stateless — saved targets never persist to disk, never appear in clone_config.yaml, and never travel through git. Each clone request sends the picked target's credentials inline. To export/import (e.g., onboard a teammate) copy the JSON from the localStorage key. To clear all saved targets, delete the key in browser devtools → Application → Local Storage.

On /clone the saved targets appear in a dropdown when you tick "Clone to a different workspace." See the cross-workspace clone guide for the full flow.

The sidebar supports two modes:

  • Expanded — full-width sidebar showing icons and labels for all navigation groups and pages.
  • Collapsed (rail) — icon-only rail that saves horizontal space while keeping navigation accessible.

Toggle between modes using the collapse/expand button at the bottom of the sidebar or the "Sidebar Navigation" toggle in Settings under the Interface section.

The sidebar uses Databricks-style density: 13px font size, 16px icons, and compact padding. Colors are driven by CSS variables so they adapt automatically to the active theme.

Accessibility

Clone-Xs targets WCAG 2.1 AA compliance across the entire interface:

  • Focus indicators — all interactive elements show a visible focus outline when navigated via keyboard.
  • ARIA patterns — the login page uses a proper ARIA tab pattern; forms include aria-required on mandatory fields; loading states set aria-busy and provide screen-reader-only status text.
  • Touch targets — interactive elements maintain a minimum 32px touch target, matching Databricks compact density.
  • Reduced motion — a prefers-reduced-motion media query disables animations and transitions for users who request it.
  • Print styles — printing any page hides the sidebar, header, and navigation so only the main content appears.

Pages

Overview

PagePathDescription
Dashboard/Analytics dashboard with 10 stat cards, 5 charts (area, pie, bar, line, and trend), 2 insight tables, a Catalog Health Score gauge, Pinned Catalog Pairs for quick-access favorites, and a Notification Center bell icon for recent events. Uses GET /api/dashboard/stats and GET /api/catalog-health.
Audit Trail/auditRedesigned audit page with a summary stats bar at the top. Enhanced filters include free-text search, status dropdown, catalog filter, and date range picker. Entries render as expandable rows that reveal a detail grid of operation metadata. A Log Detail Panel displays color-coded execution logs (info/warn/error) with a Download Full Log button for offline review. Uses GET /api/audit and GET /api/audit/{jobId}/logs.
Metrics/metricsPerformance metrics showing total clones, success rate, average duration, and tables-per-hour throughput with a status breakdown bar chart. Uses GET /api/monitor/metrics.

Operations

PagePathDescription
Clone/cloneMulti-step wizard: select source/destination catalogs, configure clone type (deep/shallow), set options like parallel and validate, preview the plan, then execute. Polls job progress in real-time. Now reads template URL parameters to pre-fill all checkboxes and options when launched from the Templates page, and auto-populates the Storage Location from the selected source catalog. Uses POST /api/clone and GET /api/clone/{jobId}.
Sync/syncTwo-way catalog synchronization with dry-run and drop-extra options. Shows a diff preview of ADD/UPDATE/REMOVE actions before executing, then tracks job progress. Uses POST /api/sync and GET /api/clone/{jobId}.
Incremental Sync/incremental-syncSyncs only changed tables since the last clone. Checks for delta changes first, then runs an incremental sync with volume support. Uses POST /api/incremental/check and POST /api/incremental/sync.
Generate/generateGenerates IaC artifacts: Databricks workflow YAML or Terraform/Pulumi configurations for clone jobs. Tracks generation progress. Uses POST /api/generate/workflow and POST /api/generate/terraform.
Rollback/rollbackLists rollback logs from previous operations and lets you revert a clone by selecting a log entry and confirming execution. Now uses Delta RESTORE TABLE instead of DROP, showing a per-table rollback plan with RESTORE vs DROP badges. Records pre-clone table versions so each table can be individually restored. Uses GET /api/rollback/logs and POST /api/rollback.
Templates/templatesRedesigned template browser with category filter pills, unique icons and accent colors per template, configuration badges showing key options at a glance, and expandable long descriptions. Click anywhere on a card to launch the Clone page with all template configuration passed as URL parameters. Uses GET /api/templates.
Create Job/create-jobCreate a persistent Databricks Job with auto-populated storage location, Clone-Xs job dropdown for updates, cron schedule, email notifications, retries, and full clone options. Uses POST /api/generate/create-job and GET /api/generate/clone-jobs.
Multi-Clone/multi-cloneClone a single source catalog to multiple destination workspaces in parallel. Add/remove destination rows, then execute all clones concurrently. Uses POST /api/clone (one per destination).
Advanced Tables/advanced-tablesManage advanced Unity Catalog table types: materialized views, streaming tables, online tables, vector search indexes, and feature tables. Create, inspect, and refresh advanced tables with full metadata display. Uses GET /api/advanced-tables and POST /api/advanced-tables.
Demo Data/demo-dataGenerate realistic demo catalogs with synthetic data. 10 industries (healthcare, financial, retail, telecom, manufacturing, energy, education, real_estate, logistics, insurance), each with 20 tables/views/UDFs. Template presets (Quick/Sales/Full), medallion architecture toggle, Create UDFs checkbox (toggle UDF creation), Create Volumes checkbox (toggle volume and sample file creation), date range inputs (start/end date pickers for controlling the generated data time range), destination catalog input (auto-clone to a second catalog after generation), generation preview with cost estimates, per-industry progress bars with estimated time remaining (ETA based on elapsed time and industries completed), cleanup button, and direct link to Explorer. Uses POST /api/generate/demo-data. The same page also hosts six unstructured-data tabsDocuments (PDF / DOCX / PPTX / XLSX / EML, with optional AI-drafted narrative content via Databricks Model Serving), Media (PNG / WAV / MP4), Knowledge (markdown wiki / Q&A JSON / JSONL chat), Logs (NGINX / JSON / syslog / OTel traces), Code (Python / JS / Java repos), and Live Capture (browser webcam → UC Volume + Delta with inline BINARY, image-grounded multimodal AI for caption / alt-text / summary / tags / OCR / scene category, with a Strict-vs-Permissive description style toggle). The first five share a destination radio (Volume / Volume + catalog / direct Delta table) and a dynamic catalog/schema/volume picker; Live Capture is synchronous (no JobManager) and writes one row per multipart upload. See Unstructured Demo Data.

Discovery

PagePathDescription
Data Lab/data-labInteractive SQL query editor with catalog browser, 12 chart types, auto-visualization, deep data profiler, execution plan analysis, schema diagrams, and 4 AI features (Fix, Analyze, Explain, Generate). See the Data Lab guide. Uses POST /api/reconciliation/execute-sql, POST /api/profile-table, POST /api/profile-results, and POST /api/ai/summarize.
Notebooks/notebooksMulti-cell SQL + Markdown notebook for interactive data exploration. Features: catalog browser sidebar, execution counter, AI per cell (fix/explain/generate), parameterized cells with {{variable}} syntax, cell duplication, auto-save, table of contents, drag-and-drop reorder, find across cells, undo/redo, presentation mode, HTML export, notebook templates, and temp view chaining. See the Data Lab guide. Uses POST /api/reconciliation/execute-sql and /api/notebooks CRUD.
Explorer/exploreFull catalog exploration page with a Databricks-style Catalog Browser tree sidebar (catalogs → schemas → tables, lazy loading, search filter, resizable, hideable via Settings). Tabs include: Overview (8 stat cards with Monthly/Yearly cost estimates, schema size donut, table type distribution donut, Top Used Tables, Most Used Columns, schema filter pills), UC Objects (External Locations, Storage Credentials, Connections, Registered Models, Metastore, Shares, Recipients), Views (all views with column counts), Functions (all UDFs with lazy loading), Volumes (type and path), PII Detection (inline scanner), and Feature Store (auto-detected feature tables). Click any table to open the Table Detail Drawer (columns, properties, owner, storage location, dates). Per-table quick actions (Preview, Clone, Profile), Compare shortcut to Diff page, and Export CSV. Uses POST /api/search, POST /api/stats, POST /api/column-usage, GET /api/uc-objects, POST /api/table-usage, and GET /api/catalogs/{catalog}/{schema}/{table}/info.
Diff & Compare/diffCompare two catalogs side-by-side to see which tables are missing, extra, or different. Also supports validation to verify row counts match. Uses POST /api/diff and POST /api/validate.
Config Diff/config-diffSide-by-side comparison of two clone configurations (paste YAML/JSON or load from profiles). Highlights added, removed, and changed keys. Uses POST /api/config/diff.
Lineage/lineageInteractive lineage graph with multi-hop tracing (up to 5 hops), upstream/downstream tabs, column-level lineage, notebook/job attribution, time range filtering, and export (JSON/CSV). Powered by system.access.table_lineage, system.access.column_lineage, and Clone-Xs audit logs. Insights tab shows most connected tables, root sources, terminal sinks, top columns by usage, and active users. Uses POST /api/lineage and POST /api/column-usage.
Dependencies/view-depsAnalyze view and function dependencies within a schema, producing a dependency graph and recommended creation order. Uses POST /api/dependencies/views and POST /api/dependencies/functions.
Impact Analysis/impactAssess the downstream blast radius of changes to a catalog, schema, or table. Shows affected views, functions, and risk level (low/medium/high). Uses POST /api/impact.
Data Preview/previewSample and compare rows from source and destination tables side-by-side. Supports single-table preview or cross-catalog comparison mode. Uses POST /api/sample and POST /api/sample/compare.

Analysis

PagePathDescription
Reports/reportsView clone job history with options to export reports, create snapshots, estimate costs, and trigger rollbacks from a single page. Uses GET /api/clone-jobs, POST /api/export, and POST /api/snapshot.
PII Scanner/piiScan a catalog for columns containing personally identifiable information (emails, phone numbers, SSNs) and flag sensitive data. Uses POST /api/pii-scan.
Schema Drift/schema-driftDetect schema differences between source and destination catalogs — added/removed/modified columns, type changes, and nullability mismatches. Uses POST /api/schema-drift.
Profiling/profilingProfile data quality across a catalog or schema: null percentages, distinct counts, min/max values, and data type distributions. Uses POST /api/profile.
Cost Estimator/costEstimate the storage and DBU cost of cloning a catalog. Now displays Total Size in GB/TB, Tables Scanned count, projected Monthly and Yearly Cost, a Deep vs Shallow cost comparison, and a Top 10 Largest Tables breakdown. Uses POST /api/estimate.
Storage Metrics/storage-metricsAnalyze table storage: file counts, sizes, vacuum candidates, and predictive optimization status. Supports running VACUUM and OPTIMIZE directly. Uses POST /api/storage-metrics and POST /api/check-predictive-optimization.
Compliance/complianceGenerate governance and compliance reports for a catalog covering permissions, access patterns, and policy adherence. Uses POST /api/compliance.

Management

PagePathDescription
Monitor/monitorContinuous monitoring of catalog sync status — compares source and destination in real-time, tracks drift, and shows sync freshness for each table. Uses POST /api/monitor.
Preflight/preflightRun prerequisite checks before a clone: validates catalog access, warehouse connectivity, permissions, and schema compatibility. Uses POST /api/preflight.
Config/configView and edit the application YAML configuration with a built-in editor. Lists available profiles and allows saving changes. Uses GET /api/config, GET /api/config/profiles, and PUT /api/config.
Settings/settingsTwo-panel layout (sidebar nav + content panel) for managing Databricks connection, authentication (PAT or Azure with pill-style tabs), warehouse selection (radio buttons with Start and Test actions), audit table configuration, theme and interface preferences, cost estimation settings, and feature flags. See the Settings Page section above for full details. Uses POST /api/auth/azure-login, POST /api/auth/azure/connect, and GET /api/auth/warehouses.
Warehouse/warehouseView, start, and stop SQL warehouses in your Databricks workspace. Shows real-time status with auto-refresh every 10 seconds. Uses GET /api/auth/warehouses, POST /api/warehouse/start, and POST /api/warehouse/stop.
RBAC/rbacManage role-based access control policies for clone operations. Create and view policies that restrict which users can clone specific catalogs. Uses POST /api/rbac/policies.
Plugins/pluginsBrowse installed plugins and toggle them on or off. Each plugin extends Clone-Xs with additional hooks and capabilities. Uses GET /api/plugins and POST /api/plugins/{id}/{enable|disable}.

Governance Portal

Accessed via Portal Switcher. Includes RBAC, RTBF, DSAR, data dictionary, certifications, data contracts, SLA monitoring, and change history.

PagePathDescription
RBAC/governance/rbacRole-based access control policies.
RTBF / Erasure/governance/rtbfGDPR Article 17 erasure workflow. See RTBF guide.
DSAR / Access/governance/dsarGDPR Article 15 access request workflow. See DSAR guide.

Security Portal

PII detection, compliance validation, and pre-clone security checks.

PagePathDescription
PII Scanner/security/piiDetect personally identifiable information across catalogs.
Compliance/security/complianceGenerate governance and compliance reports.
Preflight Checks/security/preflightValidate permissions and config before cloning.

Automation Portal

Pipelines, job scheduling, templates, and workspace job management.

PagePathDescription
Pipelines/automation/pipelinesChain operations into reusable workflows.
Templates/automation/templatesPre-built clone configurations and recipes.
Create Job/automation/create-jobSchedule persistent Databricks clone jobs.
Clone Jobs/automation/jobsList, clone, compare, and backup Databricks Jobs across workspaces. Clone within same workspace or cross-workspace with host/token. Job diff view and JSON backup/restore.
DLT Pipelines/automation/dltDiscover, clone, and monitor Delta Live Tables pipelines.

Infrastructure Portal

Warehouse management, cross-workspace federation, and data sharing.

PagePathDescription
Warehouse/infrastructure/warehouseView, start, and manage SQL warehouses.
Lakehouse Monitor/infrastructure/lakehouse-monitorMonitor lakehouse table quality and metrics.
Federation/infrastructure/federationCross-workspace catalog federation.
Delta Sharing/infrastructure/delta-sharingShare data across organizations.

MDM Portal (Master Data Management)

First open-source Databricks-native MDM — 19 pages covering entity resolution, golden records, stewardship, and compliance.

PagePathDescription
Overview/mdmDashboard with entity stats, charts, global search, and table initialization.
Golden Records/mdm/golden-recordsMaster entities with entity 360 drawer (attributes, source records, timeline).
Match & Merge/mdm/match-merge5 tabs: Duplicates, Rules, Survivorship, Source Trust, Ingest. Match tuning tester.
Relationships/mdm/relationship-graphInteractive SVG entity graph with zoom, filter, and detail panel.
Merge History/mdm/merge-historyAll merge/split decisions with undo capability.
Data Stewardship/mdm/stewardshipReview queue with side-by-side compare, bulk ops, SLA timer, comments.
Hierarchies/mdm/hierarchiesCreate and manage parent-child entity trees.
Industry Templates/mdm/templatesHealthcare (MPI), Financial (KYC), Retail (360), Manufacturing — one-click apply.
Reference Data/mdm/reference-dataCode lists with aliases and cross-system mapping tables.
Negative Match/mdm/negative-match"Do not link" rules — pairs that should never be merged.
Settings/mdm/settingsThresholds, SLA, notifications, retention, defaults.
DQ Scorecards/mdm/scorecardsPer-entity-type accuracy, completeness, and active rate.
Data Profiling/mdm/profilingAttribute fill rates and distinct value analysis.
Cross-Domain/mdm/cross-domainMatch across entity types (Customer ↔ Supplier).
Consent/mdm/consentGDPR consent matrix — 7 consent types per entity.
Audit Log/mdm/audit-logUnified event log with search, filter, CSV export.
Reports/mdm/reportsCompliance reports with JSON/Markdown export.