Skip to main content

Quickstart

Clone your first Unity Catalog catalog in 5 minutes.

Prerequisites

  • Python 3.13+
  • Access to a Databricks workspace with Unity Catalog enabled
  • A SQL warehouse (serverless or pro) with permissions to execute queries
  • Databricks credentials configured (PAT, service principal, or CLI profile)

Step 1: Install

pip install clone-xs

Verify the installation:

clxs --help

Step 2: Initialise config

clxs init

This creates a config/clone_config.yaml with sensible defaults. Edit it to set your warehouse ID and catalog names:

source_catalog: "production"
destination_catalog: "production_clone"
sql_warehouse_id: "your-warehouse-id"
clone_type: "DEEP"
load_type: "FULL"
max_workers: 4
Finding your warehouse ID

Go to your Databricks workspace → SQL Warehouses → click your warehouse → the ID is in the URL or shown in the details panel (e.g. 1a86a25830e584b7).

Step 2b: Select your warehouse (Web UI)

If you are using the Web UI, go to Settings and select your SQL warehouse from the dropdown. This persists to the backend config and is used as the default for all operations. You can also set the active warehouse from the Warehouse page by clicking Set as Active on any running warehouse.

tip

The Settings page loads all configuration from the API backend — it is the single source of truth. Changes made in Settings are immediately available to the Clone page and all other operations.

Step 3: Run pre-flight checks

clxs preflight \
--source production \
--dest production_clone \
--warehouse-id 1a86a25830e584b7

This validates connectivity, permissions, source/destination access, and warehouse status before you clone.

Step 4: Clone

clxs clone \
--source production \
--dest production_clone \
--warehouse-id 1a86a25830e584b7

You'll see a progress bar with real-time status:

  Schemas |██████████████████████████████| 8/8 (100%) [8ok/0fail/0skip] ETA: done

Step 5: Validate (optional)

clxs validate \
--source production \
--dest production_clone \
--warehouse-id 1a86a25830e584b7

This compares row counts, schema structure, and metadata between source and destination.

What's next