CH-UICH-UI

Models (SQL Transformations)

dbt-style SQL models with dependency resolution, DAG execution, and scheduling

Models let you define SQL transformations that materialize as ClickHouse views or tables. Dependencies between models are declared with $ref() syntax, resolved into a DAG, and executed in topological order.

Included in the free Community edition.

Overview

Each model has:

  • A name (valid ClickHouse identifier, unique per connection)
  • A SQL body (with optional $ref() references to other models)
  • A materialization (view or table)
  • A target database (where the view/table is created)

Models are scoped to a ClickHouse connection. Execution runs through the tunnel gateway using credentials from an active user session.

Creating a Model

Navigate to Models in the sidebar and click New Model.

FieldDescriptionDefault
NameValid ClickHouse identifier (^[a-zA-Z_][a-zA-Z0-9_]*$)Required
DescriptionFree-text descriptionEmpty
Target DatabaseClickHouse database for the outputdefault
Materializationview or tableview
SQL BodyThe SELECT query (may contain $ref() calls)Required
Table EngineClickHouse engine (only for table)MergeTree
ORDER BYClickHouse ORDER BY clause (only for table)tuple()

Materialization Types

TypeDDL Generated
viewCREATE OR REPLACE VIEW db.name AS {your SQL}
tableDROP TABLE IF EXISTS db.name then CREATE TABLE db.name ENGINE = engine ORDER BY order_by AS {your SQL}

Tables are recreated on every run (drop + create). Views are replaced in-place.

$ref() Syntax

Reference other models by name using $ref(). At execution time, references are resolved to fully-qualified table names.

SELECT
  a.user_id,
  a.total_orders,
  b.lifetime_value
FROM $ref(user_orders) a
JOIN $ref(user_revenue) b ON a.user_id = b.user_id

Resolves to:

SELECT
  a.user_id,
  a.total_orders,
  b.lifetime_value
FROM `analytics`.`user_orders` a
JOIN `analytics`.`user_revenue` b ON a.user_id = b.user_id

Resolution Rules

  • $ref(name) becomes `target_database`.`name` (backtick-escaped)
  • References inside SQL comments (-- and /* */) are ignored — comments are stripped before extraction
  • Each unique reference creates a dependency edge in the DAG
  • Unresolved references (referencing a model that doesn't exist) cause a validation error

DAG & Dependencies

Models form a Directed Acyclic Graph (DAG). CH-UI uses Kahn's algorithm for topological sorting and cycle detection.

How It Works

  1. Extract $ref() calls from each model's SQL body
  2. Build a dependency graph (model A depends on model B if A references B)
  3. Topologically sort using Kahn's algorithm (process models with zero in-degree first)
  4. If fewer models are processed than exist in the graph, a cycle is detected

Validation Checks

CheckError
Unknown referencereferences unknown model 'foo' via $ref()
Self-referencecannot reference itself via $ref(model_name)
Circular dependencycycle detected in model dependencies

Run validation anytime from the UI or via GET /api/models/validate.

Execution

Three execution modes are available. All acquire a per-connection lock to prevent concurrent runs.

Run All

Executes every model in the connection in topological order.

POST /api/models/run

Run Single

Executes one model plus all its transitive upstream dependencies (recursive walk).

POST /api/models/{id}/run

Use this when working on a specific model — its dependencies are automatically rebuilt first.

Run Pipeline

Executes all models in the same connected component (see Pipelines below).

POST /api/models/pipelines/{anchorId}/run

Execution Steps (Per Model)

  1. Check upstream failures — if any dependency failed, skip this model
  2. Resolve $ref() calls — replace with fully-qualified table names
  3. Generate DDLCREATE OR REPLACE VIEW or DROP TABLE + CREATE TABLE ... AS
  4. Execute via tunnel — 5-minute timeout per statement
  5. Record result — status, elapsed time, error (if any)

Cascade Failures

If an upstream model fails, all downstream models are marked as skipped. The run status becomes partial (mix of success and failure) or error (all failed/skipped).

Pipelines

A pipeline is a group of models that are transitively connected — they directly or indirectly depend on each other. Models with no connections to each other form separate pipelines.

How Pipelines Are Detected

CH-UI treats the dependency graph as undirected and finds connected components via BFS. Each component is an independent pipeline.

Anchor Model

Each pipeline is identified by its anchor model — the first model in topological order within the component. The anchor ID is used for scheduling and pipeline execution.

Example

Pipeline 1: raw_events → cleaned_events → daily_aggregates
Pipeline 2: user_signups → user_cohorts (independent, separate schedule)

Scheduling

Each pipeline can have its own cron schedule. The scheduler checks for due schedules every 30 seconds.

Creating a Schedule

Navigate to the pipeline view and click Schedule on a pipeline card.

FieldDescription
Cron5-field cron expression (minute hour dom month dow)
EnabledToggle to pause/resume

Cron Syntax

Standard 5-field format:

FieldRangeSpecial
Minute0-59*, ,, -, /
Hour0-23*, ,, -, /
Day of month1-31*, ,, -, /
Month1-12*, ,, -, /
Day of week0-6 (Sun=0)*, ,, -, /

Examples:

0 9 * * *     # Daily at 9:00 AM
0 */6 * * *   # Every 6 hours
30 2 * * 0    # Sundays at 2:30 AM
0 0 1 * *     # First of month at midnight

Execution

When a schedule fires, the scheduler calls RunPipeline for the anchor model's connected component. The triggered_by field is set to "scheduler".

Credentials are borrowed from an active user session (up to 3 recent sessions are tried). If no valid session exists, the run fails.

DAG Visualization

The DAG View renders models as an interactive graph using XyFlow.

  • Upstream models appear on the left, downstream on the right
  • Nodes show model name, materialization type, status, and target database
  • Edges show dependency direction
  • Layout: x = depth * 300px, y = index_at_depth * 120px

Run History

Every execution creates a run record with per-model results.

Run Status

StatusMeaning
runningExecution in progress
successAll models succeeded
partialSome models succeeded, some failed
errorAll models failed or were skipped

Per-Model Result Status

StatusMeaning
pendingNot yet started
runningCurrently executing
successCompleted without errors
errorExecution failed (error in error field)
skippedSkipped due to upstream dependency failure

Results include the resolved SQL (with $ref() replaced), elapsed time in milliseconds, and any error message.

Viewing History

Navigate to Models > Runs to see recent runs with their status, model counts, trigger source, and timestamps. Click a run to see per-model results.

Concurrency

  • Only one run per connection at a time (mutex lock)
  • Attempting a second run returns: a model run is already in progress for this connection
  • Each DDL statement has a 5-minute execution timeout

On this page