CH-UICH-UI

Governance

Metadata harvesting, access matrix, lineage, policies, incidents, alerts, and audit log

Governance provides ClickHouse visibility: metadata catalog, access control analysis, data lineage, policy enforcement, incident management, alerting, and audit logging.

Requires a Pro license.

Overview

Governance modules:

  • Dashboard — inventory metrics and sync status
  • Tables — metadata, comments, notes, sensitivity tags
  • Query Audit — query history and top query surfaces
  • Lineage — auto-detected table-level lineage from SQL
  • Access — users, roles, grants, access matrix, over-permission detection
  • Incidents — incident queue from policy violations with comments
  • Policies — guardrail definitions with warn/block enforcement
  • Query Log — ClickHouse system.query_log snapshot
  • Alerts — channels, rules, routing, escalation, and dispatch
  • Audit Log — privileged action audit trail

Sync Model

Governance caches ClickHouse metadata and access state in SQLite via three background sync workers.

WorkerSourceStrategyTick
metadatasystem.databases, system.tables, system.columns, system.partsFull resync with soft deletes5 min
query_logsystem.query_logWatermark-based incremental (timestamp)5 min
accesssystem.users, system.roles, system.role_grants, system.grantsFull resync5 min

Workers wake every 5 minutes and trigger a sync if data is older than 10 minutes (staleness threshold). You can also trigger syncs manually — full or single type.

Credential Borrowing

Sync workers borrow credentials from active user sessions (tries up to 3 recent sessions). No hardcoded passwords are stored.

Sync Status

Each worker tracks: status (idle/running/error), last_synced_at, watermark (query_log only), row_count, and last_error.

Metadata Harvesting

Metadata sync runs in three phases:

Phase 1: Databases

Queries system.databases (excludes system and INFORMATION_SCHEMA). Tracks name, engine, first seen, and last updated. Detects database_added and database_removed changes.

Phase 2: Tables

Queries system.tables joined with system.parts for statistics. Tracks table name, engine, UUID, total rows, total bytes, and partition count. Detects table_added and table_removed changes.

Phase 3: Columns

Queries system.columns. Tracks column name, type, position, default expression, and comment. Detects column_added, column_removed, and column_type_changed events.

All removals use soft deletes (is_deleted flag) to preserve history.

Schema Changes

Seven change types are tracked with old/new values and a detected_at timestamp:

  • database_added, database_removed
  • table_added, table_removed
  • column_added, column_removed, column_type_changed

View recent changes via Governance > Tables or GET /api/governance/schema-changes.

Sensitivity Tags

Tag tables or columns with classification labels for catalog and compliance purposes.

TagPurpose
PIIPersonally identifiable information
FINANCIALFinancial data
INTERNALInternal/confidential
PUBLICPublic data
CRITICALCritical/sensitive

Tags are stored in SQLite and displayed alongside metadata in the governance UI. Each tag records who applied it and when.

Comments & Notes

Turn CH-UI into a lightweight data catalog by annotating tables and columns.

Comments

  • Table comments — synced back to ClickHouse via ALTER TABLE ... MODIFY COMMENT
  • Column comments — stored in SQLite, displayed alongside column metadata
  • Max 4000 characters per comment

Notes

Free-form notes on tables or columns. Multiple notes per object, each with author and timestamp. Admin-only creation and deletion.

Access Control

Users & Roles

Harvested from system.users and system.roles. Tracks auth type (no_password, plaintext_password, sha256_password, double_sha1_password), host restrictions, and default roles.

Grants

Harvested from system.grants. Tracks access_type (SELECT, INSERT, ALTER, CREATE, DROP, etc.), target database/table/column, partial revokes, and grant options.

Access Matrix

After each access sync, an access matrix is rebuilt combining direct user grants and role-based grants into a unified view:

FieldDescription
UserClickHouse username
RoleGranted role (if role-based)
DatabaseTarget database
TableTarget table
PrivilegeAccess type (SELECT, INSERT, etc.)
DirectWhether the grant is direct or via role
Last QueryLast time user queried this object

Over-Permission Detection

Identifies privileges that haven't been used within a configurable threshold.

  • Default: 30 days of inactivity
  • Configurable via ?days=N query parameter (1-3650)
  • Helps identify privilege cleanup candidates

Lineage

CH-UI auto-detects table-level lineage by parsing SQL from the query log.

Edge Types

TypeSource Pattern
select_fromSELECT ... FROM table
insert_selectINSERT INTO target SELECT ... FROM source
create_as_selectCREATE TABLE/MV target AS SELECT ... FROM source

How It Works

  • Regex-based extraction from FROM and JOIN clauses
  • Self-references and system tables are excluded
  • One edge per source-target pair per query
  • Stored with query ID, ClickHouse user, and detection timestamp

Viewing Lineage

  • Table detail view — upstream and downstream edges for a specific table
  • Full graph — all nodes and edges across the connection

Policy Engine

Define guardrails that check queries against access rules before or after execution.

Policy Fields

FieldDescription
NamePolicy identifier
DescriptionWhat the policy enforces
Object Typedatabase, table, or column
Object TargetDatabase, table, and/or column name
Required RoleClickHouse role needed to access the object
Severityinfo, warn, error, critical
Enforcement Modewarn or block
EnabledToggle

Enforcement Modes

ModeBehavior
warnLog violation, allow query execution (post-execution detection)
blockReject query before execution (pre-execution guardrail)

Granularity

  • Database-level — protects all tables in the database
  • Table-level — protects a specific table
  • Column-level — protects a specific column (word-boundary regex match against query text)

Evaluation

  1. Check if the query touches a protected object (tables extracted from SQL + text heuristic)
  2. Verify the user's roles from the access matrix
  3. If the user lacks the required role, create a violation
  4. If enforcement mode is block, reject the query with a 403

Guardrails

The guardrail service runs pre-execution checks on all query endpoints. If access state is stale (>10 minutes), an alert event is emitted but the query is allowed.

Incidents

Incidents track policy violations through a resolution workflow.

Auto-Creation

When a policy violation occurs, an incident is automatically created (or deduplicated into an existing one).

Deduplication key: violation:{policy_name}:{user}:{severity} (lowercased)

If an open/triaged/in-progress incident with the same key exists:

  • occurrence_count is incremented
  • last_seen_at is updated
  • No duplicate incident is created

Source Types

SourceDescription
violationAuto-created from policy violation
over_permissionCreated from over-permission detection
manualManually created by admin

Status Workflow

open → triaged → in_progress → resolved (or dismissed)

resolved_at is auto-set when status changes to resolved or dismissed.

Comments

Admins can add threaded comments to incidents. Each comment updates the incident's updated_at timestamp.

Query Log

The Query Log tab shows a snapshot from system.query_log (falls back to query_thread_log if unavailable).

Ingested Query Log

Synced incrementally via watermark. Includes:

  • Query text, user, event time
  • Normalized hash (SHA256 for fingerprinting similar queries)
  • Query kind (Select/Insert/Create/Alter/Drop/Other)
  • Metrics: duration, read rows/bytes, result rows, written rows/bytes, memory usage
  • Tables used (extracted from FROM/JOIN clauses)
  • Error tracking: is_error, exception code, exception message

Batch limit: 5000 queries per sync cycle.

Top Queries

GET /api/governance/query-log/top returns queries ranked by frequency using the normalized hash.

Live Query Log

GET /api/governance/clickhouse-query-log queries system.query_log directly with filters: time range, query kind, status, and free-text search.

Alerts

Governance includes a full alerting system for governance events.

Channels

Three provider types for delivering notifications:

ProviderConfiguration
SMTPHost, port, username, password, from address (plaintext or StartTLS)
ResendAPI key, from address
BrevoAPI key, from address

Channel configs are encrypted at rest. Use the Test button to verify delivery before saving.

Rules

Rules match events to channels.

FieldDescription
NameRule identifier
Event Typepolicy.violation, schedule.failed, schedule.slow, or * (wildcard)
Severity MinimumOnly match events at or above this severity
Cooldown (seconds)Deduplication window per fingerprint
Max AttemptsRetry limit per delivery
Subject TemplateSupports {{placeholders}}
Body TemplateSupports {{placeholders}}

Available template variables: {{event_type}}, {{severity}}, {{title}}, {{message}}, {{payload_json}}, {{channel_name}}, {{created_at}}, {{event_id}}, {{rule_name}}.

Routing

Each rule has one or more routes that determine where and how to deliver.

FieldDescription
ChannelWhich alert channel to use
RecipientsEmail addresses
Delivery Modeimmediate or digest
Digest WindowMinutes to batch (default 15, for digest mode)

Escalation

Routes can define an escalation path:

FieldDescription
Escalation ChannelFallback channel after failures
Escalation RecipientsFallback email addresses
Escalation After FailuresNumber of failed attempts before escalating

Dispatch

The dispatcher runs every 8 seconds (Pro edition only):

  1. Materialize — matches new events to rules, creates dispatch jobs (up to 100 per tick)
  2. Process due jobs — sends immediate notifications (up to 30 per tick)
  3. Process digests — sends batched digest notifications (up to 30 per tick)

Retry uses exponential backoff: 10s * 2^(attempt-1), max 30 minutes.

Audit Log

Tracks privileged actions across governance and alerts.

Captured Actions

  • governance.sync — full sync triggered
  • governance.table.comment.updated — table comment change
  • governance.column.comment.updated — column comment change
  • governance.table.note.created / governance.column.note.created — note added
  • governance.object.note.deleted — note deleted
  • governance.tag.created / governance.tag.deleted — tag change
  • governance.access.user.created / governance.access.user.deleted — user change
  • governance.policy.created / governance.policy.updated / governance.policy.deleted — policy change
  • governance.incident.from_violation — incident auto-created
  • alerts.channel.created / alerts.channel.updated / alerts.channel.deleted — channel change
  • alerts.rule.created / alerts.rule.updated / alerts.rule.deleted — rule change

Filtering

Filter by time range, action type, username, or free-text search. Default limit 100, max 1000.

On this page