Governance

Metadata harvesting, access matrix, lineage, policies, incidents, alerts, and audit log

Governance provides ClickHouse visibility: metadata catalog, access control analysis, data lineage, policy enforcement, incident management, alerting, and audit logging.

Requires a Pro license.

Overview

Governance modules:

Dashboard — inventory metrics and sync status
Tables — metadata, comments, notes, sensitivity tags
Query Audit — query history and top query surfaces
Lineage — auto-detected table-level lineage from SQL
Access — users, roles, grants, access matrix, over-permission detection
Incidents — incident queue from policy violations with comments
Policies — guardrail definitions with warn/block enforcement
Query Log — ClickHouse system.query_log snapshot
Alerts — channels, rules, routing, escalation, and dispatch
Audit Log — privileged action audit trail

Sync Model

Governance caches ClickHouse metadata and access state in SQLite via three background sync workers.

Worker	Source	Strategy	Tick
`metadata`	`system.databases`, `system.tables`, `system.columns`, `system.parts`	Full resync with soft deletes	5 min
`query_log`	`system.query_log`	Watermark-based incremental (timestamp)	5 min
`access`	`system.users`, `system.roles`, `system.role_grants`, `system.grants`	Full resync	5 min

Workers wake every 5 minutes and trigger a sync if data is older than 10 minutes (staleness threshold). You can also trigger syncs manually — full or single type.

Credential Borrowing

Sync workers borrow credentials from active user sessions (tries up to 3 recent sessions). No hardcoded passwords are stored.

Sync Status

Each worker tracks: status (idle/running/error), last_synced_at, watermark (query_log only), row_count, and last_error.

Metadata Harvesting

Metadata sync runs in three phases:

Phase 1: Databases

Queries system.databases (excludes system and INFORMATION_SCHEMA). Tracks name, engine, first seen, and last updated. Detects database_added and database_removed changes.

Phase 2: Tables

Queries system.tables joined with system.parts for statistics. Tracks table name, engine, UUID, total rows, total bytes, and partition count. Detects table_added and table_removed changes.

Phase 3: Columns

Queries system.columns. Tracks column name, type, position, default expression, and comment. Detects column_added, column_removed, and column_type_changed events.

All removals use soft deletes (is_deleted flag) to preserve history.

Schema Changes

Seven change types are tracked with old/new values and a detected_at timestamp:

database_added, database_removed
table_added, table_removed
column_added, column_removed, column_type_changed

View recent changes via Governance > Tables or GET /api/governance/schema-changes.

Sensitivity Tags

Tag tables or columns with classification labels for catalog and compliance purposes.

Tag	Purpose
`PII`	Personally identifiable information
`FINANCIAL`	Financial data
`INTERNAL`	Internal/confidential
`PUBLIC`	Public data
`CRITICAL`	Critical/sensitive

Tags are stored in SQLite and displayed alongside metadata in the governance UI. Each tag records who applied it and when.

Comments & Notes

Turn CH-UI into a lightweight data catalog by annotating tables and columns.

Comments

Table comments — synced back to ClickHouse via ALTER TABLE ... MODIFY COMMENT
Column comments — stored in SQLite, displayed alongside column metadata
Max 4000 characters per comment

Notes

Free-form notes on tables or columns. Multiple notes per object, each with author and timestamp. Admin-only creation and deletion.

Access Control

Users & Roles

Harvested from system.users and system.roles. Tracks auth type (no_password, plaintext_password, sha256_password, double_sha1_password), host restrictions, and default roles.

Grants

Harvested from system.grants. Tracks access_type (SELECT, INSERT, ALTER, CREATE, DROP, etc.), target database/table/column, partial revokes, and grant options.

Access Matrix

After each access sync, an access matrix is rebuilt combining direct user grants and role-based grants into a unified view:

Field	Description
User	ClickHouse username
Role	Granted role (if role-based)
Database	Target database
Table	Target table
Privilege	Access type (SELECT, INSERT, etc.)
Direct	Whether the grant is direct or via role
Last Query	Last time user queried this object

Over-Permission Detection

Identifies privileges that haven't been used within a configurable threshold.

Default: 30 days of inactivity
Configurable via ?days=N query parameter (1-3650)
Helps identify privilege cleanup candidates

Lineage

CH-UI auto-detects table-level lineage by parsing SQL from the query log.

Edge Types

Type	Source Pattern
`select_from`	`SELECT ... FROM table`
`insert_select`	`INSERT INTO target SELECT ... FROM source`
`create_as_select`	`CREATE TABLE/MV target AS SELECT ... FROM source`

How It Works

Regex-based extraction from FROM and JOIN clauses
Self-references and system tables are excluded
One edge per source-target pair per query
Stored with query ID, ClickHouse user, and detection timestamp

Viewing Lineage

Table detail view — upstream and downstream edges for a specific table
Full graph — all nodes and edges across the connection

Policy Engine

Define guardrails that check queries against access rules before or after execution.

Policy Fields

Field	Description
Name	Policy identifier
Description	What the policy enforces
Object Type	`database`, `table`, or `column`
Object Target	Database, table, and/or column name
Required Role	ClickHouse role needed to access the object
Severity	`info`, `warn`, `error`, `critical`
Enforcement Mode	`warn` or `block`
Enabled	Toggle

Enforcement Modes

Mode	Behavior
`warn`	Log violation, allow query execution (post-execution detection)
`block`	Reject query before execution (pre-execution guardrail)

Granularity

Database-level — protects all tables in the database
Table-level — protects a specific table
Column-level — protects a specific column (word-boundary regex match against query text)

Evaluation

Check if the query touches a protected object (tables extracted from SQL + text heuristic)
Verify the user's roles from the access matrix
If the user lacks the required role, create a violation
If enforcement mode is block, reject the query with a 403

Guardrails

The guardrail service runs pre-execution checks on all query endpoints. If access state is stale (>10 minutes), an alert event is emitted but the query is allowed.

Incidents

Incidents track policy violations through a resolution workflow.

Auto-Creation

When a policy violation occurs, an incident is automatically created (or deduplicated into an existing one).

Deduplication key: violation:{policy_name}:{user}:{severity} (lowercased)

If an open/triaged/in-progress incident with the same key exists:

occurrence_count is incremented
last_seen_at is updated
No duplicate incident is created

Source Types

Source	Description
`violation`	Auto-created from policy violation
`over_permission`	Created from over-permission detection
`manual`	Manually created by admin

Status Workflow

open → triaged → in_progress → resolved (or dismissed)

resolved_at is auto-set when status changes to resolved or dismissed.

Comments

Admins can add threaded comments to incidents. Each comment updates the incident's updated_at timestamp.

Query Log

The Query Log tab shows a snapshot from system.query_log (falls back to query_thread_log if unavailable).

Ingested Query Log

Synced incrementally via watermark. Includes:

Query text, user, event time
Normalized hash (SHA256 for fingerprinting similar queries)
Query kind (Select/Insert/Create/Alter/Drop/Other)
Metrics: duration, read rows/bytes, result rows, written rows/bytes, memory usage
Tables used (extracted from FROM/JOIN clauses)
Error tracking: is_error, exception code, exception message

Batch limit: 5000 queries per sync cycle.

Provider	Configuration
SMTP	Host, port, username, password, from address (plaintext or StartTLS)
Resend	API key, from address
Brevo	API key, from address

Channel configs are encrypted at rest. Use the Test button to verify delivery before saving.

Rules

Rules match events to channels.

Field	Description
Name	Rule identifier
Event Type	`policy.violation`, `schedule.failed`, `schedule.slow`, or `*` (wildcard)
Severity Minimum	Only match events at or above this severity
Cooldown (seconds)	Deduplication window per fingerprint
Max Attempts	Retry limit per delivery
Subject Template	Supports `{{placeholders}}`
Body Template	Supports `{{placeholders}}`

Available template variables: {{event_type}}, {{severity}}, {{title}}, {{message}}, {{payload_json}}, {{channel_name}}, {{created_at}}, {{event_id}}, {{rule_name}}.

Routing

Each rule has one or more routes that determine where and how to deliver.

Field	Description
Channel	Which alert channel to use
Recipients	Email addresses
Delivery Mode	`immediate` or `digest`
Digest Window	Minutes to batch (default 15, for digest mode)

Escalation

Routes can define an escalation path:

Field	Description
Escalation Channel	Fallback channel after failures
Escalation Recipients	Fallback email addresses
Escalation After Failures	Number of failed attempts before escalating

Dispatch

The dispatcher runs every 8 seconds (Pro edition only):

Materialize — matches new events to rules, creates dispatch jobs (up to 100 per tick)
Process due jobs — sends immediate notifications (up to 30 per tick)
Process digests — sends batched digest notifications (up to 30 per tick)

Retry uses exponential backoff: 10s * 2^(attempt-1), max 30 minutes.

Audit Log

Tracks privileged actions across governance and alerts.

Captured Actions

governance.sync — full sync triggered
governance.table.comment.updated — table comment change
governance.column.comment.updated — column comment change
governance.table.note.created / governance.column.note.created — note added
governance.object.note.deleted — note deleted
governance.tag.created / governance.tag.deleted — tag change
governance.access.user.created / governance.access.user.deleted — user change
governance.policy.created / governance.policy.updated / governance.policy.deleted — policy change
governance.incident.from_violation — incident auto-created
alerts.channel.created / alerts.channel.updated / alerts.channel.deleted — channel change
alerts.rule.created / alerts.rule.updated / alerts.rule.deleted — rule change

Filtering

Filter by time range, action type, username, or free-text search. Default limit 100, max 1000.

Governance

On this page