Governance
Metadata harvesting, access matrix, lineage, policies, incidents, alerts, and audit log
Governance provides ClickHouse visibility: metadata catalog, access control analysis, data lineage, policy enforcement, incident management, alerting, and audit logging.
Requires a Pro license.
Overview
Governance modules:
- Dashboard — inventory metrics and sync status
- Tables — metadata, comments, notes, sensitivity tags
- Query Audit — query history and top query surfaces
- Lineage — auto-detected table-level lineage from SQL
- Access — users, roles, grants, access matrix, over-permission detection
- Incidents — incident queue from policy violations with comments
- Policies — guardrail definitions with warn/block enforcement
- Query Log — ClickHouse
system.query_logsnapshot - Alerts — channels, rules, routing, escalation, and dispatch
- Audit Log — privileged action audit trail
Sync Model
Governance caches ClickHouse metadata and access state in SQLite via three background sync workers.
| Worker | Source | Strategy | Tick |
|---|---|---|---|
metadata | system.databases, system.tables, system.columns, system.parts | Full resync with soft deletes | 5 min |
query_log | system.query_log | Watermark-based incremental (timestamp) | 5 min |
access | system.users, system.roles, system.role_grants, system.grants | Full resync | 5 min |
Workers wake every 5 minutes and trigger a sync if data is older than 10 minutes (staleness threshold). You can also trigger syncs manually — full or single type.
Credential Borrowing
Sync workers borrow credentials from active user sessions (tries up to 3 recent sessions). No hardcoded passwords are stored.
Sync Status
Each worker tracks: status (idle/running/error), last_synced_at, watermark (query_log only), row_count, and last_error.
Metadata Harvesting
Metadata sync runs in three phases:
Phase 1: Databases
Queries system.databases (excludes system and INFORMATION_SCHEMA). Tracks name, engine, first seen, and last updated. Detects database_added and database_removed changes.
Phase 2: Tables
Queries system.tables joined with system.parts for statistics. Tracks table name, engine, UUID, total rows, total bytes, and partition count. Detects table_added and table_removed changes.
Phase 3: Columns
Queries system.columns. Tracks column name, type, position, default expression, and comment. Detects column_added, column_removed, and column_type_changed events.
All removals use soft deletes (is_deleted flag) to preserve history.
Schema Changes
Seven change types are tracked with old/new values and a detected_at timestamp:
database_added,database_removedtable_added,table_removedcolumn_added,column_removed,column_type_changed
View recent changes via Governance > Tables or GET /api/governance/schema-changes.
Sensitivity Tags
Tag tables or columns with classification labels for catalog and compliance purposes.
| Tag | Purpose |
|---|---|
PII | Personally identifiable information |
FINANCIAL | Financial data |
INTERNAL | Internal/confidential |
PUBLIC | Public data |
CRITICAL | Critical/sensitive |
Tags are stored in SQLite and displayed alongside metadata in the governance UI. Each tag records who applied it and when.
Comments & Notes
Turn CH-UI into a lightweight data catalog by annotating tables and columns.
Comments
- Table comments — synced back to ClickHouse via
ALTER TABLE ... MODIFY COMMENT - Column comments — stored in SQLite, displayed alongside column metadata
- Max 4000 characters per comment
Notes
Free-form notes on tables or columns. Multiple notes per object, each with author and timestamp. Admin-only creation and deletion.
Access Control
Users & Roles
Harvested from system.users and system.roles. Tracks auth type (no_password, plaintext_password, sha256_password, double_sha1_password), host restrictions, and default roles.
Grants
Harvested from system.grants. Tracks access_type (SELECT, INSERT, ALTER, CREATE, DROP, etc.), target database/table/column, partial revokes, and grant options.
Access Matrix
After each access sync, an access matrix is rebuilt combining direct user grants and role-based grants into a unified view:
| Field | Description |
|---|---|
| User | ClickHouse username |
| Role | Granted role (if role-based) |
| Database | Target database |
| Table | Target table |
| Privilege | Access type (SELECT, INSERT, etc.) |
| Direct | Whether the grant is direct or via role |
| Last Query | Last time user queried this object |
Over-Permission Detection
Identifies privileges that haven't been used within a configurable threshold.
- Default: 30 days of inactivity
- Configurable via
?days=Nquery parameter (1-3650) - Helps identify privilege cleanup candidates
Lineage
CH-UI auto-detects table-level lineage by parsing SQL from the query log.
Edge Types
| Type | Source Pattern |
|---|---|
select_from | SELECT ... FROM table |
insert_select | INSERT INTO target SELECT ... FROM source |
create_as_select | CREATE TABLE/MV target AS SELECT ... FROM source |
How It Works
- Regex-based extraction from
FROMandJOINclauses - Self-references and system tables are excluded
- One edge per source-target pair per query
- Stored with query ID, ClickHouse user, and detection timestamp
Viewing Lineage
- Table detail view — upstream and downstream edges for a specific table
- Full graph — all nodes and edges across the connection
Policy Engine
Define guardrails that check queries against access rules before or after execution.
Policy Fields
| Field | Description |
|---|---|
| Name | Policy identifier |
| Description | What the policy enforces |
| Object Type | database, table, or column |
| Object Target | Database, table, and/or column name |
| Required Role | ClickHouse role needed to access the object |
| Severity | info, warn, error, critical |
| Enforcement Mode | warn or block |
| Enabled | Toggle |
Enforcement Modes
| Mode | Behavior |
|---|---|
warn | Log violation, allow query execution (post-execution detection) |
block | Reject query before execution (pre-execution guardrail) |
Granularity
- Database-level — protects all tables in the database
- Table-level — protects a specific table
- Column-level — protects a specific column (word-boundary regex match against query text)
Evaluation
- Check if the query touches a protected object (tables extracted from SQL + text heuristic)
- Verify the user's roles from the access matrix
- If the user lacks the required role, create a violation
- If enforcement mode is
block, reject the query with a 403
Guardrails
The guardrail service runs pre-execution checks on all query endpoints. If access state is stale (>10 minutes), an alert event is emitted but the query is allowed.
Incidents
Incidents track policy violations through a resolution workflow.
Auto-Creation
When a policy violation occurs, an incident is automatically created (or deduplicated into an existing one).
Deduplication key: violation:{policy_name}:{user}:{severity} (lowercased)
If an open/triaged/in-progress incident with the same key exists:
occurrence_countis incrementedlast_seen_atis updated- No duplicate incident is created
Source Types
| Source | Description |
|---|---|
violation | Auto-created from policy violation |
over_permission | Created from over-permission detection |
manual | Manually created by admin |
Status Workflow
open → triaged → in_progress → resolved (or dismissed)resolved_at is auto-set when status changes to resolved or dismissed.
Comments
Admins can add threaded comments to incidents. Each comment updates the incident's updated_at timestamp.
Query Log
The Query Log tab shows a snapshot from system.query_log (falls back to query_thread_log if unavailable).
Ingested Query Log
Synced incrementally via watermark. Includes:
- Query text, user, event time
- Normalized hash (SHA256 for fingerprinting similar queries)
- Query kind (Select/Insert/Create/Alter/Drop/Other)
- Metrics: duration, read rows/bytes, result rows, written rows/bytes, memory usage
- Tables used (extracted from FROM/JOIN clauses)
- Error tracking: is_error, exception code, exception message
Batch limit: 5000 queries per sync cycle.
Top Queries
GET /api/governance/query-log/top returns queries ranked by frequency using the normalized hash.
Live Query Log
GET /api/governance/clickhouse-query-log queries system.query_log directly with filters: time range, query kind, status, and free-text search.
Alerts
Governance includes a full alerting system for governance events.
Channels
Three provider types for delivering notifications:
| Provider | Configuration |
|---|---|
| SMTP | Host, port, username, password, from address (plaintext or StartTLS) |
| Resend | API key, from address |
| Brevo | API key, from address |
Channel configs are encrypted at rest. Use the Test button to verify delivery before saving.
Rules
Rules match events to channels.
| Field | Description |
|---|---|
| Name | Rule identifier |
| Event Type | policy.violation, schedule.failed, schedule.slow, or * (wildcard) |
| Severity Minimum | Only match events at or above this severity |
| Cooldown (seconds) | Deduplication window per fingerprint |
| Max Attempts | Retry limit per delivery |
| Subject Template | Supports {{placeholders}} |
| Body Template | Supports {{placeholders}} |
Available template variables: {{event_type}}, {{severity}}, {{title}}, {{message}}, {{payload_json}}, {{channel_name}}, {{created_at}}, {{event_id}}, {{rule_name}}.
Routing
Each rule has one or more routes that determine where and how to deliver.
| Field | Description |
|---|---|
| Channel | Which alert channel to use |
| Recipients | Email addresses |
| Delivery Mode | immediate or digest |
| Digest Window | Minutes to batch (default 15, for digest mode) |
Escalation
Routes can define an escalation path:
| Field | Description |
|---|---|
| Escalation Channel | Fallback channel after failures |
| Escalation Recipients | Fallback email addresses |
| Escalation After Failures | Number of failed attempts before escalating |
Dispatch
The dispatcher runs every 8 seconds (Pro edition only):
- Materialize — matches new events to rules, creates dispatch jobs (up to 100 per tick)
- Process due jobs — sends immediate notifications (up to 30 per tick)
- Process digests — sends batched digest notifications (up to 30 per tick)
Retry uses exponential backoff: 10s * 2^(attempt-1), max 30 minutes.
Audit Log
Tracks privileged actions across governance and alerts.
Captured Actions
governance.sync— full sync triggeredgovernance.table.comment.updated— table comment changegovernance.column.comment.updated— column comment changegovernance.table.note.created/governance.column.note.created— note addedgovernance.object.note.deleted— note deletedgovernance.tag.created/governance.tag.deleted— tag changegovernance.access.user.created/governance.access.user.deleted— user changegovernance.policy.created/governance.policy.updated/governance.policy.deleted— policy changegovernance.incident.from_violation— incident auto-createdalerts.channel.created/alerts.channel.updated/alerts.channel.deleted— channel changealerts.rule.created/alerts.rule.updated/alerts.rule.deleted— rule change
Filtering
Filter by time range, action type, username, or free-text search. Default limit 100, max 1000.