catalog: add per-timeframe console cluster utilization overview views by jubrad · Pull Request #37307 · MaterializeInc/materialize

jubrad · 2026-06-25T20:43:08Z

Motivation

The Console's cluster-detail page polls a replica-utilization rollup that, for every timeframe except "Last 14 days", is recomputed ad-hoc on each request: it builds the whole-fleet rollup (per-replica metrics aggregation, five per-bucket arg-max top-1s, a multi-way join) and only filters down to the one selected cluster at the very end.

That recompute is CPU-bound on mz_catalog_server, so it serializes under concurrent users and its cost grows linearly with the number of clusters in the deployment. In a local repro on a scale=1,workers=1 cluster (matching mz_catalog_server) with synthetic fleets at a realistic 1-min metric cadence:

clusters	recompute (1 query)	recompute @ 16 sessions	indexed lookup @ 16 sessions
25	3.5 s	p50 51.8 s / 0.33 qps	p50 124 ms / 125 qps
100	13 s	replica OOM-crash	p50 124 ms / 117 qps
400	52 s	—	p50 124 ms / 125 qps

The indexed lookup is flat in both fleet size and concurrency; the recompute collapses.

What this does

Adds two new indexed views alongside the existing 14-day overview, so the Console can read a maintained, per-cluster indexed lookup for every timeframe it offers:

mz_console_cluster_utilization_overview_3h — 1-minute buckets, 3-hour window
mz_console_cluster_utilization_overview_24h — 5-minute buckets, 24-hour window
mz_console_cluster_utilization_overview — now 1-hour buckets / 14 days (was 8-hour)

All three share one SQL body (console_cluster_utilization_overview_sql) and output relation so they stay in sync, and each is indexed on cluster_id in mz_catalog_server. The shared body also drops a redundant re-join of replica_history that the binning step never needed.

Cost / rollout

The two new views only read a 3h/24h window, so they are cheap to maintain (~10 MB and ~510 MB of arrangements; ~1–10 s hydration at 100 replicas). The 14-day view dominates cost (input-bound on 14 days of 1-minute samples — ~1.7 GB / ~33 s hydration per 50 replicas with the GROUP SIZE hints); the 8h→1h change roughly doubles its arrangement memory but leaves hydration unchanged. Incremental cost of the whole change is ~18 MB/replica.

Because the cost is modest for typical deployments and significant only for very large fleets, these ship always-on; the companion Console PR gates use of the views on the environment version and falls back to the ad-hoc query on older environments. A system-flag kill-switch can be added as a fast-follow if a large environment needs it.

Tests

Updates the catalog snapshot tests (oid, information_schema_tables, mz_catalog_server_index_accounting, catalog_server_explain, autogenerated mz_internal) and the catalog / indexes / explain-analyze testdrive files to cover the new objects.

The Console's cluster-detail page polls a replica-utilization rollup that, for every timeframe except "Last 14 days", was recomputed ad-hoc on each request: it builds the whole-fleet rollup (per-replica metrics aggregation, five per-bucket arg-max top-1s, a multi-way join) and only filters to the one selected cluster at the very end. That recompute is CPU-bound on mz_catalog_server, so it serializes under concurrent users and its cost grows linearly with the number of clusters in the deployment. This adds two new indexed views alongside the existing 14-day overview so the Console can read a maintained, per-cluster indexed lookup for every timeframe it offers instead of recomputing: - mz_console_cluster_utilization_overview_3h (1-minute buckets, 3h window) - mz_console_cluster_utilization_overview_24h (5-minute buckets, 24h window) - mz_console_cluster_utilization_overview (now 1-hour buckets, 14d; previously 8-hour buckets) All three share one SQL body (console_cluster_utilization_overview_sql) and output relation so they stay in sync, and each is indexed on cluster_id in mz_catalog_server. The view body is also cleaned up to drop a redundant re-join of replica_history that the binning step never needed. The two new views read only a 3h/24h window of metrics, so they are cheap to maintain (~10MB and ~510MB of arrangements, ~1-10s hydration at 100 replicas). The 14-day view dominates cost (input-bound on 14 days of 1-minute samples); the 8h->1h change roughly doubles its arrangement memory but leaves hydration unchanged. The Console gates use of these views on the environment version, falling back to the ad-hoc query on older environments. Updates the catalog snapshot tests (oid, information_schema_tables, mz_catalog_server_index_accounting, catalog_server_explain, autogenerated mz_internal) and the catalog/indexes/explain-analyze testdrive files to cover the new objects.

The cluster-detail page's replica-utilization graph previously recomputed the whole-fleet rollup ad-hoc for every timeframe except "Last 14 days", which serializes on mz_catalog_server under concurrent users and scales with the number of clusters in the deployment. This points the graph at the pre-materialized, per-cluster indexed overview views (added in the companion catalog PR): a new selector picks the finest view whose window covers the selected range — _3h (1-min), _24h (5-min), or the 14-day view (now 1-hour) — so each load is an indexed lookup keyed on cluster_id. Ranges longer than any view (30 days) still use the ad-hoc query. Use of the new views is gated on the environment version via useEnvironmentGate (CLUSTER_UTILIZATION_OVERVIEW_VIEWS_VERSION); older environments keep today's behavior (ad-hoc for short ranges, the pre-existing 8-hour view for 14 days). The remaining ad-hoc query (buildReplicaUtilizationHistoryQuery, used for the 30-day range and on older environments) is also optimized: the cluster filter is pushed down into replica_history so the rollup only processes the target cluster's replicas, the redundant replica_history re-join is removed, and the offline-events scan is restricted to the cluster's replicas. Also fixes a latent bug where the optional endDate bound used startDate. types/materialize.d.ts is hand-edited to add the new view types; it should be regenerated with `yarn gen:types` against an environment that has the views. Depends on MaterializeInc#37307.

jubrad · 2026-06-26T15:44:56Z

cc @leedqin

Cluster-utilization overview — performance findings

Backing the Console's cluster-detail replica-utilization graphs. The Console used
to run a whole-fleet utilization rollup on every cluster page load against
mz_catalog_server. This summarizes what we measured and weighs the options.

TL;DR — ship the three maintained indexed views + cluster_id point lookup.
It turns a single-worker, throughput-capped fleet-wide recompute into a shared,
sub-second lookup: ~0.33 → ~125 qps, p50 102 s → 236 ms at 32 concurrent
clients, holding up at 800 replicas. The two alternatives we tested lose: a
single 1-min base is a nicer storage shape but collapses under concurrent
SUBSCRIBE; a base + 1h chain is a memory wash today.

Numbers from a local environmentd, dedicated bench cluster scale=1,workers=1
(1 worker, like mz_catalog_server), synthetic fleet, 14 d of metrics at 1/min.
Memory = settled maintained-arrangement bytes (mz_object_arrangement_sizes).
2 replicas per cluster (so "200 replicas" = "100 clusters").

The problem

Each page load issued the full rollup (five per-bucket arg-max top-1s, a 6-way
join, a name lateral, the offline jsonb_agg) and only then filtered to the one
cluster. The filter sat after the top-1s, so it couldn't push into a lookup —
every query recomputed the whole fleet, and on the single-worker catalog server N
concurrent loads serialize.

The fix, and what it buys (before/after)

Three maintained indexed views — _3h (1-min/3 h), _24h (5-min/24 h),
_overview (1-h/14 d) — with cluster_id as the leading key so
WHERE cluster_id = ? is a point lookup. The Console reads the finest view
covering the timeframe.

SELECT … WHERE cluster_id = ? from N concurrent clients, 25 clusters:

N clients	before (no index)	after (indexed)
1	0.33 qps / 3.50 s	3.5 qps / 354 ms
4	0.35 qps / 12.9 s	58.8 qps / 58 ms
16	0.33 qps / 51.8 s	125.1 qps / 124 ms
32	0.32 qps / 102.2 s	118.1 qps / 236 ms

Before is pinned at ~0.33 qps regardless of concurrency (clients just queue; p50
grows linearly). After sustains ~125 qps, sub-300 ms p50 at N=32. Scale: at
400 clusters (800 replicas) the indexed path still does 4.3 → 131 qps (N=1→32);
the unindexed path at 100 clusters degrades to 0.08 qps and drops server
connections at N≥16.

Weighing the options

	1. Three maintained views (shipped)	2. Single 1-min base	3. Base + 1h chain
Maintained objects	3	1	2
Memory @200 replicas	7230 MB	4760 MB	7080 MB
Hydration	baseline	~40% faster	~40% faster (base)
Single-user query p50	88–192 ms	320–330 ms (re-bin)	minute data + client re-bin
Concurrent SUBSCRIBE (N=32)	230 ms, flat mem	18.9 s, +628 MB	230 ms, flat mem
<24 h resolution	5-min	1-min	1-min
Verdict	✅ ship	❌ SUBSCRIBE-hostile	⚠️ wash today

Option 2 — single 1-min base. Tempting: at 1/min scrape each 1-min bucket
holds one sample, so the five top-1s collapse to identity and vanish; one index
subsumes all three views, ~40% faster hydration, less memory at scale. But the
intended use is SUBSCRIBE, and a SUBSCRIBE can't share an arrangement unless a
catalog object maintains it. Re-binning coarse ranges at query time means each
subscriber builds its own re-bin dataflow:

N SUBSCRIBEs	Opt 1 (maintained)	Opt 2 (re-bin)
1	27 ms / flat	611 ms / flat
16	168 ms / flat	9.8 s / +258 MB
32	230 ms / flat	18.9 s / +628 MB

Opt 2 serializes on the worker and blows up — 82× worse snapshots at N=32. Rules
it out for the SUBSCRIBE use case.

queries

See Q4 (base) and Q5 (unmaterialized re-bin) in the appendix.

Option 3 — base + a maintained 1h view on top. Both arrangements are
maintained (SUBSCRIBE-safe), 1-min resolution for <24 h, two objects not three.
Measured, it's a memory wash: 3460 MB @100 replicas (+0.9%) / 7080 MB @200
(−2%) vs the shipped 7230 MB, and apples-to-apples both grow ~2.05×/doubling —
no divergence in the measured range. The genuine scaling signal lives in the
base shape, not the chain total: the 14d top-1 rollup grows super-linearly
(~2.45×/doubling) while the 1-min base grows ~linearly, so the base alone is
29% cheaper than the lean 14d view at 100 clusters — but adding the 1h object to
make it usable eats that back. A material chain win only appears once the base's
near-linear growth dominates, extrapolated past ~200 clusters (~400 replicas);
not measured here (data stops at 100 clusters). The 1h view does reuse the base
(1170 MB increment vs ~2690 MB standalone). Worth revisiting if fleets approach
that scale or if 1-min <24 h resolution is wanted.

query

See Q7 in the appendix.

Key supporting findings

Pushdown is everything. With cluster_id after the top-1s, a coarse
re-bin full-scans the fleet: 7.1 s / 14.3 s at 50 / 100 replicas. Making
cluster_id the leading DISTINCT ON key turns it into a lookup —
330 ms, flat across fleet size. Same trap the original Console query fell
into.
The DISTINCT ON GROUP SIZE hint is mandatory. Without it the 14d view
hydrates in 545.8 s; with it, ~67 s at 100 replicas (~8×).
Bin width is the overview's main cost knob. At 50 replicas the 14d view is
930 MB at 8-h bins vs 1700 MB at 1-h bins — the finer overview resolution
roughly doubles its memory. (We moved 14 d from 8 h → 1 h deliberately.)
The offline jsonb_agg is a non-issue in the indexed design. Measured with
the offline part on vs off, coexisting in one cluster state, settled:
+0.3% memory at realistic volume, +3.3% at a 20× stress load; the jsonb
arrangement in isolation is ~0 MB (90 MB at stress). EXPLAIN confirms the
variants differ (13 vs 0 jsonb refs). It aggregates the tiny status table and
is maintained once, so it adds nothing per query — no need to split it out.

why the earlier A/B looked wrong

An initial A/B measured the two variants in separate cluster states (build →
measure → drop, twice) and read the first non-zero arrangement size, producing
an impossible "without offline uses more memory." Measuring them coexisting
and settled gives the correct, necessarily ≥ 0 deltas above.

Conclusion

Ship the three maintained indexed views with the cluster_id point lookup. The
1-min base is a better single-user storage shape but is SUBSCRIBE-hostile; the
base + 1h chain is a memory wash today (scaling upside only past ~200 clusters);
the offline jsonb_agg is negligible. This exploration strengthens the shipped
design rather than replacing it.

Methodology

All runs on a local environmentd, dedicated bench cluster scale=1,workers=1
(one worker, matching mz_catalog_server), synthetic fleet shadowing the real
system relations (replica_sizes, replica_history, replica_name_history,
replica_status_history, replica_metrics_history) with 14 days of metrics at a
1/min scrape (20,160 samples/replica). Memory = sum of maintained-arrangement
bytes from mz_internal.mz_object_arrangement_sizes, read after the arrangement
settles. Throughput from N concurrent clients issuing point lookups for a fixed
duration. Raw per-run stdout for every number here is preserved alongside this
doc.

Appendix: queries

The harness shadows the real system relations with user tables of identical
shape. Two CTEs are shared by several designs and shown once.

rmh — per-sample cross-process rollup:

SELECT m.occurred_at, m.replica_id, r.size, r.cluster_id,
  (SUM(m.cpu_nano_cores::float8)/(NULLIF(s.cpu_nano_cores,0)*s.processes))   AS cpu_percent,
  (SUM(m.memory_bytes::float8)/(NULLIF(s.memory_bytes,0)*s.processes))       AS memory_percent,
  (SUM(m.disk_bytes::float8)/(NULLIF(s.disk_bytes,0)*s.processes))           AS disk_percent,
  SUM(m.memory_bytes::float8) AS memory_bytes, SUM(m.disk_bytes::float8) AS disk_bytes,
  s.disk_bytes*s.processes AS total_disk_bytes, s.memory_bytes*s.processes AS total_memory_bytes,
  MAX(m.heap_bytes::float8) AS heap_bytes, MAX(m.heap_limit) AS heap_limit,
  COALESCE(MAX(m.heap_bytes::float8/NULLIF(m.heap_limit,0)),
           SUM(m.memory_bytes::float8)/(NULLIF(s.memory_bytes,0)*s.processes)) AS heap_percent
FROM replica_history AS r
JOIN replica_sizes AS s ON r.size = s.size
JOIN replica_metrics_history AS m ON m.replica_id = r.replica_id
GROUP BY m.occurred_at, m.replica_id, r.size, r.cluster_id,
         s.cpu_nano_cores, s.memory_bytes, s.disk_bytes, s.processes

five_max(src, GS) — five per-bucket arg-max top-1s + their join, over a
relation src with the rmh columns plus a bucket column b. cluster_id is
carried through every branch, so WHERE cluster_id = ? pushes into all five reads
(a point lookup when src is indexed on cluster_id):

WITH binned AS (SELECT * FROM <src>),
max_memory AS (SELECT DISTINCT ON (cluster_id, b, replica_id) cluster_id, b, replica_id, memory_percent, occurred_at
  FROM binned OPTIONS (DISTINCT ON INPUT GROUP SIZE = <GS>) ORDER BY cluster_id, b, replica_id, COALESCE(memory_bytes,0) DESC),
max_disk AS (SELECT DISTINCT ON (cluster_id, b, replica_id) cluster_id, b, replica_id, disk_percent, occurred_at
  FROM binned OPTIONS (DISTINCT ON INPUT GROUP SIZE = <GS>) ORDER BY cluster_id, b, replica_id, COALESCE(disk_bytes,0) DESC),
max_cpu AS (SELECT DISTINCT ON (cluster_id, b, replica_id) cluster_id, b, replica_id, cpu_percent, occurred_at
  FROM binned OPTIONS (DISTINCT ON INPUT GROUP SIZE = <GS>) ORDER BY cluster_id, b, replica_id, COALESCE(cpu_percent,0) DESC),
max_heap AS (SELECT DISTINCT ON (cluster_id, b, replica_id) cluster_id, b, replica_id, heap_percent, occurred_at
  FROM binned OPTIONS (DISTINCT ON INPUT GROUP SIZE = <GS>) ORDER BY cluster_id, b, replica_id, COALESCE(heap_bytes,0) DESC),
max_mad AS (SELECT DISTINCT ON (cluster_id, b, replica_id) cluster_id, b, replica_id, memory_and_disk_percent, occurred_at
  FROM binned OPTIONS (DISTINCT ON INPUT GROUP SIZE = <GS>) ORDER BY cluster_id, b, replica_id, COALESCE(memory_and_disk_percent,0) DESC)
SELECT max_memory.b, max_memory.replica_id, max_memory.cluster_id,
  max_memory.memory_percent, max_memory.occurred_at AS mem_at,
  max_disk.disk_percent, max_cpu.cpu_percent, max_heap.heap_percent, max_mad.memory_and_disk_percent
FROM max_memory
JOIN max_disk USING (b, replica_id, cluster_id) JOIN max_cpu  USING (b, replica_id, cluster_id)
JOIN max_heap USING (b, replica_id, cluster_id) JOIN max_mad USING (b, replica_id, cluster_id)

Q1. Shipped rollup (maintained indexed view)

Byte-for-byte the builtin body (real definition in
src/catalog/src/builtin/mz_internal.rs). The three shipped views are this body
at (bin, retention, GROUP SIZE) = ('1 minute',3h,1), ('5 minutes',24h,5),
('1 hour',14d,60). Shown at 1-minute / 14 days:

CREATE VIEW util AS
WITH replica_metrics_history AS (
  SELECT m.occurred_at, m.replica_id, r.size,
    (SUM(m.cpu_nano_cores::float8)/(NULLIF(s.cpu_nano_cores,0)*s.processes))   AS cpu_percent,
    (SUM(m.memory_bytes::float8)/(NULLIF(s.memory_bytes,0)*s.processes))       AS memory_percent,
    (SUM(m.disk_bytes::float8)/(NULLIF(s.disk_bytes,0)*s.processes))           AS disk_percent,
    SUM(m.disk_bytes::float8) AS disk_bytes, SUM(m.memory_bytes::float8) AS memory_bytes,
    s.disk_bytes*s.processes AS total_disk_bytes, s.memory_bytes*s.processes AS total_memory_bytes,
    MAX(m.heap_bytes::float8) AS heap_bytes, MAX(m.heap_limit) AS heap_limit,
    COALESCE(MAX(m.heap_bytes::float8/NULLIF(m.heap_limit,0)),
             SUM(m.memory_bytes::float8)/(NULLIF(s.memory_bytes,0)*s.processes)) AS heap_percent
  FROM replica_history AS r
  JOIN replica_sizes AS s ON r.size = s.size
  JOIN replica_metrics_history AS m ON m.replica_id = r.replica_id
  GROUP BY m.occurred_at, m.replica_id, r.size, s.cpu_nano_cores, s.memory_bytes, s.disk_bytes, s.processes
),
binned AS (
  SELECT *, date_bin('1 minute', occurred_at, '1970-01-01'::timestamp) AS bucket_start
  FROM replica_metrics_history
  WHERE mz_now() <= date_bin('1 minute', occurred_at, '1970-01-01'::timestamp) + INTERVAL '14 days'
),
max_memory AS (SELECT DISTINCT ON (bucket_start, replica_id) bucket_start, replica_id, memory_percent, occurred_at
  FROM binned OPTIONS (DISTINCT ON INPUT GROUP SIZE = 60) ORDER BY bucket_start, replica_id, COALESCE(memory_bytes,0) DESC),
max_disk AS (SELECT DISTINCT ON (bucket_start, replica_id) bucket_start, replica_id, disk_percent, occurred_at
  FROM binned OPTIONS (DISTINCT ON INPUT GROUP SIZE = 60) ORDER BY bucket_start, replica_id, COALESCE(disk_bytes,0) DESC),
max_cpu AS (SELECT DISTINCT ON (bucket_start, replica_id) bucket_start, replica_id, cpu_percent, occurred_at
  FROM binned OPTIONS (DISTINCT ON INPUT GROUP SIZE = 60) ORDER BY bucket_start, replica_id, COALESCE(cpu_percent,0) DESC),
max_memory_and_disk AS (SELECT DISTINCT ON (bucket_start, replica_id) bucket_start, replica_id, memory_percent, disk_percent, memory_and_disk_percent, occurred_at
  FROM (SELECT *, CASE WHEN disk_bytes IS NULL AND memory_bytes IS NULL THEN NULL
        ELSE (COALESCE(memory_bytes,0)+COALESCE(disk_bytes,0))/NULLIF((total_memory_bytes+total_disk_bytes),0) END AS memory_and_disk_percent
        FROM binned) AS inner_mad
  OPTIONS (DISTINCT ON INPUT GROUP SIZE = 60) ORDER BY bucket_start, replica_id, COALESCE(memory_and_disk_percent,0) DESC),
max_heap AS (SELECT DISTINCT ON (bucket_start, replica_id) bucket_start, replica_id, heap_percent, occurred_at
  FROM binned OPTIONS (DISTINCT ON INPUT GROUP SIZE = 60) ORDER BY bucket_start, replica_id, COALESCE(heap_bytes,0) DESC),
offline AS (
  SELECT date_bin('1 minute', occurred_at, '1970-01-01'::timestamp) AS bucket_start, replica_id,
    jsonb_agg(jsonb_build_object('replicaId', rsh.replica_id, 'occurredAt', rsh.occurred_at,
      'status', rsh.status, 'reason', rsh.reason)) AS offline_events
  FROM replica_status_history AS rsh
  WHERE process_id = 0 AND status = 'offline'
    AND mz_now() <= date_bin('1 minute', occurred_at, '1970-01-01'::timestamp) + INTERVAL '14 days'
  GROUP BY bucket_start, replica_id
)
SELECT bucket_start, replica_id, max_memory.memory_percent, max_memory.occurred_at AS max_memory_at,
  max_disk.disk_percent, max_disk.occurred_at AS max_disk_at,
  max_memory_and_disk.memory_and_disk_percent, max_memory_and_disk.memory_percent AS mad_mem,
  max_memory_and_disk.disk_percent AS mad_disk, max_memory_and_disk.occurred_at AS mad_at,
  max_heap.heap_percent, max_heap.occurred_at AS max_heap_at, max_cpu.cpu_percent, max_cpu.occurred_at AS max_cpu_at,
  offline.offline_events, bucket_start + INTERVAL '1 minute' AS bucket_end,
  replica_name_history.new_name AS name, replica_history.cluster_id, replica_history.size
FROM max_memory
JOIN max_disk USING (bucket_start, replica_id)
JOIN max_cpu USING (bucket_start, replica_id)
JOIN max_memory_and_disk USING (bucket_start, replica_id)
JOIN max_heap USING (bucket_start, replica_id)
JOIN replica_history USING (replica_id)
CROSS JOIN LATERAL (
  SELECT new_name FROM replica_name_history
  WHERE replica_id = replica_name_history.id
    AND bucket_start + INTERVAL '1 minute' >= COALESCE(replica_name_history.occurred_at, '1970-01-01'::timestamp)
  ORDER BY replica_name_history.occurred_at DESC LIMIT 1
) AS replica_name_history
LEFT JOIN offline USING (bucket_start, replica_id);

CREATE INDEX util_idx IN CLUSTER bench ON util (cluster_id);

Q2. Per-cluster point lookup (before vs after)

The per-pageload query, and the one swept for throughput. "before" = no index on
util; "after" = identical statement once util_idx exists:

SELECT * FROM util WHERE cluster_id = $1;

Q3. Design A — 14d/1h view alone

CREATE VIEW view_a AS
  -- five_max(src, 60), src = (SELECT *, date_bin('1 hour', occurred_at, '1970-01-01') AS b
  --   FROM (SELECT *, <memory_and_disk_percent CASE> FROM ( <rmh> ) rmh
  --         WHERE mz_now() <= date_bin('1 hour', occurred_at, '1970-01-01') + INTERVAL '14 days') z);
CREATE INDEX view_a_idx IN CLUSTER bench ON view_a (cluster_id);

Q4. Design B — 1-min base

CREATE VIEW base_1min AS
  SELECT date_bin('1 minute', occurred_at, '1970-01-01'::timestamp) AS bucket_start,
    replica_id, cluster_id, size, occurred_at,
    cpu_percent, memory_percent, disk_percent, heap_percent, memory_bytes, disk_bytes, heap_bytes,
    total_memory_bytes, total_disk_bytes,
    CASE WHEN disk_bytes IS NULL AND memory_bytes IS NULL THEN NULL
      ELSE (COALESCE(memory_bytes,0)+COALESCE(disk_bytes,0))/NULLIF((total_memory_bytes+total_disk_bytes),0) END AS memory_and_disk_percent
  FROM ( <rmh> ) rmh
  WHERE mz_now() <= date_bin('1 minute', occurred_at, '1970-01-01'::timestamp) + INTERVAL '14 days';
CREATE INDEX base_1min_idx IN CLUSTER bench ON base_1min (cluster_id);

Q5. Design B — unmaterialized re-bin (collapses under SUBSCRIBE)

CREATE VIEW rebin_1h AS
  -- five_max(src, 60), src = (SELECT *, date_bin('1 hour', occurred_at, '1970-01-01') AS b FROM base_1min)
SELECT * FROM rebin_1h WHERE cluster_id = $1;   -- per-query top-1, NOT maintained

Q6. SUBSCRIBE probe

-- design A: v = maintained five_max view (indexed); design B: v = unmaterialized rebin_1h
BEGIN;
DECLARE cur CURSOR FOR SUBSCRIBE (SELECT * FROM v WHERE cluster_id = 'c<i>');
FETCH ALL cur;   -- time to first full snapshot

Q7. Base + 1h chain

-- base_1min + index (Q4), plus:
CREATE VIEW hour AS
  -- five_max(src, 60), src = (SELECT *, date_bin('1 hour', occurred_at, '1970-01-01') AS b FROM base_1min)
CREATE INDEX hour_idx IN CLUSTER bench ON hour (cluster_id);

Q8. Offline jsonb on vs off

The "off" variant drops exactly the offline CTE, its offline.offline_events
projection, and the trailing LEFT JOIN offline USING (bucket_start, replica_id)
from Q1. EXPLAIN: with = 13 jsonb refs / 3 Reduce / 6 Join; without = 0 / 2 / 5.

jubrad mentioned this pull request Jun 25, 2026

console: read indexed cluster utilization overview views per timeframe #37308

Draft

jubrad force-pushed the console-cluster-utilization-views branch from c052d01 to ccd2a38 Compare June 25, 2026 23:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

catalog: add per-timeframe console cluster utilization overview views#37307

catalog: add per-timeframe console cluster utilization overview views#37307
jubrad wants to merge 1 commit into
MaterializeInc:mainfrom
jubrad:console-cluster-utilization-views

jubrad commented Jun 25, 2026

Uh oh!

jubrad commented Jun 26, 2026

Q1. Shipped rollup (maintained indexed view)

Q2. Per-cluster point lookup (before vs after)

Q3. Design A — 14d/1h view alone

Q4. Design B — 1-min base

Q5. Design B — unmaterialized re-bin (collapses under SUBSCRIBE)

Q6. SUBSCRIBE probe

Q7. Base + 1h chain

Q8. Offline jsonb on vs off

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jubrad commented Jun 25, 2026

Motivation

What this does

Cost / rollout

Tests

Uh oh!

jubrad commented Jun 26, 2026

Cluster-utilization overview — performance findings

The problem

The fix, and what it buys (before/after)

Weighing the options

Key supporting findings

Conclusion

Q1. Shipped rollup (maintained indexed view)

Q2. Per-cluster point lookup (before vs after)

Q3. Design A — 14d/1h view alone

Q4. Design B — 1-min base

Q5. Design B — unmaterialized re-bin (collapses under SUBSCRIBE)

Q6. SUBSCRIBE probe

Q7. Base + 1h chain

Q8. Offline jsonb on vs off

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant