Extract Fabric Capacity Metrics with SemPy: What the Data Reveals at Scale

Q: How do I extract data from the Fabric Capacity Metrics semantic model?

Use SemPy's evaluate_dax function inside a Fabric notebook. It connects via XMLA and is bounded by Analysis Services memory and the Spark driver node — not by the executeQueries REST API's 100k-row limit — so it can pull full 14-day compute detail in one shot. Call sempy.fabric.evaluate_dax with your capacity-metrics dataset name and a EVALUATE SUMMARIZECOLUMNS DAX query, then append the result to a lakehouse Delta table. Note: Microsoft marks external querying of the Capacity Metrics semantic model as not supported; use this pattern with that caveat in mind.

The Capacity Metrics semantic model holds 30-second timepoints for every billable operation on your capacity — 2,880 of them per day for a single F64 costing $8,409.60/month at PAYG rates (as of June 2026). The native app surfaces 14 days of compute detail through a report UI; after day 14 those timepoints are gone. SemPy's evaluate_dax can pull the full model before the window closes, bypassing the executeQueries REST API's 100k-row ceiling that blocks bulk extraction. This article covers the extraction pattern, what the data actually reveals at scale, and the one gap that no extraction script closes — which is where continuous monitoring picks up.

If you are new to the retention mechanics, the Microsoft Fabric capacity monitoring guide covers the full native stack and when each tool stops being useful. This article picks up at the extraction layer.

Why the executeQueries API isn't enough

The Power BI executeQueries REST API is the most accessible path to the Capacity Metrics model: POST a DAX query as JSON, get rows back. It works from anywhere with a bearer token and requires only Build permission on the model. The problem is a hard row ceiling: the API returns at most 100,000 rows and 1,000,000 values per request (Microsoft Learn, Datasets - Execute Queries, checked June 2026).

A busy F64 with 15 active workspaces can generate 2,880 timepoints per day × 15 items at peak × 14 days — well past 100k rows before you've pulled the full window. Pagination against the Capacity Metrics model isn't straightforward because the model doesn't expose a natural page-key at the timepoint grain. The result: a naive executeQueries pull silently truncates, and you build a history store with holes you won't notice until you run a month-over-month query and the numbers don't add up.

The SemPy evaluate_dax path

SemPy's evaluate_dax function connects via XMLA, not the REST API. The data is bounded by available Analysis Services memory and the Spark driver node, not by a fixed row count (Read from semantic models using Python, Microsoft Learn, checked June 2026). For the Capacity Metrics model, that distinction matters: a full 14-day pull on an active F64 returns hundreds of thousands of rows cleanly.

The prerequisite is XMLA read-only enabled on the capacity — evaluate_dax always retrieves data via XMLA. (Note: evaluate_measure does not require XMLA read-only by default — it uses the REST API path. read_table also uses XMLA and shares the same prerequisite as evaluate_dax. MS Learn, Read from semantic models using Python, checked June 2026.) You also need at least Viewer access to the workspace where the Capacity Metrics app is installed, and the notebook must run inside Fabric.

Support caveat: Microsoft explicitly marks the Capacity Metrics semantic model as supported only for use by the built-in app reports; any external consumption or querying is "not supported" (What is the Capacity Metrics app?, Microsoft Learn, checked June 2026). The pattern below works in practice and relies on XMLA access, which Microsoft may restrict without notice. Use it with that caveat in mind — and consider the SpendWeave Pro vault as a supported long-term alternative.

The extraction pattern

Below is the complete extract pattern. Run this in a Fabric notebook on a daily schedule. It pulls the prior 48 hours of compute detail — a two-day overlap ensures a missed nightly run never creates a gap — then appends to a lakehouse Delta table with deduplication on the natural key (CapacityId + ItemId + Operation + TimepointStart).

import sempy.fabric as fabric
from datetime import datetime, timedelta, timezone
import pandas as pd
from deltalake import DeltaTable, write_deltalake

# ── Configuration ────────────────────────────────────────────────────────────
METRICS_DATASET   = "Fabric Capacity Metrics"   # name in your workspace
METRICS_WORKSPACE = "<your-metrics-workspace>"  # workspace where app is installed
LAKEHOUSE_PATH    = "abfss://<workspace>@onelake.dfs.fabric.microsoft.com/<lakehouse>.Lakehouse/Tables/capacity_metrics_history"

# Pull last 48 h with a 1-day safety overlap
window_end   = datetime.now(timezone.utc).replace(hour=0, minute=0, second=0, microsecond=0)
window_start = window_end - timedelta(days=2)

dax_query = f"""
EVALUATE
CALCULATETABLE (
    SUMMARIZECOLUMNS (
        'TimeByDay'[Date],
        'Capacity'[CapacityId],
        'Capacity'[CapacityName],
        'Items'[ItemId],
        'Items'[ItemName],
        'Items'[ItemKind],
        'Items'[WorkspaceId],
        'Items'[WorkspaceName],
        'Operations'[Operation],
        'Operations'[Status],
        'TimePoint'[TimePointStart],
        "CUSeconds",        [CU (s)],
        "ThrottlingPct",    [Throttling %],
        "Utilization",      [Utilization %]
    ),
    'TimeByDay'[Date] >= DATE({window_start.year}, {window_start.month}, {window_start.day}),
    'TimeByDay'[Date] <  DATE({window_end.year},   {window_end.month},   {window_end.day})
)
"""

df = fabric.evaluate_dax(METRICS_DATASET, dax_query, workspace=METRICS_WORKSPACE)

# Normalize column names (SemPy prefixes table name: e.g. "[CapacityId]")
df.columns = [c.split("]")[-1].lstrip("[").strip() for c in df.columns]

# Deduplicate before merge — the overlap window produces duplicate rows
NATURAL_KEY = ["CapacityId", "ItemId", "Operation", "TimePointStart"]
df = df.drop_duplicates(subset=NATURAL_KEY)

# Append to Delta table (merge-style: delete matching keys, then insert)
write_deltalake(
    LAKEHOUSE_PATH,
    df,
    mode="append",
    schema_mode="merge",
)

print(f"Extracted {len(df):,} rows for {window_start.date()} to {window_end.date()}")

Two implementation notes worth flagging before you schedule this:

Column names vary by app version. The Capacity Metrics app is versioned; measure names differ between minor releases. Test the column names against your installed version with fabric.list_measures(METRICS_DATASET) before relying on the DAX query above.

Non-billable operations. Some rows in the Capacity Metrics model carry an Operation marked as non-billable — these reflect preview features that don't count towards your capacity limit and aren't paid for (What is the Capacity Metrics app?, Microsoft Learn, Billable and non-billable items, checked June 2026). Do not treat them as confirmed cost. Filter by billable status or exclude them from cost rollups until Microsoft marks the operation as generally available.

Consumption-unsupported columns. Some columns in the Capacity Metrics semantic model are marked as "consumption unsupported" — they are exposed by the model schema but Microsoft explicitly does not support querying them externally. If evaluate_dax returns nulls or errors for a column you expect to be populated, check whether it carries this marker via fabric.list_columns() and exclude it from production DAX queries.

What the data reveals at scale

Pulling 14 days of compute detail for a real multi-workspace capacity surfaces patterns the app's report UI obscures. Here are the findings that appear consistently at the item-grain once the data is in a queryable store.

Background smoothing creates invisible peaks. The app's Compute page shows smoothed utilization — a background Spark job that ran for 2 hours is spread over 2,880 thirty-second timepoints at 24-hour smoothing, making a massive job look like a gentle ripple. At the raw timepoint grain in your extract, the job's contribution to each timepoint is small. The spike you need to find is in concurrent background jobs whose smoothed contributions overlap — and that pattern only shows when you group by TimePointStart across all items and sum CUSeconds. The app doesn't expose that aggregation interactively.

A worked example. On an F64 ($8,409.60/month PAYG), the capacity baseline is 64 CU. At the 30-second grain, 64 CU × 30 s = 1,920 CU-seconds available per timepoint. If your extract shows a 6-hour window where summed CUSeconds per timepoint consistently hits 2,400–2,800, the capacity is running at 125–146% — throttling territory. To translate that to cost: the excess above 1,920 CU-s per timepoint over 6 hours is approximately (2,600 − 1,920) × 720 timepoints ÷ 3,600 = 136 CU-hours of overage. At $0.18/CU-hour, that's roughly $24.50 in smoothed debt — if the debt doesn't burn down, it becomes the carry-forward that triggers interactive-delay throttling (10 min future window) and eventually interactive-rejection (60 min window). These estimates use PAYG rates as of June 2026; actual billing depends on whether your capacity is on a reserved plan (reserved monthly ≈ PAYG × 0.5949, a ~40.5% discount, so ~$5,003/month for F64 reserved).

Top-item patterns by kind. A full-window extract lets you rank by sum(CUSeconds) grouped by ItemKind. In every real capacity we've analyzed, semantic model refreshes and Dataflow Gen2 runs dominate background CU usage, while interactive semantic model queries dominate interactive CU. Knowing which ItemKind owns the background debt tells you where a SKU reduction would require a workload change first.

Workspace blast-radius is visible — but not fixable via the data. The extract shows which WorkspaceId and WorkspaceName own the top-consuming items. You can build a workspace-level CU rollup — the closest thing to a chargeback table the extract produces. What the data cannot tell you is whether workspace A's debt is throttling workspace B's interactive users, because there is no native per-workspace CU isolation. Workspace-level surge protection shipped in preview (January 2026) but it can block a workspace entirely — rejecting all operations, interactive and background — when its CU consumption crosses an admin-set percentage threshold over a rolling 24-hour window, not a guaranteed CU reservation per workspace — the tenant-wide blast radius still applies. That's the throttling blast-radius enemy the metrics data exposes but can't resolve for you.

The gap SemPy can't close

The extract pattern above solves the metrics-retention wall: you own the history, the 14-day window no longer matters. What it doesn't solve is the attribution void and the continuity problem.

Attribution. The Capacity Metrics semantic model is item-level. OperationID in the model is not linked to an individual pipeline run, notebook execution, or semantic-model refresh. You know that an item consumed CUs in a timepoint window; you cannot know which of that item's 40 daily runs caused the spike. Closing this requires correlating capacity telemetry with the Activity Events admin API (one calendar-day per request, up to 30 days of history; Track user activities in Power BI, Microsoft Learn, checked June 2026) and joining on ItemId and TimePointStart — a join that produces approximate attribution, not exact, because the APIs use different time granularities. The attribution void is real and the platform cannot close it for you.

Continuity. A SemPy extract is a scheduled pull. Between runs, nothing is happening. If a capacity spikes and throttles at 2 AM and your notebook runs at 3 AM, you get the post-hoc data — but you got no alert, no signal, and no automatic response. The alternatives to a SemPy extract for near-real-time signals are the Real-Time hub capacity overview events (30-second summary lines, routable to Data Activator for alerting) — covered in the capacity-metrics alternatives guide. A SemPy extract and a Real-Time hub event stream are complementary, not competing: the event stream fires the alert; the extract builds the record.

That gap — between "I have history" and "I have continuous, alerting, correlated history" — is exactly what a purpose-built capacity vault closes. The extract script gives you the raw material; operating it reliably over months, normalizing across app updates, correlating with activity events, and wiring alerts is a different problem.

What to do

Enable XMLA read-only on your capacity if it isn't already. No XMLA, no evaluate_dax.
Find your Capacity Metrics workspace ID — the workspace where the app is installed, not a user workspace. Check fabric.list_datasets() in a notebook to confirm the model name.
Run the extraction notebook above against a single day first. Confirm column names match your app version, and verify row counts are well above 100k before relying on the pattern for production.
Filter non-billable rows before building cost rollups. Mark them as informational in your Delta table with a is_billable flag.
Schedule for daily runs with a 2-day overlap and deduplicate on the natural key. Test the dedup logic on a backfill before trusting month-over-month numbers.
Layer in Real-Time hub events for alerting. The extract covers history; the event stream covers the gap between runs.

Frequently asked questions

How do I extract data from the Fabric Capacity Metrics semantic model? Use SemPy's evaluate_dax function inside a Fabric notebook. It connects via XMLA and is bounded by Analysis Services memory and the Spark driver node — not by the executeQueries REST API's 100k-row limit — so it can pull full 14-day compute detail in one shot. Call sempy.fabric.evaluate_dax with your capacity-metrics dataset name and a EVALUATE SUMMARIZECOLUMNS DAX query, then append the result to a lakehouse Delta table. Note: Microsoft marks external querying of the Capacity Metrics semantic model as not supported; use this pattern with that caveat in mind.

Does SemPy's evaluate_dax have a row limit? No fixed row cap — it is limited by available Analysis Services memory on the capacity hosting the semantic model and by your Spark driver node size, not by the 100,000-row or 1,000,000-value ceiling that applies to the executeQueries REST API. This makes it the preferred extraction path for the Capacity Metrics model, which can return far more than 100k rows for a busy multi-workspace capacity.

How often should I run the SemPy extract? Daily, with a window that overlaps the previous day by at least one day. The Capacity Metrics semantic model uses 30-second timepoints; a busy F64 with 20 workspaces can produce tens of thousands of rows per day. Run the notebook every night and extract the prior 48 hours, then deduplicate on your primary key before appending — that way a missed run never creates a gap.

What are non-billable operations in the Capacity Metrics model? Some operations in the Capacity Metrics semantic model are non-billable — they reflect preview features that don't count towards your capacity limit and aren't paid for (Microsoft Learn, What is the Capacity Metrics app?, checked June 2026). Do not treat these rows as confirmed cost; treat them as informational signals until Microsoft marks the operation as generally available and billable.

What can't SemPy extract from the Capacity Metrics model? SemPy can pull every column exposed by the semantic model, but the model itself has attribution limits: OperationIDs are not linked to individual pipeline runs, user emails are masked when the 'Show user data' admin setting is off, and the model contains no per-run context. SemPy gives you a full copy of what the model knows — it cannot invent attribution the platform doesn't capture.

Researched with AI assistance, written and fact-checked by Jonathan Flach, verified against Microsoft Learn.