Predicting Fabric Throttling Before It Happens

By Jonathan Flach · Published 2026-06-20 · Reviewed 2026-06-20

An F64 capacity running at 80% average utilization can still hit interactive rejection — every new interactive request refused, though background jobs keep running — if a burst of background jobs pushes carry-forward debt above the 60-minute future-capacity threshold. By the time the Capacity Metrics app shows the throttle event, it is 10–15 minutes old and users have been complaining for five minutes already (What is the Microsoft Fabric Capacity Metrics app?, Microsoft Learn, checked June 2026). You can predict throttling — not react to it — by tracking how fast the capacity's carry-forward debt is accumulating relative to the rate at which idle capacity can burn it down.

Throttling in Fabric is not triggered by utilization percentage; it is triggered by how many minutes of future capacity have been pre-committed by smoothed carry-forward debt (Understand your Fabric capacity throttling, Microsoft Learn, checked June 2026). That distinction is the foundation of any forecasting approach. For the full mechanics of how carry-forward accumulates, see the complete guide to Microsoft Fabric capacity monitoring — this article focuses on what to measure and how to project onset. For what to do once throttling has already started, see Fabric throttling triage; for the underlying smoothing physics, see fabric throttling explained.

The three leading indicators

The Capacity Metrics app's Throttling chart exposes three gauges, each measuring how full one of the three future-capacity windows is. The same three figures appear — updated every 30 seconds — in the Real-Time hub capacity overview event stream (Explore Fabric capacity overview events, Microsoft Learn, checked June 2026):

MetricEvent fieldWindowTrigger at
Interactive delay %interactiveDelayThresholdPercentage10 min (20 timepoints)100%
Interactive rejection %interactiveRejectionThresholdPercentage60 min (120 timepoints)100%
Background rejection %backgroundRejectionThresholdPercentage24 h (2,880 timepoints)100%

These percentages are not current-second utilization. They express the fraction of each window's total CU budget that is already committed by carry-forward debt. A 10-minute percentage of 75% means the capacity has pre-committed 75% of the next 10 minutes' CU allowance — 2.5 minutes of future capacity left before the interactive-delay throttle engages (Understand the metrics app compute page, Microsoft Learn, checked June 2026).

The event also carries overageTotalCapacityUnitMs — the net cumulative carry-forward in CU-milliseconds, the raw number behind all three percentages. Tracking its rate of change is the core of the forecasting method below.

Why the metrics app alone is not a forecasting tool

The Metrics app has two limitations that prevent it from serving as an early-warning system:

  1. 10–15 minute data lag. "Usage data becomes available within 10 to 15 minutes after the activity occurs," per Microsoft Learn. At the interactive-delay stage, that lag means you are reading a throttle event that started roughly 10 minutes ago.
  2. No alerting engine. The app shows history; it does not fire alerts. The only native alert is an email at 100% utilization — sent after the fact.

The metrics lag limits your reaction time to essentially zero: by the time the chart shows the 10-minute window crossing 100%, interactive delay has been active for several minutes. The Real-Time hub events close this gap — 30-second granularity, a live stream — but they still require you to watch them or build alerting on top. Both paths are reactive unless you build forward-looking logic on the trend.

The debt-trajectory forecast method

The core of this article is the debt-trajectory forecast method: a structured approach to predicting throttle onset by computing the slope of carry-forward accumulation and projecting when each threshold will be crossed.

The core mechanics

At every 30-second timepoint, the capacity either gains carry-forward debt or burns it down:

  • Gain: overageAddCapacityUnitMs — CUs carried forward because this window's usage exceeded the budget.
  • Burndown: overageBurndownCapacityUnitMs — CUs reduced because this window had spare capacity.
  • Net: overageTotalCapacityUnitMs — the running total.

The interactiveDelayThresholdPercentage crossing 100% is equivalent to:

overageTotalCapacityUnitMs >= baseCapacityUnits × 1000 × 30 × 20 timepoints

For an F64 (baseCapacityUnits = 64): the 10-minute threshold = 64 × 1,000 × 30 × 20 = 38,400,000 CU-ms.

The 60-minute threshold = 64 × 1,000 × 30 × 120 = 230,400,000 CU-ms.

The forecast calculation

Track overageTotalCapacityUnitMs across a rolling window of N timepoints (6 = 3 minutes is a practical minimum). Compute the net change per timepoint:

netSlope = (overageTotal[now] - overageTotal[N timepoints ago]) / N

Project the minutes until each threshold:

minutesToInteractiveDelay = (threshold_10min - overageTotal[now]) / netSlope / 2
  (divide by 2 to convert timepoints to minutes)

If netSlope is negative (burndown exceeds new additions), the capacity is self-healing and no throttle is imminent. If netSlope is positive, the crossing time is a real forecast.

Worked example: F32 capacity under morning load

Concrete numbers. F32 PAYG = $4,204.80/month as of June 2026. baseCapacityUnits = 32.

10-minute threshold in CU-ms = 32 × 1,000 × 30 × 20 = 19,200,000 CU-ms.

Suppose at 08:00 the capacity begins a batch load. Over five 30-second windows, overageTotalCapacityUnitMs reads as follows (illustrative, based on the accumulation math from Microsoft Learn throttling mechanics, checked June 2026):

WindowTimeoverageTotalCapacityUnitMsinteractiveDelay%
W108:00:001,800,0009.4%
W208:00:303,900,00020.3%
W308:01:006,400,00033.3%
W408:01:309,200,00047.9%
W508:02:0012,400,00064.6%

Net slope over 4 intervals: (12,400,000 − 1,800,000) / 4 = 2,650,000 CU-ms per timepoint.

Minutes to 10-minute threshold from W5: (19,200,000 − 12,400,000) / 2,650,000 / 2 = 1.28 minutes.

Interactive delay begins at approximately 08:03:17 — known at 08:02:00, a 77-second lead time.

Minutes to 60-minute threshold (60 min = 32 × 1,000 × 30 × 120 = 115,200,000 CU-ms): (115,200,000 − 12,400,000) / 2,650,000 / 2 = 19.4 minutes.

Interactive rejection is projected at roughly 08:21:24 — known more than 19 minutes in advance.

This lead time is the practical value of the forecast. At 08:02:00, the Capacity Metrics app is still showing data from 07:47:00. The Real-Time hub event stream is current — you have the numbers you need — but only the trajectory calculation tells you the crossing is 77 seconds away.

Alert thresholds that give real lead time

Rather than alerting at 100%, set thresholds that give actionable lead time:

Alert targetThreshold %Typical lead time (moderate-slope scenario)*
Pre-interactive-delay warning70% on 10-min %~4 min before delay onset
Interactive delay imminent85% on 10-min %~1–2 min before delay onset
Interactive rejection warning60% on 60-min %~8 min before rejection onset
Background rejection warning50% on 24-h %Hours before rejection onset

*At the aggressive slope in the F32 worked example above (2,650,000 CU-ms/timepoint), the lead time from the 70% alert to the 10-minute threshold crossing is only ~1 minute; from 85% only ~0.5 minutes. The 4-minute figure assumes a gentler slope (~720,000 CU-ms/timepoint) — roughly 3.7× slower accumulation. Actual lead time varies with the slope; compute it from your live overageTotalCapacityUnitMs trend.

Wire a Data Activator rule on the Real-Time hub capacity overview events at 70% on interactiveDelayThresholdPercentage. That fires before most of the interactive-delay window has filled, while the trajectory is still flat enough to intervene — by scaling up the SKU or draining the queue.

The blast-radius problem

The throttling blast-radius named enemy appears here directly: every workspace on the capacity shares the same carry-forward debt accumulator. One overnight batch that generates 200M CU-ms of background carry-forward will push backgroundRejectionThresholdPercentage to roughly 7% on an F32 — but raises interactiveRejectionThresholdPercentage to ~174%, meaning the 60-minute interactive-rejection window is already blown past and the next morning's interactive workloads — Power BI report queries, direct DAX calls, notebook interactive sessions — face immediate rejection. Spark batch pipelines are background operations (SparkCore = Background) and are unaffected at 174% interactiveRejectionThresholdPercentage; Spark jobs are only blocked if backgroundRejectionThresholdPercentage itself exceeds 100%. The debt is tenant-wide; the pain is tenant-wide.

Workspace-level surge protection (preview, January 2026) can throttle or limit a workspace's background usage against an admin-set threshold, but it does not give each workspace a guaranteed CU reservation. The cumulative debt trajectory is still a capacity-level number. Per-workspace forecasting would require tracking each workspace's contribution to overageTotalCapacityUnitMs over time — possible only if you persist the 30-second event stream to an Eventhouse and join it with workspace-attribution data.

What the Metrics app's Overage chart shows

The Metrics app Compute page has an Overage (Carryforward) chart with three series (Understand the metrics app compute page, Microsoft Learn, checked June 2026):

  • Add % (green): Carry-forward percent added in the 30-second window.
  • Burndown % (blue): Carry-forward percent reduced in the 30-second window.
  • Cumulative % (red): Running total — the number the threshold percentages are derived from.

When the Cumulative % line is rising and Add % bars are consistently taller than Burndown % bars, throttle onset is imminent. This visual is the in-app version of the debt-trajectory logic above. Its limitation: at 10–15 minutes of lag, "imminent" in the chart is already "already happened" in reality.

Why pausing is not the fix

Pausing a capacity does halt throttling immediately — it resets the carry-forward accumulator to zero. But pausing is the pause trap, not a fix: when you pause, all accumulated smoothed carry-forward debt settles immediately as a one-time PAYG charge (Pause and resume your Fabric capacity, Microsoft Learn, checked June 2026). On an F32 with 115,200,000 CU-ms of outstanding debt (the 60-minute threshold), the settlement is 115,200 CU-seconds = 32 CU-hours. At $0.18/CU-hour, that is $5.76 in one billing event — in addition to any reserved-capacity fee if you're on a reservation. On an F64 with proportionally larger debt, the number scales accordingly.

For an overnight batch that has built 24 hours of background carry-forward, the settlement can be substantial. This is the pause trap: pausing converts accumulated debt into an immediate bill, not an eliminated cost. The capacity resumes clean, but you have paid for every CU-ms of smoothed overage. If you pause purely to clear throttling without fixing the underlying workload, you will rebuild the same debt on the next run. The correct response to predicted throttling is to intervene before it peaks — scale up the SKU temporarily, drain the queue, or reschedule the offending workload.

What to do when the slope turns positive

The debt-trajectory forecast tells you a crossing is coming. The response depends on how far out the crossing is:

  1. More than 10 minutes out: Identify the top item on the Metrics app's Compute page matrix. If it is a pipeline or background job, determine whether it can be rescheduled or paused. A single large overnight batch contributing disproportionate debt is often better rescheduled to a lower-utilization window.
  2. 2–10 minutes out: Initiate a temporary SKU scale-up. F32 → F64 doubles the per-timepoint CU budget and doubles the burndown rate, which can absorb a 50% debt increase without crossing the interactive-delay threshold.
  3. Under 2 minutes: Interactive delay is almost certain. Notify affected teams, let in-flight interactive requests complete (throttling never interrupts them), and do not submit new large background workloads until the debt clears.

Never scale down during a positive slope. Halving the SKU halves the absolute CU-ms threshold and slows burndown simultaneously — existing debt that was below the F32 threshold may already exceed the F16 threshold, instantly crossing into interactive delay or rejection.

Frequently asked questions

How do you predict Fabric throttling before it happens? Track the three throttling-percentage fields in the capacity overview events: interactiveDelayThresholdPercentage (10-minute window), interactiveRejectionThresholdPercentage (60-minute window), and backgroundRejectionThresholdPercentage (24-hour window). Compute the slope of overageTotalCapacityUnitMs over a rolling 3-minute window (6 timepoints). A positive slope reaching the interactive-delay threshold in fewer than 2–3 windows is your earliest reliable warning — typically 1–5 minutes before users feel it.

What is PercentageOfBaseCapacity in Fabric? The Metrics app's item-detail drill-through shows how much of the capacity's baseline an individual operation consumed in a 30-second window — 250% means an operation used 2.5× the SKU's total CU allowance in that interval. In the Real-Time hub capacity overview events, the equivalent signals are interactiveDelayThresholdPercentage and interactiveRejectionThresholdPercentage — percentages of the 10-minute and 60-minute future-capacity windows already consumed by smoothed carry-forward debt (Troubleshooting guide — capacity limit exceeded, Microsoft Learn, checked June 2026).

How long before throttling does the Capacity Metrics app give a warning? The app gives no alert — it has no alerting engine, and its data lags 10–15 minutes. The earliest native warning is the Real-Time hub capacity overview event, emitted every 30 seconds, which carries the current interactiveDelayThresholdPercentage. Setting a Data Activator alert at 70–80% gives roughly 2–5 minutes of lead time before the 100% threshold triggers delay.

Can you predict background rejection in Fabric? Yes. Background rejection triggers when backgroundRejectionThresholdPercentage exceeds 100%, meaning 24 hours of future capacity has been consumed. A capacity running 5% over baseline continuously will reach background rejection in roughly 20 days; a severe overload at 200% above baseline can reach it in under 15 hours. Tracking the daily slope of overageTotalCapacityUnitMs lets you project that crossing well in advance — often 12–18 hours out for severe chronic overloads.

Why does the metrics app show throttling 10–15 minutes after it started? Because usage data becomes available within 10 to 15 minutes after the activity occurs, per Microsoft Learn. The app ingests smoothed utilization on a scheduled basis, not in real time. The only near-real-time native path is the Fabric capacity overview events in Real-Time hub, which emit a smoothed summary every 30 seconds while the capacity is active.

Researched with AI assistance, written and fact-checked by Jonathan Flach, verified against Microsoft Learn.