Fabric Throttling Explained: Why One Query Slows Your Whole Tenant
By Jonathan Flach · Published 2026-06-20 · Reviewed 2026-06-20
An F64 capacity costs $8,409.60 per month at PAYG rates as of June 2026 — and a single background pipeline that runs 300 short copy activities can push its 24-hour background rejection chart past 100%, blocking every user across the entire capacity until the debt burns down. That's fabric capacity throttling in practice: staged across future-capacity time windows, applied tenant-wide, and driven by smoothed debt that accrued minutes or hours before users ever saw an error.
Throttling in Microsoft Fabric is not a utilization percentage breaker. It is a future-capacity accounting system: operations borrow from CUs the capacity hasn't yet run, and throttling kicks in when that borrowing exceeds defined time-window thresholds — 10 minutes, 60 minutes, and 24 hours. Understanding those windows, and understanding that they apply to a shared CU pool with no native per-workspace isolation, is the foundation of every capacity-sizing and cost-reduction decision. The complete monitoring picture — what the Capacity Metrics app shows, where its 14-day compute retention wall sits, and how to build alerting around it — lives in the Microsoft Fabric capacity monitoring guide.
Why throttling is a future-capacity problem, not a real-time one
Fabric uses bursting and smoothing to let operations run faster than the SKU's baseline ceiling. Bursting lets a job temporarily consume more CUs than the capacity nominally provides; smoothing distributes that extra CU cost across future 30-second timepoints rather than billing it all at the moment the job ran (Fabric throttling policy, Microsoft Learn, checked June 2026).
Interactive operations smooth over a minimum of 5 minutes, up to 64 minutes. Background operations — pipelines, Spark jobs, semantic-model refreshes — smooth over a full 24-hour window of 2,880 timepoints, each 30 seconds long. A single background job consuming 1 CU-hour on an F2 contributes only about 1.25 CUs per timepoint, or roughly 2.1% of each 30-second slot's budget (Metrics app calculations, Microsoft Learn, checked June 2026). That's the design intent: spreading the cost across the day so jobs don't instantly breach the ceiling.
The problem surfaces when multiple jobs run simultaneously. Each one individually looks harmless after smoothing. Together they fill every timepoint's budget, carry-forward CUs accumulate faster than idle capacity can absorb them, and the debt clock starts running. The capacity's actual CPU right now could be at 40% — but it's busy throttling because the debt is in the future, already committed, and every new operation makes it larger.
The throttling stages table
The four throttling stages are triggered by how much future capacity has been consumed, measured against the 24-hour daily CU allowance for the SKU. This is the authoritative table, sourced from both the throttling policy and the metrics app calculations docs (checked June 2026).
| Stage | Future-capacity window crossed | Experience | What's still allowed |
|---|---|---|---|
| Overage protection | Up to 10 min of future CUs | No user impact — burst absorbed silently | All operations |
| Interactive delay | 10 min < usage ≤ 60 min | A 20-second throttle added to every new interactive request (report clicks, DAX queries, XMLA) | Background ops continue to accumulate debt |
| Interactive rejection | 60 min < usage ≤ 24 h | Interactive requests rejected; users see CapacityLimitExceeded error | Background ops still start and run |
| Background rejection | Usage > 24 h of future CUs | All requests rejected — interactive and background | Nothing |
Reading the percentages: The Capacity Metrics app shows three throttling charts — 10-minute interactive %, 60-minute interactive %, and 24-hour background %. Each is a ratio of smoothed carry-forward to that window's total capacity allowance. 100% on the 10-minute chart means the interactive delay throttle is now active. 250% on the background rejection chart means the capacity has consumed 2.5× its daily CU allowance — the minimum recovery time with no new operations is ((250 − 100) ÷ 100) × 24 hours = 36 hours (Metrics app calculations, Microsoft Learn, checked June 2026).
The blast-radius qualifier: These stages apply at the capacity level — to every workspace on the shared CU pool. There is no native per-workspace CU isolation. Workspace-level surge protection shipped in preview in January 2026 and lets admins set a CU percentage limit on background usage per workspace over a rolling 24-hour window, with the ability to tag high-priority workspaces as "Mission Critical" to exempt them from capacity-level surge protection rules (Surge protection, Microsoft Learn, checked June 2026). But that feature throttles or limits a workspace's background usage against an admin-set threshold — it does not give each workspace a guaranteed CU reservation, and capacity-level surge protection explicitly does not guarantee that interactive requests aren't delayed or rejected (Surge protection, Microsoft Learn, checked June 2026). The tenant-wide blast radius still applies.
The blast-radius mechanism: why your Power BI dashboard fails when a pipeline runs
Every workload on a capacity draws from one shared CU pool. There are no walls between workspaces. This is the throttling blast-radius enemy: one workload's smoothed debt throttles everyone on the capacity — people who never touched the offending pipeline and whose own reports consume no meaningful CUs.
A concrete scenario: a metadata-driven ingestion pipeline fires 300 back-to-back copy activities. Fabric Data Factory charges copy activities based on run duration and the number of intelligent-optimization throughput resources allocated — 1.5 CU-hours per resource per hour of execution (Data Factory pricing — pipelines, Microsoft Learn, checked June 2026); the pricing scenario docs confirm the formula as resources × 1.5 × duration_in_hours (Pricing scenario — load Parquet to data warehouse, Microsoft Learn, checked June 2026). Critically, practitioners report that billing rounds up to the nearest whole minute — a 14-second copy activity billing as a full 1-minute run (community-observed behavior, not explicitly stated in the Microsoft Learn pricing reference). If accurate, a 4-throughput-resource copy rounded to 1 minute costs 4 × 1.5 × (1/60) ≈ 0.10 CU-hours even for sub-minute actual work — and that effective minimum-billing floor multiplies across 300 activities with no batching discount. The dangerous pattern is not one slow copy; it is hundreds of rapid-fire short copies where each run contributes its own metered CU charge. When a pipeline fires 300 short copy activities in rapid succession, the aggregate throughput-resource allocations across all concurrent copies — each independently smoothed over the 24-hour background window — can fill every timepoint's budget faster than overnight idle capacity burns it down. Stack that debt across the 24-hour background window for an F32 capacity (PAYG $4,204.80/month, 32 CUs) and if the per-timepoint budget fills before overnight burndown can recover it, every workspace on the capacity enters interactive delay — then rejection — regardless of which workspace ran the pipeline.
The F64 threshold ($8,409.60/month PAYG, $5,002.87/month reserved as of June 2026) is where Power BI report viewers no longer need Pro or PPU licenses. That makes F64 an attractive target for organizations consolidating BI and data engineering onto one capacity. It also concentrates the blast radius: a single runaway pipeline can now throttle the entire BI fleet. For a deep treatment of isolation strategies and multi-capacity topologies, see workload isolation and the Fabric blast radius.
How to read a live throttling event in the Capacity Metrics app
The Metrics app's Compute page > Throttling section shows three tabs:
- Interactive delay (10-minute %) — crosses 100% when the interactive delay throttle is active. Every new report click gets a 20-second artificial delay.
- Interactive rejection (60-minute %) — crosses 100% when reports and DAX queries are being rejected outright.
- Background rejection (24-hour %) — crosses 100% when all operations are blocked. A value of 250% means the capacity consumed 2.5× its daily allowance; at minimum 36 additional hours of quiet are needed to clear it.
The System events table on the same page shows a timestamped history of throttling state changes. The Overages tab shows the carryforward, add, and burndown curves over time.
What the app cannot show: which individual pipeline run (as distinct from which pipeline item) created the debt. Attribution in Fabric is item-level — you can see that Pipeline X was the source of CU debt at a given timepoint, and you can see the user who triggered the operation, but the OperationID is not linked to a specific pipeline run ID, so you cannot match the metric to a run in pipeline monitoring history (Metrics app timepoint item detail page, Microsoft Learn, checked June 2026). That attribution void is a separate wall from throttling itself.
The Compute page's Throttling charts hold 14 days of history. A preview Item History page, available since August 2025, extends item-level compute analysis to 30 days — so a three-week-old incident is recoverable natively if it falls within that window (Item history page preview, Microsoft Learn, checked June 2026). For incidents older than 30 days, or for continuous cross-capacity alerting that spans multiple retention windows, native tooling still has no answer. The monitoring gap this creates is covered in predicting Fabric throttling before it lands.
What to do when throttling is active
The Metrics app itself flags three self-service options for active throttling (Fabric throttling policy, Microsoft Learn, checked June 2026):
1. Temporarily scale up the SKU. This is the correct first move in most situations. A larger SKU has more CUs per 30-second timepoint, so each idle timepoint contributes more burndown. Moving from F32 to F64 doubles the idle CU floor per slot and roughly halves the time to clear the debt. The SKU can be scaled back down once the carry-forward clears. This costs real money for the duration, but the cost is predictable and the operation stays under admin control.
2. Stop submitting new operations. Every new workload adds smoothed CUs to future timepoints. A quiet period lets the existing carry-forward drain. In practice this is hard to enforce across a shared tenant, but temporarily pausing scheduled pipelines or moving refresh windows can help burndown catch up.
3. Fix the offending workload. The only permanent fix is eliminating or rescheduling the debt-creating pattern. High-frequency short copy activities, hot Dataflow Gen2 loops, and misconfigured Data Activator rules are the typical offenders — they generate disproportionate CU debt relative to their compute value. This is the only option that reduces baseline carry-forward accrual rate rather than just increasing the burndown speed.
The pause-and-clear trap. Pausing a capacity immediately ends throttling — it resets carry-forward to zero and the capacity restarts clean. Microsoft documents this as a self-service recovery mechanism (Fabric throttling policy, Microsoft Learn, checked June 2026). But the mechanism comes with a billing consequence: pausing settles all accumulated smoothed CU debt as a one-time billing event at full PAYG rates. On a reserved capacity, that charge is paid on top of the reservation. An organization running a reserved F64 ($5,002.87/month) that pauses with significant smoothed background debt can see a spike charge of thousands of dollars land on a single billing event. Community users have reported seeing capacity usage spike to millions of apparent CUs in the Metrics app at the moment of pause — all the smoothed future debt collapsing into one point. Frame this as an emergency option only, used when downtime is acceptable and you have confirmed the cost. The correct fix is scale up or fix the workload.
Capacity overage (preview). A separate admin feature — Capacity Overage — allows the capacity to absorb overages beyond the SKU ceiling by billing for them at 3× the normal CU rate rather than throttling. This prevents throttling entirely at the cost of predictable but elevated billing. It's worth knowing about for mission-critical production environments that cannot tolerate any interactive delay, but it is not a substitute for right-sizing.
Sizing to reduce carry-forward accrual
The right-sizing objective is keeping the 24-hour background percentage below 100% with genuine headroom — not just fitting at 95%. Background smoothing is what determines whether debt burns down during quiet overnight hours or compounds forward. If overnight idle time isn't sufficient to clear each day's accumulated debt, carry-forward grows day over day until the capacity enters background rejection.
A practical heuristic from the throttling policy: one SKU tier up on the doubling ladder doubles the per-timepoint idle CU budget and roughly doubles the burndown rate. If an F32 regularly shows 80–90% on its 24-hour background chart, moving to F64 (doubling CUs from 32 to 64) adds enough idle headroom per timepoint to let overnight burndown clear the debt. The estimated PAYG cost difference is $8,409.60 − $4,204.80 = $4,204.80/month. That is an estimate scoped to PAYG rates as of June 2026; reserved pricing applies the 0.5949 factor.
For a decision framework on when one capacity versus multiple smaller capacities makes financial sense — including the blast-radius trade-off — see workload isolation and the Fabric blast radius. For the forecasting angle of throttling — reading carry-forward trends to predict the next incident before it happens — see predicting Fabric throttling.
What to do next
- Open the Capacity Metrics app Throttling charts. Check all three tabs: 10-minute interactive %, 60-minute interactive %, and 24-hour background %. If any exceeds 80%, burndown headroom is thin.
- Read the System events table. Look for the timestamps of recent Interactive Delay or Background Rejection events — that's your incident history.
- Identify the top item on the Overages tab. That's the workload generating the most smoothed debt. Whether it's a pipeline pattern, a hot Dataflow Gen2, or a misconfigured refresh, that item is the correct target.
- Scale the SKU temporarily, not permanently, if you're in active rejection. Double the CU count, let it burn down, scale back. Quantify the cost before doing it.
- Never pause to clear throttling on a reserved capacity without confirming the billing impact first. It ends the throttle, but it converts all carry-forward debt to an immediate PAYG-rate charge on top of your reservation.
- Set an alert at 80% on the 10-minute interactive chart — not 100%. At 80% you still have time to act; at 100% users are already experiencing the 20-second delay.
The named enemy this article defeats is the throttling blast-radius: one workload's debt throttles the whole tenant because there is no native per-workspace CU isolation — only a preview workspace-level surge protection feature that limits background spending per workspace rather than guaranteeing each workspace a CU slice. SpendWeave tracks throttling stages and carry-forward debt continuously, flagging the moment the 10-minute interactive window fills rather than waiting for users to report errors.
Frequently asked questions
What triggers throttling in Microsoft Fabric? Throttling is triggered when carry-forward CU debt crosses future-capacity time-window thresholds: overage protection covers up to 10 minutes; interactive delay (20-second added latency) from 10 to 60 minutes; interactive rejection from 60 minutes to 24 hours; background rejection beyond 24 hours (Fabric throttling policy, Microsoft Learn, checked June 2026). These are future-capacity windows, not utilization percentages.
Does Fabric throttle the whole tenant or just one workspace? Throttling applies at the capacity level — every workspace assigned to that capacity is affected. Workspace-level surge protection (preview, January 2026) can limit how much background CU a single workspace consumes, but it does not give workspaces a guaranteed CU reservation, and it does not prevent capacity-level interactive delays or rejections.
How long does Fabric throttling last? Until idle capacity burns carry-forward CUs to zero. A background rejection at 250% means the capacity used 2.5× its daily CU allowance; the minimum recovery time with no new operations is ((250 − 100) / 100) × 24 hours = 36 hours. A temporary SKU scale-up is the fastest legitimate remedy.
What does a 250% background rejection mean in Fabric? It means the capacity consumed 2.5 times its full 24-hour daily CU allowance. All requests are rejected. Minimum recovery time is 36 hours with no new operations. It does not mean instantaneous utilization is at 250%; it means 2.5 days' worth of future capacity has been pre-spent (Metrics app calculations, Microsoft Learn, checked June 2026).
Can pausing a Fabric capacity clear throttling? Yes — it resets carry-forward to zero immediately. But it also triggers a one-time billing event for all accumulated smoothed debt at full PAYG rates, paid on top of any reservation. Frame it as an emergency-only option and confirm the dollar amount before using it; the correct fix is a temporary SKU scale-up or eliminating the debt-creating workload.
Researched with AI assistance, written and fact-checked by Jonathan Flach, verified against Microsoft Learn.