Smoothing, Bursting & Carry-Forward Debt: The Physics of Fabric Compute

Q: What is smoothing in Microsoft Fabric?

Smoothing is the mechanism that spreads a workload's CU consumption across future timepoints rather than charging it all at the moment of execution. Interactive operations are smoothed over a minimum of 5 minutes and up to 64 minutes, depending on how many CU-seconds the operation consumed; background operations are smoothed over 24 hours across 2,880 timepoints. A 30-second interval called a timepoint is the smallest unit of smoothing — there are 2,880 of them in a 24-hour window. Smoothing is what lets a small capacity run a heavy pipeline without instantly exceeding its ceiling.

Q: What triggers throttling in Microsoft Fabric?

Throttling is triggered when the carry-forward CU debt — accumulated overage after smoothing — crosses time-window thresholds. Up to 10 minutes of future capacity consumed: no throttling (overage protection). 10–60 minutes: a 20-second delay is applied to all new interactive requests. 60 minutes–24 hours: interactive requests are rejected outright. Over 24 hours: all requests, including background pipelines, are rejected. Throttling lifts when idle capacity burns down the carryforward debt.

Q: How long does Fabric throttling last?

Throttling lasts until idle capacity burns down the accumulated carry-forward CUs to zero. Because background smoothing spreads CUs over 2,880 timepoints in 24 hours, burndown can be slow if new operations keep adding to the debt. The fastest remedies are: temporarily scale up the SKU (each timepoint has more idle capacity to burn debt), or fix the offending workload. Pausing the capacity does immediately end throttling (Microsoft's documented self-service mechanism), but it triggers a one-time billing event for all accumulated smoothed debt at PAYG rates — on a reserved capacity that charge is paid on top of the reservation. Use pause as an emergency option only when downtime is acceptable and the billing hit is understood.

Q: What is carry-forward CU debt in Fabric?

Carry-forward CUs are the accumulated overages that exceed the 10-minute overage-protection window. Once overage runs past 10 minutes of future capacity, it becomes carryforward that is applied against each subsequent timepoint. If a timepoint has idle CUs, those idle CUs reduce (burn down) the carryforward. If there are no idle CUs, the carryforward grows and throttling escalates through the staged windows.

A background pipeline that bursts to 4× the F64's baseline for 15 minutes doesn't spike your utilization chart in real time — it quietly distributes ~$11.52 of smoothed CU cost (256 CUs × 0.25 hr × $0.18) across the next 2,880 timepoints (the 24-hour background window), each timepoint 30 seconds long. Most of the time that cost burns down harmlessly overnight because each timepoint still has idle headroom. But stack three or four of those pipelines on top of each other and the math changes: every timepoint's budget fills, carry-forward CUs accumulate faster than idle capacity can absorb them, and only then does that smoothed cost become genuine carry-forward debt. The 10-minute overage-protection window fills, and the throttle cascade begins — first a 20-second delay on every report click, then full interactive rejection, then a blocked pipeline queue. Understanding the physics behind that cascade is the prerequisite for everything in the Microsoft Fabric cost-reduction playbook.

The building block: a 30-second timepoint

Fabric doesn't measure capacity usage in minutes or hours in real time. It divides time into 30-second intervals called timepoints — 2,880 of them per 24-hour window (Fabric throttling policy, Microsoft Learn, checked June 2026). Every CU consumed by every operation gets attributed to one or more of those timepoints. The timepoint is the atomic unit: throttling, smoothing, and carryforward are all computed at timepoint granularity.

Why does that matter? Because it means the smoothing math is discrete, not continuous. When Fabric smooths a background job, it literally divides the job's total CU-seconds by 2,880 and places that fraction into each of the next 2,880 slots. The job's CU contribution per timepoint is tiny — but it's there in each slot, and it adds to every other job's contribution in that same slot.

Bursting: borrowing from the future

Fabric is deliberately designed to let operations run faster than the SKU's baseline ceiling. This is bursting: when a Spark notebook, warehouse query, or pipeline copy activity starts, the platform allocates extra CUs above the nominal SKU limit to complete it quickly. Those extra CUs are real compute — they aren't hypothetical. They are borrowed from future capacity, meaning they are debited against the CU allowance that the capacity would otherwise accumulate in upcoming timepoints (Fabric throttling policy, Microsoft Learn, checked June 2026).

Bursting is purely a performance feature. It doesn't appear as a separate billing line. What it does is increase the raw CU-seconds that need to be smoothed. A job that would have consumed 60 CU-seconds at exactly baseline speed might burst to 180 CU-seconds at 3× speed — three times as many CU-seconds to smooth away across future timepoints.

Smoothing: spreading the debt forward

Smoothing is the mechanism that makes bursting sustainable. Rather than charging the full CU cost of a burst operation to the timepoint where it ran, Fabric distributes the cost across future timepoints. How far the cost spreads depends on the operation classification:

Operation type	Smoothing window	Why
Interactive (report renders, DAX queries, XMLA)	Minimum 5 minutes, up to 64 minutes depending on CU usage	Short operations, but users notice delays — smooth just enough to absorb spikes
Background (pipelines, Spark jobs, semantic-model refreshes, Copilot)	24 hours (2,880 timepoints)	Long runtimes, large CU totals — spread across the day's idle capacity

Source: Fabric throttling policy, Microsoft Learn (checked June 2026).

The worked example in the Microsoft docs makes this concrete: a 1 CU-hour background job on an F2 contributes just 1.25 CUs per 30-second timepoint, which is about 2.1% of the F2's per-timepoint ceiling. Even though the job consumed 6× what the F2 can do in a 10-minute window, smoothing prevents it from triggering throttling because only a tiny slice hits each slot.

That's the design intent. The problem arises when multiple large jobs run simultaneously — each individually harmless after smoothing, but collectively filling every timepoint's budget.

Carry-forward debt: the invisible accumulator

When a timepoint's allocated CUs are fully consumed — when the sum of all smoothed contributions in that slot reaches or exceeds the SKU's per-timepoint limit — an overage is computed. That overage doesn't vanish; it becomes carry-forward CUs (Fabric throttling policy, Microsoft Learn, checked June 2026).

Carry-forward CUs roll into the next timepoint, and the next, and the next. If subsequent timepoints have idle headroom, that idle capacity burns down the carryforward — the CU debt shrinks. If subsequent timepoints are also full (because new jobs keep running), the carryforward grows and the outstanding future-capacity consumed keeps increasing.

This is the debt accrual mechanism that drives throttling. It's invisible in the Azure portal and in most monitoring views because it accumulates at sub-minute resolution. You see the result — throttling events — but the debt buildup happened 5 to 20 minutes earlier.

The debt-accrual timeline: how throttling stages trigger

The following original timeline shows how carry-forward debt grows from a single burst event through to full background rejection. It assumes an F64 that runs a heavy batch at 9:00 AM, then moderate interactive traffic through the morning. All timestamps are illustrative estimates based on the Microsoft throttling thresholds.

TIME      FUTURE CAPACITY CONSUMED   THROTTLE STATE
────────  ─────────────────────────  ─────────────────────────────────────────
09:00     0 min (clean)              None — capacity at baseline
09:02     +2 min (burst begins)      None — inside overage-protection window
09:05     +6 min (pipeline running)  None — still inside 10-min window
09:10     +10 min (window fills)     ⚠ INTERACTIVE DELAY — 20s added to every
                                     new report click, DAX query, XMLA call
09:25     +18 min (debt growing)     ⚠ Interactive delay continues; users notice
                                     slow slicers and dashboard loads
10:00     +50 min (still accruing)   ⚠ Approaching interactive rejection threshold
10:10     +60 min (threshold hit)    ✖ INTERACTIVE REJECTION — reports fail to
                                     load; users see CapacityLimitExceeded error
10:10–    Debt grows if new ops run  ✖ Interactive requests blocked; background
~09:00+   above idle burndown rate   jobs (pipelines, refreshes) still run
 24h
Tomorrow  +24 h threshold hit        ✖✖ BACKGROUND REJECTION — all requests
09:00     (if debt never burned)     blocked, including scheduled pipelines

The critical insight from this timeline: the throttle doesn't start at 09:10 because utilization hit 100% at 09:10. It starts because 10 minutes of future CU budget was consumed by 09:10 — the debt is in the future, not the present. The capacity's actual CPU could be at 40% right now while it's busy throttling, because the outstanding debt is measured in future timepoints, not current load.

This is why "interactive delay at 10 minutes" and "interactive rejection at 60 minutes" are future-capacity time windows, not utilization percentages. The Capacity Metrics app's Throttling chart measures "minutes of future capacity consumed," which is exactly this debt.

What the Capacity Metrics app shows (and doesn't)

The Metrics app's Compute page exposes three debt gauges on its Throttling chart: 10-minute interactive %, 60-minute interactive %, and 24-hour background %. Each measures cumulative carry-forward as a fraction of that window's total capacity allowance. 100% on the 10-minute chart means the interactive-delay throttle is now active.

What you cannot see in the Metrics app without drilling down to the Overages tab:

Which specific operation created the debt (item-level attribution, not run-level).
How much of the current carryforward will burn down versus accrue — that depends on whether quiet periods follow.
The per-workspace contribution: there is no native per-workspace CU reservation or isolation (a workspace-level surge protection feature shipped in preview in January 2026, but it throttles or limits a workspace's background usage against an admin-set threshold — it does not give each workspace a guaranteed CU slice). The tenant-level blast radius still applies.

The Metrics app keeps 14 days of 30-second compute detail on its Compute page; the preview Item History page adds 30 days of daily, item-level compute trends (storage and workspace monitoring are 30 days too). So a throttling incident from three weeks ago still shows up as a daily Item History trend — but the sub-minute timepoint detail you need to actually diagnose it is gone after 14 days, and anything past 30 days is gone natively entirely. That retention ceiling is why correlating a throttle to a workload change weeks later means extracting the data to your own store.

For how to read the 14-day window and what to do about the retention wall, see Microsoft Fabric capacity monitoring.

The blast-radius enemy

This is where the throttling blast-radius named enemy lives: one capacity, one shared CU pool, one debt accumulator. A single greedy pipeline — a metadata-driven pattern that fires 250 short copy activities back-to-back, for instance — can fill the 10-minute interactive window and block every user's Power BI report across the entire tenant, regardless of which workspace those reports live in. Users who never touched the pipeline get the 20-second delay. If the pipeline keeps running, they graduate to full rejection.

The right size to avoid this is the SKU at which your 24-hour smoothed background usage stays below 100% with real headroom, not the SKU that barely fits at 95%. Background smoothing is what determines whether debt burns down during quiet hours or compounds overnight. The right-sizing mechanics — and how to read the Capacity Metrics app to find that number — are in right-sizing your Fabric capacity.

How throttling lifts (and what makes it faster)

Throttling lifts when idle capacity burns the carryforward to zero. Each timepoint that runs below the SKU's ceiling contributes idle CUs to burndown. The rate of burndown depends on how much idle headroom exists per timepoint.

Three approaches accelerate burndown:

Temporarily scale up the SKU. A larger SKU has more CUs per 30-second timepoint, so more idle capacity is available each slot for burndown. If you're at F32 and throttled, scaling to F64 doubles the idle CUs per timepoint and roughly halves the burndown time.
Stop submitting new operations. Every new workload adds to the smoothed debt in future timepoints. A quiet period lets the existing carryforward drain without new additions.
Fix the offending workload. If the debt accrued from a pathological job pattern — 250 short copies per minute-rounded run, a hot Dataflow Gen2 loop, a misconfigured Data Activator alert — eliminating the source stops debt accrual immediately.

What does not reliably clear throttling:

Waiting without changing anything — works eventually if no new operations run, but can take hours if background smoothing is spreading a large job's debt across the next 24 hours.
Pausing the capacity without understanding the billing hit — pausing does immediately end throttling and is Microsoft's documented self-service recovery mechanism (Fabric pause and resume, Microsoft Learn, checked June 2026). But it also settles all accumulated smoothed debt as a one-time billing event at full PAYG rates (Fabric throttling policy, Microsoft Learn, checked June 2026). On a reserved capacity, that charge is paid on top of the reservation. Reserve this option for genuine emergencies where brief downtime is acceptable and you have confirmed the billing impact. The full trap mechanics and the reserved-capacity worst case are in the Fabric pause-resume trap.

A worked number: how much debt can a hot pipeline create?

Consider an F16 capacity (16 CUs/second baseline, PAYG $2,102.40/month as of June 2026). A poorly written metadata-driven pipeline runs 300 copy activities back to back, each sub-minute. Fabric Data Factory's intelligent throughput optimization (ITO) ranges from 4 to 256 units; at the minimum of 4 ITO units, the billing rate is 1.5 CU-hours per copy-activity-hour (Data Factory pricing — pipelines, Microsoft Learn, checked June 2026). Each 20-second copy activity consumes its actual 20-second duration for billing: 4 ITO × 1.5 CU-hr/ITO-hr × (20 s ÷ 3,600 s/hr) × 3,600 s/hr = 120 CU-seconds per copy (a conservative floor assuming no sub-second rounding).

300 such copies = 36,000 CU-seconds of compute. Because DataMovement operations are classified as background, those CU-seconds are smoothed over 24 hours — across all 2,880 timepoints (Fabric throttling policy, Microsoft Learn, checked June 2026). That yields 36,000 ÷ 2,880 ≈ 12.5 CU-seconds per timepoint. The F16 provides 16 CUs × 30 seconds = 480 CU-seconds per timepoint, so this single pipeline contributes roughly 2.6% of each timepoint's budget — well below the 100% threshold needed to generate carry-forward debt. One run of this pipeline, after smoothing, does not cause throttling on an F16.

The scenario where throttling appears is concurrent stacking. To fill every F16 timepoint (480 CU-seconds each) with 12.5 CU-seconds-per-run contributions, you need roughly 39 simultaneous runs of the same pipeline. At that point every timepoint is consumed, carry-forward debt accumulates faster than idle CUs can absorb it, the 10-minute overage-protection window fills, and the interactive delay — then rejection — cascade follows. That's the blast-radius problem: not one rogue pipeline, but a scheduling pattern that concentrates many concurrent pipelines in the same window. This is an estimate, scoped to the described load profile and the rates above; your actual figure comes from your own Capacity Metrics data.

What to do

Read the Throttling chart before right-sizing. The 10-minute interactive % and 24-hour background % metrics are the ground truth. Peak utilization % alone doesn't tell you whether throttling is occurring or imminent.
Size to 24-hour smoothed background under 100%. Not peak CU, not average CU — smoothed background. If that metric regularly hits 90%+, move up a SKU. One tier on the doubling ladder doubles your burndown rate.
Identify debt-creating patterns. Short high-frequency copy activities, hot Dataflow Gen2 items, unmanaged Data Activator loops — these generate disproportionate debt relative to their business value. Fix the pattern, not the SKU.
Pause only as a last resort — and know the billing hit first. Pausing immediately ends throttling (Microsoft's documented self-service mechanism), but it converts all accumulated carry-forward debt into an immediate bill at PAYG rates. On a reserved capacity that charge is paid on top of the reservation. Only pause if downtime is acceptable and you have confirmed the cost; otherwise scale up or fix the workload instead.
Set a carryforward alert. The Capacity Metrics app lets admins set email alerts at 100% utilization. Set a secondary alert at 80% on the 10-minute interactive chart so you have lead time before the interactive delay kicks in.

Frequently asked questions

What is smoothing in Microsoft Fabric? Smoothing spreads a workload's CU consumption across future 30-second timepoints rather than charging it all at once. Interactive operations smooth over a minimum of 5 minutes and up to 64 minutes, depending on how many CU-seconds the operation consumed; background operations smooth over 24 hours across 2,880 timepoints. Smoothing is what lets a small capacity run a heavy pipeline without instantly exceeding its ceiling.

What is bursting in Microsoft Fabric? Bursting lets a Fabric operation temporarily use more CUs than the capacity's baseline SKU provides, completing jobs quickly by borrowing from future capacity. Smoothing then distributes that borrowed cost across future timepoints.

What triggers throttling in Microsoft Fabric? Throttling triggers when carry-forward CU debt crosses future-capacity time-window thresholds: overage protection covers up to 10 minutes; interactive delay (20-second added latency) from 10 to 60 minutes; interactive rejection from 60 minutes to 24 hours; background rejection beyond 24 hours (Fabric throttling policy, Microsoft Learn, checked June 2026).

How long does Fabric throttling last? Until idle capacity burns carry-forward CUs to zero. The fastest fix is a temporary SKU increase, which raises the idle CU floor per timepoint and accelerates burndown. Pausing does immediately end throttling (Microsoft's documented self-service option), but triggers a one-time billing event for all accumulated smoothed debt at PAYG rates — confirm the cost before using it as a recovery mechanism.

What is carry-forward CU debt in Fabric? Carry-forward CUs are overages that exceed the 10-minute overage-protection window. They roll into each subsequent timepoint and are reduced by idle capacity (burndown) or grow if new operations keep adding smoothed debt.

Researched with AI assistance, written and fact-checked by Jonathan Flach, verified against Microsoft Learn.