Fabric Capacity Sizing Guide: Pick the Right F-SKU First Time

Q: What happens if I under-size a Fabric capacity?

Throttling kicks in through three escalating stages keyed to future-capacity time windows: interactive delay (a 20-second throttle added to interactive requests) after 10 minutes of smoothed overage, interactive rejection after 60 minutes, and background rejection after 24 hours. Under-sizing doesn't crash the platform immediately, but sustained overage blocks all new requests — including background pipelines — until the smoothed debt clears.

Q: How much headroom should I leave in a Fabric capacity?

Keep your 24-hour smoothed background usage under 100% with at least 10–20% room for spikes. For interactive workloads, aim to keep the 10-min interactive % metric in the Capacity Metrics app below 80% during business hours — that leaves burst room before the 20-second throttle triggers. If you regularly hit 90%+ smoothed background, move up a SKU; the doubling ladder means one tier up doubles your headroom and doubles your cost, so it is a clean trade-off to evaluate.

An F64 capacity gives you 64 CUs per second of baseline compute and costs $8,409.60/month on pay-as-you-go (64 × $0.18 × 730 h, as of June 2026). Pick one SKU too small and throttling blocks your users' reports; pick one too large and you burn double the budget for headroom you never use. The good news: Fabric's bursting and smoothing mechanics mean the right size is almost never your peak CPU — it's the smallest SKU whose 24-hour smoothed background usage stays under 100% with room for interactive spikes. This guide gives you the heuristic table to pick a starting point by workload profile, the smoothing mechanics that make it work, and the one case where the F64 viewer-licensing threshold overrides the load math entirely. For the full F-SKU price table and billing model comparison, see the Microsoft Fabric pricing and capacity planning guide.

Sizing is a smoothing question, not a peak-CPU question

The single most common sizing mistake is buying for the peak. A team sees a 15-minute Spark job spike the metrics app to 4× their baseline and concludes they need four times the capacity. They don't — they need to understand what smoothing actually does.

Fabric uses two mechanisms to absorb short bursts above your SKU's baseline:

Bursting — When a workload starts, Fabric can allocate more CUs than your SKU provides at that moment, completing the job faster. Burst consumption is tracked in 30-second intervals (Understand your Fabric capacity throttling, Microsoft Learn, checked June 2026).

Smoothing — That burst cost is then averaged over a window rather than charged as a spike:

Interactive operations (report renders, DAX queries, XMLA): smoothed over a minimum of 5 minutes and up to 64 minutes depending on how much CU usage they consume. (The "10-min interactive %" metric in the Capacity Metrics app is the trigger threshold for the Interactive Delay throttle — it is not the smoothing window itself.)
Background operations (pipelines, Spark jobs, semantic-model refreshes, Dataflows Gen2): smoothed over a rolling 24-hour window (Understand your Fabric capacity throttling, Microsoft Learn, checked June 2026).

The 24-hour background window is why a modest F8 or F16 can absorb a heavy nightly pipeline: the capacity "borrows" compute from quiet daytime hours and repays the debt across the rest of the day. If that debt stays under 100% of the SKU's daily CU budget, no throttling fires — even if the job briefly used 4× the baseline CU rate during its 15-minute run.

Throttling only starts when smoothed usage exceeds the SKU's allowance, and it escalates in stages keyed to future-capacity time windows (Metrics app calculations, Microsoft Learn, checked June 2026):

Stage	Window crossed	What happens
Overage protection	Up to 10 min future usage	Burst absorbed silently; no user impact
Interactive delay	10 min smoothed	20-second throttle added to interactive requests
Interactive rejection	60 min smoothed	New interactive requests rejected; users see errors
Background rejection	24 h smoothed	All requests rejected, including background jobs

These percentages measure future capacity consumed, not CPU utilization. A 150% reading on the 10-minute interactive chart in the Metrics app means the capacity has consumed 1.5× of the CUs available in the next 10 minutes — not that a CPU ran at 150%.

The sizing rule: read the Capacity Metrics app for at least 14 days (its compute-detail retention limit; Microsoft Learn, checked June 2026). Find the peak 24-hour smoothed background % and the peak 10-minute interactive %. Pick the smallest SKU where both stay under 100%, with 15–20% margin. That is your load-based size. Then apply the viewer-licensing check in the section below — it may override the answer.

The F64 licensing override

Pure load sizing can be overridden by viewer economics. At F64 and above, anyone with a free Fabric license can view Power BI reports on that capacity. Below F64 (F2 through F32), every viewer needs a Power BI Pro ($14/user/mo) or PPU ($24/user/mo) license (Understand Microsoft Fabric Licenses, Microsoft Learn, checked June 2026: "Viewing Power BI content requires a Pro or PPU license when the F SKU is below F64.").

Run the cliff math before committing to a sub-F64 SKU:

An F32 ($4,204.80/mo PAYG) serving 500 Pro viewers costs $4,204.80 + $7,000 (500 × $14) = $11,204.80/mo.
An F64 ($8,409.60/mo PAYG) serves the same 500 viewers at $0 in Pro licenses.

The F64 wins by $2,795.20/mo and gives you twice the compute. The crossover from F32 to F64 sits at 301 Pro viewers (⌈($8,409.60 − $4,204.80) ÷ $14⌉ = ⌈300.34⌉ = 301); the crossover from F8 to F64 is 526 viewers (⌈($8,409.60 − $1,051.20) ÷ $14⌉ = ⌈525.60⌉ = 526). Below those thresholds, the smaller SKU plus Pro is cheaper; above them, F64 wins outright.

This is the case where the F64 viewer threshold overrides pure-load sizing: your compute may fit in an F16, but your viewer count makes F64 cheaper all-in. Cross this check before you finalize any sub-F64 decision.

The sizing heuristic by workload profile

This table maps four real-world workload profiles to a starting F-SKU, with smoothing and bursting headroom notes. Use it as a starting point for a 60-day trial or a short PAYG burst, then validate against your own Capacity Metrics app readings.

Workload profile	What runs on the capacity	Starting F-SKU	Smoothing / bursting notes
BI-only	Primarily Power BI report renders, DAX queries, semantic-model scheduled refreshes; no Spark, no heavy pipelines	F4–F8	The 10-min interactive % metric is the binding constraint. DAX queries burst hard but smooth quickly (5–64 min depending on CU load) — watch the 10-min interactive % in the metrics app, not 24-h background. If you have > ~301 Pro viewers (at $14/user/mo vs F32), the F64 licensing override applies regardless of load.
ETL-heavy	Nightly or windowed Data Factory pipelines, Dataflows Gen2, Copy jobs; light or no interactive BI	F8–F16	Background smoothing (24 h) is the binding constraint. Pipelines bill on duration — roughly 1.5 CU-h per throughput resource per hour of copy activity (a typical parallel copy may use 4–16 resources; check the Metrics app for actual throughput units), plus 0.0056 CU-h per orchestration run (Pricing for pipelines, Microsoft Learn, checked June 2026). Dataflows Gen2 standard compute bills at a two-tier rate for the current CI/CD engine: 12 CU/s for the first 10 minutes, then 1.5 CU/s per second beyond 10 minutes — far heavier than pipelines for comparable runtimes on short queries, but the per-second rate drops sharply for long-running queries. (The non-CI/CD rate — a flat 16 CU/s — is legacy-only since April 2026: new Dataflow Gen2 items can no longer be created on the legacy engine, so all new ETL workloads should assume the CI/CD two-tier rate above. Existing legacy items continue to run at 16 CU/s until migrated (Dataflows Gen2 overview, Microsoft Learn, checked June 2026).) SQL stored procedures in a Warehouse consume fewer CUs for the same transformation (Pricing for Dataflow Gen2, Microsoft Learn, checked June 2026). Size so the 24-h smoothed background % stays under ~80% on a normal night, leaving burst room for large ad-hoc loads.
Data-science / Spark-heavy	Spark notebooks, Spark job definitions, ML training runs, large-scale data engineering; may include autoscale Spark	F16–F64	Spark jobs burst intensely and smooth over 24 h. Each Spark session start has overhead; shared sequential sessions reduce waste. Consider enabling Autoscale Billing for Spark (Spark jobs offload to a separate meter) if your Spark demand is spiky and your interactive BI must stay throttle-free — but validate the separate-meter cost first. Size the base capacity for your interactive BI floor, then let autoscale absorb Spark peaks.
Mixed (BI + ETL + Spark)	All of the above on one shared capacity	F32–F64	The shared CU pool means one greedy Spark job can consume background-smoothing headroom that a simultaneous pipeline needs — the throttling blast-radius. Size to the sum of peak 24-h smoothed background (ETL + Spark) and peak 10-min interactive (BI), not the max of either alone. At this scale, splitting into two capacities (one for interactive BI, one for background engineering) often provides better isolation than sizing up one SKU. At or above F64 the viewer-licensing benefit kicks in, which frequently tips the math toward staying on one large capacity.

All monthly figures below assume PAYG at $0.18/CU-hour × 730 h, as of June 2026:

Starting F-SKU	PAYG / month	Reserved (1-yr) / month	Typical workload fit
F4	$525.60	$312.68	BI-only, small team, < 50 viewers
F8	$1,051.20	$625.36	BI-only larger, or light nightly ETL
F16	$2,102.40	$1,250.72	ETL-heavy (moderate pipeline volume)
F32	$4,204.80	$2,501.44	Heavy ETL or light Spark; check viewer cliff
F64	$8,409.60	$5,002.87	Mixed workloads or > ~301 Pro viewers (vs F32 at $14/user/mo)

Reserved figures are estimates using the 0.5949 factor; validate against the Azure portal quote for your region.

How to validate your starting size

A heuristic gets you to the trial; the Capacity Metrics app closes the decision. Follow these steps:

Provision your starting SKU on PAYG. Run it for at least two full weeks, including your heaviest scheduled windows. Never use a vendor estimate or a benchmark from a different workload profile.
Read the two binding metrics. In the Capacity Metrics app, check the 10-minute interactive % during business hours and the 24-hour background % at end of your heaviest pipeline window. If the background % stays under 80% on normal days and under 95% on heavy days, you're correctly sized with headroom. If either regularly crosses 90%, move up a tier.
Run the viewer-licensing check. Count your Power BI report viewers. If viewer count × $14 Pro + current SKU cost approaches or exceeds the next F-SKU price, recalculate the all-in cost. The F64 licensing threshold can pull you up a SKU even if your compute fits below it.
Choose the billing model after validating the size. If your capacity runs at a high baseline (> ~60% of hours across the year), a one-year reservation saves roughly 40.5%. If your workload is windowed or spiky, PAYG with scheduled pause is cheaper — but do not pause to clear throttling: pausing bills all accumulated smoothed debt immediately at PAYG rates and SpendWeave does not recommend it as a throttling escape — evaluate that cost before choosing it over waiting for natural burndown or scaling up temporarily (Pause and resume your Fabric capacity, Microsoft Learn, checked June 2026). For the full billing-model decision after size validation, see the Fabric reserved vs. PAYG comparison.
Reassess after any major workload change. Adding a new Spark pipeline, onboarding a new business unit's reports, or enabling Copilot features (which run as background operations and consume CUs) all shift the smoothed baseline. The metrics app holds 14 days of compute detail — check it before and after any significant change.

The named enemy: the throttling blast-radius

Sizing too small creates one of the clearest enemies in Fabric FinOps: the throttling blast-radius. Because all workloads share one CU pool per capacity, there is no native per-workspace CU isolation. A single Spark notebook that exceeds the 24-hour smoothed background budget can trigger background rejection — which then also blocks the nightly pipeline, the semantic-model refresh, and any new background jobs from the whole capacity. Users don't see which job caused the problem; they see "capacity is throttled."

The blast-radius framing cuts both ways when comparing F64 against two F32s. Both options total 64 CUs at the same PAYG cost ($8,409.60/mo); the CU headroom is identical. What differs: a single F64 capacity pool means the viewer-licensing benefit applies (free viewer access for the whole org) and you eliminate the per-capacity burst ceiling that caps each F32 independently. Two F32s, by contrast, provide better workload isolation — a runaway Spark notebook on one capacity cannot consume the background-smoothing budget of the other — but you lose the F64 viewer-licensing floor and halve each capacity's individual burst ceiling. Splitting into two smaller capacities is the right isolation move when a runaway notebook regularly torpedoes report performance; consolidating onto one F64 is the right move when viewer-license savings or burst headroom outweigh the blast-radius risk. There is no universally right answer — it depends on whether your interactive BI users and your engineering pipelines can coexist on one schedule.

SpendWeave's stance: size to the 24-hour smoothed background peak, verify with 14 days of real data, apply the viewer-licensing check, and treat the blast-radius as a governance problem before it becomes a capacity-size problem. Throttling that recurs on a correctly-sized capacity is almost always a workload-optimization problem — not a signal to buy more CUs. Read the detailed monthly-bill estimate methodology in the Fabric monthly-bill estimator walkthrough.

Frequently asked questions

How do I size a Microsoft Fabric capacity? Size to your 24-hour smoothed background CU usage, not to peak CPU. Run a 60-day Fabric trial (F4 or F64 depending on eligibility) or a short PAYG burst, read the Capacity Metrics app for at least two weeks, and pick the smallest F-SKU whose smoothed background stays under 100%. Bursting and smoothing give you roughly 10 minutes of interactive headroom and 24 hours of background headroom above baseline before throttling starts.

What is Fabric capacity bursting and smoothing? Bursting lets Fabric allocate more CUs than your SKU's baseline to complete a job quickly — metered in 30-second intervals. Smoothing averages that burst cost over a window: interactive operations are smoothed over a minimum of 5 minutes and up to 64 minutes depending on how much CU usage they consume (the "10-min interactive %" in the Metrics app is the throttle-trigger threshold, not the smoothing window); background operations are smoothed over 24 hours. Smoothing is what lets a small capacity run a heavy nightly pipeline — the cost spreads across quiet hours rather than hitting as a spike.

Does the F64 licensing threshold affect capacity sizing? Yes — and it can override pure-load sizing. If your Power BI viewer count pushes "smaller SKU plus Pro licenses" past the F64 PAYG price, F64 becomes the right size regardless of compute load. At F64 and above, anyone with a free Fabric license can view Power BI reports. The crossover against an F32 baseline sits at 301 Pro viewers (at $14/user/mo).

What happens if I under-size a Fabric capacity? Throttling kicks in through three escalating stages: interactive delay (a 20-second throttle) after 10 minutes of smoothed overage, interactive rejection after 60 minutes, and background rejection after 24 hours. Sustained overage blocks all new requests until the smoothed debt clears.

How much headroom should I leave in a Fabric capacity? Keep 24-hour smoothed background usage under 100% with at least 10–20% margin. For interactive workloads, keep the 10-min interactive % metric in the Capacity Metrics app below 80% during peak hours — that leaves burst room before the 20-second throttle triggers.

Researched with AI assistance, written and fact-checked by Jonathan Flach, verified against Microsoft Learn.