#server-performance

ZFS ARC Tuning: Caps, Limits and What to Measure

11 min read - June 24, 2026

hero section cover
Table of contents
  • Measure ARC before you tune anything
  • The four ARC tunables that matter
  • Tuning ARC by workload
  • Diagnosing problems and knowing when to stop
Share

ZFS ARC tuning by workload. Which tunables matter, how to set zfs_arc_max on Linux and FreeBSD, and how to tell when you are done.

ZFS will quietly take roughly half your system RAM for its read cache by default, and on the wrong kind of server that means swap activity, OOM kills, or a database competing with the filesystem for memory. ZFS ARC tuning is about deciding how much of that RAM the ARC is actually allowed to keep, and what you give up to set the limit. This post covers how ARC uses memory, what to measure before you touch anything, the handful of tunables worth changing, and sensible starting points for file servers, hypervisors, databases, and backup targets. For the snapshot side of ZFS, see our guide to ZFS snapshots.

Measure ARC before you tune anything

Do not change a single tunable until you have baseline numbers from a normal busy period. Quiet-period snapshots will send you in the wrong direction. Nightly backups, weekly reports, and batch jobs are usually where ARC behaviour gets interesting, so capture data across several days.

Three tools cover most of what you need:

  • arcstat 1 gives a live scrolling view of hit and miss counters, demand versus prefetch activity, and current ARC size. Use it during load tests and backup windows.
  • arc_summary prints a single snapshot: ARC size and target, the MFU/MRU split, metadata ratios, and active tunables. Run arc_summary -s arc for the ARC section only.
  • Raw counters live in /proc/spl/kstat/zfs/arcstats on Linux and under the kstat.zfs.misc and vfs.zfs sysctl trees on FreeBSD. Scrape these from monitoring rather than parsing formatted output.

The counters worth recording before any change:

MetricWhere to find itWhy it matters
ARC size, target, max (size, c, c_max)arcstat, kstatTells you whether ARC is pinned at its ceiling or still has room to grow
Demand data and metadata hit ratiosarcstat, arc_summaryDemand misses translate directly into application latency
Available memory and swap activity (si/so)free -h, vmstat 1Sustained swap-in/out while ARC is large is the clearest sign of memory pressure
Disk service time (await) and utilisationiostat -xConnects ARC misses to actual storage bottlenecks
memory_throttle_count/proc/spl/kstat/zfs/arcstatsA rising count confirms ZFS is being throttled because of memory pressure

Two things people commonly get wrong here. Watch available memory, not free memory; Linux happily reports low free RAM as a steady state and that alone is not a problem. The signal that matters is available memory near zero combined with sustained swap activity (the Linux memory management primer covers why). And treat hit ratio as a trend, not a target. A 99% hit ratio on a box that is swapping is a tuning failure, not a success.


 

The four ARC tunables that matter

Most production tuning comes down to four settings. Match the setting to the pressure you actually measured in the baseline. Swap activity points to zfs_arc_max. Reclaim churn that keeps wiping a hot cache points to zfs_arc_min. Slow directory walks point to the metadata limit.

TunableWhat it doesWhen to change itRisk if wrong
zfs_arc_maxHard upper limit on ARC RAM usageCo-hosting databases or VMs that need reserved RAMToo low: more disk I/O and latency. Too high: swap pressure or OOM.
zfs_arc_minFloor that stops ARC shrinking aggressivelyWorkloads with short memory spikes that keep wiping the cacheToo high: starves applications during genuine memory pressure
zfs_arc_meta_limit_percentShare of ARC available to metadata (replaces the older zfs_arc_meta_limit)Millions of small files, deep directory trees, slow ls/findToo low: directory lookups crawl. Too high: starves data caching.
zfs_arc_free_targetHow much free system memory ZFS tries to keep availableServers with sudden large allocation bursts (VM start-up, big query plans)Too high: ARC stays small even when RAM is available

Start with the smallest change that addresses the pressure you can see. For zfs_arc_max, the right ceiling depends on workload (covered in the next section). For zfs_arc_min, a floor of 25% to 50% of zfs_arc_max is a reasonable starting point if you need one at all. For metadata, recent OpenZFS defaults already give metadata 75% of ARC via zfs_arc_meta_limit_percent, which is generous for most workloads; only touch this when metadata misses are clearly visible in arcstat.

Applying changes on Linux and FreeBSD

On Linux, test a change at runtime by writing to the sysfs parameter file. No reboot needed:

echo 17179869184 > /sys/module/zfs/parameters/zfs_arc_max

That sets zfs_arc_max to 16 GiB immediately. To make the change survive a reboot, add it to /etc/modprobe.d/zfs.conf:

options zfs zfs_arc_max=17179869184

On FreeBSD, runtime changes use sysctl:

sysctl vfs.zfs.arc_max=17179869184

Persist the same value in /boot/loader.conf:

vfs.zfs.arc_max="17179869184"

Change one setting at a time, in small steps of around 10% of total RAM. Watch the problem window. Keep the change only if swap stays at zero and latency is stable. Persist only after the runtime test passes.

Tuning ARC by workload

Total RAM is the wrong place to start. ARC sizing should follow the workload mix on the box.

WorkloadStarting zfs_arc_maxARC priorityNotesKey metric
Dedicated file server / NAS75% to 80% of RAMData and metadataPrefetch on. Aggressive cache is the point.Overall hit ratio
Virtualisation host30% to 40% of RAMBalancedLeave headroom for guest RAM and host tasks. Any non-zero si/so means cap further.Host swap (si/so)
Database server25% to 50% of RAMMetadata-leaningReserve memory for the DB engine first. Set primarycache=metadata if the engine handles its own buffer cache.Demand misses
Backup / archive targetConservative capMetadata onlySet primarycache=metadata so one-pass scans do not evict useful blocks.Prefetch misses, metadata hit rate
Analytics / repeated readHigher cap after other caches reservedMFU-heavyL2ARC on NVMe can keep the hot working set across query runs.Demand misses

A VM host needs to share memory with its guests, so a 30% to 40% cap is a safe default and 50% is already too high on most builds. Databases like PostgreSQL and MySQL manage their own buffer caches, so you reserve memory for the engine first and let ARC have what is left. Backup targets benefit from primarycache=metadata because the data being read is rarely needed again, and you do not want a nightly backup walking the entire pool and flushing the rest of the cache as it goes. Across every workload, swap activity while ARC is pinned at zfs_arc_max means the cap is too high; that rule does not change.

Diagnosing problems and knowing when to stop

An undersized ARC shows up as high read IOPS, low demand hit rates, and slow directory browsing while the system still has free RAM. An oversized ARC is less obvious. The hit ratio looks fine, but the box starts swapping, load averages climb, processes block in D state while the kernel reclaims ARC pages on demand, and in the worst case the OOM killer starts choosing victims. The cache looks healthy and the server feels terrible.

Metadata pressure shows up when demand_metadata_bytes sits much higher than demand_data_bytes in arc_summary. That is when metadata is fighting data for space, and the metadata percent limit is worth raising.

Match what you see to the first setting to check:

SymptomLikely causeFirst tunable to checkNext step
High await with high demand missesWorking set exceeds ARCzfs_arc_maxAdd RAM or add L2ARC
Swap activity while ARC is largeARC starving the OS or appszfs_arc_maxLower the cap
Performance dies after memory spikesAggressive eviction during reclaimzfs_arc_minSet a floor at 25% to 50% of arc_max
Slow ls, find, small-file opsMetadata cache starvationzfs_arc_meta_limit_percentRaise the metadata percent
Rising memory_throttle_countSystem-wide memory pressurezfs_arc_maxLower the cap; check for L2ARC index bloat

L2ARC is not free. The index for L2ARC entries lives in primary ARC, and if that overhead climbs past about a third of total ARC capacity, the secondary cache does more harm than good. Reach for L2ARC only when the working set is bigger than RAM but still fits on a fast NVMe device, and only when the primary ARC hit ratio is already healthy.

The right time to stop tuning is when latency is flat, swap stays at zero through the same busy window that caused the original problem, and further changes no longer improve anything. A high hit ratio means nothing if the server is swapping. Past that point, stop adjusting settings and only revisit them if the same problem comes back under the same workload.

If you need a server with the RAM headroom to run ZFS properly without fighting your VMs or databases for memory (how much RAM do you actually need is worth a read first), take a look at FDC dedicated servers.

Blog

Featured this week

More articles
Digital eye strain: How to protect your vision in a screen-heavy world

Digital eye strain: How to protect your vision in a screen-heavy world

Staring at screens all day? Learn how to reduce digital eye strain with proven techniques and tools. This guide is essential for remote workers, developers, and anyone in tech.

4 min read - May 21, 2025

Why it's important to have a powerful and unmetered VPS

8 min read - May 9, 2025

More articles
background image

Have questions or need a custom solution?

icon

Flexible options

icon

Global reach

icon

Instant deployment

icon

Flexible options

icon

Global reach

icon

Instant deployment