GeoTIFF is the lingua franca of raster geodata. It is well-supported, widely understood, and entirely adequate for single-file workflows. But GeoTIFF was designed before cloud object storage existed, and its structure reflects that: metadata at the front, data tiled internally, a single file that must be read sequentially or via cloud-optimised extensions.
Zarr takes a different approach. Rather than a single file, a Zarr store is a directory (or object storage prefix) of small chunk files. Each chunk is a fixed-size piece of the array that can be read independently. This means any chunk can be fetched in parallel, with no coordination between readers. A client can request exactly the spatial and temporal slice it needs — two HTTP range requests, in parallel, from an S3 bucket — without downloading or scanning the rest of the dataset.
# Zarr Store Structure
A Zarr store for a 3D array (time, y, x) looks like:
monthly_ndvi.zarr/
├── .zattrs # dataset-level attributes (CRS, description, etc.)
├── .zgroup # signals this is a Zarr group
└── ndvi/
├── .zarray # array metadata: shape, dtype, chunks, compressor
├── .zattrs # variable-level attributes
├── 0.0.0 # chunk: time[0:30], y[0:512], x[0:512]
├── 0.0.1 # chunk: time[0:30], y[0:512], x[512:1024]
├── 0.1.0 # chunk: time[0:30], y[512:1024], x[0:512]
├── ...
├── 1.0.0 # chunk: time[30:60], y[0:512], x[0:512]
└── ...
Each chunk file is independently compressed. Reading a spatial subset requires fetching only the chunks that overlap that subset — no scanning, no parsing of irrelevant data.
# Writing XArray Datasets to Zarr
import xarray as xr
import numpy as np
import pandas as pd
# Build a sample 3-year daily NDVI array (1095 × 2000 × 2000)
time_index = pd.date_range("2020-01-01", periods=365*3, freq="D")
ndvi = xr.DataArray(
data=np.random.default_rng(0).uniform(0.0, 0.9, (len(time_index), 2000, 2000)).astype(np.float32),
dims=["time", "y", "x"],
coords={
"time": time_index,
"y": np.linspace(-27.0, -30.0, 2000),
"x": np.linspace(152.5, 155.5, 2000),
},
attrs={
"long_name": "Normalised Difference Vegetation Index",
"units": "dimensionless",
"crs": "EPSG:4326"
}
)
ds = xr.Dataset({"ndvi": ndvi})
# Write to Zarr with explicit chunking
ds.chunk({"time": 30, "y": 512, "x": 512}).to_zarr(
"monthly_ndvi.zarr",
mode="w",
encoding={
"ndvi": {
"dtype": "float32",
"compressor": zarr.Blosc(cname="zstd", clevel=5, shuffle=zarr.Blosc.BITSHUFFLE)
}
}
)
# Chunk Strategy: The Most Important Decision
The chunk strategy determines which access patterns are fast and which are slow. There is no universally correct chunk shape — it depends on your dominant query pattern.
# Time-series queries (pixel time series)
If you frequently extract the full time history of individual pixels or small regions:
# Chunk optimised for pixel time series: entire time axis per chunk
ds.chunk({"time": -1, "y": 64, "x": 64}).to_zarr("ndvi_timeseries_opt.zarr", mode="w")
# Each chunk: (1095, 64, 64) = 1095 × 64 × 64 × 4 bytes ≈ 17 MB
With this chunking, fetching a pixel’s time series requires one chunk read. Fetching a 64×64 spatial window’s time series requires one chunk read. Fetching a monthly mean across all pixels requires reading all chunks (the time axis is not partitioned).
# Spatial queries (single-time maps)
If you frequently read full spatial extent at a single time step:
# Chunk optimised for spatial maps: entire spatial extent per time chunk
ds.chunk({"time": 1, "y": -1, "x": -1}).to_zarr("ndvi_spatial_opt.zarr", mode="w")
# Each chunk: (1, 2000, 2000) = 16 MB
Each time step is a single chunk. Reading one month’s NDVI map requires exactly one chunk read. Extracting a pixel time series requires one read per time step — 1095 reads.
# Balanced (default for most workflows)
# Balanced: works reasonably well for both access patterns
ds.chunk({"time": 30, "y": 512, "x": 512}).to_zarr("ndvi_balanced.zarr", mode="w")
# Each chunk: (30, 512, 512) ≈ 30 MB
This is the typical default. Time series extraction over 3 years costs ~37 reads per pixel window. Spatial map extraction costs ~16 reads (2000/512 × 2000/512 rounded up). Neither is optimal, but both are reasonable.
# Reading from Zarr: Lazy Access
import xarray as xr
# Open lazily — reads only metadata, not data
ds = xr.open_zarr("ndvi_balanced.zarr")
print(ds)
# <xarray.Dataset>
# Dimensions: (time: 1095, y: 2000, x: 2000)
# Coordinates: time, y, x
# Data variables: ndvi (time, y, x) float32 dask.array<chunksize=(30, 512, 512)>
# Select a spatial subset — still lazy
brisbane_ndvi = ds['ndvi'].sel(
x=slice(152.9, 153.2),
y=slice(-27.3, -27.7)
)
print(brisbane_ndvi.shape) # (1095, ~33, ~33) — shape calculated from coords
# Compute: reads only the overlapping chunks
import time
t0 = time.perf_counter()
result = brisbane_ndvi.compute()
print(f"Brisbane NDVI (1095 time steps): {time.perf_counter() - t0:.2f}s")
# → ~0.4s (reads ~6 chunks covering the spatial extent)
# Compression Options
Zarr supports pluggable compressors. For float32 spatial data, the most effective options:
import zarr
import numcodecs
# Blosc/Zstd with bit-shuffle: best compression for float arrays
# BITSHUFFLE rearranges bits within each element before compression,
# dramatically improving compression ratio for floating-point data
zstd_blosc = zarr.Blosc(cname="zstd", clevel=5, shuffle=zarr.Blosc.BITSHUFFLE)
# LZ4: faster decompression, slightly worse ratio
lz4_blosc = zarr.Blosc(cname="lz4", clevel=5, shuffle=zarr.Blosc.SHUFFLE)
# For near-lossless compression with scale-offset encoding:
scale_offset = numcodecs.FixedScaleOffset(offset=0, scale=10000, dtype="float32", astype="int16")
# Stores NDVI * 10000 as int16 — 50% smaller, precision to 0.0001
Compression benchmark for a (30, 512, 512) chunk of float32 NDVI data:
| Compressor | Ratio | Write speed | Read speed |
|---|---|---|---|
| None | 1.0× | 2.1 GB/s | 2.1 GB/s |
| Blosc/LZ4 + SHUFFLE | 2.8× | 1.4 GB/s | 1.6 GB/s |
| Blosc/Zstd L5 + BITSHUFFLE | 4.2× | 0.6 GB/s | 0.9 GB/s |
| FixedScaleOffset → int16 + Zstd | 7.1× | 0.5 GB/s | 0.8 GB/s |
For cloud storage, higher compression ratio reduces both storage cost and network transfer time.
# Writing to S3 (Cloud Zarr)
Zarr stores can live in S3 buckets, where each chunk is a separate object:
import s3fs
import xarray as xr
# Connect to S3
s3 = s3fs.S3FileSystem(key="YOUR_KEY", secret="YOUR_SECRET")
store = s3fs.S3Map("s3://your-bucket/ndvi.zarr", s3=s3)
# Write to S3 — chunks uploaded as individual S3 objects
ds.chunk({"time": 30, "y": 512, "x": 512}).to_zarr(store, mode="w")
# Read from S3 (lazily — only fetches chunks needed for computation)
ds_remote = xr.open_zarr(store)
With a bucket in the same region as your compute, S3 Zarr reads achieve 1–2 GB/s effective throughput by parallelising chunk requests across Dask workers.
# Consolidating Metadata
For stores with many chunks (thousands of files), metadata lookup can be slow because each .zarray and .zattrs file requires a separate object-storage HEAD request. Consolidate metadata into a single file:
import zarr
# Write consolidated metadata after creating the store
zarr.consolidate_metadata("ndvi_balanced.zarr")
# Read with consolidated metadata — single metadata request
ds = xr.open_zarr("ndvi_balanced.zarr", consolidated=True)
For stores with 10,000+ chunk files on S3, consolidated metadata reduces open-time from ~30 seconds (one HEAD request per file) to ~0.1 seconds.
# The GeoTIFF → Zarr Migration Pattern
For datasets you use repeatedly for analytics:
import rioxarray
import xarray as xr
import glob
# Step 1: Load GeoTIFFs lazily
tifs = sorted(glob.glob("landsat_ndvi/*.tif"))
ds = xr.open_mfdataset(tifs, engine="rasterio", concat_dim="time", combine="nested", parallel=True)
# Step 2: Set correct chunking
ds_chunked = ds.chunk({"time": 30, "y": 512, "x": 512})
# Step 3: Write to Zarr once
ds_chunked.to_zarr("landsat_ndvi.zarr", mode="w")
# After this: always open from Zarr, not from individual GeoTIFFs
ds = xr.open_zarr("landsat_ndvi.zarr")
Opening 1,000 GeoTIFFs via open_mfdataset typically takes 10–30 seconds (one file header parse per file). Opening the equivalent Zarr store takes under 0.5 seconds. For daily-run analytics pipelines, this one-time conversion pays dividends on every subsequent run.
Related reading: XArray GroupBy for Temporal Raster Aggregation · stackstac: Lazy Satellite Time Series from STAC Catalogues · Rasterio Windowed Reading for Large Rasters