Zarr (data format)

Zarr
Zarr
Filename extension	.zarr
Latest release	3
Type of format	Multidimensional array
Open format?	Yes
Free format?	Yes
Website	zarr.dev

Last updated September 23, 2025

Zarr is an open standard for storing large multidimensional array data. It specifies a protocol and data format, and is designed to be "cloud ready" including random access, by dividing data into subsets referred to as chunks.^[1]^[2] Zarr can be used within many programming languages, including Python, Java, JavaScript, C++, Rust and Julia.^[3] It has been used by organizations such as Google and Microsoft to publish large datasets.^[4]^[5] Early versions of Zarr were first released in 2015 by Alistair Miles.^[6]^[7]

Format description

An illustration of Zarr's chunking data format. Zarr-scipy2019-storage.png — An illustration of Zarr's chunking data format.

The main data format in Zarr is multidimensional arrays. For parallelisable access, these arrays are stored and accessed as a grid of so-called "chunks". The actual data format on disk depends on the compressor and storage plugins selected by the user.^[8]

Zarr's design was influenced by that of HDF5, and so it includes similar features for metadata and grouping: arrays can be grouped into named hierarchies, and they can also be annotated with key-value metadata stored alongside the array.^[8]

Applications

Due to its efficient handling of tensors, Zarr is being used to publish weather and satellite data ^[9] and energy data,^[10] among others.

For bioimaging such as microscopy, a consortium called the Open Microscopy Environment (OME) created a format called "OME-Zarr", based on Zarr with some discipline-specific extensions.^[11] The .zarr specification enables granular representation of outputs of complex experiments, such as high content screening assays. Each plate read in the microscope contains multiple wells, and to scan each well, multiple fields are needed. Each image may have up to 5 dimensions (time points, imaging channels and the three space dimensions). It may also include resolution pyramids, enabling better performance of visualization tools. As Zarr uses multiple directories for organizing data, each of these different fields can be specified and retrieved independently, for example by retrieving a custom URL from object storage databases. ^[11]

References

↑ "Zarr - chunked, compressed, N-dimensional arrays". zarr.dev. Retrieved 2024-09-12.
↑ "Cloud-Optimized Geospatial Formats Guide: Zarr". guide.cloudnativegeo.org. Retrieved 2024-09-12.
↑ "Zarr Implementations". zarr.dev. Retrieved 2025-01-09.
↑ "Google Cloud: ERA5 data". cloud.google.com. Retrieved 2024-09-12.
↑ "Microsoft Planetary Computer: Reading Zarr Data". planetarycomputer.microsoft.com. Retrieved 2024-09-12.
↑ "zarr - PyPI" . Retrieved 2025-02-10.
↑ Alistair Miles (2016-04-14). "To HDF5 and beyond" . Retrieved 2025-02-10.
1 2 3 "Zarr - Tutorial". zarr.readthedocs.io. Retrieved 2024-09-12.
↑ "Lazy loading: Making it easier to access vast datasets of weather & satellite data". openclimatefix.org. Archived from the original on 2024-09-12. Retrieved 2024-09-12.
↑ Sansal, Altay; Kainkaryam, Sribharath; Lasscock, Ben; Valenciano, Alejandro (2023). "MDIO: Open-source format for multidimensional energy data". The Leading Edge. 42 (7). Society of Exploration Geophysicists: 465–473. Bibcode:2023LeaEd..42..465S. doi:10.1190/tle42070465.1. ISSN 1938-3789.
1 2 Moore, Josh (2023). "OME-Zarr: a cloud-optimized bioimaging file format with international community support". Histochemistry and Cell Biology. 160 (3). Springer Science and Business Media LLC: 223–251. doi:10.1007/s00418-023-02209-1. hdl: 1721.1/151126 . ISSN 1432-119X. PMC 10492740 . PMID 37428210.

External links

Official website

This computing article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[zarr-specs-1] "Zarr - chunked, compressed, N-dimensional arrays". zarr.dev. Retrieved 2024-09-12.

[cloudnativegeo-2] "Cloud-Optimized Geospatial Formats Guide: Zarr". guide.cloudnativegeo.org. Retrieved 2024-09-12.

[zarr-github-impl-3] "Zarr Implementations". zarr.dev. Retrieved 2025-01-09.

[4] "Google Cloud: ERA5 data". cloud.google.com. Retrieved 2024-09-12.

[5] "Microsoft Planetary Computer: Reading Zarr Data". planetarycomputer.microsoft.com. Retrieved 2024-09-12.

[6] "zarr - PyPI" . Retrieved 2025-02-10.

[7] Alistair Miles (2016-04-14). "To HDF5 and beyond" . Retrieved 2025-02-10.

[zarrtutorial-8] 1 2 3 "Zarr - Tutorial". zarr.readthedocs.io. Retrieved 2024-09-12.

[ocf-9] "Lazy loading: Making it easier to access vast datasets of weather & satellite data". openclimatefix.org. Archived from the original on 2024-09-12. Retrieved 2024-09-12.

[mdio-10] Sansal, Altay; Kainkaryam, Sribharath; Lasscock, Ben; Valenciano, Alejandro (2023). "MDIO: Open-source format for multidimensional energy data". The Leading Edge. 42 (7). Society of Exploration Geophysicists: 465–473. Bibcode:2023LeaEd..42..465S. doi:10.1190/tle42070465.1. ISSN 1938-3789.

[ome-zarr-11] 1 2 Moore, Josh (2023). "OME-Zarr: a cloud-optimized bioimaging file format with international community support". Histochemistry and Cell Biology. 160 (3). Springer Science and Business Media LLC: 223–251. doi:10.1007/s00418-023-02209-1. hdl: 1721.1/151126 . ISSN 1432-119X. PMC 10492740 . PMID 37428210.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

Zarr (data format)

Contents

Format description

Applications

See also

References

External links