|
H5CPP
v1.14.0
Modern C++ templates for HDF5 serial and parallel I/O
|
|
Read and write any STL type, linear-algebra container, registered POD, or
std::tupleto HDF5 in one line of C++. No manualH5Tcreate, no offset arithmetic, no error-code dance.
RAII handles close on scope exit. The on-disk format is canonical HDF5 — readable by Python, R, MATLAB, Julia, Fortran, or any other HDF5-capable tool — and for sparse matrices it's byte-compatible with scipy.sparse.csc_matrix, Julia HDF5.jl, and the 10x Genomics / Loompy convention out of the box.
H5CPP began from a practical requirement: efficient storage for large numerical datasets with both indexed block access and sequential streaming. Existing serialisation systems handled streams reasonably well, but not the kind of partial, multidimensional, file-backed access needed in numerical computing and financial engineering. HDF5 already had most of the right storage primitives — partial I/O, extendable datasets, compression, broad interoperability across operating systems and scientific environments. What was missing was a modern C++ interface.
| Direct HDF5 in C/C++ | H5CPP | |
|---|---|---|
| Datatype + dataspace | Manual H5Tcreate / H5Screate_simple per type | h5::write(fd, "x", value) — type deduced from T |
| Shape bookkeeping | Hand-tracked hsize_t[] arrays | Derived from the value's access_traits_t::size |
| Resource lifetime | Paired H5Fopen / H5Fclose, error-prone on early return | h5::fd_t — RAII close on scope exit |
| Container interop | Memcpy into intermediate buffers | Direct on std::vector, arma::Mat, Eigen::SparseMatrix, … |
| Error handling | Negative return codes + global error stack | 48-leaf typed exception hierarchy with HDF5 stack folded in |
| Parallel I/O | Separate FAPL setup; raw MPI driver calls | h5::fapl{ h5::driver::mpi{comm, info} } — same call shape |
| User structs | Hand-written H5T_COMPOUND per field | H5CPP_REGISTER_STRUCT(T) (POD) or h5cpp-compiler (non-POD); C++26 reflection on the roadmap |
This documentation site is organised along three axes plus a project scorecard. Pick the one that matches what you're trying to do:
| Axis | Answers | Start here |
|---|---|---|
| I/O | "I have a `FILE` / `DATASET` / `ATTRIBUTE` / `GROUP` — what API do I call?" | API reference organised by HDF5 object kind |
| TOPICS | "What does h5cpp *support*, and how does it work under the hood?" | Capability reference: properties, linalg, STL, reflection, filters, MPI, error handling, type system, CDash, reports |
| COOKBOOK | "Just show me a runnable example." | 28 worked examples — basics, datasets, attributes, groups, compound, linalg, sparse, mdspan, transform, optimised, packet table, custom pipeline, references, smart pointers, reflection, multi-TU, half-float, CSV, S3, MPI, … |
| USABILITY | "What's actually supported in v1.14.0?" | Per-category feature scorecard (✔ / ◇ / ✘) — core I/O, type coverage, linalg, advanced HDF5, parallel & async, sparse, error handling, STL, type system, reflection, build & CI |
A few of the differentiators worth knowing about up front:
| Capability | One-liner | Reference |
|---|---|---|
| Linear algebra | Armadillo, Eigen, Blaze, Blitz++, dlib, IT++, Boost uBLAS, xtensor — same h5::write / h5::read on every library | LINEAR ALGEBRA |
| STL (sequence + associative + ragged) | std::vector, std::map, std::list, std::vector<std::vector<T>>, std::vector<std::string> — all routed via the unified dispatch | STL |
| Sparse matrices | arma::SpMat and Eigen::SparseMatrix<T, ColMajor> round-trip through canonical CSC group layout (scipy / Julia / 10x / Loompy interop) | LINEAR ALGEBRA |
| Streaming I/O (C++20 ranges) | for (auto v : h5::view<float>(ds)) … — iterate multi-GB datasets one chunk at a time, no materialisation | curated_io_api_dataset_view |
| Compiler-assisted reflection | Pre-build Clang tool emits compound descriptors + scatter/gather for any non-POD struct — no intrusive macros. C++26 reflection on the roadmap. | REFLECTION |
| Walter Brown feature detection | The dispatch asks structural questions about T (does it have data()? value_type? tuple_size?) — third-party container libraries (Abseil, Folly, EASTL, Boost.Container) work via the same path | STL |
| Filters | gzip, shuffle, fletcher32, nbit, Gorilla (Facebook time-series codec), custom plugins — composed via the \| operator | FILTERS |
| Parallel I/O | MPI-IO collective and independent transfer; works without a parallel filesystem (laptop testing, NFS-backed cluster) | MPI |
| Async mode | h5::async::create returns type-level-discriminated descriptors — operator hid_t() = delete'd so raw-CAPI escape fails at compile time | Async-mode handles |
| Context-sensitive exceptions | 48 typed leaves — catch(h5::error::io::attribute::open&) distinguishes from catch(h5::error::io::dataset::read&) | ERROR |
Pre-built packages for v1.14.0, one row per platform — the H5CPP column is the header-only library (required); the COMPILER column is the optional Clang LibTooling-based reflection toolchain (needed only for non-POD struct persistence — see REFLECTION).
| Platform | Arch | Format | H5CPP (lib) | COMPILER (optional) |
|---|---|---|---|---|
| Ubuntu / Debian | x86_64 / amd64 | .deb | download (22 MB) | download (17 MB) |
| Ubuntu / Debian | aarch64 / arm64 | .deb | download (22 MB) | download (17 MB) |
| RHEL / Fedora / openSUSE | x86_64 | .rpm | download (22 MB) | download (17 MB) |
| RHEL / Fedora / openSUSE | aarch64 | .rpm | download (22 MB) | download (17 MB) |
| Linux (musl, distro-agnostic) | x86_64 | .tar.gz | — | download (33 MB) |
| macOS | Apple Silicon (arm64) | .pkg | download (23 MB) | download (2 KB ⚠) |
| Windows | x64 | .exe | download (25 MB) | download (9 MB) |
⚠ The macOS COMPILER
.pkgis currently a 2 KB stub — full macOS package in the works. Build from source via the h5cpp-compiler repo if you need it on Apple Silicon today.
SHA-256 checksums on each release page: h5cpp v1.14.0 · h5cpp-compiler v1.12.6 · browse all h5cpp releases / all compiler releases for older versions and pre-release builds.
Write your first program — see the snippet at the top of this page.
CI matrix, ASan / UBSan / TSan, code coverage, and the public CDash dashboard:
| OS / Compiler | GCC 13 | GCC 14 | Clang 17–20 | Apple Clang | MSVC |
|---|---|---|---|---|---|
| Ubuntu 22.04 / 24.04 | ✔ | ✔ | ✔ | ∅ | ∅ |
| macOS 15 arm64 | ∅ | ∅ | ∅ | ✔ | ∅ |
| Windows | ∅ | ∅ | ∅ | ∅ | ✔ |
std::mdspan); C++26 reflection on the roadmapThe v1.14.0 usability scorecard has the full picture — highlights since the earlier evaluation:
std::map, std::list, std::set, ragged vector<vector<T>> all routed through the unified Walter Brown trait-based dispatchh5::async::* descriptor types with operator hid_t() = delete so the type system prevents raw-CAPI escape pathsh5::view<T> — C++20 ranges view over rank-1 chunked datasets, decompressed through the filter pipeline.pkg for Apple Silicon| Coming from… | Start at | Why |
|---|---|---|
| HDF5 CAPI | IO API | One-to-one mapping from HDF5 verbs (H5Fcreate, H5Dwrite, H5Aopen) to h5cpp's typed equivalents |
| h5py (Python) | IO API | Side-by-side syntax comparison |
| Boost.Serialization / Cereal / nlohmann::json | REFLECTION | How h5cpp's compiler-assisted reflection avoids intrusive macros |
| HighFive | the bench/ suite | Methodology used to compare h5cpp against HighFive and raw CAPI |
| A new C++ project | The quickstart above + Basic H5CPP Operations | 60 seconds to a working h5::write / h5::read round-trip |
| A linear-algebra-heavy codebase | LINEAR ALGEBRA | Per-library mapper headers and storage-order rules |
| An LLM-tooling pipeline (RPC schemas, JSON Schema) | h5cpp-compiler Multi-Backend Architecture | One struct → HDF5 + Protobuf + JSON Schema + SQL + Avro from a single source of truth |
H5CPP began as a small collection of templates. Early contact with The HDF Group helped shape the first H5CPP11 project, and later informed the C++17 version. I am especially grateful to Gerd Heber for his sustained guidance, generosity, and friendship over the years; to Elena Pourmal and David Pareah for their encouragement, support on the project’s direction; and to Mark Paterno and Chris at Fermilab for their thoughtful input. Many of H5CPP’s stronger ideas were sharpened through those discussions. Any mistakes, omissions, or rough edges remain entirely my own.
H5CPP has been presented in HDF5 and C++ community venues over multiple years, including HUG sessions, HDF Group events, C++ community talks, and ISC-related material. Topics covered include compiler-assisted reflection, POD introspection, MPI/parallel I/O, throughput/latency trade-offs, and practical HDF5 workflows.
| Source | https://github.com/vargalabs/h5cpp |
| h5cpp-compiler | https://github.com/vargalabs/h5cpp-compiler |
| License | MIT |
| Citation | Zenodo DOI 10.5281/zenodo.20123216 |
| Contact | Steven Varga · steve.nosp@m.n@va.nosp@m.rgala.nosp@m.bs.c.nosp@m.om · https://steven-varga.ca/ |
Documentation built with Doxygen + the doxygen-awesome-css theme.