|
H5CPP
v1.14.0
Modern C++ templates for HDF5 serial and parallel I/O
|
|
C++ already knows the layout of your structs. HDF5 does not. Bridging the two by hand means maintaining a parallel schema in H5Tinsert calls that drifts out of sync every time a field is added, renamed, or reordered.
h5cpp removes that tax. You write ordinary C++ structs with small [[h5::...]] annotations; the H5CPP compiler scans them, classifies each struct by its fields, and emits the correct HDF5 compound descriptors and serialisation bodies into generated.h. Your C++ type stays the single source of truth.
Storing a struct in HDF5 without reflection looks like this:
Change one member and the descriptor silently corrupts. Add std::string or std::vector<T> and the problem escalates: you now need hvl_t relays, chunked extendable datasets, append logic, unpack logic, and HDF5 memory reclamation — all by hand.
Write the struct once, with annotations where the default is wrong:
The compiler emits the descriptor into generated.h:
You do not write, review, or maintain that code. The compiler generates it when types.h changes.
h5cpp classifies every struct into one of two tiers before emission.
Tier-1 covers arithmetic fields, fixed C arrays, nested POD structs, and the serialize_full escape hatch. The emitted register_struct<T>() registers a compound type with h5cpp's normal h5::write / h5::read path.
Tier-2 covers structs with std::string or std::vector<T> fields. The compiler emits a row_t mirror, a compound_type() helper that creates H5T_VARIABLE and H5T_VLEN base types, and scatter<T> / gather<T> bodies that marshal the live object. Example:
| File | Purpose |
|---|---|
types.h | Annotated C++ record declarations — tier-1 POD, tier-2 VLEN, nested POD, serialize_full, name_all, on_missing |
reflection.cpp | Round-trip demo for reflected structs and generic library types |
generated.h | H5CPP compiler output: compound descriptors, VLEN mirrors, scatter/gather bodies |
Include order:
The executable writes reflection.h5 in the current directory. Inspect the schema:
| Attribute | Scope | Effect |
|---|---|---|
[[h5::doc("...")]] | struct | Documentation propagated into generated.h |
[[h5::alias("...")]] | struct | Logical name for the generated namespace |
[[h5::version("...")]] | struct | Schema version metadata in generated output |
[[h5::name("...")]] | field | Rename one field on disk; overrides name_all |
[[h5::name_all("pre", "suf")]] | struct | Prefix/suffix applied to every field name |
[[h5::ignore]] | field | Omit the field from persistence |
[[h5::chunk(N)]] | struct | Chunk size for tier-2 datasets |
[[h5::compress("gzip", N)]] | struct | Compression filter for tier-2 datasets |
[[h5::on_missing("create")]] | struct | Create dataset when missing |
[[h5::on_missing("ignore")]] | struct | Return early when dataset is missing |
[[h5::serialize_full]] | struct | Force tier-1 emission; non-POD fields are silently skipped |
The second half of the example demonstrates h5cpp's generic access_traits_t dispatch. These types round-trip through h5::write / h5::read without compiler assistance:
| Type | Storage model | Mechanism |
|---|---|---|
std::tuple<int, double, char> | scalar compound | pack / unpack flat buffer |
std::vector<std::tuple<int, float>> | rank-1 compound | elem_traits::pack each element |
std::vector<std::complex<double>> | rank-1 compound/native | direct write/read |
std::map<int, double> | key-value compound | { key, value } rows |
std::set<int> | rank-1 dataset | staged iterator write |
std::vector<std::string> | VLEN text dataset | char* relay + reclaim |
std::vector<std::vector<double>> | ragged VLEN dataset | hvl_t relay + reclaim |
The compiler is for your domain structs — the things HDF5 cannot infer and C++17 cannot reflect. Standard containers are already handled by the library.
| Change | Manual HDF5 cost | Compiler-assisted cost |
|---|---|---|
| Add POD field | Add H5Tinsert manually | Rebuild |
| Rename field on disk | Edit string literal manually | Add [[h5::name("...")]] |
| Add fixed array | Create H5Tarray_create | Rebuild |
| Add nested struct | Build nested compound | Rebuild |
Add std::string | Write VLEN string plumbing | Rebuild |
Add std::vector<T> | Write hvl_t packing/unpacking | Rebuild |
| Add chunking | Edit DCPL manually | Add [[h5::chunk(N)]] |
| Add compression | Edit DCPL manually | Add [[h5::compress("gzip", N)]] |
| Skip internal field | Remember not to insert it | Add [[h5::ignore]] |
| Change missing-path policy | Edit scatter/gather logic | Add [[h5::on_missing("...")]] |
Today's implementation uses a Clang-based compiler tool that parses [[h5::...]] attributes and emits generated.h. Under C++26 (P2996 + P3394), the same work moves into the language itself:
std::meta::members_of(^sensor_t) will enumerate fields at compile time, read annotations via std::meta::annotations_of, and produce the same dt_t<T> and scatter<T> specialisations that generated.h contains today. The H5CPP compiler becomes an optional convenience; the vocabulary stays identical. This example is therefore both a practical C++17 tool and a preview of the zero-tooling path that C++26 unlocks.
The GENERATED line tells CMake to invoke the H5CPP compiler before compiling the example. When types.h changes, generated.h is re-emitted automatically.
| Target | Status |
|---|---|
examples-reflection | OK — tier-1, tier-2, annotations, and type-system round-trips verified |
No external dependencies.
examples/compound/ — smaller tier-1 / tier-2 reflection exampleexamples/attributes/ — exhaustive type-check matrix for register_structexamples/container/ — generic STL and structural container dispatchexamples/multi-tu/ — generated descriptor use across multiple translation unitstasks/h5cpp-compiler-h5-attribute-taxonomy.md — full annotation vocabularytasks/h5cpp-type-system-architecture-notes.md — kind × storage dispatch matrixgenerated.h — rendered with syntax highlightingreflection.cpp — rendered with syntax highlightingtypes.h — rendered with syntax highlighting