|
H5CPP
v1.14.0
Modern C++ templates for HDF5 serial and parallel I/O
|
|
Date: 2026-05-21 Authors: Steven Varga, Winston (Architecture) Status: Verified — years confirmed via web search 2026-05-21; novel-combination claim verified Repo: vargaconsulting/h5cpp-compiler Use: Related-work section for papers, pitch-deck context, reference when explaining the project's position in the landscape
The technique h5cpp-compiler uses — Clang-AST-walking codegen for type persistence — has a clear precedent: CERN ROOT's rootcling (2014), which generates per-class streamers for ROOT files. The pattern itself is older still: Qt MOC (~1995), SWIG (1996).
What appears to be unique to h5cpp-compiler is the combination: Clang-AST-walking plus HDF5 as the target plus zero-copy scatter/gather for std::vector-and-similar fields plus the strict "guaranteed only for recursively contiguous types" semantic. No exact match was found via web search. The closest HDF5 alternative, HighFive (BlueBrain, 2015), is explicitly a manual wrapper — the user declares HDF5 compound types by hand.
Project lineage matters for the claim. The h5cpp work itself has multiple generations:
So the templates have been refined since 2017; the Clang AST walker is the 2018 addition. Both the long template lineage and the codegen step matter when positioning the work.
All years below were confirmed via web search 2026-05-21. Confidence indicators have been removed; every entry below is supported by at least one cited source (see "Sources" at the bottom).
| Year | Tool | Notes |
|---|---|---|
| ~1992–95 | Qt MOC | First Qt release 1995 (Qt 0.9); MOC has been in Qt from the beginning. Walks Q_OBJECT classes via own preprocessor, generates metadata + signal/slot dispatch |
| 1994–95 | CERN ROOT | Development started 1994 (Brun & Rademakers, CERN); first public release v0.5 in November 1995 |
| Feb 1996 | SWIG (David Beazley) | Originated at Los Alamos National Laboratory; walks C++ headers, generates language-binding glue using own parser |
| 2014 | rootcling / Cling (CERN ROOT 6) | Clang/LLVM-based replacement for the older rootcint; closest direct analog to h5cpp-compiler's approach. Generates per-class streamer code from C++ class declarations annotated via LinkDef.h |
| 2018 | C++ static reflection — P1240R0 | Sutton, Vali, Vandevoorde; first revision October 8, 2018. Succeeded by P2996 for C++26. Would let h5cpp-compiler do its work inside the user's compilation without Clang Tooling |
| 2019 | refl-cpp (Veselin Karaganev) | Header-only manual reflection library with macros declaring fields |
| Dec 2017 | h5cpp11 (Steven Varga) — second-generation precursor to current h5cpp | C++11 header-only HDF5 library; pure template-based, no Clang AST codegen. Earliest commits Dec 2017; repository published to GitHub July 19, 2018; archived 2022. Same lineage as current h5cpp (templates between linear-algebra libraries — armadillo, eigen3, ublas, blitz++ — and HDF5 datasets) but without the compiler. There is also a first-generation precursor predating h5cpp11; not yet documented here |
| Fall 2018 | h5cpp + h5cpp-compiler (Steven Varga) — current generation | First introduced at Chicago C++ Usergroup meeting, Fall 2018. Presented at ISC'19 BOF (Frankfurt). Already framed at debut as *"low latency MPI capable persistence"* — the MPI angle is part of the original pitch. Collaboration with The HDF Group. This is the generation that added the Clang AST walker on top of the template foundation laid down by h5cpp11. |
| Year | Tool | Notes |
|---|---|---|
| 2007 | Apache Thrift | Facebook open-sourced in April 2007; donated to Apache 2008; TLP October 2010. Multi-language IDL codegen |
| Jul 2008 | Protocol Buffers | Public release July 7, 2008 (internal at Google since ~2001); .proto files compiled by protoc |
| Apr 2013 | Cap'n Proto (Kenton Varda) | Released April 1, 2013 by the primary author of Protocol Buffers v2. Zero-copy schema-first |
| Jun 2014 | FlatBuffers (Google) | Released June 17, 2014. Zero-copy schema-first, games/mobile focus |
| Year | Tool | Notes |
|---|---|---|
| 2002–04 | Boost.Serialization (Robert Ramey) | Development from 2002; first Boost release in 1.32 on Nov 1, 2004. Intrusive serialize() methods + visitors |
| ~2013 | Cereal (Voorhies / Grant) | Modern C++11 serialization library; intrusive macros |
| 2014 | Boost.PFR / Magic Get (Antony Polukhin) | Originally "magic_get"; compile-time POD reflection via structured-bindings tricks. Tier-1 only by our taxonomy, no compiler tooling |
| ~2014-15 | Rust serde (David Tolnay) | Pre-1.0 development from ~2014–15; serde 1.0 released May 2017. Procmacro derive — closest cross-language analog: "compiler generates per-type serializer from the type definition" |
| Year | Format / Tool | Notes |
|---|---|---|
| 1998 | HDF5 (NCSA → HDF Group) | The target format itself. NCSA released with DOE/NASA/NCSA support. HDF Group spun off in 2006 |
| Oct 2007 | JSON Schema (Kris Zyp first proposal) | First formal draft December 2009 |
| 2008 | MessagePack (Sadayuki Furuhashi) | Announced August 16, 2008 |
| 2009 | Apache Avro | Initial release 2009; v1.0.0 + Apache TLP May 2010 |
| 2008–09 | h5py (Andrew Collette) | Python HDF5 with automatic runtime type mapping |
| 2011 | Swagger | Renamed OpenAPI ~2015 |
| Oct 2013 | CBOR (RFC 7049) | Binary JSON, IoT-focused |
| Jun 2014 | FlatBuffers | (also in schema-first table) |
| 2015 | GraphQL (Facebook open-source) | Internal since 2012, public release 2015 |
| 2015 | HighFive (BlueBrain) | Started 2015 as part of the Blue Brain Project. Closest HDF5-space alternative to h5cpp. Manual struct-to-compound mapping, no AST codegen |
| Feb / Oct 2016 | Apache Arrow | Announced February 17, 2016; first release v0.1.0 October 7, 2016 |
Sifting the timeline above, the components h5cpp-compiler combines all exist as prior art individually:
H5Tvlen_create, hvl_t), but using them automatically from a Clang AST walk has no published precedent we foundThe unique combination:
Clang AST walker discovers user struct types via
h5::writecall sites → emits a per-type shim that builds HDF5 compound-with-VLEN descriptors pointing directly atvector.data()→ library issues a singleH5Dwritewith zero intermediate buffer.
A focused web search ("HDF5 zero-copy scatter gather C++ struct compound type automatic generation") returned Steven Varga's own blog post as the canonical reference — Zero-Cost C++ Structs to HDF5 Compound Types with H5CPP. No competing publications found.
HighFive (BlueBrain) is the most-similar HDF5 C++ library. The key distinction:
| Feature | HighFive (2015) | h5cpp-compiler (2018) |
|---|---|---|
| Library | Header-only C++14/17 wrapper around HDF5 C API | Header-only C++17 library + LLVM-based source-transformation tool |
| Struct → compound mapping | Manual — user calls HighFive::CompoundType::create<MyStruct>(…) and registers each field explicitly | Automatic — Clang AST walker discovers struct definitions via call sites; emits register_struct<T>() specializations |
Scatter for std::vector fields | Not implemented at this level — user handles indirection manually | Designed-in (target state for tier 2; see scatter/gather design doc) |
| MPI claim | None made | Explicit from the project's debut |
| Codegen tooling | None | Clang Tooling-based |
HighFive is the high-quality manual alternative; h5cpp-compiler is the automatic alternative. They are complementary rather than competing.
rootcling is the strongest direct analog. Both tools:
Differences:
| rootcling | h5cpp-compiler | |
|---|---|---|
| Target file format | ROOT .root | HDF5 .h5 |
| Discovery mechanism | User-supplied LinkDef.h listing classes to wrap | AST walker latches onto h5::write / h5::read call sites |
| Runtime model | TClass + TStreamerInfo (per-class runtime dictionary) | Compile-time template specialization (register_struct<T>, future scatter<T> / gather<T>) |
| Container support | STL containers via TStreamerInfo | std::vector + extensible via adapter trait |
| Polymorphism | Yes, via RTTI + class hierarchies | Out of scope (tier 4 reserved with [[h5cpp::serialize_full]]) |
rootcling is older, more mature, and broader in scope (polymorphism, multi-format I/O via ROOT's TFile). h5cpp-compiler is narrower (HDF5 only) and sharper (zero-copy scatter as a first-class concern).
Claims we can make confidently, with this verification behind us:
Claims we should be careful about:
All cited URLs verified accessible 2026-05-21.
gh api repos/steven-varga/h5cpp11); earliest commits Dec 2017