|
H5CPP
v1.14.0
Modern C++ templates for HDF5 serial and parallel I/O
|
|
h5cpp recognises dense and sparse linear-algebra containers from eight upstream libraries as first-class template arguments to h5::read, h5::write, and h5::create. Each library is opted into the dispatch by a single mapper header — until the header is included, the dispatch leaves the library's types out of storage_representation_v<T> and h5::write(file, "ds", x) will not compile.
Sparse matrices and vectors land in HDF5 as a group containing four datasets and two scalar string attributes. The layout is canonical Compressed Sparse Column (CSC) and is byte-for-byte compatible with scipy.sparse.csc_matrix, Julia's HDF5.jl sparse reader, 10x Genomics' Cell Ranger output, and the Loompy file format.
| Component | Disk dtype | Length | Carries |
|---|---|---|---|
data | element T | nnz | non-zero values in column-major order |
indices | uint32 | nnz | row index of each non-zero |
indptr | uint32 | n_cols + 1 | offset into data / indices where each column starts; last entry equals nnz |
shape | uint64 | 2 | [n_rows, n_cols] — the full logical extent |
@format | string | scalar attr | always "csc" |
@axis | string | scalar attr | always "column" |
Index width is fixed uint32 on disk regardless of the upstream library's index type (arma::uword, Eigen::Index, std::ptrdiff_t). h5cpp converts in or out of uint32 at the file boundary, preserving the in-memory precision but matching the 10x/Loompy on-disk convention. An overflow guard at write time throws h5::error::io::dataset::write when any of {nnz, n_rows, n_cols} exceeds 2^32 - 1.
Sparse vectors are promoted to single-column matrices — arma::SpCol<T> and Eigen::SparseVector<T,ColMajor,I> round-trip as N×1 CSC matrices; arma::SpRow<T> rolls up to a 1×N CSC matrix. This keeps cross-tool consumers (scipy, Julia, 10x) on a single canonical reader path.
Precondition before write:
SpMat::sync() must have completed.src.makeCompressed() must have been called, and the matrix must be ColMajor. RowMajor sparse triggers a compile-time static_assert.h5cpp does not call sync() or makeCompressed() implicitly — either would mutate a const matrix and break thread-safety.
Sparse types are routed by the h5::meta::is_sparse_v<T> SFINAE guard onto a dedicated overload set in h5cpp/H5Dsparse.hpp — they sit outside the storage_representation_t enum that dense and STL paths use, since the on-disk shape is a group rather than a single dataset. The generic sparse_traits<T> accessor contract is declared in h5cpp/H5Tsparse.hpp; per-library specialisations live with each mapper header (H5Marma.hpp for arma::SpMat/SpRow/SpCol, H5Meigen.hpp for Eigen::SparseMatrix/SparseVector).
Every dense container above is a contiguous buffer of T with one pointer (memptr(), data(), valuePtr(), …) into a single heap allocation. h5cpp passes that pointer straight to H5Dwrite/H5Dread with no copy and no transpose, so the on-disk layout matches the upstream container's in-memory layout byte-for-byte.
N consecutive elements regardless of rowVector / columnVector labels in the type. Round-trip between any of arma::Row, Eigen::VectorXd, blaze::DynamicVector, ublas::vector, etc. on the same dataset is byte-exact.arma::Mat, Eigen::Matrix<…,ColMajor> (default), blaze::DynamicMatrix<…,columnMajor>.dlib::matrix (always row-major), Eigen::Matrix<…,RowMajor>, blaze::DynamicMatrix<…,rowMajor>, ublas::matrix (default), itpp::Mat. Reading a column-major dataset into a row-major container (or vice versa) does not transpose — it reinterprets, producing the mathematical transpose. Cross-library reads must match orientation.xt::xarray, xt::xtensor, blitz::Array<T,N>, arma::Cube) follow the upstream library's default storage order (xtensor row-major; blitz row-major; arma::Cube column-major slice-by-slice).All dense linalg containers resolve to storage_representation_t::linear_value_dataset — the same dispatch path used by std::vector<T> and STL sequence containers.
| Library | Mapper header | Dense vectors / matrices | Sparse |
|---|---|---|---|
| Armadillo | h5cpp/H5Marma.hpp | h5::arma::rowvec, colvec, colmat, cube | arma::SpMat, SpRow, SpCol |
| Eigen | h5cpp/H5Meigen.hpp | Eigen::Matrix<T,R,C,O>, Eigen::Array<T,R,C,O> | Eigen::SparseMatrix, SparseVector (CSC) |
| Blitz++ | h5cpp/H5Mblitz.hpp | h5::blitz::array<T,N> (rank-N) | — |
| Blaze | h5cpp/H5Mblaze.hpp | h5::blaze::rowvec, colvec, rowmat, colmat | — |
| dlib | h5cpp/H5Mdlib.hpp | h5::dlib::rowmat<T> | — |
| IT++ | h5cpp/H5Mitpp.hpp | h5::itpp::rowvec, rowmat | — |
| Boost uBLAS | h5cpp/H5Mublas.hpp | h5::ublas::rowvec, rowmat | — |
| xtensor | h5cpp/H5Mxtensor.hpp | xt::xarray<T>, xt::xtensor<T,N> | — |
std::valarray is documented under Supported Types § STL sequence containers; its mapper header h5cpp/H5Mvalarray.hpp follows the same opt-in pattern as the linalg mappers above.
Include h5cpp/H5Marma.hpp. The mapper exposes the h5cpp alias names in h5::arma:: that pin the upstream arma:: types:
| h5cpp alias | Upstream type | Rank | Layout | Notes |
|---|---|---|---|---|
h5::arma::rowvec<T> | arma::Row<T> | 1 | contiguous | shape (N,) on disk; layout-orientation invariant |
h5::arma::colvec<T> | arma::Col<T> | 1 | contiguous | shape (N,) on disk; round-trips with rowvec |
h5::arma::colmat<T> | arma::Mat<T> | 2 | column-major | the canonical arma matrix; default elsewhere |
h5::arma::cube<T> | arma::Cube<T> | 3 | col-major slices | shape (rows, cols, slices) on disk |
Sparse (CSC group layout):
| Type | Upstream | Layout | Notes |
|---|---|---|---|
arma::SpMat<T> | arma::SpMat<T> | CSC | native CSC — values + row_indices + col_ptrs |
arma::SpRow<T> | arma::SpRow<T> | CSC | derives from SpMat; serialised as 1×N |
arma::SpCol<T> | arma::SpCol<T> | CSC | derives from SpMat; serialised as N×1 |
Precondition: SpMat::sync() must have completed before write. h5cpp does not call sync() implicitly — that would require a const_cast on a user-supplied const SpMat&. See the arma docs for direct-access requirements on values, row_indices, col_ptrs.
Include h5cpp/H5Meigen.hpp. Eigen's matrix and array families are recognised for any combination of rows/cols/options (Dynamic, fixed-size, ColMajor or RowMajor):
| Type | Rank | Layout | Notes |
|---|---|---|---|
Eigen::Matrix<T,R,C,O> | 2 | O & RowMajorBit decides | MatrixXd, Matrix3f, RowVectorXf etc. all covered |
Eigen::Array<T,R,C,O> | 2 | as above | element-wise array variant |
Eigen::VectorXd, RowVectorXd, … | 1/2 | contiguous | aliases for Matrix<…,Dynamic,1> / <…,1,Dynamic> |
Eigen sparse types live in the <Eigen/SparseCore> module, not <Eigen/Core>. The mapper guards the sparse path on EIGEN_SPARSECORE_MODULE_H, so users who only include <Eigen/Core> don't pay for the sparse trait specialisations.
| Type | Layout | Constraint |
|---|---|---|
Eigen::SparseMatrix<T,ColMajor,I> | CSC | RowMajor static_asserts at write time |
Eigen::SparseVector<T,ColMajor,I> | CSC | always a single inner vector |
Precondition: src.makeCompressed() has been called. h5cpp does not call it implicitly to avoid mutating a const SparseMatrix&. RowMajor sparse triggers a clear compile-time error rather than silently producing a transposed file.
Include h5cpp/H5Mblaze.hpp. Blaze's DynamicVector and DynamicMatrix carry their orientation as a non-type template parameter, and h5cpp pins both orientations explicitly:
| h5cpp alias | Upstream type | Rank | Layout |
|---|---|---|---|
h5::blaze::rowvec<T> | blaze::DynamicVector<T, blaze::rowVector> | 1 | contiguous |
h5::blaze::colvec<T> | blaze::DynamicVector<T, blaze::columnVector> | 1 | contiguous |
h5::blaze::rowmat<T> | blaze::DynamicMatrix<T, blaze::rowMajor> | 2 | row-major |
h5::blaze::colmat<T> | blaze::DynamicMatrix<T, blaze::columnMajor> | 2 | column-major |
Blaze's BLAS wrappers conflict with Armadillo's by default; include <blaze/Math.h> with BLAZE_BLAS_MODE=0 (Blaze's own macro), or include Blaze before Armadillo so the latter's LAPACK adapters win.
Include h5cpp/H5Mblitz.hpp. Blitz exposes a single variable-rank array template; rank is a non-type parameter on the type, not a runtime quantity.
| h5cpp alias | Upstream type | Rank | Layout |
|---|---|---|---|
h5::blitz::array<T, N> | blitz::Array<T, N> | N | row-major |
Blitz supports ranks 1–11 (its native limit). Element type T must itself be one of the elementary scalars, strings, or compound types from Supported Types.
Include h5cpp/H5Mdlib.hpp. dlib's matrix template carries memory manager and layout as template parameters; the mapper pins row_major_layout and the default stateless memory manager:
| h5cpp alias | Upstream type | Rank | Layout |
|---|---|---|---|
h5::dlib::rowmat<T> | dlib::matrix<T, 0, 0, dlib::memory_manager_stateless_kernel_1<char>, dlib::row_major_layout> | 2 | row-major |
dlib matrices declared with column_major_layout or a non-default memory manager are not in the dispatch — they would round-trip through h5::dlib::rowmat<T> (copy at the call site).
Include h5cpp/H5Mitpp.hpp. IT++ matrices are row-major by convention (the name Mat does not carry an orientation tag):
| h5cpp alias | Upstream type | Rank | Layout |
|---|---|---|---|
h5::itpp::rowvec<T> | itpp::Vec<T> | 1 | contiguous |
h5::itpp::rowmat<T> | itpp::Mat<T> | 2 | row-major |
Include h5cpp/H5Mublas.hpp. uBLAS is header-only — see boost/numeric/ublas/matrix.hpp and vector.hpp. The default matrix<T> and vector<T> types are row-major / contiguous:
| h5cpp alias | Upstream type | Rank | Layout |
|---|---|---|---|
h5::ublas::rowvec<T> | boost::numeric::ublas::vector<T> | 1 | contiguous |
h5::ublas::rowmat<T> | boost::numeric::ublas::matrix<T> | 2 | row-major |
uBLAS's column_major matrix variant is not in the dispatch.
Include h5cpp/H5Mxtensor.hpp. xtensor splits rank into dynamic (xarray, rank decided at runtime by reshape) and static (xtensor, rank fixed in the type):
| Type | Rank | Layout | Notes |
|---|---|---|---|
xt::xarray<T> | dynamic | row-major | rank discovered from the dataset shape on read |
xt::xtensor<T, N> | static N | row-major | static rank must match the on-disk rank exactly |
The xtensor header has no single root XTENSOR_HPP guard; the mapper detects xtensor presence through <xtensor/xarray.hpp>, which is the canonical entry point.
Multiple mapper headers can be active in the same translation unit. The dispatches are partitioned by type — h5::write(fd, "a", arma_x) and h5::write(fd, "b", eigen_y) co-exist without conflict.
The historical ABI conflict was Armadillo's LAPACK wrappers vs Blaze's BLAS path. h5cpp does not pull in either library's LAPACK adapters; whichever header you include first determines which set of BLAS/LAPACK symbol names enters the translation unit. For mixed arma + blaze code, define BLAZE_BLAS_MODE=0 before including blaze.
The mapper headers specialise three traits in h5::meta:
is_contiguous<T> — declares that T::memptr() / data() / valuePtr() returns a pointer to a contiguous buffer of decay<T>::type elements.storage_representation_impl<T> — pins the value to storage_representation_t::linear_value_dataset (dense) or routes through is_sparse<T> for sparse.access_traits_t<T> — exposes the rank, dimensions, and element pointer that H5Dread.hpp / H5Dwrite.hpp consume.Once those three are in scope, the type joins the same generic read/write/create dispatch as std::vector<T> — no per-library special-case at the call site. See the Type System Architecture Notes for the design rationale.