My hdf5 cheatsheet.
1
2
| import h5py
import numpy as np
|
Create a file
1
| f = h5py.File('demo.hdf5', 'w')
|
1
2
| data = np.arange(10)
data
|
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
<HDF5 dataset "array": shape (10,), type "<i8">
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
array([1, 2, 5])
Add additional data
1
| f['full/dataset'] = data
|
['array', 'dataset', 'full']
True
['dataset']
Create dataset
1
| dset = f.create_dataset('/full/bigger', (10000, 1000, 1000, 1000), compression='gzip')
|
Set attributes
<Attributes of HDF5 object at 140618810188336>
Atributes again have dictionary structure, so can add attribute like so:
1
2
| dset.attrs['sampling frequency'] = 'Every other week between 1 Jan 2001 and 7 Feb 2010'
dset.attrs['PI'] = 'Fabian'
|
1
2
3
| list(dset.attrs.items())
for i in dset.attrs.items():
print(i)
|
('PI', 'Fabian')
('sampling frequency', 'Every other week between 1 Jan 2001 and 7 Feb 2010')
Open file
1
| f = h5py.File('demo.hdf5', 'r')
|
['array', 'dataset', 'full']
hdf5 files are organised in a hierarchy - that’s what the “h” stands for.
'/array'
['array', 'dataset', 'full']
['bigger', 'dataset']
Sources