zea.data.dataloader¶
H5 dataloader for loading images from zea datasets.
Functions
|
Generate indices for h5 files. |
Classes
|
H5Generator class for iterating over hdf5 files in an advanced way. |
- class zea.data.dataloader.H5Generator(file_paths, key='data/image', n_frames=1, shuffle=True, return_filename=False, limit_n_samples=None, limit_n_frames=None, seed=None, cache=False, additional_axes_iter=None, sort_files=True, overlapping_blocks=False, initial_frame_axis=0, insert_frame_axis=True, frame_index_stride=1, frame_axis=-1, validate=True, **kwargs)[source]¶
Bases:
Dataset
H5Generator class for iterating over hdf5 files in an advanced way. Mostly used internally, you might want to use the Dataloader class instead. Loads one item at a time. Always outputs numpy arrays.
- load(file, key, indices)[source]¶
Extract data from hdf5 file. :param file_name: name of the file to extract image from. :type file_name: str :type key:
str
:param key: key of the hdf5 dataset to grab data from. :type key: str :type indices:tuple
|str
:param indices: indices to extract image from (tuple of slices) :type indices: tuple- Returns:
image extracted from hdf5 file and indexed by indices.
- Return type:
np.ndarray
- zea.data.dataloader.generate_h5_indices(file_paths, file_shapes, n_frames, frame_index_stride, key='data/image', initial_frame_axis=0, additional_axes_iter=None, sort_files=True, overlapping_blocks=False, limit_n_frames=None)[source]¶
Generate indices for h5 files.
Generates a list of indices to extract images from hdf5 files. Length of this list is the length of the extracted dataset.
- Parameters:
file_paths (list) – List of file paths.
file_shapes (list) – List of file shapes.
n_frames (int) – Number of frames to load from each hdf5 file.
frame_index_stride (int) – Interval between frames to load.
key (str, optional) – Key of hdf5 dataset to grab data from. Defaults to “data/image”.
initial_frame_axis (int, optional) – Axis to iterate over. Defaults to 0.
additional_axes_iter (list, optional) – Additional axes to iterate over in the dataset. Defaults to None.
sort_files (bool, optional) – Sort files by number. Defaults to True.
overlapping_blocks (bool, optional) – Will take n_frames from sequence, then move by 1. Defaults to False.
limit_n_frames (int, optional) – Limit the number of frames to load from each file. This means n_frames per data file will be used. These will be the first frames in the file. Defaults to None.
- Returns:
- List of tuples with indices to extract images from hdf5 files.
(file_name, key, indices) with indices being a tuple of slices.
- Return type:
list
Example
[ ( "/folder/path_to_file.hdf5", "data/image", [range(0, 1), slice(None, 256, None), slice(None, 256, None)], ), ( "/folder/path_to_file.hdf5", "data/image", [range(1, 2), slice(None, 256, None), slice(None, 256, None)], ), ..., ]