Utility functions#
RosettaSciIO provides certain utility functions that are applicable for multiple formats, e.g. for the HDF5-format on which a number of plugins are based.
HDF5 utility functions#
HDF5 file inspection.
- rsciio.utils.hdf5.list_datasets_in_file(filename, dataset_key=None, hardlinks_only=False, verbose=True)#
Read from a NeXus or
.hdffile and return a list of the dataset paths.This method is used to inspect the contents of an hdf5 file. The method iterates through group attributes and returns NXdata or hdf datasets of size >=2 if they’re not already NXdata blocks and returns a list of the entries. This is a convenience method to inspect a file to list datasets present rather than loading all the datasets in the file as signals.
- Parameters:
- filename
str,pathlib.Path Filename of the file to read or corresponding pathlib.Path.
- dataset_key
str,listofstr,None, default=None If a str or list of strings is provided only return items whose path contain the strings. For example, dataset_key = [“instrument”, “Fe”] will only return hdf entries with “instrument” or “Fe” somewhere in their hdf path.
- hardlinks_onlybool, default=False
If true any links (soft or External) will be ignored when loading.
- verbosebool, default=True
Prints the results to screen.
- filename
- Returns:
listList of paths to datasets.
See also
rsciio.utils.hdf5.read_metadata_from_fileConvenience function to read metadata present in a file.
- rsciio.utils.hdf5.read_metadata_from_file(filename, lazy=False, metadata_key=None, verbose=False, skip_array_metadata=False)#
Read the metadata from a NeXus or
.hdffile.This method iterates through the hdf5 file and returns a dictionary of the entries. This is a convenience method to inspect a file for a value rather than loading the file as a signal.
- Parameters:
- filename
str,pathlib.Path Filename of the file to read or corresponding pathlib.Path.
- lazybool, default=False
Whether to open the file lazily or not.
- metadata_key
None,str,listofstr, default=None None will return all datasets found including linked data. Providing a string or list of strings will only return items which contain the string(s). For example, search_keys = [“instrument”,”Fe”] will return hdf entries with “instrument” or “Fe” in their hdf path.
- verbosebool, default=False
Pretty print the results to screen.
- skip_array_metadatabool, default=False
Whether to skip loading array metadata. This is useful as a lot of large array may be present in the metadata and it is redundant with dataset itself.
- filename
- Returns:
dictMetadata dictionary.
See also
rsciio.utils.hdf5.list_datasets_in_fileConvenience function to list datasets present in a file.
Test utility functions#
- rsciio.tests.registry_utils.download_all(pooch_object=None, ignore_hash=None, show_progressbar=True)#
Download all test data if they are not already locally available in
rsciio.tests.datafolder.- Parameters:
- pooch_object
pooch.PoochorNone, default=None The registry to be used. If None, a RosettaSciIO registry will be used.
- ignore_hashbool or
None, default=None Don’t compare the hash of the downloaded file with the corresponding hash in the registry. On windows, the hash comparison will fail for non-binary file, because of difference in line ending. If None, the comparision will only be used on unix system.
- show_progressbarbool, default=True
Whether to show the progressbar or not.
- pooch_object
- rsciio.tests.registry_utils.make_registry(directory, output, recursive=True, exclude_pattern=None)#
Make a registry of files and hashes for the given directory.
This is helpful if you have many files in your test dataset as it keeps you from needing to manually update the registry.
- Parameters:
- directory
str Directory of the test data to put in the registry. All file names in the registry will be relative to this directory.
- output
str Name of the output registry file.
- recursivebool
If True, will recursively look for files in subdirectories of directory.
- exclude_pattern
listorNone List of pattern to exclude.
- directory
Notes
Adapted from fatiando/pooch BSD-3-Clause
- rsciio.tests.registry_utils.update_registry()#
Update the
rsciio.tests.registry.txtfile, which is required after adding or updating test data files.Unix system only. This is not supported on windows, because the hash comparison will fail for non-binary file, because of difference in line ending.