MRCZ format#
The mrcz format is an extension of the CCP-EM MRC2014 file format.
CCP-EM MRC2014 file format.
It uses the blosc meta-compression library to bitshuffle and compress files in
a blocked, multi-threaded environment. The supported data types are float32,
int8, uint16, int16 and complex64.
It supports arbitrary meta-data, which is serialized into JSON.
MRCZ also supports asynchronous reads and writes.
Repository |
|
PyPI |
|
Citation |
|
Preprint |
Support for this format is not enabled by default. In order to enable it, the mrcz library needs to be installed and optionally blosc to use compression.
API functions#
- rsciio.mrcz.file_reader(filename, lazy=False, endianess='<', mmap_mode='c', **kwds)#
File reader for the MRCZ format for tomographic data.
- Parameters:
filename (str, pathlib.Path) – Filename of the file to read or corresponding pathlib.Path.
lazy (bool, Default=False) – Whether to open the file lazily or not.
endianess (str, Default="<") –
"<"or">", depending on how the bits are written to the file.mmap_mode (str, Default="c") – The MRCZ reader currently only supports C-ordering memory-maps.
- Returns:
List of dictionaries containing the following fields:
’data’ – multidimensional numpy array
’axes’ – list of dictionaries describing the axes containing the fields ‘name’, ‘units’, ‘index_in_array’, and either ‘size’, ‘offset’, and ‘scale’ or a numpy array ‘axis’ containing the full axes vector
’metadata’ – dictionary containing the parsed metadata
’original_metadata’ – dictionary containing the full metadata tree from the input file
- Return type:
list of dicts
Examples
>>> from rsciio.mrcz import file_reader >>> new_signal = file_reader('file.mrcz')
- rsciio.mrcz.file_writer(filename, signal, do_async=False, compressor=None, clevel=1, n_threads=None, **kwds)#
Write signal to MRCZ format.
- Parameters:
filename (str, pathlib.Path) – Filename of the file to write to or corresponding pathlib.Path.
signal (dict) –
Dictionary containing the signal object. Should contain the following fields:
’data’ – multidimensional numpy array
’axes’ – list of dictionaries describing the axes containing the fields ‘name’, ‘units’, ‘index_in_array’, and either ‘size’, ‘offset’, and ‘scale’ or a numpy array ‘axis’ containing the full axes vector
’metadata’ – dictionary containing the metadata tree
endianess (str, Default="<") –
"<"or">", depending on how the bits are written to the file.do_async (bool, Default=False) – Currently supported within RosettaSciIO for writing only, this will save the file in a background thread and return immediately. Warning: there is no method currently implemented within RosettaSciIO to tell if an asychronous write has finished.
compressor ({None, "zlib", "zstd", "lz4"}, Default=None) – The compression codec.
clevel (int, Default=1) – The compression level, an
intfrom 1 to 9.n_threads (int) – The number of threads to use for
blosccompression. Defaults to the maximum number of virtual cores (including Intel Hyperthreading) on your system, which is recommended for best performance. Ifdo_async = Trueyou may wish to leave one thread free for the Python GIL.
Notes
The recommended compression codec is
zstd(zStandard) withclevel=1for general use. If speed is critical, uselz4(LZ4) withclevel=9. Integer data compresses more redably than floating-point data, and in general the histogram of values in the data reflects how compressible it is.To save files that are compatible with other programs that can use MRC such as GMS, IMOD, Relion, MotionCorr, etc. save with
compressor=None, extension.mrc. JSON metadata will not be recognized by other MRC-supporting software but should not cause crashes.Examples
>>> from rsciio.mrcz import file_writer >>> file_writer('file.mrcz', signal, do_async=True, compressor='zstd', clevel=1)