datasafe subpackage¶
datasafe.datasafe module¶
Datasafe module for the labinform package.
The datasafe is a key feature of labinform which serves to safely store data. Functionality includes directory generation and checksum creation.
-
class
labinform.datasafe.datasafe.
Datasafe
[source]¶ Bases:
object
Data handler for moving data in the context of a datasafe.
The operations performed include generation of a directory structure, storing data in and retrieving data from these directories as well verifying the integrity of and providing general information about the data stored.
-
static
add_directory
(path)[source]¶ Create a directory at a specified path
- Parameters
path (
str
) – path of the directory that should be created
-
compare_checksum
(loi='', with_meta=False)[source]¶ Create local checksum and compare with checksum file.
-
static
dir_is_empty
(path='')[source]¶ Check whether a directory is empty.
- Parameters
path (
str
) – path of the directory which should be checked
-
static
find_highest
(path='')[source]¶ Find a numbered directory with the highest number.
For a given path, find the numbered directory (i.e. directory with an integer as name) with the highest number. If the directory that the path leads to doesn’t exist, if it is empty or if the subdirectories are not ‘numbered’ an error is raised.
- Parameters
path (
str
) – path of the directory that should be searched
-
generate
(experiment='', sample_id='')[source]¶ Generate directory structure and return identifier.
Verify to what extent the relevant directory structure is present and create directories as required. In this context the measurement number for a given sample is - in case of consecutive measurements - automatically increased.
Return a unique identifier for the respective measurement and sample, including the directory path.
-
static
has_dir
(path='')[source]¶ Check whether a directory exists.
- Parameters
path (
str
) – path of the directory which should be checked
-
static
increment
(number=0)[source]¶ Increment an integer by one.
- Parameters
number (
int
) – integer that should be incremented
-
index
(loi='')[source]¶ Retrieve meta information about a dataset from the datasafe.
Retrieves meta information (Manifest.yaml file) for a dataset in the datasafe if present at the target directory (as specified in the LOI), raises an exception otherwise.
- Parameters
loi (
str
) – unique identifier for the dataset for which the meta information should be retrieved.- Returns
manifest_dict – retrieved meta information (Manifest.yaml) as ordered dict
- Return type
-
loi_to_path
(loi='')[source]¶ Retrieve a file’s datasafe directory path from the data’s LOI.
Retrieves the data’s path (including the datasafe’s root path) which is included in the LOI. If the LOI is not correctly formatted, an exception is raised.
-
make_both_checksum_files
(path='', ignore_control_files=True)[source]¶ Create files containing hashes for files in target directory.
Wrapper method: Creates two checksums for files if present at the target directory and writes it to a checksum file, raises an exception otherwise. One checksum includes metadata, one doesn’t.
-
make_checksum_file
(path='', with_meta=False, ignore_control_files=True)[source]¶ Create a file containing a hash for files in target directory.
Creates a checksum for files if present at the target directory and writes it to a checksum file, raises an exception otherwise.
- Parameters
- Returns
checksum – checksum (currently MD5)
- Return type
-
static
make_checksum_for_file
(path='')[source]¶ Create a hash (currently MD5) for a file at a given path.
-
make_checksum_for_files
(path='', with_meta=False, ignore_control_files=True)[source]¶ Create a cryptographic hash (currently MD5) for multiple files.
All files in the directory are sorted and included in the checksum with the option to exclude control files, i.e. the manifest file and checksum files.
-
make_tgz
(path='')[source]¶ Pack directory content to *.tgz file.
Pack all files in directory to a *.tgz file without the folder itself.
- Parameters
path (
str
) – path of the directory containing the files
-
moveto
(data='', experiment='', sample_id='')[source]¶ Prepare directory in datasafe and move data there.
This is a wrapper function which calls
generate()
to generate a directory structure if necessary and creates a local checksum of the file to be moved. Then moves the file to the datasafe, creates another checksum. The two checksums are compared and the result of the comparison is returned.- Parameters
- Returns
results – list containing the generated LOI and the result of the checksum comparison
- Return type
-
multi_push
(path='', loi='')[source]¶ Move data (all files in one directory) into the datasafe.
Wrapper around
push()
for moving all files in any one directory. The files are packed to a tgz archive before moving and unpacked after. Before packing and after unpacking the data’s checksums are compared.
-
property
path
¶ Get or set the path of the datasafe’s top level directory.
The directory is checked for existence and set as path only in case it exists.
-
pull
(loi='', target='')[source]¶ Retrieve data from the datasafe.
Retrieves data from the datasafe if present at the target directory (as specified in the LOI) and moves it to another target directory, raises an exception otherwise.
-
push
(data='', loi='', check_empty=True)[source]¶ Move data (one file) into the datasafe.
Before moving the existence of the target directory (as specified in the LOI) as well as its emptiness are verified. Before and after moving, the data’s checksums are compared.
-
retrieve_checksum
(loi='', with_meta=False)[source]¶ Return checksum from checksum file for a given LOI.
-
static
-
exception
labinform.datasafe.datasafe.
DirectoryNotEmptyError
[source]¶ Bases:
labinform.datasafe.datasafe.Error
Raised when it is tried to push data to a non-empty directory.
-
exception
labinform.datasafe.datasafe.
Error
[source]¶ Bases:
Exception
Base class for exceptions in this module.
-
exception
labinform.datasafe.datasafe.
IncorrectLoiError
[source]¶ Bases:
labinform.datasafe.datasafe.Error
Raised when an incorrect loi is provided.
-
exception
labinform.datasafe.datasafe.
NoChecksumFilePresentError
[source]¶ Bases:
labinform.datasafe.datasafe.Error
Raised when checksum file cannot be retrieved due to inexistence
-
exception
labinform.datasafe.datasafe.
NoPathForThisLoiError
[source]¶ Bases:
labinform.datasafe.datasafe.Error
Raised when the path corresponding to a given loi doesn’t exist.
-
exception
labinform.datasafe.datasafe.
NoSuchDirectoryError
[source]¶ Bases:
labinform.datasafe.datasafe.Error
Raised when an invalid path is set.
datasafe.manifest module¶
Module used for creation of manifest files.
-
class
labinform.datasafe.manifest.
ManifestWriter
[source]¶ Bases:
object
Tool for automated creation of manifest files.
-
manifest_dict
¶ Ordered dict that is filled with information and finally saved as a manifest file.
- Type
-
complete
¶ Indication whether the data of the dataset are complete.
Sometimes, measurements get cancelled, but the data measured so far are still useful. However, in such cases some of the metadata may not fit to the actual dimensions of the numerical data.
- Type
-