miranda package#

Copyright 2019-2023 Trevor James Smith and Ouranos Inc.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class miranda.DataBase(source, *, destination: Path | str | None = None, common_path: Path | str | None = None, file_pattern: str | list[str] = '*.nc', project_name: str = None, recursive: bool = True)[source]#

Bases: object

Database management class.

archive()[source]#

Not yet implemented.

group_by(*, common_path: Path | str = None, subdirectories: bool = True, dates: bool = True, size: int = 10737418240)[source]#

Grouping meta-function.

Notes

Not yet implemented.

items()[source]#

Show items.

keys()[source]#

Show keys.

target(target: Path | str)[source]#

Target directory or server address.

transfer()[source]#

Not yet implemented.

values()[source]#

Show values.

class miranda.FileMeta(path: str, size: int = -1)[source]#

Bases: object

File path and size.

django = {'path': ['CharField', 'max_length=512'], 'size': ['IntegerField', 'null=True', 'blank=True']}#
class miranda.StorageState(base_path, capacity=-1, used_space=-1, free_space=-1)[source]#

Bases: object

Information regarding the storage capacity of a disk.

Subpackages#

Submodules#

miranda.cv module#

Controlled Vocabulary module.

miranda.data module#

Database Management module.

class miranda.data.DataBase(source, *, destination: Path | str | None = None, common_path: Path | str | None = None, file_pattern: str | list[str] = '*.nc', project_name: str = None, recursive: bool = True)[source]#

Bases: object

Database management class.

archive()[source]#

Not yet implemented.

group_by(*, common_path: Path | str = None, subdirectories: bool = True, dates: bool = True, size: int = 10737418240)[source]#

Grouping meta-function.

Notes

Not yet implemented.

items()[source]#

Show items.

keys()[source]#

Show keys.

target(target: Path | str)[source]#

Target directory or server address.

transfer()[source]#

Not yet implemented.

values()[source]#

Show values.

miranda.scripting module#

Scripting Helpers module.

miranda.storage module#

Disk space management#

Classes:

  • DiskSpaceError - the exception raised on failure.

  • FileMeta - file and its size.

  • StorageState - storage capacity and availability of a medium.

Functions:

  • total_size() - get total size of a list of files.

  • size_division() - divide files based on number and size restrictions.

exception miranda.storage.DiskSpaceError[source]#

Bases: Exception

DiskSpaceError Exception.

class miranda.storage.FileMeta(path: str, size: int = -1)[source]#

Bases: object

File path and size.

django = {'path': ['CharField', 'max_length=512'], 'size': ['IntegerField', 'null=True', 'blank=True']}#
class miranda.storage.StorageState(base_path, capacity=-1, used_space=-1, free_space=-1)[source]#

Bases: object

Information regarding the storage capacity of a disk.

miranda.storage.file_size(file_path_or_bytes_or_dict: Path | str | int | list[str | Path] | generator | dict[str, Path | list[Path]]) int[source]#

Return size of object in bytes.

Parameters:

file_path_or_bytes_or_dict (Path or str or int, list of str or Path, GeneratorType, or dict[str, Path or list of Path])

Returns:

int

miranda.storage.report_file_size(file_path_or_bytes_or_dict: Path | str | int | list[str | Path] | generator | dict[str, Path | list[Path]], use_binary: bool = True, significant_digits: int = 2) str[source]#

Report file size in a human-readable format.

This function will parse the contents of a list or generator of files and return the size in bytes of a file or a list of files in pretty formatted text.

Parameters:
  • file_path_or_bytes_or_dict (Path or str or int, list of str or Path, GeneratorType, or dict[str, Path or list of Path])

  • use_binary (bool)

  • significant_digits (int)

miranda.storage.size_division(files_to_divide: list | FileMeta | Path, size_limit: int = 0, file_limit: int = 0, check_name_repetition: bool = False, preserve_order: bool = False) list[list][source]#

Divide files according to size and number limits.

Parameters:
  • files_to_divide (list of str or Path, FileMeta, Path) – Files to be sorted.

  • size_limit (int) – Size limit of divisions in bytes. Default: 0 (no limit).

  • file_limit (int) – Number of files limit of divisions. Default: 0 (no limit).

  • check_name_repetition (bool) – Flag to prevent file name repetitions. Default: False.

  • preserve_order (bool) – Flag to force files to be restored in the order they are given. Default: False.

Returns:

list[list] – list of divisions (each division is a list of FileMeta objects).

miranda.storage.size_evaluation(file_list: list[str | FileMeta | Path]) int[source]#

Total size of files.

Parameters:

file_list (list of str or Path or FileMeta)

Returns:

int – total size of files in bytes.

miranda.units module#

Special Time Units-Handling submodule.

miranda.units.get_time_frequency(d: Dataset, expected_period: str | None = None, minimum_continuous_period: str = '1M') tuple[list[int | str], str][source]#

Try to understand the Dataset frequency.

If it can’t be inferred with xarray.infer_freq() it tries to: - look for a “freq” attrs in the global or time variable attributes. - infer monthly frequency if all time steps are between 27 and 32 days

In the event that an expected_period is supplied, special handling will be called allowing for determining data that may be internally discontinuous (e.g. discontinuous overall, but continuous for minimum_continuous_period). This is provided for instances where input data in a multifile dataset is sparse.

Parameters:
  • d (xr.Dataset) – An xarray.Dataset.

  • expected_period (str) – An xarray-compatible time period (e.g. “1H”, “1D”, “7D”, “1M”, “1A”). The time period expected of the input dataset. The “1M” period is specially-handled.

  • minimum_continuous_period (str) – An xarray-compatible time period (e.g. “1H”, “1D”, “7D”, “1M”, “1A”). The minimum expected granular period that data should have continuous values for. The “1M” period is specially-handled.

Returns:

  • offset (list of int or str) – The offset a list of (multiplier, base)

  • offset_meaning (str) – The offset meaning (single word)

miranda.units.parse_offset(freq: str) Sequence[str][source]#

Parse an offset string.

Parse a frequency offset and, if needed, convert to cftime-compatible components.

Parameters:

freq (str) – Frequency offset.

Returns:

  • multiplier (int) – Multiplier of the base frequency. “[n]W” is always replaced with “[7n]D”, as xarray doesn’t support “W” for cftime indexes.

  • offset_base (str) – Base frequency. “Y” is always replaced with “A”.

  • is_start_anchored (bool) – Whether coordinates of this frequency should correspond to the beginning of the period (True) or its end (False). Can only be False when base is A, Q or M; in other words, xclim assumes frequencies finer than monthly are all start-anchored.

  • anchor (str or None) – Anchor date for bases A or Q. As xarray doesn’t support “W”, neither does xclim (anchor information is lost when given).

miranda.utils module#

Miscellaneous Helper Utilities module.

class miranda.utils.HiddenPrints[source]#

Bases: object

Special context manager for hiding print statements.

Notes

Solution from https://stackoverflow.com/a/45669280/7322852 Credit to Alexander C (https://stackoverflow.com/users/2039471/alexander-c) CC-BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0/)-

miranda.utils.chunk_iterables(iterable: Sequence, chunk_size: int) Iterable[source]#

Generate lists of chunk_size elements from iterable.

Notes

Adapted from eidord (2012) https://stackoverflow.com/a/12797249/7322852 (https://creativecommons.org/licenses/by-sa/4.0/)

miranda.utils.generic_extract_archive(resources: str | Path | list[bytes | str | Path], output_dir: str | Path | None = None) list[Path][source]#

Extract archives (tar/zip) to a working directory.

Parameters:
  • resources (str or Path or list of bytes or str or Path) – list of archive files (if netCDF files are in list, they are passed and returned as well in the return).

  • output_dir (str or Path, optional) – string or Path to a working location (default: temporary folder).

Returns:

list – List of original or of extracted files

miranda.utils.list_paths_with_elements(base_paths: str | list[str], elements: list[str]) list[dict][source]#

List a given path structure.

Parameters:
  • base_paths (list of str) – List of paths from which to start the search.

  • elements (list of str) – Ordered list of the expected elements.

Returns:

list of dict – The keys are ‘path’ and each of the members of the given elements, the path is the absolute path.

Notes

Suppose you have the following structure: /base_path/{color}/{shape} The resulting list would look like:

[{'path':/base_path/red/square, 'color':'red', 'shape':'square'},
{'path':/base_path/red/circle, 'color':'red', 'shape':'circle'},
{'path':/base_path/blue/triangle, 'color':'blue', 'shape':'triangle'},
...]

Obviously, ‘path’ should not be in the input list of elements.

miranda.utils.publish_release_notes(style: str = 'md', file: PathLike | StringIO | TextIO | None = None) str | None[source]#

Format release history in Markdown or ReStructuredText.

Parameters:
  • style ({“rst”, “md”}) – Use ReStructuredText formatting or Markdown. Default: Markdown.

  • file ({os.PathLike, StringIO, TextIO}, optional) – If provided, prints to the given file-like object. Otherwise, returns a string.

Returns:

str, optional

Notes

This function is solely for development purposes.

miranda.utils.single_item_list(iterable: Iterable) bool[source]#

Ascertain whether a list has exactly one entry.

See: https://stackoverflow.com/a/16801605/7322852

Parameters:

iterable (Iterable)

Returns:

bool

miranda.utils.working_directory(directory: str | Path) None[source]#

Change the working directory within a context object.

This function momentarily changes the working directory within the context and reverts to the file working directory when the code block it is acting upon exits

Parameters:

directory (str or pathlib.Path)

Returns:

None

miranda.validators module#

Data Validation module.

miranda.validators.url_validate(target: str) Match[str] | None[source]#

Validate whether a supplied URL is reliably written.

Parameters:

target (str)

References

https://stackoverflow.com/a/7160778/7322852