miranda package#

Copyright 2019-2023 Trevor James Smith and Ouranos Inc.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

class miranda.DataBase(source, *, destination: str | Path | None = None, common_path: str | Path | None = None, file_pattern: str | List[str] = '*.nc', project_name: str | None = None, recursive: bool = True)[source]#

Bases: object

archive()[source]#
group_by(*, common_path: Path | str | None = None, subdirectories: bool = True, dates: bool = True, size: int = 10737418240)[source]#
items()[source]#
keys()[source]#
target(target: Path | str)[source]#
transfer()[source]#
values()[source]#
class miranda.FileMeta(path: str, size: int = -1)[source]#

Bases: object

File path and size.

django = {'path': ['CharField', 'max_length=512'], 'size': ['IntegerField', 'null=True', 'blank=True']}#
class miranda.StorageState(base_path, capacity=-1, used_space=-1, free_space=-1)[source]#

Bases: object

Information regarding the storage capacity of a disk.

Subpackages#

Submodules#

miranda.cv module#

miranda.data module#

class miranda.data.DataBase(source, *, destination: str | Path | None = None, common_path: str | Path | None = None, file_pattern: str | List[str] = '*.nc', project_name: str | None = None, recursive: bool = True)[source]#

Bases: object

archive()[source]#
group_by(*, common_path: Path | str | None = None, subdirectories: bool = True, dates: bool = True, size: int = 10737418240)[source]#
items()[source]#
keys()[source]#
target(target: Path | str)[source]#
transfer()[source]#
values()[source]#

miranda.scripting module#

miranda.storage module#

Disk space management#

Classes:

  • DiskSpaceError - the exception raised on failure.

  • FileMeta - file and its size.

  • StorageState - storage capacity and availability of a medium.

Functions:

  • total_size() - get total size of a list of files.

  • size_division() - divide files based on number and size restrictions.

exception miranda.storage.DiskSpaceError[source]#

Bases: Exception

class miranda.storage.FileMeta(path: str, size: int = -1)[source]#

Bases: object

File path and size.

django = {'path': ['CharField', 'max_length=512'], 'size': ['IntegerField', 'null=True', 'blank=True']}#
class miranda.storage.StorageState(base_path, capacity=-1, used_space=-1, free_space=-1)[source]#

Bases: object

Information regarding the storage capacity of a disk.

miranda.storage.file_size(file_path_or_bytes_or_dict: Path | str | int | List[str | Path] | generator | Dict[str, Path | List[Path]]) int[source]#
Parameters:

file_path_or_bytes_or_dict (Union[Path, str, int, List[Union[str, Path]], GeneratorType, Dict[str, Union[Path, List[Path]]]])

Returns:

int

miranda.storage.report_file_size(file_path_or_bytes_or_dict: Path | str | int | List[str | Path] | Dict[str, Path | List[Path]], use_binary: bool = True, significant_digits: int = 2) str[source]#

This function will parse the contents of a list or generator of files and return the size in bytes of a file or a list of files in pretty formatted text.

miranda.storage.size_division(files_to_divide: List | FileMeta | Path, size_limit: int = 0, file_limit: int = 0, check_name_repetition: bool = False, preserve_order: bool = False) List[list][source]#

Divide files according to size and number limits.

Parameters:
  • files_to_divide (Union[List, FileMeta, Path])

  • size_limit (int) – Size limit of divisions in bytes. Default: 0 (no limit).

  • file_limit (int) – Number of files limit of divisions. Default: 0 (no limit).

  • check_name_repetition (bool) – Flag to prevent file name repetitions. Default: False.

  • preserve_order (bool) – Flag to force files to be restored in the order they are given. Default: False.

Returns:

List[list] – list of divisions (each division is a list of FileMeta objects).

miranda.storage.size_evaluation(file_list: List[str | FileMeta | Path]) int[source]#

Total size of files.

Parameters:

file_list (Union[str, Path, FileMeta])

Returns:

int – total size of files in bytes.

miranda.units module#

miranda.units.get_time_frequency(d: Dataset, expected_period: str | None = None, minimum_continuous_period: str = '1M') Tuple[List[int | str], str][source]#

Try to understand the Dataset frequency.

If it can’t be inferred with xarray.infer_freq() it tries to: - look for a “freq” attrs in the global or time variable attributes. - infer monthly frequency if all time steps are between 27 and 32 days

In the event that an expected_period is supplied, special handling will be called allowing for determining data that may be internally discontinuous (e.g. discontinuous overall, but continuous for minimum_continuous_period). This is provided for instances where input data in a multifile dataset is sparse.

Parameters:
  • d (xr.Dataset)

  • expected_period (str) – An xarray-compatible time period (e.g. “1H”, “1D”, “7D”, “1M”, “1A”) The time period expected of the input dataset. The “1M” period is specially-handled.

  • minimum_continuous_period (str) – An xarray-compatible time period (e.g. “1H”, “1D”, “7D”, “1M”, “1A”) The minimum expected granular period that data should have continuous values for. The “1M” period is specially-handled.

Returns:

  • offset (List[Union[int, str]]) – The offset a list of (multiplier, base)

  • offset_meaning (str) – The offset meaning (single word)

miranda.units.parse_offset(freq: str) Sequence[str][source]#

Parse an offset string.

Parse a frequency offset and, if needed, convert to cftime-compatible components.

Parameters:

freq (str) – Frequency offset.

Returns:

  • multiplier (int) – Multiplier of the base frequency. “[n]W” is always replaced with “[7n]D”, as xarray doesn’t support “W” for cftime indexes.

  • offset_base (str) – Base frequency. “Y” is always replaced with “A”.

  • is_start_anchored (bool) – Whether coordinates of this frequency should correspond to the beginning of the period (True) or its end (False). Can only be False when base is A, Q or M; in other words, xclim assumes frequencies finer than monthly are all start-anchored.

  • anchor (str or None) – Anchor date for bases A or Q. As xarray doesn’t support “W”, neither does xclim (anchor information is lost when given).

miranda.utils module#

class miranda.utils.HiddenPrints[source]#

Bases: object

miranda.utils.chunk_iterables(iterable: Sequence, chunk_size: int) Iterable[source]#

Generate lists of chunk_size elements from iterable.

Notes

Adapted from eidord (2012) https://stackoverflow.com/a/12797249/7322852 (https://creativecommons.org/licenses/by-sa/4.0/)

miranda.utils.generic_extract_archive(resources: str | Path | List[bytes | str | Path], output_dir: str | Path | None = None) List[Path][source]#

Extract archives (tar/zip) to a working directory.

Parameters:
  • resources (Union[str, Path, List[Union[bytes, str, Path]]]) – list of archive files (if netCDF files are in list, they are passed and returned as well in the return).

  • output_dir (Optional[Union[str, Path]]) – string or Path to a working location (default: temporary folder).

Returns:

list – List of original or of extracted files

miranda.utils.list_paths_with_elements(base_paths: str | List[str], elements: List[str]) List[Dict][source]#

List a given path structure.

Parameters:
  • base_paths (List[str]) – list of paths from which to start the search.

  • elements (List[str]) – ordered list of the expected elements.

Returns:

List[Dict] – The keys are ‘path’ and each of the members of the given elements, the path is the absolute path.

Notes

Suppose you have the following structure: /base_path/{color}/{shape} The resulting list would look like:

[{'path':/base_path/red/square, 'color':'red', 'shape':'square'},
{'path':/base_path/red/circle, 'color':'red', 'shape':'circle'},
{'path':/base_path/blue/triangle, 'color':'blue', 'shape':'triangle'},
...]

Obviously, ‘path’ should not be in the input list of elements.

miranda.utils.publish_release_notes(style: str = 'md', file: PathLike | StringIO | TextIO | None = None) str | None[source]#

Format release history in Markdown or ReStructuredText.

Parameters:
  • style ({“rst”, “md”}) – Use ReStructuredText formatting or Markdown. Default: Markdown.

  • file ({os.PathLike, StringIO, TextIO}, optional) – If provided, prints to the given file-like object. Otherwise, returns a string.

Returns:

str, optional

Notes

This function is solely for development purposes.

miranda.utils.single_item_list(iterable: Iterable) bool[source]#

Ascertain whether a list has exactly one entry.

See: https://stackoverflow.com/a/16801605/7322852

Parameters:

iterable (Iterable)

Returns:

bool

miranda.utils.working_directory(directory: str | Path) None[source]#

Change the working directory within a context object.

This function momentarily changes the working directory within the context and reverts to the file working directory when the code block it is acting upon exits

Parameters:

directory (Union[str, Path])

Returns:

None

miranda.validators module#

miranda.validators.url_validate(target: str) Match[str] | None[source]#

Validate whether a supplied URL is reliably written.

Parameters:

target (str)

References

https://stackoverflow.com/a/7160778/7322852