simple_dvc.sidecar module

Custom implementation of sidecar files.

Sidecar files can refernces reference “Out” objects, which directly correspond to cached assets or “directories”, which are a list of assets. Our implementation assumes there is only one level of indirection — i.e. a directory “out” will only refernce file-based outs.

simple_dvc.sidecar.find_unreferenced_data(self)[source]

TODO: move this elsewhere.

class simple_dvc.sidecar.Out[source]

Bases: UDict

Wrapper for a DVC out dictionary.

An “Out” is a dictionary that contains the expected hash of some piece of data as well as a relative path that indicates where it should live.

Example

>>> out = Out({'md5': 'badbeaf', 'path': 'baz'})
>>> out.is_dir
>>> out.rel_cache_fpath
property is_dir
property rel_cache_fpath
class simple_dvc.sidecar.Sidecar(fpath, dvc)[source]

Bases: NiceRepr

Class that handles information stored in a .dvc sidecar file.

Given the additional context of a DVC repo, this provides the ability to check if the referenced data exists, pull it, or check it out. This does not perform any safety checks, which means it is faster than regular DVC, but the user must be careful becuase its lack of saftey means you can break things.

_main_loaded

Note: A sidecar contains two direct references:

  • Reference to specific-file outs (usually just one)

  • Reference to a directory outs, which is a file that contains a list of references

The directory outs are a list of indirect refrences and thse are stored in subdir_outs

property num_subdirs
property num_main_files
summary()[source]
_load_all()[source]
_load_main()[source]

Loads pointers from the main sidecar file.

If this contains directories, then there still may be more data to load.

_group_file_outs()[source]

Determine which file outputs exist / are missing

_group_dir_outs()[source]

Determine which directory outputs exist / are missing

_load_and_group_subdir_files()[source]

For existing directory outputs, loads them and determines if their contents are missing

print(‘self._sub_file_out_groups = {}’.format(ub.urepr(self._sub_file_out_groups, nl=3)))

_iter_linked_pairs()[source]
Yields:
Tuple[Path, Path] -

the target checkout link path and the cache path it should point to

_group_linked_pairs()[source]
unsafe_checkout()[source]

Unsafe custom checkout logic

class simple_dvc.sidecar.SidecarCollection(iterable=(), /)[source]

Bases: list

classmethod from_paths(paths, dvc)[source]
_load_sidecars(check_links=False)[source]
unsafe_pull(remote_name)[source]

Custom simple implementation of pull that cuts corners

Pulls data from one cache to this local one.

unsafe_checkout()[source]
simple_dvc.sidecar.simple_checkout()[source]
simple_dvc.sidecar.unsafe_pull_and_checkout()[source]
simple_dvc.sidecar.find_and_fix_missing_files()[source]

Find DVC files where the data hasn’t been checked out yet.