canns.data.datasets¶
Universal data loading utilities for CANNs.
This module provides generic functions to download and load data from URLs, with specialized support for CANNs example datasets.
Attributes¶
Functions¶
|
Compute SHA256 hash of a file. |
|
Detect file type based on extension. |
|
Download a specific dataset. |
|
Download a file with progress bar. |
Get the data directory, creating it if necessary. |
|
|
Get path to a dataset, downloading/setting up if necessary. |
Get guide for uploading datasets to Hugging Face. |
|
|
Download and return files for a Left_Right_data_of session. |
|
Download and return a specific Left_Right_data_of NPZ file. |
List available datasets with descriptions. |
|
|
Universal data loading function that downloads and reads data from URLs. |
|
Load data from file based on file type. |
Quick setup function to get datasets ready. |
Module Contents¶
- canns.data.datasets.download_dataset(dataset_key, force=False)[source]¶
Download a specific dataset.
- canns.data.datasets.download_file_with_progress(url, filepath, chunk_size=8192)[source]¶
Download a file with progress bar.
- canns.data.datasets.get_dataset_path(dataset_key, auto_setup=True)[source]¶
Get path to a dataset, downloading/setting up if necessary.
- canns.data.datasets.get_huggingface_upload_guide()[source]¶
Get guide for uploading datasets to Hugging Face.
- Returns:
Upload guide text.
- Return type:
- canns.data.datasets.get_left_right_data_session(session_id, auto_download=True, force=False)[source]¶
Download and return files for a Left_Right_data_of session.
- canns.data.datasets.get_left_right_npz(session_id, filename, auto_download=True, force=False)[source]¶
Download and return a specific Left_Right_data_of NPZ file.
- Parameters:
- Returns:
Path to the requested file if available, None otherwise.
- Return type:
Path or None
- canns.data.datasets.load(url, cache_dir=None, force_download=False, file_type=None)[source]¶
Universal data loading function that downloads and reads data from URLs.
- Parameters:
url (str) – URL to download data from.
cache_dir (str or Path, optional) – Directory to cache downloaded files. If None, uses temporary directory.
force_download (bool) – Force re-download even if file exists in cache.
file_type (str, optional) – Force specific file type (‘text’, ‘numpy’, ‘json’, ‘pickle’, ‘hdf5’). If None, auto-detect from file extension.
- Returns:
Loaded data.
- Return type:
Any
Examples
>>> # Load numpy data >>> data = load('https://example.com/data.npz') >>> >>> # Load text data with custom cache >>> data = load('https://example.com/data.txt', cache_dir='./cache') >>> >>> # Force specific file type >>> data = load('https://example.com/data.bin', file_type='numpy')
- canns.data.datasets.load_file(filepath, file_type=None)[source]¶
Load data from file based on file type.
- Parameters:
filepath (Path) – Path to the data file.
file_type (str, optional) – Force specific file type. If None, auto-detect from extension.
- Returns:
Loaded data.
- Return type:
Any
- canns.data.datasets.quick_setup()[source]¶
Quick setup function to get datasets ready.
- Returns:
True if successful, False otherwise.
- Return type: