carate.loader package

Submodules

carate.loader.load_data module

File for data loading from the standard datasets implemented in the pytorch_geometric # library. The DataSet loader is implemented as a base class and other subclasses include loaders for standardized benchmarks as well as custom datasets.

author:

Julian M. Kleber

class carate.loader.load_data.DatasetObject(dataset_name: str, dataset_save_path: str, test_ratio: int, batch_size: int, shuffle: bool, custom_size: int | None)[source]

Bases: ABC, DefaultObject, Dataset

Interface for DataLoading objects

abstract classmethod load_data(dataset_name: str, dataset_save_path: str, test_ratio: int, batch_size: int, shuffle: bool) None[source]
class carate.loader.load_data.StandardDatasetMoleculeNet(dataset_save_path: str, dataset_name: str, test_ratio: int, batch_size: int, shuffle: bool = True, custom_size: int | None = None)[source]

Bases: StandardPytorchGeometricDataset

Implementation of the Dataset interaface with focus on the models implemented in pytorch_geometric and provided by the MoleculeNet collection of datasets.

DataSet

alias of MoleculeNet

class carate.loader.load_data.StandardDatasetTUDataset(dataset_save_path: str, dataset_name: str, test_ratio: int, batch_size: int, shuffle: bool = True)[source]

Bases: StandardPytorchGeometricDataset

class for loading standard datasates from the TU Dataset collection implemented by PyTorch Geometric.

author: Julian M. Kleber

DataSet

alias of TUDataset

class carate.loader.load_data.StandardPytorchGeometricDataset(dataset_name: str, dataset_save_path: str, test_ratio: int, batch_size: int, shuffle: bool, custom_size: int | None)[source]

Bases: DatasetObject

DataSet: Dataset
load_data(dataset_name: str, test_ratio: int, dataset_save_path: str, batch_size: int = 64, shuffle: bool = True, custom_size: int | None = None) List[DataLoader | Dataset][source]

The load_dataset function loads a standard dataset, splits it into a training and testing set, and returns the appropriate dataloaders for each. The test_ratio parameter specifies what percentage of the original dataset should be used as the testing set. The batch_size parameter specifies how many samples should be in each batch.

Parameters:
  • path:str – Used to Define the path where the dataset is located.

  • dataset_name:str – Used to Specify which dataset to load.

  • test_ratio:int – Used to divide the dataset into a training and test set.

  • batch_size:int – Used to set the batch size for training.

Returns:

A train_loader and a test_loader.

Doc-author:

Julian M. Kleber

make_split(dataset: Dataset, test_ratio: int, batch_size: int, custom_size: int | None = None) List[DataLoader | Dataset][source]

Module contents