carate.evaluation package

Submodules

carate.evaluation.base module

This is the heart of the application and trains / tests a algorithm on a given dataset. The idea is to parametrize as much as possible.

author:

Julian M. Kleber

class carate.evaluation.base.Evaluation(dataset_name: str, dataset_save_path: str, result_save_dir: str, model_net: Model, optimizer: Optimizer, data_set: DatasetObject, device: device, logger: Any, resume: bool, test_ratio: int, num_epoch: int = 150, num_cv: int = 5, num_classes: int = 2, out_dir: str = './out', batch_size: int = 64, shuffle: bool = True, model_save_freq: int = 100, override: bool = True, normalize: bool = False, custom_size: int | None = None)[source]

Bases: DefaultObject

The evaluation class is about evaluating a given model written in PyTorch or PyTorchGeometric.

cv(num_cv: int, num_epoch: int, num_classes: int, dataset_name: str, dataset_save_path: str, logger: Any, test_ratio: int, resume: bool, data_set: DatasetObject, shuffle: bool, batch_size: int, model_net: Model, optimizer: Optimizer, device: device, result_save_dir: str, model_save_freq: int, override: bool = True, normalize: bool = False, custom_size: int | None = None) Dict[str, Any][source]

The function is the core of the evaluation. The results are saved on disk during the run and returned as json at the end of the run.

Parameters:
  • self – Used to Represent the instance of the class.

  • num_cv:int – Used to specify the number of cross-validation folds.

  • num_epoch:int – Used to Specify the number of epochs to train for.

  • num_classes:int – Used to Determine the number of classes in the dataset.

  • dataset_name:str – Used to Specify the name of the dataset to be used.

  • DataSetType[DatasetObject] – Used to Load the data.

:param : Used to Specify the number of folds in a (stratified)kfold,. :return: A list of dictionaries.

Doc-author:

Trelent

load_model_checkpoint(checkpoint_path: str, model_net: ~carate.models.base_model.Model, optimizer=<class 'torch.optim.optimizer.Optimizer'>) Model[source]

The load_model_checkpoint function loads a model checkpoint from the specified path.

The function loads a model checkpoint from the specified path, and sets it as the model of this evaluation object. The function also returns that loaded model.

Parameters:
  • self – Used to Refer to the object itself.

  • checkpoint_path:str – Used to Specify the path to the checkpoint file.

  • model_net:Model – Used to Specify the model that is being loaded.

  • optimizer=torch.optim.Optimizer – Used to Load the optimizer state

from a checkpoint. :param : Used to Load the model checkpoint. :return: The model.

Doc-author:

Julian M. Kleber

name = 'Default evaluation'
save_model_checkpoint(result_save_dir: str, dataset_name: str, num_cv: int, num_epoch: int, model_net: Type[Module], optimizer: Type[Optimizer], loss: float, override: bool = True) None[source]

The save_model function saves the model to a file.

The save_model function saves the model to a file. The filename is based on the dataset name, number of cross-validation folds, and epoch number. The file is saved in the result_save_dir directory with an extension of .pt (for PyTorch). If this directory does not exist, it will be created before saving the file.

Parameters:
  • result_save_dir:str – Used to specify the directory where the model will be saved.

  • dataset_name:str – Used to save the model with a name that includes the dataset it was trained on.

  • num_cv:int – Used to specify which cross validation fold the model is being saved for.

  • num_epoch:int – Used to save the model at a certain epoch.

  • model_net:Type[torch.nn.Module] – Used to save the model.

:param : Used to save the model at a certain frequency. :return: None.

Doc-author:

Julian M. Kleber

save_result(result_save_dir: str, dataset_name: str, num_cv: int, num_epoch: int, data: dict) None[source]

The save_result function saves the results of a cross-validation run to a .json file. The goal is to provide a json interface of cv results for later analysis of the training runs.

Parameters:
  • self – Used to represent the instance of the class.

  • result_save_dir:str – Used to specify the directory where the results will be saved.

  • dataset_name:str – Used to identify the dataset.

  • num_cv:int – Used to specify the number of cross validation runs.

  • int (num_epoch) – Epoch the run was saved in

  • data:dict – Used to store the results of each cross validation run.

Returns:

None.

Doc-author:

Julian M. Kleber

save_whole_checkpoint(result_save_dir: str, dataset_name: str, num_cv: int, num_epoch: int, model_net: Type[Module], data: dict, optimizer: Type[Optimizer], loss: float, override: bool = True) None[source]

The save_whole_checkpoint function saves the model checkpoint and results for a given epoch.

The save_whole_checkpoint function saves the model checkpoint and results for a given epoch. It is called by the train function in order to save checkpoints at regular intervals during training, as well as after each cross-validation fold has been trained on. The saved files are used to resume training if it is interrupted, or to evaluate performance of different models on test data without having to retrain them from scratch.

Parameters:
  • self – Used to Represent the instance of the class.

  • result_save_dir:str – Used to Specify the directory where

the checkpoint will be saved. :param dataset_name:str: Used to Name the dataset. :param num_cv:int: Used to Specify the cross validation number. :param num_epoch:int: Used to Specify the number of epochs that have been completed. :param model_net:Type[torch.nn.Module]: Used to Save the model. :param data:dict: Used to Save the data, which is a dictionary containing the training and validation data. :param optimizer:Type[torch.optim.Optimizer]: Used to Save the optimizer state. :param loss:float: Used to Save the loss value. :param override:bool=True: Used to Override the previous checkpoint. :param : Used to Save the model. :return: None.

Doc-author:

Julian M. Kleber

test(test_loader: DataLoader, model_net: Model, device: device, **kwargs: Any) Any[source]

The test function is used to test the model on a dataset. It returns the accuracy of the model on that dataset calculated as the average of the atomic accuracy for each batch in the Dataset

Parameters:
  • test_loader – Used to pass the test data loader.

  • epoch – Used to keep track of the current epoch.

  • model_net – Used to pass the model to the test function.

  • device – Used to tell torch which device to use.

  • test=False – Used to distinguish between training

and testing. :return: The accuracy of the model on the test data.

Doc-author:

Julian M. Kleber

train(epoch: int, model_net: Model, device: device, train_loader: Type[DataLoader], optimizer: Optimizer, num_classes: int)[source]

The train function is used to train the model. The function takes in a number of epochs and a model, and returns the accuracy on the test set.

Parameters:
  • epoch – Used to Determine when to stop training.

  • model – Used to Pass the model to the function.

  • device – Used to Tell the model which device to use.

  • train_loader – Used to Load the training data.

  • test_loader – Used to Evaluate the model on the test data.

  • optimizer – Used to Specify the optimizer that will be used in training.

  • num_classes=2 – Used to Specify the number of classes in the data.

  • shrikage=51 – Used to Make sure that the model is trained for at least 51 epochs.

Returns:

The accuracy of the model on the training set.

Doc-author:

Trelent

carate.evaluation.classification module

Evaulation object for classification

class carate.evaluation.classification.ClassificationEvaluation(dataset_name: str, dataset_save_path: str, result_save_dir: str, model_net: Model, optimizer: Optimizer, data_set: DatasetObject, device: device, logger: Any, resume: bool, test_ratio: int, num_epoch: int = 150, num_cv: int = 5, num_classes: int = 2, out_dir: str = './out', batch_size: int = 64, shuffle: bool = True, model_save_freq: int = 100, override: bool = True, normalize: bool = False, custom_size: int | None = None)[source]

Bases: Evaluation

carate.evaluation.classification_whole_dataset module

carate.evaluation.regression module

Evaulation object for classification

class carate.evaluation.regression.RegressionEvaluation(dataset_name: str, dataset_save_path: str, result_save_dir: str, model_net: Model, optimizer: Optimizer, data_set: DatasetObject, device: device, logger: Any, resume: bool, test_ratio: int, num_epoch: int = 150, num_cv: int = 5, num_classes: int = 2, out_dir: str = './out', batch_size: int = 64, shuffle: bool = True, model_save_freq: int = 100, override: bool = True, normalize: bool = False, custom_size: int | None = None)[source]

Bases: Evaluation

Module that implements the Regression evaluation

cv(num_cv: int, num_epoch: int, num_classes: int, dataset_name: str, dataset_save_path: str, test_ratio: int, resume: bool, data_set: DatasetObject, logger: Any, shuffle: bool, batch_size: int, model_net: Model, optimizer: Optimizer, device: device, result_save_dir: str, model_save_freq: int, override: bool = True, normalize: bool = True, custom_size: int | None = None) Dict[str, Any][source]

The function is the core of the evaluation. The results are saved on disk during the run and returned as json at the end of the run.

Parameters:
  • self – Used to Represent the instance of the class.

  • num_cv:int – Used to specify the number of cross-validation folds.

  • num_epoch:int – Used to Specify the number of epochs to train for.

  • num_classes:int – Used to Determine the number of classes in the dataset.

  • dataset_name:str – Used to Specify the name of the dataset to be used.

  • DataSetType[DatasetObject] – Used to Load the data.

:param : Used to Specify the number of folds in a (stratified)kfold,. :return: A list of dictionaries.

Doc-author:

Trelent

test(test_loader: Type[DataLoader], epoch: int, model_net: Model, device: device, **kwargs: Any) Tuple[float, float][source]

The test function is used to test the model on a dataset. It returns the accuracy of the model on that dataset calculated as the average of the atomic accuracy for each batch in the Dataset

Parameters:
  • test_loader – Used to pass the test data loader.

  • epoch – Used to keep track of the current epoch.

  • model_net – Used to pass the model to the test function.

  • device – Used to tell torch which device to use.

  • test=False – Used to distinguish between training

and testing. :return: The accuracy of the model on the test data.

Doc-author:

Julian M. Kleber

train(epoch: int, model_net: Model, device: device, train_loader: DataLoader, optimizer: Optimizer, num_classes: int, **kwargs: Any) float[source]

The train function is used to train the model. The function takes in a number of epochs and a model, and returns the accuracy on the test set.

Parameters:
  • epoch – Used to Determine when to stop training.

  • model – Used to Pass the model to the function.

  • device – Used to Tell the model which device to use.

  • train_loader – Used to Load the training data.

  • test_loader – Used to Evaluate the model on the test data.

  • optimizer – Used to Specify the optimizer that will be used in training.

  • num_classes=2 – Used to Specify the number of classes in the data.

  • shrikage=51 – Used to Make sure that the model is trained for at least 51 epochs.

Returns:

The accuracy of the model on the training set.

Doc-author:

Trelent

Module contents