carate.evaluation package
Submodules
carate.evaluation.base module
This is the heart of the application and trains / tests a algorithm on a given dataset. The idea is to parametrize as much as possible.
- author:
Julian M. Kleber
- class carate.evaluation.base.Evaluation(dataset_name: str, dataset_save_path: str, result_save_dir: str, model_net: Model, optimizer: Optimizer, data_set: DatasetObject, device: device, logger: Any, resume: bool, test_ratio: int, num_epoch: int = 150, num_cv: int = 5, num_classes: int = 2, out_dir: str = './out', batch_size: int = 64, shuffle: bool = True, model_save_freq: int = 100, override: bool = True, normalize: bool = False, custom_size: int | None = None)[source]
Bases:
DefaultObjectThe evaluation class is about evaluating a given model written in PyTorch or PyTorchGeometric.
- cv(num_cv: int, num_epoch: int, num_classes: int, dataset_name: str, dataset_save_path: str, logger: Any, test_ratio: int, resume: bool, data_set: DatasetObject, shuffle: bool, batch_size: int, model_net: Model, optimizer: Optimizer, device: device, result_save_dir: str, model_save_freq: int, override: bool = True, normalize: bool = False, custom_size: int | None = None) Dict[str, Any][source]
The function is the core of the evaluation. The results are saved on disk during the run and returned as json at the end of the run.
- Parameters:
self – Used to Represent the instance of the class.
num_cv:int – Used to specify the number of cross-validation folds.
num_epoch:int – Used to Specify the number of epochs to train for.
num_classes:int – Used to Determine the number of classes in the dataset.
dataset_name:str – Used to Specify the name of the dataset to be used.
DataSetType[DatasetObject] – Used to Load the data.
:param : Used to Specify the number of folds in a (stratified)kfold,. :return: A list of dictionaries.
- Doc-author:
Trelent
- load_model_checkpoint(checkpoint_path: str, model_net: ~carate.models.base_model.Model, optimizer=<class 'torch.optim.optimizer.Optimizer'>) Model[source]
The load_model_checkpoint function loads a model checkpoint from the specified path.
The function loads a model checkpoint from the specified path, and sets it as the model of this evaluation object. The function also returns that loaded model.
- Parameters:
self – Used to Refer to the object itself.
checkpoint_path:str – Used to Specify the path to the checkpoint file.
model_net:Model – Used to Specify the model that is being loaded.
optimizer=torch.optim.Optimizer – Used to Load the optimizer state
from a checkpoint. :param : Used to Load the model checkpoint. :return: The model.
- Doc-author:
Julian M. Kleber
- name = 'Default evaluation'
- save_model_checkpoint(result_save_dir: str, dataset_name: str, num_cv: int, num_epoch: int, model_net: Type[Module], optimizer: Type[Optimizer], loss: float, override: bool = True) None[source]
The save_model function saves the model to a file.
The save_model function saves the model to a file. The filename is based on the dataset name, number of cross-validation folds, and epoch number. The file is saved in the result_save_dir directory with an extension of .pt (for PyTorch). If this directory does not exist, it will be created before saving the file.
- Parameters:
result_save_dir:str – Used to specify the directory where the model will be saved.
dataset_name:str – Used to save the model with a name that includes the dataset it was trained on.
num_cv:int – Used to specify which cross validation fold the model is being saved for.
num_epoch:int – Used to save the model at a certain epoch.
model_net:Type[torch.nn.Module] – Used to save the model.
:param : Used to save the model at a certain frequency. :return: None.
- Doc-author:
Julian M. Kleber
- save_result(result_save_dir: str, dataset_name: str, num_cv: int, num_epoch: int, data: dict) None[source]
The save_result function saves the results of a cross-validation run to a .json file. The goal is to provide a json interface of cv results for later analysis of the training runs.
- Parameters:
self – Used to represent the instance of the class.
result_save_dir:str – Used to specify the directory where the results will be saved.
dataset_name:str – Used to identify the dataset.
num_cv:int – Used to specify the number of cross validation runs.
int (num_epoch) – Epoch the run was saved in
data:dict – Used to store the results of each cross validation run.
- Returns:
None.
- Doc-author:
Julian M. Kleber
- save_whole_checkpoint(result_save_dir: str, dataset_name: str, num_cv: int, num_epoch: int, model_net: Type[Module], data: dict, optimizer: Type[Optimizer], loss: float, override: bool = True) None[source]
The save_whole_checkpoint function saves the model checkpoint and results for a given epoch.
The save_whole_checkpoint function saves the model checkpoint and results for a given epoch. It is called by the train function in order to save checkpoints at regular intervals during training, as well as after each cross-validation fold has been trained on. The saved files are used to resume training if it is interrupted, or to evaluate performance of different models on test data without having to retrain them from scratch.
- Parameters:
self – Used to Represent the instance of the class.
result_save_dir:str – Used to Specify the directory where
the checkpoint will be saved. :param dataset_name:str: Used to Name the dataset. :param num_cv:int: Used to Specify the cross validation number. :param num_epoch:int: Used to Specify the number of epochs that have been completed. :param model_net:Type[torch.nn.Module]: Used to Save the model. :param data:dict: Used to Save the data, which is a dictionary containing the training and validation data. :param optimizer:Type[torch.optim.Optimizer]: Used to Save the optimizer state. :param loss:float: Used to Save the loss value. :param override:bool=True: Used to Override the previous checkpoint. :param : Used to Save the model. :return: None.
- Doc-author:
Julian M. Kleber
- test(test_loader: DataLoader, model_net: Model, device: device, **kwargs: Any) Any[source]
The test function is used to test the model on a dataset. It returns the accuracy of the model on that dataset calculated as the average of the atomic accuracy for each batch in the Dataset
- Parameters:
test_loader – Used to pass the test data loader.
epoch – Used to keep track of the current epoch.
model_net – Used to pass the model to the test function.
device – Used to tell torch which device to use.
test=False – Used to distinguish between training
and testing. :return: The accuracy of the model on the test data.
- Doc-author:
Julian M. Kleber
- train(epoch: int, model_net: Model, device: device, train_loader: Type[DataLoader], optimizer: Optimizer, num_classes: int)[source]
The train function is used to train the model. The function takes in a number of epochs and a model, and returns the accuracy on the test set.
- Parameters:
epoch – Used to Determine when to stop training.
model – Used to Pass the model to the function.
device – Used to Tell the model which device to use.
train_loader – Used to Load the training data.
test_loader – Used to Evaluate the model on the test data.
optimizer – Used to Specify the optimizer that will be used in training.
num_classes=2 – Used to Specify the number of classes in the data.
shrikage=51 – Used to Make sure that the model is trained for at least 51 epochs.
- Returns:
The accuracy of the model on the training set.
- Doc-author:
Trelent
carate.evaluation.classification module
Evaulation object for classification
- class carate.evaluation.classification.ClassificationEvaluation(dataset_name: str, dataset_save_path: str, result_save_dir: str, model_net: Model, optimizer: Optimizer, data_set: DatasetObject, device: device, logger: Any, resume: bool, test_ratio: int, num_epoch: int = 150, num_cv: int = 5, num_classes: int = 2, out_dir: str = './out', batch_size: int = 64, shuffle: bool = True, model_save_freq: int = 100, override: bool = True, normalize: bool = False, custom_size: int | None = None)[source]
Bases:
Evaluation
carate.evaluation.classification_whole_dataset module
carate.evaluation.regression module
Evaulation object for classification
- class carate.evaluation.regression.RegressionEvaluation(dataset_name: str, dataset_save_path: str, result_save_dir: str, model_net: Model, optimizer: Optimizer, data_set: DatasetObject, device: device, logger: Any, resume: bool, test_ratio: int, num_epoch: int = 150, num_cv: int = 5, num_classes: int = 2, out_dir: str = './out', batch_size: int = 64, shuffle: bool = True, model_save_freq: int = 100, override: bool = True, normalize: bool = False, custom_size: int | None = None)[source]
Bases:
EvaluationModule that implements the Regression evaluation
- cv(num_cv: int, num_epoch: int, num_classes: int, dataset_name: str, dataset_save_path: str, test_ratio: int, resume: bool, data_set: DatasetObject, logger: Any, shuffle: bool, batch_size: int, model_net: Model, optimizer: Optimizer, device: device, result_save_dir: str, model_save_freq: int, override: bool = True, normalize: bool = True, custom_size: int | None = None) Dict[str, Any][source]
The function is the core of the evaluation. The results are saved on disk during the run and returned as json at the end of the run.
- Parameters:
self – Used to Represent the instance of the class.
num_cv:int – Used to specify the number of cross-validation folds.
num_epoch:int – Used to Specify the number of epochs to train for.
num_classes:int – Used to Determine the number of classes in the dataset.
dataset_name:str – Used to Specify the name of the dataset to be used.
DataSetType[DatasetObject] – Used to Load the data.
:param : Used to Specify the number of folds in a (stratified)kfold,. :return: A list of dictionaries.
- Doc-author:
Trelent
- test(test_loader: Type[DataLoader], epoch: int, model_net: Model, device: device, **kwargs: Any) Tuple[float, float][source]
The test function is used to test the model on a dataset. It returns the accuracy of the model on that dataset calculated as the average of the atomic accuracy for each batch in the Dataset
- Parameters:
test_loader – Used to pass the test data loader.
epoch – Used to keep track of the current epoch.
model_net – Used to pass the model to the test function.
device – Used to tell torch which device to use.
test=False – Used to distinguish between training
and testing. :return: The accuracy of the model on the test data.
- Doc-author:
Julian M. Kleber
- train(epoch: int, model_net: Model, device: device, train_loader: DataLoader, optimizer: Optimizer, num_classes: int, **kwargs: Any) float[source]
The train function is used to train the model. The function takes in a number of epochs and a model, and returns the accuracy on the test set.
- Parameters:
epoch – Used to Determine when to stop training.
model – Used to Pass the model to the function.
device – Used to Tell the model which device to use.
train_loader – Used to Load the training data.
test_loader – Used to Evaluate the model on the test data.
optimizer – Used to Specify the optimizer that will be used in training.
num_classes=2 – Used to Specify the number of classes in the data.
shrikage=51 – Used to Make sure that the model is trained for at least 51 epochs.
- Returns:
The accuracy of the model on the training set.
- Doc-author:
Trelent