Minimal Configuration

This section introduces you to the configuration files. The configuration is rather rigid. It is intended that you really think about what you are doing, otherwise you might not do anything after all.

Don’t be like the average human:

Tip

An average human looks without seeing, listens without hearing, touches without feeling, eats without tasting, moves without physical awareness, inhales without awareness of odour or fragrance, and talks without thinking. - Leonardo da Vinci

In our modern times we study without learning, and experiment without doing research. The technology is not useful if you can’t build it yourself.

Configuration example for Classification

Configurations can be stored as a .py file or passed via json. For an example of a configuration with JSON please refer to the notebook tutorials.

dataset_name = "MCF-7"
num_classes = 2
num_features = 46
model = "cgc_classification"
evaluation = "classification"
optimizer = "adams"
net_dimension = 364
learning_rate = 0.0005
dataset_save_path = "./"
test_ratio = 20
batch_size = 64
shuffle = True
num_epoch = 500
num_cv = 5
result_save_dir = "./"
data_set = "StandardTUD"
model_save_freq = 50
override = True
device = "cpu"

Configuration example for Regression

For an exmaple of a regression see the configuration file of ALCHEMY

dataset_save_path = "/media/dev/Data/carate_paper/"
result_save_dir = "/media/dev/Data/carate_paper/ALCHEMY_20"
dataset_name = "alchemy_full"
num_classes = 12
num_features = 6
model = "cgc_regression"
evaluation = "regression"
optimizer = "adams"  # defaults to adams optimizer
net_dimension = 364
learning_rate = 0.0005
test_ratio = 20
batch_size = 64
shuffle = True
num_epoch = 150
num_cv = 5
data_set = "StandardTUD"
model_save_freq = 15
override = True
device = "cpu"
normalize = True

Configuration with JSON

It might come in handy to use JSON for starting a run. See below for an example json

{
    "dataset_name" : "PROTEINS",
    "num_classes" : 2,
    "num_features" : 3,
    "model" : "cgc_classification",
    "evaluation" : "classification",
    "optimizer" : "adams",  # defaults to adams optimizer
    "net_dimension" : 364,
    "learning_rate" : 0.0005,
    "dataset_save_path" : "./data",
    "test_ratio" : 20,
    "batch_size" : 64,
    "shuffle" : True,
    "num_epoch" : 10,
    "num_cv" : 1,
    "result_save_dir" : "./PROTEINS_20",
    "data_set" : "StandardTUD",
    "model_save_freq" : 30,
    "device": "cpu",
    "override": True,
}

Starting a run from a Jupyter Notebook

I recommend to use the JSON mode from a Jupyter notebook. For example you can run

from carate.runner.run import RunInitializer

# Set Parameters
parameters = {
"dataset_name" : "PROTEINS",
"num_classes" : 2,
"num_features" : 3,
"model" : "cgc_classification",
"evaluation" : "classification",
"optimizer" : "adams",  # defaults to adams optimizer
"net_dimension" : 364,
"learning_rate" : 0.0005,
"dataset_save_path" : "./data",
"test_ratio" : 20,
"batch_size" : 64,
"shuffle" : True,
"num_epoch" : 10,
"num_cv" : 1,
"result_save_dir" : "./PROTEINS_20",
"data_set" : "StandardTUD",
"model_save_freq" : 5, 
"device": "cpu",
"override": True, 
}


#intialize a Run object

runner = RunInitializer.from_json(json_object=parameters)
runner.run()

Starting a run from a config file

You can also start a run from a config file

dataset_name = "MCF-7"
num_classes = 2
num_features = 46
model = "cgc_classification"
evaluation = "classification"
optimizer = "adams"  # defaults to adams optimizer
net_dimension = 364
learning_rate = 0.0005
dataset_save_path = "./data"
test_ratio = 20
batch_size = 64
shuffle = True
num_epoch = 300
num_cv = 5
result_save_dir = "./MCF-7"
data_loader = "StandardTUD"
model_save_freq = 30
override = True
heads = 3 
device = "cpu"

if __name__ == "__main__":

    from carate.run import RunInitializer

    config_filepath = "./mcf.py"
    runner = RunInitializer.from_file(config_filepath=config_filepath)
    runner.run()