codes.surrogates.AbstractSurrogate package

Contents

codes.surrogates.AbstractSurrogate package#

Submodules#

codes.surrogates.AbstractSurrogate.abstract_config module#

class codes.surrogates.AbstractSurrogate.abstract_config.AbstractSurrogateBaseConfig(learning_rate=0.0003, regularization_factor=0.0, optimizer='adamw', momentum=0.0, scheduler='cosine', poly_power=0.9, eta_min=0.1, activation=ReLU(), loss_function=MSELoss(), beta=0.0)#

Bases: object

Base configuration for the AbstractSurrogate model.

This class defines shared attributes and methods for surrogate models.

learning_rate#

Learning rate for the optimizer.

Type:

float

regularization_factor#

Regularization coefficient, applied as weight decay.

Type:

float

optimizer#

Type of optimizer to use. Supported options: adamw, sgd.

Type:

str

momentum#

Momentum factor for the optimizer (used only if optimizer == “sgd”).

Type:

float

scheduler#

Type of learning rate scheduler to use. - “schedulefree”: Use schedulefree optimizer. - “cosine”: Use cosine annealing scheduler. - “poly”: Use polynomial decay scheduler.

Type:

str

poly_power#

Power for polynomial decay scheduler (used only if scheduler == “poly”).

Type:

float

eta_min#

Multiplier for minimum learning rate for cosine annealing scheduler (used only if scheduler == “cosine”).

Type:

float

activation#

Activation function used in the model.

Type:

nn.Module

loss_function#

Loss function used for training.

Type:

nn.Module

loss_kwargs#

Additional arguments for the loss function (used only if loss_function == nn.SmoothL1Loss()).

Type:

float

activation(*args, **kwargs): Module = ReLU()#
beta: float = 0.0#
eta_min: float = 0.1#
learning_rate: float = 0.0003#
property loss: Module#

Returns the loss function to be used for training.

If the loss function is nn.SmoothL1Loss, it returns an instance with the specified beta. Otherwise, it returns the loss function as is.

loss_function(*args, **kwargs): Module = MSELoss()#
momentum: float = 0.0#
optimizer: str = 'adamw'#
poly_power: float = 0.9#
regularization_factor: float = 0.0#
scheduler: str = 'cosine'#

codes.surrogates.AbstractSurrogate.abstract_surrogate module#

class codes.surrogates.AbstractSurrogate.abstract_surrogate.AbstractSurrogateModel(device=None, n_quantities=29, n_timesteps=100, n_parameters=0, training_id=None, config=None)#

Bases: ABC, Module

Abstract base class for surrogate models. This class implements the basic structure of a surrogate model and defines the methods that need to be implemented by the subclasses for it to be compatible with the benchmarking framework. For more information, see https://codes-docs.web.app/documentation.html#add_model.

Parameters:
  • device (str, optional) – The device to run the model on. Defaults to None.

  • n_quantities (int, optional) – The number of quantities. Defaults to 29.

  • n_timesteps (int, optional) – The number of timesteps. Defaults to 100.

  • config (dict, optional) – The configuration dictionary. Defaults to {}.

train_loss#

The training loss.

Type:

float

test_loss#

The test loss.

Type:

float

MAE#

The mean absolute error.

Type:

float

normalisation#

The normalisation parameters.

Type:

dict

train_duration#

The training duration.

Type:

float

device#

The device to run the model on.

Type:

str

n_quantities#

The number of quantities.

Type:

int

n_timesteps#

The number of timesteps.

Type:

int

L1#

The L1 loss function.

Type:

nn.L1Loss

config#

The configuration dictionary.

Type:

dict

forward(inputs

Any) -> tuple[Tensor, Tensor]: Forward pass of the model.

prepare_data(

dataset_train: np.ndarray, dataset_test: np.ndarray | None, dataset_val: np.ndarray | None, timesteps: np.ndarray, batch_size: int, shuffle: bool,

) -> tuple[DataLoader, DataLoader, DataLoader]

Gets the data loaders for training, testing, and validation.

fit(

train_loader: DataLoader, test_loader: DataLoader, epochs: int | None, position: int, description: str,

) -> None

Trains the model on the training data. Sets the train_loss and test_loss attributes.

predict(data_loader

DataLoader) -> tuple[Tensor, Tensor]: Evaluates the model on the given data loader.

save(

model_name: str, subfolder: str, training_id: str, data_info: dict,

) -> None

Saves the model to disk.

load(training_id

str, surr_name: str, model_identifier: str) -> None: Loads a trained surrogate model.

setup_progress_bar(epochs

int, position: int, description: str) -> tqdm: Helper function to set up a progress bar for training.

denormalize(data

Tensor) -> Tensor: Denormalizes the data back to the original scale.

checkpoint(test_loss, epoch)#

If save_best is True and test_loss < self.best_test_loss, overwrite the single-file checkpoint on disk and update best_test_loss/epoch.

Return type:

None

denormalize(data, leave_log=False, leave_norm=False)#

Denormalize the data.

Parameters:
  • data (Tensor | np.ndarray) – The data to denormalize.

  • leave_log (bool) – If True, do not exponentiate the data even if log10_transform is True.

  • leave_norm (bool) – If True, do not denormalize the data even if normalisation is applied.

Returns:

The denormalized data.

Return type:

Tensor | np.ndarray

denormalize_old(data)#

Denormalize the data.

Parameters:

data (np.ndarray) – The data to denormalize.

Returns:

The denormalized data.

Return type:

np.ndarray

abstract fit(train_loader, test_loader, epochs, position, description, multi_objective)#

Perform the training of the model. Sets the train_loss and test_loss attributes.

Parameters:
  • train_loader (DataLoader) – The DataLoader object containing the training data.

  • test_loader (DataLoader) – The DataLoader object containing the testing data.

  • epochs (int) – The number of epochs to train the model for.

  • position (int) – The position of the progress bar.

  • description (str) – The description of the progress bar.

  • multi_objective (bool) – Whether the training is multi-objective.

Return type:

None

abstract forward(inputs)#

Forward pass of the model.

Parameters:

inputs (Any) – The input data as recieved from the dataloader.

Returns:

The model predictions and the targets.

Return type:

tuple[Tensor, Tensor]

get_checkpoint(test_loader, criterion)#

After training, compare the current model’s test loss to the best recorded loss. If the final model is better, keep it; otherwise load the saved best checkpoint.

Parameters:
  • test_loader (DataLoader) – DataLoader for computing final test loss.

  • criterion (nn.Module) – Loss function used for evaluation.

Return type:

None

classmethod get_registered_classes()#

Returns the list of registered surrogate model classes.

Return type:

list[type[AbstractSurrogateModel]]

load(training_id, surr_name, model_identifier, model_dir=None)#

Load a trained surrogate model.

Parameters:
  • training_id (str) – The training identifier.

  • surr_name (str) – The name of the surrogate model.

  • model_identifier (str) – The identifier of the model (e.g., ‘main’).

Return type:

None

Returns:

None. The model is loaded in place.

predict(data_loader, leave_log=False, leave_norm=False)#

Evaluate the model on the given dataloader.

Parameters:
  • data_loader (DataLoader) – The DataLoader object containing the data the model is evaluated on.

  • leave_log (bool) – If True, do not exponentiate the data even if log10_transform is True.

  • leave_norm (bool) – If True, do not denormalize the data even if normalisation is applied.

Returns:

The predictions and targets.

Return type:

tuple[Tensor, Tensor]

abstract prepare_data(dataset_train, dataset_test, dataset_val, timesteps, batch_size, shuffle, dummy_timesteps=True)#

Prepare the data for training, testing, and validation. This method should return the DataLoader objects for the training, testing, and validation data.

Parameters:
  • dataset_train (np.ndarray) – The training dataset.

  • dataset_test (np.ndarray) – The testing dataset.

  • dataset_val (np.ndarray) – The validation dataset.

  • timesteps (np.ndarray) – The timesteps.

  • batch_size (int) – The batch size.

  • shuffle (bool) – Whether to shuffle the data.

  • dummy_timesteps (bool) – Whether to use dummy timesteps. Defaults to True.

Returns:

The DataLoader objects for the

training, testing, and validation data.

Return type:

tuple[DataLoader, DataLoader, DataLoader]

classmethod register(surrogate)#

Registers a surrogate model class into the registry.

save(model_name, base_dir, training_id)#

Save the model to disk.

Parameters:
  • model_name (str) – The name of the model.

  • subfolder (str) – The subfolder to save the model in.

  • training_id (str) – The training identifier.

  • data_info (dict) – The data parameters.

Return type:

None

setup_checkpoint()#

Prepare everything needed to save the single ‘best’ checkpoint. Must be called before any call to self.checkpoint(…).

Return type:

None

setup_optimizer_and_scheduler(epochs)#

Set up optimizer and scheduler based on self.config.scheduler and self.config.optimizer. Supports “adamw”, “sgd” optimizers and “schedulefree”, “cosine”, “poly” schedulers. Patches standard optimizers so that .train() and .eval() exist as no-ops. Patches ScheduleFree optimizers to have a no-op scheduler.step(). For ScheduleFree optimizers, use lr warmup for the first 1% of epochs. For Poly scheduler, use a power decay based on self.config.poly_power. For Cosine scheduler, use a minimum learning rate defined by self.config.eta_min.

Parameters:

epochs (int) – The number of epochs the training will run for.

Returns:

The optimizer and scheduler instances.

Return type:

tuple[torch.optim.Optimizer, torch.optim.lr_scheduler._LRScheduler]

Raises:

ValueError – If an unknown optimizer or scheduler is specified in the config.

setup_progress_bar(epochs, position, description)#

Helper function to set up a progress bar for training.

Parameters:
  • epochs (int) – The number of epochs.

  • position (int) – The position of the progress bar.

  • description (str) – The description of the progress bar.

Returns:

The progress bar.

Return type:

tqdm

time_pruning(current_epoch, total_epochs)#

Determine whether a trial should be pruned based on projected runtime, but only after a warmup period (10% of the total epochs).

Warmup: Do not prune if current_epoch is less than warmup_epochs. After warmup, compute the average epoch time, extrapolate the total runtime, and retrieve the threshold (runtime_threshold) from the study’s user attributes. If the projected runtime exceeds the threshold, raise an optuna.TrialPruned exception.

Parameters:
  • current_epoch (int) – The current epoch count.

  • total_epochs (int) – The planned total number of epochs.

Raises:

optuna.TrialPruned – If the projected runtime exceeds the threshold.

Return type:

None

validate(epoch, train_loader, test_loader, optimizer, progress_bar, total_epochs, multi_objective)#

Shared “validation + checkpoint” logic, to be called once per epoch in each fit().

Return type:

None

Relies on:
  • self.update_epochs (int)

  • self.train_loss (np.ndarray)

  • self.test_loss (np.ndarray)

  • self.MAE (np.ndarray)

  • self.optuna_trial

  • self.L1 (nn.L1Loss)

  • self.predict(…)

  • self.checkpoint(test_loss, epoch)

Only runs if (epoch % self.update_epochs) == 0. Main reporting metric is MAE in log10-space (i.e., Δdex). Additionally, MAE in linear space is computed.

Module contents#

class codes.surrogates.AbstractSurrogate.AbstractSurrogateBaseConfig(learning_rate=0.0003, regularization_factor=0.0, optimizer='adamw', momentum=0.0, scheduler='cosine', poly_power=0.9, eta_min=0.1, activation=ReLU(), loss_function=MSELoss(), beta=0.0)#

Bases: object

Base configuration for the AbstractSurrogate model.

This class defines shared attributes and methods for surrogate models.

learning_rate#

Learning rate for the optimizer.

Type:

float

regularization_factor#

Regularization coefficient, applied as weight decay.

Type:

float

optimizer#

Type of optimizer to use. Supported options: adamw, sgd.

Type:

str

momentum#

Momentum factor for the optimizer (used only if optimizer == “sgd”).

Type:

float

scheduler#

Type of learning rate scheduler to use. - “schedulefree”: Use schedulefree optimizer. - “cosine”: Use cosine annealing scheduler. - “poly”: Use polynomial decay scheduler.

Type:

str

poly_power#

Power for polynomial decay scheduler (used only if scheduler == “poly”).

Type:

float

eta_min#

Multiplier for minimum learning rate for cosine annealing scheduler (used only if scheduler == “cosine”).

Type:

float

activation#

Activation function used in the model.

Type:

nn.Module

loss_function#

Loss function used for training.

Type:

nn.Module

loss_kwargs#

Additional arguments for the loss function (used only if loss_function == nn.SmoothL1Loss()).

Type:

float

activation(*args, **kwargs): Module = ReLU()#
beta: float = 0.0#
eta_min: float = 0.1#
learning_rate: float = 0.0003#
property loss: Module#

Returns the loss function to be used for training.

If the loss function is nn.SmoothL1Loss, it returns an instance with the specified beta. Otherwise, it returns the loss function as is.

loss_function(*args, **kwargs): Module = MSELoss()#
momentum: float = 0.0#
optimizer: str = 'adamw'#
poly_power: float = 0.9#
regularization_factor: float = 0.0#
scheduler: str = 'cosine'#
class codes.surrogates.AbstractSurrogate.AbstractSurrogateModel(device=None, n_quantities=29, n_timesteps=100, n_parameters=0, training_id=None, config=None)#

Bases: ABC, Module

Abstract base class for surrogate models. This class implements the basic structure of a surrogate model and defines the methods that need to be implemented by the subclasses for it to be compatible with the benchmarking framework. For more information, see https://codes-docs.web.app/documentation.html#add_model.

Parameters:
  • device (str, optional) – The device to run the model on. Defaults to None.

  • n_quantities (int, optional) – The number of quantities. Defaults to 29.

  • n_timesteps (int, optional) – The number of timesteps. Defaults to 100.

  • config (dict, optional) – The configuration dictionary. Defaults to {}.

train_loss#

The training loss.

Type:

float

test_loss#

The test loss.

Type:

float

MAE#

The mean absolute error.

Type:

float

normalisation#

The normalisation parameters.

Type:

dict

train_duration#

The training duration.

Type:

float

device#

The device to run the model on.

Type:

str

n_quantities#

The number of quantities.

Type:

int

n_timesteps#

The number of timesteps.

Type:

int

L1#

The L1 loss function.

Type:

nn.L1Loss

config#

The configuration dictionary.

Type:

dict

forward(inputs

Any) -> tuple[Tensor, Tensor]: Forward pass of the model.

prepare_data(

dataset_train: np.ndarray, dataset_test: np.ndarray | None, dataset_val: np.ndarray | None, timesteps: np.ndarray, batch_size: int, shuffle: bool,

) -> tuple[DataLoader, DataLoader, DataLoader]

Gets the data loaders for training, testing, and validation.

fit(

train_loader: DataLoader, test_loader: DataLoader, epochs: int | None, position: int, description: str,

) -> None

Trains the model on the training data. Sets the train_loss and test_loss attributes.

predict(data_loader

DataLoader) -> tuple[Tensor, Tensor]: Evaluates the model on the given data loader.

save(

model_name: str, subfolder: str, training_id: str, data_info: dict,

) -> None

Saves the model to disk.

load(training_id

str, surr_name: str, model_identifier: str) -> None: Loads a trained surrogate model.

setup_progress_bar(epochs

int, position: int, description: str) -> tqdm: Helper function to set up a progress bar for training.

denormalize(data

Tensor) -> Tensor: Denormalizes the data back to the original scale.

checkpoint(test_loss, epoch)#

If save_best is True and test_loss < self.best_test_loss, overwrite the single-file checkpoint on disk and update best_test_loss/epoch.

Return type:

None

denormalize(data, leave_log=False, leave_norm=False)#

Denormalize the data.

Parameters:
  • data (Tensor | np.ndarray) – The data to denormalize.

  • leave_log (bool) – If True, do not exponentiate the data even if log10_transform is True.

  • leave_norm (bool) – If True, do not denormalize the data even if normalisation is applied.

Returns:

The denormalized data.

Return type:

Tensor | np.ndarray

denormalize_old(data)#

Denormalize the data.

Parameters:

data (np.ndarray) – The data to denormalize.

Returns:

The denormalized data.

Return type:

np.ndarray

abstract fit(train_loader, test_loader, epochs, position, description, multi_objective)#

Perform the training of the model. Sets the train_loss and test_loss attributes.

Parameters:
  • train_loader (DataLoader) – The DataLoader object containing the training data.

  • test_loader (DataLoader) – The DataLoader object containing the testing data.

  • epochs (int) – The number of epochs to train the model for.

  • position (int) – The position of the progress bar.

  • description (str) – The description of the progress bar.

  • multi_objective (bool) – Whether the training is multi-objective.

Return type:

None

abstract forward(inputs)#

Forward pass of the model.

Parameters:

inputs (Any) – The input data as recieved from the dataloader.

Returns:

The model predictions and the targets.

Return type:

tuple[Tensor, Tensor]

get_checkpoint(test_loader, criterion)#

After training, compare the current model’s test loss to the best recorded loss. If the final model is better, keep it; otherwise load the saved best checkpoint.

Parameters:
  • test_loader (DataLoader) – DataLoader for computing final test loss.

  • criterion (nn.Module) – Loss function used for evaluation.

Return type:

None

classmethod get_registered_classes()#

Returns the list of registered surrogate model classes.

Return type:

list[type[AbstractSurrogateModel]]

load(training_id, surr_name, model_identifier, model_dir=None)#

Load a trained surrogate model.

Parameters:
  • training_id (str) – The training identifier.

  • surr_name (str) – The name of the surrogate model.

  • model_identifier (str) – The identifier of the model (e.g., ‘main’).

Return type:

None

Returns:

None. The model is loaded in place.

predict(data_loader, leave_log=False, leave_norm=False)#

Evaluate the model on the given dataloader.

Parameters:
  • data_loader (DataLoader) – The DataLoader object containing the data the model is evaluated on.

  • leave_log (bool) – If True, do not exponentiate the data even if log10_transform is True.

  • leave_norm (bool) – If True, do not denormalize the data even if normalisation is applied.

Returns:

The predictions and targets.

Return type:

tuple[Tensor, Tensor]

abstract prepare_data(dataset_train, dataset_test, dataset_val, timesteps, batch_size, shuffle, dummy_timesteps=True)#

Prepare the data for training, testing, and validation. This method should return the DataLoader objects for the training, testing, and validation data.

Parameters:
  • dataset_train (np.ndarray) – The training dataset.

  • dataset_test (np.ndarray) – The testing dataset.

  • dataset_val (np.ndarray) – The validation dataset.

  • timesteps (np.ndarray) – The timesteps.

  • batch_size (int) – The batch size.

  • shuffle (bool) – Whether to shuffle the data.

  • dummy_timesteps (bool) – Whether to use dummy timesteps. Defaults to True.

Returns:

The DataLoader objects for the

training, testing, and validation data.

Return type:

tuple[DataLoader, DataLoader, DataLoader]

classmethod register(surrogate)#

Registers a surrogate model class into the registry.

save(model_name, base_dir, training_id)#

Save the model to disk.

Parameters:
  • model_name (str) – The name of the model.

  • subfolder (str) – The subfolder to save the model in.

  • training_id (str) – The training identifier.

  • data_info (dict) – The data parameters.

Return type:

None

setup_checkpoint()#

Prepare everything needed to save the single ‘best’ checkpoint. Must be called before any call to self.checkpoint(…).

Return type:

None

setup_optimizer_and_scheduler(epochs)#

Set up optimizer and scheduler based on self.config.scheduler and self.config.optimizer. Supports “adamw”, “sgd” optimizers and “schedulefree”, “cosine”, “poly” schedulers. Patches standard optimizers so that .train() and .eval() exist as no-ops. Patches ScheduleFree optimizers to have a no-op scheduler.step(). For ScheduleFree optimizers, use lr warmup for the first 1% of epochs. For Poly scheduler, use a power decay based on self.config.poly_power. For Cosine scheduler, use a minimum learning rate defined by self.config.eta_min.

Parameters:

epochs (int) – The number of epochs the training will run for.

Returns:

The optimizer and scheduler instances.

Return type:

tuple[torch.optim.Optimizer, torch.optim.lr_scheduler._LRScheduler]

Raises:

ValueError – If an unknown optimizer or scheduler is specified in the config.

setup_progress_bar(epochs, position, description)#

Helper function to set up a progress bar for training.

Parameters:
  • epochs (int) – The number of epochs.

  • position (int) – The position of the progress bar.

  • description (str) – The description of the progress bar.

Returns:

The progress bar.

Return type:

tqdm

time_pruning(current_epoch, total_epochs)#

Determine whether a trial should be pruned based on projected runtime, but only after a warmup period (10% of the total epochs).

Warmup: Do not prune if current_epoch is less than warmup_epochs. After warmup, compute the average epoch time, extrapolate the total runtime, and retrieve the threshold (runtime_threshold) from the study’s user attributes. If the projected runtime exceeds the threshold, raise an optuna.TrialPruned exception.

Parameters:
  • current_epoch (int) – The current epoch count.

  • total_epochs (int) – The planned total number of epochs.

Raises:

optuna.TrialPruned – If the projected runtime exceeds the threshold.

Return type:

None

validate(epoch, train_loader, test_loader, optimizer, progress_bar, total_epochs, multi_objective)#

Shared “validation + checkpoint” logic, to be called once per epoch in each fit().

Return type:

None

Relies on:
  • self.update_epochs (int)

  • self.train_loss (np.ndarray)

  • self.test_loss (np.ndarray)

  • self.MAE (np.ndarray)

  • self.optuna_trial

  • self.L1 (nn.L1Loss)

  • self.predict(…)

  • self.checkpoint(test_loss, epoch)

Only runs if (epoch % self.update_epochs) == 0. Main reporting metric is MAE in log10-space (i.e., Δdex). Additionally, MAE in linear space is computed.