rna_code.models package

Submodules

rna_code.models.autoencoder module

Base module for auto encoders.

class rna_code.models.autoencoder.Autoencoder(shape: int, latent_dim: int = 64, dropout: float = 0.1, slope: float = 0.05, num_layers: int = 3, variational: bool | str = None, num_embeddings: int = 512, embedding_dim: int = 512, commitment_cost: int = 1)

Bases: LightningModule, ABC

Base module for auto encoder.

Parameters:
  • shape (int) – Number of input variable

  • latent_dim (int, optional) – Size of the latent vector, by default 64

  • dropout (float, optional) – dropout rate, by default 0.1

  • slope (float, optional) – Leaky ReLU slope, by default 0.05

  • num_layers (int, optional) – number of encoder/decoder layers, by default 3

  • variational (bool | str, optional) – Whether or not to use variational autoencoder, can be VAE for variational autoencoder, VQ-VAE for vector quantized variational autoencoder or None if no variational logic, by default None

  • num_embeddings (int, optional) – VQ only, number of embeddings, by default 512

  • embedding_dim (int, optional) – VQ only, size of the embedding dimension for the encoding vectors, by default 512

  • commitment_cost (int, optional) – VQ only, commitment cost, by default 1

abstract build_decoder()

build decoder

abstract build_encoder()

build encoder

configure_optimizers()

Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.

Returns:

Any of these 6 options.

  • Single optimizer.

  • List or Tuple of optimizers.

  • Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple lr_scheduler_config).

  • Dictionary, with an "optimizer" key, and (optionally) a "lr_scheduler" key whose value is a single LR scheduler or lr_scheduler_config.

  • None - Fit will run without any optimizer.

The lr_scheduler_config is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.

lr_scheduler_config = {
    # REQUIRED: The scheduler instance
    "scheduler": lr_scheduler,
    # The unit of the scheduler's step size, could also be 'step'.
    # 'epoch' updates the scheduler on epoch end whereas 'step'
    # updates it after a optimizer update.
    "interval": "epoch",
    # How many epochs/steps should pass between calls to
    # `scheduler.step()`. 1 corresponds to updating the learning
    # rate after every epoch/step.
    "frequency": 1,
    # Metric to to monitor for schedulers like `ReduceLROnPlateau`
    "monitor": "val_loss",
    # If set to `True`, will enforce that the value specified 'monitor'
    # is available when the scheduler is updated, thus stopping
    # training if not found. If set to `False`, it will only produce a warning
    "strict": True,
    # If using the `LearningRateMonitor` callback to monitor the
    # learning rate progress, this keyword can be used to specify
    # a custom logged name
    "name": None,
}

When there are schedulers in which the .step() method is conditioned on a value, such as the torch.optim.lr_scheduler.ReduceLROnPlateau scheduler, Lightning requires that the lr_scheduler_config contains the keyword "monitor" set to the metric name that the scheduler should be conditioned on.

Metrics can be made available to monitor by simply logging it using self.log('metric_to_track', metric_val) in your LightningModule.

Note

Some things to know:

  • Lightning calls .backward() and .step() automatically in case of automatic optimization.

  • If a learning rate scheduler is specified in configure_optimizers() with key "interval" (default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s .step() method automatically in case of automatic optimization.

  • If you use 16-bit precision (precision=16), Lightning will automatically handle the optimizer.

  • If you use torch.optim.LBFGS, Lightning handles the closure function automatically for you.

  • If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.

  • If you need to control how often the optimizer steps, override the optimizer_step() hook.

encode(x: Tensor) Tensor

encoding logic to produce latent vector

Parameters:

x (torch.Tensor) – Input vector

Returns:

Latent vector

Return type:

torch.Tensor

forward(x: Tensor) Tensor

Forward function that reconstruct the input through the latent space

Parameters:

x (torch.Tensor) – Input vector

Returns:

Reconstruction of the input vector

Return type:

torch.Tensor

reparameterize(mu: Tensor, log_var: Tensor) Tensor

Reparametrization trick for VAE

Parameters:
  • mu (torch.Tensor) – Mean tensor for VAE sampling

  • log_var (torch.Tensor) – Log variance for VAE sampling

Returns:

Z sampled variable.

Return type:

torch.Tensor

training_step(batch, batch_idx)

Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.

Parameters:
  • batch – The output of your data iterable, normally a DataLoader.

  • batch_idx – The index of this batch.

  • dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)

Returns:

  • Tensor - The loss tensor

  • dict - A dictionary which can include any keys, but must include the key 'loss' in the case of automatic optimization.

  • None - In automatic optimization, this will skip to the next batch (but is not supported for multi-GPU, TPU, or DeepSpeed). For manual optimization, this has no special meaning, as returning the loss is not required.

In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.

Example:

def training_step(self, batch, batch_idx):
    x, y, z = batch
    out = self.encoder(x)
    loss = self.loss(out, x)
    return loss

To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:

def __init__(self):
    super().__init__()
    self.automatic_optimization = False


# Multiple optimizers (e.g.: GANs)
def training_step(self, batch, batch_idx):
    opt1, opt2 = self.optimizers()

    # do training_step with encoder
    ...
    opt1.step()
    # do training_step with decoder
    ...
    opt2.step()

Note

When accumulate_grad_batches > 1, the loss returned here will be automatically normalized by accumulate_grad_batches internally.

rna_code.models.cnn_ae module

Convolutional implementation for Autoencoder

class rna_code.models.cnn_ae.CNNAutoencoder(kernel_size: int | None = None, padding: int | None = None, **kwargs)

Bases: Autoencoder

Convolutional auto encoder.

Parameters:
  • kernel_size (int | None, optional) – Convolution kernel size, by default None

  • padding (int | None, optional) – Padding for convolution, by default None

build_decoder()

Build convolutional layers for decoder

build_encoder()

Build convolutional layers for encoder

rna_code.models.mlp_ae module

Multi layer perceptron implementation for Autoencoder

class rna_code.models.mlp_ae.MLPAutoencoder(**kwargs)

Bases: Autoencoder

Multi Layer Perceptron based Auto encoder.

build_decoder()

Build decoder.

build_encoder()

Build encoder.

rna_code.models.model_builder module

Module to build model object

returns:

Built auto encoder model

rtype:

Autoencoder

raises NotImplementedError:

If trying to build a model that isn’t of type ‘CNN’ or ‘MLP’.

class rna_code.models.model_builder.ModelBuilder(shape: int, model_params: dict)

Bases: object

Builder for model

Parameters:
  • shape (int) – Input size

  • model_params (dict) – Parameters used for model building.

generate_model() Autoencoder

Generate a model according to the model_params.

Returns:

Built auto encoder

Return type:

Autoencoder

Raises:

NotImplementedError – If trying to build a model that isn’t of type ‘CNN’ or ‘MLP’.

rna_code.models.residual_stack module

class rna_code.models.residual_stack.ResidualStack(encoder_dim)

Bases: Module

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

rna_code.models.vector_quantizer module

class rna_code.models.vector_quantizer.VectorQuantizer(num_embeddings, embedding_dim, commitment_cost)

Bases: Module

forward(inputs)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

rna_code.models.vq_conversion module

class rna_code.models.vq_conversion.vq_conversion(in_feature, out_features)

Bases: Module

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

rna_code.models.vq_pre_residual_stack_decoder module

class rna_code.models.vq_pre_residual_stack_decoder.vq_pre_residual_stack_decoder(num_embeddings, encoder_dim, dropout)

Bases: Module

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Module contents