rna_code.models package¶

Submodules¶

rna_code.models.autoencoder module¶

Base module for auto encoders.

class rna_code.models.autoencoder.Autoencoder(shape: int, latent_dim: int = 64, dropout: float = 0.1, slope: float = 0.05, num_layers: int = 3, variational: bool | str = None, num_embeddings: int = 512, embedding_dim: int = 512, commitment_cost: int = 1)¶

Bases: LightningModule, ABC

Base module for auto encoder.

Parameters:

shape (int) – Number of input variable
latent_dim (int, optional) – Size of the latent vector, by default 64
dropout (float, optional) – dropout rate, by default 0.1
slope (float, optional) – Leaky ReLU slope, by default 0.05
num_layers (int, optional) – number of encoder/decoder layers, by default 3
variational (bool | str, optional) – Whether or not to use variational autoencoder, can be VAE for variational autoencoder, VQ-VAE for vector quantized variational autoencoder or None if no variational logic, by default None
num_embeddings (int, optional) – VQ only, number of embeddings, by default 512
embedding_dim (int, optional) – VQ only, size of the embedding dimension for the encoding vectors, by default 512
commitment_cost (int, optional) – VQ only, commitment cost, by default 1

abstract build_decoder()¶: build decoder

abstract build_encoder()¶: build encoder

configure_optimizers()¶

Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.

Returns:

Any of these 6 options.

Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple lr_scheduler_config).
Dictionary, with an "optimizer" key, and (optionally) a "lr_scheduler" key whose value is a single LR scheduler or lr_scheduler_config.
None - Fit will run without any optimizer.

The lr_scheduler_config is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.

lr_scheduler_config = {
    # REQUIRED: The scheduler instance
    "scheduler": lr_scheduler,
    # The unit of the scheduler's step size, could also be 'step'.
    # 'epoch' updates the scheduler on epoch end whereas 'step'
    # updates it after a optimizer update.
    "interval": "epoch",
    # How many epochs/steps should pass between calls to
    # `scheduler.step()`. 1 corresponds to updating the learning
    # rate after every epoch/step.
    "frequency": 1,
    # Metric to to monitor for schedulers like `ReduceLROnPlateau`
    "monitor": "val_loss",
    # If set to `True`, will enforce that the value specified 'monitor'
    # is available when the scheduler is updated, thus stopping
    # training if not found. If set to `False`, it will only produce a warning
    "strict": True,
    # If using the `LearningRateMonitor` callback to monitor the
    # learning rate progress, this keyword can be used to specify
    # a custom logged name
    "name": None,
}

When there are schedulers in which the .step() method is conditioned on a value, such as the torch.optim.lr_scheduler.ReduceLROnPlateau scheduler, Lightning requires that the lr_scheduler_config contains the keyword "monitor" set to the metric name that the scheduler should be conditioned on.

Metrics can be made available to monitor by simply logging it using self.log('metric_to_track', metric_val) in your LightningModule.

Note

Some things to know:

Lightning calls .backward() and .step() automatically in case of automatic optimization.
If a learning rate scheduler is specified in configure_optimizers() with key "interval" (default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s .step() method automatically in case of automatic optimization.
If you use 16-bit precision (precision=16), Lightning will automatically handle the optimizer.
If you use torch.optim.LBFGS, Lightning handles the closure function automatically for you.
If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.
If you need to control how often the optimizer steps, override the optimizer_step() hook.

encode(x: Tensor) → Tensor¶

encoding logic to produce latent vector

Parameters:: x (torch.Tensor) – Input vector
Returns:: Latent vector
Return type:: torch.Tensor

forward(x: Tensor) → Tensor¶

Forward function that reconstruct the input through the latent space

Parameters:: x (torch.Tensor) – Input vector
Returns:: Reconstruction of the input vector
Return type:: torch.Tensor

reparameterize(mu: Tensor, log_var: Tensor) → Tensor¶

Reparametrization trick for VAE

Parameters:

mu (torch.Tensor) – Mean tensor for VAE sampling
log_var (torch.Tensor) – Log variance for VAE sampling

Returns:

Z sampled variable.

Return type:

torch.Tensor

training_step(batch, batch_idx)¶

Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.

Parameters:

batch – The output of your data iterable, normally a DataLoader.
batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)

Returns:

Tensor - The loss tensor
dict - A dictionary which can include any keys, but must include the key 'loss' in the case of automatic optimization.
None - In automatic optimization, this will skip to the next batch (but is not supported for multi-GPU, TPU, or DeepSpeed). For manual optimization, this has no special meaning, as returning the loss is not required.

In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.

Example:

def training_step(self, batch, batch_idx):
    x, y, z = batch
    out = self.encoder(x)
    loss = self.loss(out, x)
    return loss

To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:

def __init__(self):
    super().__init__()
    self.automatic_optimization = False


# Multiple optimizers (e.g.: GANs)
def training_step(self, batch, batch_idx):
    opt1, opt2 = self.optimizers()

    # do training_step with encoder
    ...
    opt1.step()
    # do training_step with decoder
    ...
    opt2.step()

Note

When accumulate_grad_batches > 1, the loss returned here will be automatically normalized by accumulate_grad_batches internally.

rna_code.models.cnn_ae module¶

Convolutional implementation for Autoencoder

class rna_code.models.cnn_ae.CNNAutoencoder(kernel_size: int | None = None, padding: int | None = None, **kwargs)¶

Bases: Autoencoder

Convolutional auto encoder.

Parameters:

kernel_size (int | None, optional) – Convolution kernel size, by default None
padding (int | None, optional) – Padding for convolution, by default None

build_decoder()¶: Build convolutional layers for decoder

build_encoder()¶: Build convolutional layers for encoder

rna_code.models.mlp_ae module¶

Multi layer perceptron implementation for Autoencoder

class rna_code.models.mlp_ae.MLPAutoencoder(**kwargs)¶

Bases: Autoencoder

Multi Layer Perceptron based Auto encoder.

build_decoder()¶: Build decoder.

build_encoder()¶: Build encoder.

rna_code.models.model_builder module¶

Module to build model object

returns:: Built auto encoder model
rtype:: Autoencoder
raises NotImplementedError:: If trying to build a model that isn’t of type ‘CNN’ or ‘MLP’.

class rna_code.models.model_builder.ModelBuilder(shape: int, model_params: dict)¶

Bases: object

Builder for model

Parameters:

shape (int) – Input size
model_params (dict) – Parameters used for model building.

generate_model() → Autoencoder¶

Generate a model according to the model_params.

Returns:: Built auto encoder
Return type:: Autoencoder
Raises:: NotImplementedError – If trying to build a model that isn’t of type ‘CNN’ or ‘MLP’.

rna_code.models.residual_stack module¶

class rna_code.models.residual_stack.ResidualStack(encoder_dim)¶

Bases: Module

forward(x)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

rna_code.models.vector_quantizer module¶

class rna_code.models.vector_quantizer.VectorQuantizer(num_embeddings, embedding_dim, commitment_cost)¶

Bases: Module

forward(inputs)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

rna_code.models.vq_conversion module¶

class rna_code.models.vq_conversion.vq_conversion(in_feature, out_features)¶

Bases: Module

forward(x)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

rna_code.models.vq_pre_residual_stack_decoder module¶

class rna_code.models.vq_pre_residual_stack_decoder.vq_pre_residual_stack_decoder(num_embeddings, encoder_dim, dropout)¶

Bases: Module

forward(x)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

rna_code.models package¶

Submodules¶

rna_code.models.autoencoder module¶

rna_code.models.cnn_ae module¶

rna_code.models.mlp_ae module¶

rna_code.models.model_builder module¶

rna_code.models.residual_stack module¶

rna_code.models.vector_quantizer module¶

rna_code.models.vq_conversion module¶

rna_code.models.vq_pre_residual_stack_decoder module¶

Module contents¶