Core Models & Data

Attention Map

Base Module

class sdofmv2.core.basemodule.BaseModule(*args: Any, **kwargs: Any)[source]

Bases: LightningModule

A foundational PyTorch Lightning module for standardized training.

This base class handles the boilerplate configuration for optimizers and learning rate schedulers. Other models in the pipeline should inherit from this class and implement their specific training_step and validation_step logic.

Parameters:

optimizer_dict (dict) – Configuration dictionary for the optimizer. Expected keys include “use” (e.g., “adamw”, “sgd”, “adam”), “learning_rate”, and “weight_decay”.
scheduler_dict (dict) – Configuration dictionary for the learning rate scheduler. Expected keys include “use” (e.g., “cosine”, “cosine_warmup”, “plateau”, “exp”), “monitor” (metric to track), and any scheduler-specific hyperparameters.
hyperparam_ignore (list[str], optional) – List of parameter names to exclude from Lightning’s automatic hyperparameter saving. Defaults to [].
*args – Variable length argument list passed to pl.LightningModule.
**kwargs – Arbitrary keyword arguments passed to pl.LightningModule.

Methods

`configure_optimizers`()	Configure optimizers and learning rate schedulers.
`training_step`(batch, batch_idx)	Perform a single training step.
`validation_step`(batch, batch_idx)	Perform a single validation step.

configure_optimizers()[source]

Configure optimizers and learning rate schedulers.

Returns:

Either a single optimizer or a dict: containing optimizer and lr_scheduler configuration.

Return type:

Union[torch.optim.Optimizer, Dict]

training_step(batch, batch_idx)[source]

Perform a single training step.

Parameters:

batch – The training batch data.
batch_idx – The index of the current batch.

Raises:

NotImplementedError – Subclasses must implement this method.

validation_step(batch, batch_idx)[source]

Perform a single validation step.

Parameters:

batch – The validation batch data.
batch_idx – The index of the current batch.

Raises:

NotImplementedError – Subclasses must implement this method.

Data Module

class sdofmv2.core.datamodule.SDOMLDataset(*args: Any, **kwargs: Any)[source]

Bases: Dataset

A PyTorch Dataset for Solar Dynamics Observatory (SDO) Machine Learning data.

This dataset aligns and loads multimodal solar observations from the AIA, HMI, and EVE instruments. It supports temporal sequencing, masking, and on-the-fly normalization for training deep learning models on solar data.

Parameters:

aligndata (pd.DataFrame) – Aligned temporal indexes used for matching inputs and outputs across different instruments.
hmi_data (zarr.hierarchy.Group) – Zarr dataset HMI magnetogram observations.
aia_data (zarr.hierarchy.Group) – Zarr dataset AIA EUV/UV image observations.
eve_data (zarr.hierarchy.Group) – Zarr dataset EVE irradiance observations.
components (list[str]) – List of magnetic components to load for HMI (e.g., [‘Bx’, ‘By’, ‘Bz’]).
wavelengths (list[str] or list[int]) – List of channels to load for AIA (e.g., [94, 131, 171, 193, 211, 304, 335, 1600, 1700]).
ions (list[str]) – List of spectral lines/ions to load for EVE (e.g., from MEGS-A and MEGS-B).
freq (str) – The temporal cadence used for rounding and aligning the time series (e.g., ‘12min’).
months (list[int]) – List of valid months (1-12) to include in the dataset. Useful for creating train/validation/test splits by time.
normalization (dict) – The normalization strategy to apply during data loading (e.g., ‘zscore’, ‘minmax’). Defaults to None.
normalization_stat (dict) – Pre-computed statistics (like mean and standard deviation) required for the chosen normalization. Defaults to None.
mask (torch.Tensor) – Whether to apply the HMI limb mask to the AIA and HMI spatial data. Defaults to None.
num_frames (int, optional) – The number of consecutive temporal frames to load per sequence sample. Defaults to 1.
drop_frame_dim (bool, optional) – If True and num_frames is 1, drops the temporal dimension. Defaults to False.
min_date (str or datetime, optional) – The earliest date boundary to include in the dataset. Defaults to None.
max_date (str or datetime, optional) – The latest date boundary to include in the dataset. Defaults to None.
get_header (bool or list, optional) – Whether to retrieve and return header metadata alongside the image tensors. Defaults to False.
precision (str, optional) – The floating-point precision for the output tensors (e.g., “32” for float32, “16” for float16). Defaults to “32”.

Methods

`get_aia_image`(idx)	Get AIA image for a given index.
`get_eve`(idx)	Get EVE data for a given index.
`get_hmi_image`(idx)	Get HMI image for a given index.

get_aia_image(idx)[source]: Get AIA image for a given index. Returns a numpy array of shape (num_wavelengths, num_frames, height, width).

get_eve(idx)[source]: Get EVE data for a given index. Returns a numpy array of shape (num_ions, num_frames, …).

get_hmi_image(idx)[source]: Get HMI image for a given index. Returns a numpy array of shape (num_channels, num_frames, height, width).

loading_data_retry(data, year, wavelength, id_of_img, num_try: int = 10, sleep_time: float = 0.5)[source]: Tries to load an image from the dataset multiple times to handle transient

class sdofmv2.core.datamodule.SDOMLDataModule(*args: Any, **kwargs: Any)[source]

Bases: LightningDataModule

A PyTorch Lightning DataModule for paired SDO machine learning data.

This module orchestrates the downloading, setup, splitting, and batching of paired AIA EUV images, HMI magnetograms, and EVE irradiance measures. It handles train/val/test splits based on specified months to prevent temporal data leakage.

Note

Input data across the different instruments needs to be temporally aligned and paired.

Parameters:

hmi_path (str) – Path to the HMI Zarr data file.
aia_path (str) – Path to the AIA Zarr data file.
eve_path (str) – Path to the EVE Zarr data file.
components (list[str]) – List of magnetic field components to load from HMI.
wavelengths (list[int] or list[str]) – List of AIA wavelengths to load.
ions (list[str]) – List of EVE ions or spectral lines to load.
batch_size (int, optional) – Number of samples per batch. Defaults to 32.
num_workers (int, optional) – Number of subprocesses to use for data loading. Defaults to None.
pin_memory (bool, optional) – If True, the data loader will copy Tensors into CUDA pinned memory before returning them. Defaults to False.
persistent_workers (bool, optional) – If True, the data loader will not shutdown worker processes after a dataset has been consumed once. Defaults to False.
normalization (dict) – specific normalization strategy to use. Defaults to False.
hmi_mask (str, optional) – Filename for the HMI mask. Defaults to “hmi_mask_512x512.npy”.
apply_mask (bool, optional) – Whether to apply the solar limb mask to the spatial data. Defaults to True.
num_frames (int, optional) – The number of consecutive temporal frames to load per sequence sample. Defaults to 1.
drop_frame_dim (bool, optional) – If True and num_frames is 1. Defaults to False.
min_date (str, optional) – The earliest date boundary to include in the splits (e.g., ‘2010-05-01’). Defaults to None.
max_date (str, optional) – The latest date boundary to include in the splits. Defaults to None.
precision (str, optional) – The floating-point precision for the output tensors (e.g., “32”, “16”). Defaults to “32”.

Methods

`setup`([stage])
`test_dataloader`()
`train_dataloader`()
`val_dataloader`()

Losses

sdofmv2.core.losses.mae_loss(pred, target) → Tensor[source]

Calculates the mean absolute error between predictions and targets.

Parameters:

pred (torch.Tensor) – Predicted values from the model.
target (torch.Tensor) – Ground truth values to compare against.

Returns:

The calculated mean absolute error as a scalar tensor.

Return type:

torch.Tensor

sdofmv2.core.losses.patch_weight_loss(pred, target, loss_dict, mask_hidden, mask_off_limb)[source]

Calculates a three-tier weighted reconstruction loss for solar data.

This function separates patches into three categories (masked inner disk, visible inner disk, and off-limb space) and applies independent weights to each group’s mean loss. This prevents the large population of space pixels or masked patches from disproportionately biasing the gradients.

Parameters:

pred (torch.Tensor) – Predicted patch values [B, L, D].
target (torch.Tensor) – Ground truth (potentially normalized) patches [B, L, D].
loss_dict (dict or object) –
Config object containing: * base_loss (dict): Must have ‘type’ (‘mse’, ‘mae’, or ‘huber’)

and ‘delta’ (for huber).
- weight_on_patches (list[float]): A three-element list: [weight_masked_inner, weight_visible_inner, weight_off_limb]. Example: [0.7, 0.2, 0.1].
mask_hidden (torch.Tensor) – Binary/bool mask from encoder [B, L]. 1 (True) indicates a masked/hidden patch.
mask_off_limb (torch.Tensor) – Binary/bool spatial mask [B, L]. 1 (True) indicates a patch outside the solar disk.

Returns:

Scalar weighted mean loss.

Return type:

torch.Tensor

Raises:

ValueError – If an unsupported loss type is provided.
IndexError – If weight_on_patches does not contain exactly three elements.

sdofmv2.core.losses.pixel_weight_loss(pred, target_norm, target, base_loss, threshold, ar_weight_ratio: float)[source]

Parameters:

pred (4d tensor) – output from model
target_norm (4d tensor) – re-normalized target by norm_pix_loss
target (4d tensor) – normalized target
base_loss (str) – baseline loss function
threshold (float) – threshold for pixels which have strong magnetic field
ar_weight_ratio (float) – weight for the pixesl greater than threshold

Returns:

torch.float

Return type:

_type_

sdofmv2.core.losses.sparse_dense_loss(pred: Tensor, target: Tensor, alpha: float = 1.0, beta: float = 1.0, base_type: Literal['mse', 'mae', 'huber'] = 'mse', huber_delta: float = 1.0, imgs: Tensor | None = None, patch_size: int = 16, corner_size: int = 4, corner_ratio: float = 0.25) → Tensor[source]

sdofmv2.core.losses.split_pixel_loss(pred: Tensor, target: Tensor, alpha: float = 1.0, beta: float = 1.0, base_type: Literal['mse', 'mae', 'huber'] = 'mse', huber_delta: float = 1.0, imgs: Tensor | None = None, patch_size: int = 16, corner_size: int = 4, corner_ratio: float = 0.25) → Tensor[source]

sdofmv2.core.losses.vector_aware_loss(pred, target, base_loss) → Tensor[source]

Calculates a loss that combines magnitude and orientation for vector fields.

This method computes a base loss (MSE or MAE) and adds a weighted cosine similarity term. The cosine similarity component enforces directional alignment between the predicted and target vectors.

Parameters:

pred (torch.Tensor) – Predicted vector field of shape (B, 3, F, H, W).
target (torch.Tensor) – Ground truth vector field of shape (B, 3, F, H, W).
base_loss (str) – The type of base loss to compute, either “mse” or “mae”.

Returns:

A scalar tensor representing the combined loss.

Return type:

torch.Tensor

Raises:

ValueError – If the provided base_loss type isn’t supported.

Masked Autoencoder (MAE)

class sdofmv2.core.mae3d.MaskedAutoencoderViT3D(img_size=224, patch_size=16, num_frames=3, tubelet_size=1, in_chans=3, embed_dim=1024, depth=24, num_heads=16, decoder_embed_dim=512, decoder_depth=8, decoder_num_heads=16, mlp_ratio=4.0, norm_layer='LayerNorm', limb_mask=None, ids_limb_mask=None, loss_dict={})[source]

Masked Autoencoder with a 3D Vision Transformer backbone.

This model encodes a fraction of visible image patches and predicts the missing (masked) patches. It incorporates specialized masking logic to handle solar limb boundaries, ensuring the model focuses on the solar disk.

Parameters:

img_size (int) – Spatial dimensions of the input images. Defaults to 224.
patch_size (int) – Spatial dimensions of each patch. Defaults to 16.
num_frames (int) – Number of frames in the input temporal sequence. Defaults to 3.
tubelet_size (int) – Temporal dimension of each patch (tubelet). Defaults to 1.
in_chans (int) – Number of input channels (e.g., wavelengths). Defaults to 3.
embed_dim (int) – Dimensionality of the token embeddings in the encoder. Defaults to 1024.
depth (int) – Number of Transformer blocks in the encoder. Defaults to 24.
num_heads (int) – Number of attention heads in the encoder’s Transformer blocks. Defaults to 16.
decoder_embed_dim (int) – Dimensionality of the token embeddings in the decoder. Defaults to 512.
decoder_depth (int) – Number of Transformer blocks in the decoder. Defaults to 8.
decoder_num_heads (int) – Number of attention heads in the decoder’s Transformer blocks. Defaults to 16.
mlp_ratio (float) – Ratio of the hidden dimension to the embedding dimension in the MLP of the Transformer blocks. Defaults to 4.0.
norm_layer (str) – The type of normalization layer to use. Currently supports “LayerNorm”. Defaults to “LayerNorm”.
limb_mask (torch.Tensor) – A 2D spatial mask tensor outlining the solar limb, used for pixel-level loss computation. Defaults to None.
ids_limb_mask (torch.Tensor or list) – 1D array of patch indices that fall outside the solar disk. These are always masked out during training. Defaults to None.
loss_dict (dict) – Configuration dictionary specifying the loss space (e.g., “patch” or “pixel”), loss type (e.g., “mse”, “mae”), and other loss-specific hyperparameters. Defaults to an empty dict.

Methods

`forward`(imgs[, mask_ratio])	Define the computation performed at every call.
`get_intermediate_layers`(x, n[, mask_ratio, ...])	Extracts features from specified intermediate encoder layers.
`random_masking`(x, mask_ratio)	Perform per-sample random masking by per-sample shuffling.

forward(imgs, mask_ratio=0.5)[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_intermediate_layers(x: Tensor, n: list[int], mask_ratio: float = 0.0, reshape: bool = True, norm: bool = False)[source]

Extracts features from specified intermediate encoder layers.

This method passes the input through the encoder and captures the output of the Transformer blocks requested in the list n. It is highly useful for downstream tasks (like classification or segmentation) that require hierarchical feature representations. Modified from timm.VisionTransformer.get_intermediate_layers.

Parameters:

x (torch.Tensor) – Input batch of image sequences. Expected shape is (B, C, T, H, W).
n (list[int]) – Indices of the layers to return features from. Index 0 corresponds to the initial patch embeddings (before blocks), while 1 through depth correspond to the Transformer blocks.
mask_ratio (float, optional) – The fraction of patches to mask out during extraction. Defaults to 0.0 (no masking).
reshape (bool, optional) – If True, reshapes the flat sequence of patches back into a spatial grid layout (B, D, H, W) based on the patch grid size. Defaults to True.
norm (bool, optional) – If True, applies the model’s layer normalization to the extracted features before returning them. Defaults to False.

Returns:

A list of feature tensors extracted from the: layers specified in n.

Return type:

list[torch.Tensor]

random_masking(x, mask_ratio)[source]

Perform per-sample random masking by per-sample shuffling. Per-sample shuffling is done by argsort random noise.

Parameters:

x (torch.Tensor) – Input sequence of embedded patches. Expected shape is [N, L, D], where N is batch size, L is sequence length (number of patches), and D is embedding dimension.
mask_ratio (float) – The fraction of patches to mask out (e.g., 0.75). If a limb mask is provided, this fraction only applies to patches inside the solar disk.

Returns:

A tuple containing:

x_masked (torch.Tensor): The sequence of kept, visible patches.
mask (torch.Tensor): Binary mask indicating kept (0) vs. removed (1) patches.
ids_restore (torch.Tensor): Indices needed to unshuffle the patches back to their original spatial order.

Return type:

tuple

MAE Module

class sdofmv2.core.mae_module.MAE(*args: Any, **kwargs: Any)[source]

Bases: BaseModule

Masked Autoencoder (MAE) for 3D/Spatiotemporal data reconstruction.

This module implements a Vision Transformer-based autoencoder that learns representations by reconstructing masked patches of volumetric data. It supports custom ROI masking (limb masking) and automated metric tracking across training, validation, and testing phases.

Parameters:

img_size – Side length of the input image (assumed square).
chan_types – List of channel names/wavelengths for logging.
patch_size – Spatial size of the 2D patches.
num_frames – Total number of frames (temporal depth) in the input sequence.
tubelet_size – Temporal size of the 3D tubelets.
in_chans – Number of input data channels.
embed_dim – Embedding dimension for the encoder.
depth – Number of transformer layers in the encoder.
num_heads – Number of attention heads in the encoder.
decoder_embed_dim – Embedding dimension for the decoder.
decoder_depth – Number of transformer layers in the decoder.
decoder_num_heads – Number of attention heads in the decoder.
mlp_ratio – Expansion ratio for the MLP hidden dimension.
norm_layer – Type of normalization layer to use (e.g., “LayerNorm”).
masking_ratio – Fraction of patches to mask (0.0 to 1.0).
limb_mask – An optional binary ROI mask.
loss_dict – Configuration for reconstruction losses.
optimizer_dict – Configuration for the optimizer.
scheduler_dict – Configuration for the learning rate scheduler.
*args – Variable length argument list passed to BaseModule.
**kwargs – Arbitrary keyword arguments passed to BaseModule.

img_size

Spatial resolution of the input images (Height and Width).

Type:: int

patch_size

The side length of the square patches extracted from each frame.

Type:: int

tubelet_size

The temporal depth of each 3D patch (number of frames).

Type:: int

masking_ratio

The fraction of patches to be masked out during the forward pass (typically 0.75).

Type:: float

chan_types

A list of identifiers for each input channel (e.g., specific wavelengths), used for per-channel metric logging.

Type:: list[str]

limb_mask

A binary spatial mask of shape (H, W) used to restrict the model’s focus to specific ROIs.

Type:: Optional[torch.Tensor]

loss_dict

Configuration parameters and weights for the reconstruction loss functions.

Type:: dict

validation_metrics

A transient buffer that accumulates metric dictionaries from each validation_step to be processed at the epoch end.

Type:: list[dict]

test_results

A transient buffer that accumulates metric dictionaries from each test_step.

Type:: list[dict]

autoencoder

The core transformer architecture consisting of the encoder and decoder blocks.

Type:: MaskedAutoencoderViT3D

Methods

`forward`(x[, mask_ratio])	Perform a forward pass through the MAE.
`forward_encoder`(x, mask_ratio)	Perform a forward pass through the encoder only.
`on_test_epoch_end`()	Called at the end of the test epoch.
`on_validation_epoch_end`()	Called at the end of the validation epoch.
`test_step`(batch, batch_idx)	Perform a single test step.
`training_step`(batch, batch_idx)	Perform a single training step.
`validation_step`(batch, batch_idx)	Perform a single validation step.

forward(x, mask_ratio=None)[source]

Perform a forward pass through the MAE.

Parameters:

x (torch.Tensor) – Input images of shape (B, C, H, W).
mask_ratio (float, optional) – Fraction of patches to mask. If None, uses the default masking_ratio. Defaults to None.

Returns:

A tuple containing:

x_hat: Reconstructed images.
mask: The applied mask tensor.

Return type:

Tuple[torch.Tensor, torch.Tensor]

forward_encoder(x, mask_ratio)[source]

Perform a forward pass through the encoder only.

Parameters:

x (torch.Tensor) – Input images.
mask_ratio (float) – Fraction of patches to mask.

Returns:

Encoded features from the encoder.

Return type:

torch.Tensor

on_test_epoch_end()[source]

Called at the end of the test epoch.

Aggregates test metrics, saves them to a CSV file, logs to the logger, and clears the results buffer.

on_validation_epoch_end()[source]

Called at the end of the validation epoch.

Aggregates validation metrics, logs them to the logger (WandB or default), and clears the metrics buffer.

test_step(batch, batch_idx)[source]

Perform a single test step.

Parameters:

batch – A tuple containing (images, timestamps).
batch_idx – The index of the current batch.

training_step(batch, batch_idx)[source]

Perform a single training step.

Parameters:

batch – A tuple containing (images, timestamps).
batch_idx – The index of the current batch.

Returns:

The training loss value.

Return type:

torch.Tensor

validation_step(batch, batch_idx)[source]

Perform a single validation step.

Parameters:

batch – A tuple containing (images, timestamps).
batch_idx – The index of the current batch.

Principal Component Analysis

sdofmv2.core.pca_analysis.mapping_dense_to_rgb(feature_map, visible_patch_ids, n_components, img_size, grid_size, patch_size, pretrained)[source]: feature_map: torch.Tensor of shape [n_patches, dim] visible_patch_ids: masked token ids (which go to encoder network) n_components: output dimension after PCA img_size: input dimension grid_size: size of grid after patchfying patch_size: size of patch returns: RGB image (H, W, 3) in [0, 1]