Transformations

Overview

The piblin_jax.transform module provides a comprehensive framework for data processing and manipulation in piblin_jax. Built on functional programming principles, the transform system enables composable, reusable, and efficient data pipelines with JAX optimization support.

The transform architecture is designed around several key principles:

Hierarchy-Aware: Transforms operate at different levels of the data hierarchy (Dataset, Measurement, MeasurementSet, etc.). Each transform level knows how to process its corresponding data structure while preserving metadata and relationships.
Composability: Transforms can be chained together using the Pipeline pattern, allowing complex operations to be built from simple, well-tested components. Pipelines themselves are transforms, enabling recursive composition.
JAX Optimization: The framework supports lazy evaluation and JIT compilation for high-performance numerical processing. Transform pipelines can be compiled once and executed efficiently on GPU/TPU devices.
Extensibility: Custom transforms can be created by subclassing base transform classes or by wrapping arbitrary functions using LambdaTransform. The framework also supports dynamic transforms that adapt parameters based on input data.

The module includes built-in transforms for common operations like smoothing, interpolation, baseline correction, normalization, and calculus operations, as well as collection-level transforms for filtering, splitting, and merging datasets.

Quick Examples

Basic Transform Pipeline

Chain multiple transforms together for sequential processing:

from piblin_jax.transform import Pipeline
from piblin_jax.transform.dataset import SmoothingTransform, NormalizeTransform

# Create a pipeline
pipeline = Pipeline([
    SmoothingTransform(window_length=5, polyorder=2),
    NormalizeTransform(method="minmax")
])

# Apply to dataset
processed_dataset = pipeline.apply(dataset)

Lazy Evaluation for Performance

Use lazy pipelines for JIT compilation and deferred execution:

from piblin_jax.transform import LazyPipeline
from piblin_jax.transform.dataset import InterpolateTransform, BaselineTransform

# Create lazy pipeline
lazy_pipeline = LazyPipeline([
    InterpolateTransform(num_points=1000),
    BaselineTransform(method="polynomial", degree=2)
])

# Apply lazily (can be JIT compiled)
result = lazy_pipeline.apply(dataset)
processed = result.compute()  # Trigger computation

Custom Lambda Transforms

Create custom transforms from functions:

from piblin_jax.transform import LambdaTransform
import numpy as np

def custom_processing(dataset):
    # Custom logic
    new_y = np.log10(dataset.y)
    return dataset.replace(y=new_y)

# Wrap as transform
log_transform = LambdaTransform(custom_processing)

# Use in pipeline
pipeline = Pipeline([
    log_transform,
    NormalizeTransform()
])

API Reference

Module Contents

Transform system for piblin-jax.

This module provides the transform framework for data processing: - Base transform classes for each hierarchy level - Pipeline composition for sequential transforms - Lazy evaluation for JAX optimization - JIT compilation support - Region-based transforms for selective processing - Lambda transforms for user-defined functions - Dynamic transforms for data-driven parameters - Core dataset transforms (interpolation, smoothing, normalization, etc.) - Collection-level transforms (filtering, splitting, merging)

Hierarchy: - Transform: Abstract base class - DatasetTransform: Operates on Dataset objects - MeasurementTransform: Operates on Measurement objects - MeasurementSetTransform: Operates on MeasurementSet objects - ExperimentTransform: Operates on Experiment objects - ExperimentSetTransform: Operates on ExperimentSet objects

Pipeline: - Pipeline: Sequential composition of transforms - LazyPipeline: Pipeline with lazy evaluation

Region-Based: - RegionTransform: Base class for region-based transforms - RegionMultiplyTransform: Example region-based transform

Lambda and Dynamic: - LambdaTransform: Wrap arbitrary functions as transforms - DynamicTransform: Base class for data-driven transforms - AutoScaleTransform: Automatic data scaling - AutoBaselineTransform: Automatic baseline correction

Dataset Transforms:

dataset: Module containing core dataset-level transforms (Interpolation, smoothing, baseline correction, normalization, calculus)

Measurement Transforms:

measurement: Module containing collection-level transforms (FilterDatasets, FilterMeasurements, SplitByRegion, MergeReplicates)

class piblin_jax.transform.AutoBaselineTransform(n_points=10, method='first')[source]

Bases: DynamicTransform

Automatically subtract baseline computed from data.

Computes baseline from first/last N points or minimum value, then subtracts it from all data. Useful for removing offsets and drift in experimental measurements.

Parameters:

n_points (int, default 10) – Number of points to use for baseline computation (only used for ‘first’ and ‘last’ methods)
method (str, default 'first') – Method for computing baseline: - ‘first’: Mean of first n_points - ‘last’: Mean of last n_points - ‘min’: Minimum value in data

Examples

>>> from piblin_jax.transform import AutoBaselineTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> # Subtract baseline from first 10 points
>>> transform = AutoBaselineTransform(n_points=10, method='first')
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=np.arange(100),
...     dependent_variable_data=np.random.randn(100) + 5.0  # offset by 5
... )
>>> result = transform.apply_to(dataset)
>>>
>>> # Subtract minimum value
>>> transform = AutoBaselineTransform(method='min')

Notes

‘first’ method: Good for time series where initial values are baseline
‘last’ method: Good for measurements that return to baseline
‘min’ method: Good for ensuring all values are non-negative
Computed parameter: ‘baseline’ (value to subtract)

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(n_points=10, method='first')[source]

Initialize auto-baseline transform.

Parameters:

n_points (int) – Number of points for baseline
method (str) – Baseline computation method (‘first’, ‘last’, ‘min’)

class piblin_jax.transform.AutoScaleTransform(target_min=0.0, target_max=1.0)[source]

Bases: DynamicTransform

Automatically scale data to specified range.

Computes min/max from data and scales to target range. Useful for normalizing data to standard ranges like [0, 1] or [-1, 1].

Parameters:

target_min (float, default 0.0) – Target minimum value after scaling
target_max (float, default 1.0) – Target maximum value after scaling

Examples

>>> from piblin_jax.transform import AutoScaleTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> # Scale to [0, 1]
>>> transform = AutoScaleTransform()
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=np.array([1, 2, 3]),
...     dependent_variable_data=np.array([10, 20, 30])
... )
>>> result = transform.apply_to(dataset)
>>> # result.dependent_variable_data is now [0, 0.5, 1]
>>>
>>> # Scale to [-1, 1]
>>> transform = AutoScaleTransform(target_min=-1.0, target_max=1.0)

Notes

Handles constant data (where min == max) by setting all values to target_min
Preserves data ordering and relative differences
Computed parameters: ‘scale’ (multiplicative factor) and ‘offset’ (additive shift)

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(target_min=0.0, target_max=1.0)[source]

Initialize auto-scale transform.

Parameters:

target_min (float) – Target minimum value
target_max (float) – Target maximum value

class piblin_jax.transform.DatasetTransform[source]

Bases: Transform[Dataset]

Transform that operates on Dataset objects.

This is the lowest level transform in the hierarchy, operating on individual datasets (1D, 2D, 3D, etc.).

Examples

>>> from piblin_jax.transform.base import DatasetTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> class SmoothTransform(DatasetTransform):
...     def _apply(self, dataset):
...         # Smooth y-values with moving average
...         dataset.y_data = np.convolve(
...             dataset.y_data,
...             np.ones(3)/3,
...             mode='same'
...         )
...         return dataset
>>>
>>> dataset = OneDimensionalDataset(
...     x_data=np.array([1, 2, 3, 4, 5]),
...     y_data=np.array([1, 5, 2, 8, 3])
... )
>>> transform = SmoothTransform()
>>> smoothed = transform.apply_to(dataset)

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to Dataset.

Parameters:

target (Dataset) – Dataset to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed dataset

Return type:

Dataset

Raises:

TypeError – If target is not a Dataset instance

class piblin_jax.transform.DynamicTransform[source]

Bases: DatasetTransform

Base class for transforms with data-driven parameters.

Dynamic transforms compute parameters from the data itself, then apply transformations using those parameters. This enables adaptive processing where the transformation depends on data characteristics.

Subclasses must implement: - _compute_parameters(dataset): Extract parameters from data - _apply_with_parameters(dataset, params): Apply transformation

Parameters:: None

Examples

>>> from piblin_jax.transform.lambda_transform import DynamicTransform
>>> from piblin_jax.backend import jnp
>>>
>>> class CustomDynamicTransform(DynamicTransform):
...     def _compute_parameters(self, dataset):
...         y_data = dataset._dependent_variable_data
...         return {'mean': jnp.mean(y_data)}
...
...     def _apply_with_parameters(self, dataset, params):
...         dataset._dependent_variable_data -= params['mean']
...         return dataset

Notes

Parameters are computed fresh each time the transform is applied
Caching can be implemented in subclasses if needed
Only works with OneDimensionalDataset

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__()[source]

Initialize dynamic transform.

class piblin_jax.transform.ExperimentSetTransform[source]

Bases: Transform[ExperimentSet]

Transform that operates on ExperimentSet objects.

ExperimentSets are the top-level container for multiple experiments. This transform level can operate across all experiments in a set.

Examples

>>> from piblin_jax.transform.base import ExperimentSetTransform
>>>
>>> class CrossExperimentNormalizeTransform(ExperimentSetTransform):
...     def _apply(self, experiment_set):
...         # Normalize across all experiments
...         global_max = 0
...         for exp in experiment_set.experiments.values():
...             for mset in exp.measurement_sets.values():
...                 for meas in mset.measurements.values():
...                     for ds in meas.datasets.values():
...                         if hasattr(ds, 'y_data'):
...                             global_max = max(global_max, ds.y_data.max())
...
...         for exp in experiment_set.experiments.values():
...             for mset in exp.measurement_sets.values():
...                 for meas in mset.measurements.values():
...                     for ds in meas.datasets.values():
...                         if hasattr(ds, 'y_data'):
...                             ds.y_data = ds.y_data / global_max
...         return experiment_set

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to ExperimentSet.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to ExperimentSet.

Parameters:

target (ExperimentSet) – ExperimentSet to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed experiment set

Return type:

ExperimentSet

Raises:

TypeError – If target is not an ExperimentSet instance

class piblin_jax.transform.ExperimentTransform[source]

Bases: Transform[Experiment]

Transform that operates on Experiment objects.

Experiments contain multiple measurement sets. This transform level can operate across measurement sets within an experiment.

Examples

>>> from piblin_jax.transform.base import ExperimentTransform
>>>
>>> class TemperatureCorrectionTransform(ExperimentTransform):
...     def _apply(self, experiment):
...         # Apply temperature correction to all measurement sets
...         temp = experiment.metadata.get('temperature', 300)
...         correction = 1.0 + (temp - 300) * 0.001
...
...         for mset in experiment.measurement_sets.values():
...             for meas in mset.measurements.values():
...                 for ds in meas.datasets.values():
...                     if hasattr(ds, 'y_data'):
...                         ds.y_data = ds.y_data * correction
...         return experiment

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Experiment.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to Experiment.

Parameters:

target (Experiment) – Experiment to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed experiment

Return type:

Experiment

Raises:

TypeError – If target is not an Experiment instance

class piblin_jax.transform.LambdaTransform(func=None, use_x=False, jit_compile=True, lambda_func=None)[source]

Bases: DatasetTransform

Transform that wraps an arbitrary function.

Allows users to create custom transforms from simple functions without defining new classes. Supports both simple functions that operate on the dependent variable only, and functions that use both independent and dependent variables.

Parameters:

func (Callable) – Function to apply to dependent variable. Signature: func(y_data: ndarray) -> ndarray Or: func(x_data: ndarray, y_data: ndarray) -> ndarray
use_x (bool, default False) – If True, pass both x and y to func. If False, pass only y to func.
jit_compile (bool, default True) – If True, attempt JIT compilation of the function. Falls back to regular function if compilation fails.

Examples

>>> from piblin_jax.transform import LambdaTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> # Simple function
>>> transform = LambdaTransform(lambda y: y * 2.0)
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=np.array([1, 2, 3]),
...     dependent_variable_data=np.array([2, 4, 6])
... )
>>> result = transform.apply_to(dataset)
>>>
>>> # Function using x and y
>>> transform = LambdaTransform(
...     lambda x, y: y / x.max(),
...     use_x=True
... )
>>>
>>> # JAX-compatible function for JIT
>>> import jax.numpy as jnp
>>> transform = LambdaTransform(
...     lambda y: jnp.exp(y) * jnp.sin(y),
...     jit_compile=True
... )

Notes

Functions should use jnp instead of np for JIT compilation
JIT compilation improves performance but requires JAX-compatible code
If compilation fails, the transform falls back to the uncompiled function
Only works with OneDimensionalDataset

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(func=None, use_x=False, jit_compile=True, lambda_func=None)[source]

Initialize lambda transform.

Parameters:

func (Callable, optional) – Function to wrap as transform (works on arrays)
use_x (bool) – If True, pass both x and y to func
jit_compile (bool) – If True, attempt JIT compilation
lambda_func (Callable, optional) – Alias for func (for piblin compatibility)

Raises:

TypeError – If func is not callable
ValueError – If neither func nor lambda_func is provided

Notes

The function should operate on arrays, not dataset objects. For example: lambda y: y * 2.0 not lambda ds: ds.dependent_variable_data * 2.0

class piblin_jax.transform.LazyPipeline(transforms=None)[source]

Bases: Pipeline[T]

Pipeline with lazy evaluation support.

Unlike the standard Pipeline, LazyPipeline defers computation until the results are actually accessed. This allows JAX to optimize the entire computation graph as a single operation.

Lazy evaluation is triggered on: - Property access (e.g., result.y_data) - Method calls (e.g., result.visualize()) - Export operations (e.g., result.export())

Parameters:: transforms (list[Transform], optional) – Initial list of transforms to include in pipeline

Examples

>>> from piblin_jax.transform import LazyPipeline
>>>
>>> # Create lazy pipeline
>>> pipeline = LazyPipeline([
...     MultiplyTransform(2.0),
...     MultiplyTransform(3.0),
... ])
>>>
>>> # Apply to dataset (computation deferred)
>>> lazy_result = pipeline.apply_to(dataset, make_copy=True)
>>>
>>> # Access property (triggers computation)
>>> y_values = lazy_result.y_data  # Computation happens here

Notes

Lazy evaluation allows JAX to optimize the entire pipeline
First property access triggers computation and caches result
Subsequent accesses use cached result
More efficient than eager evaluation for complex pipelines

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`append`(transform)	Add transform to end of pipeline.
`apply_to`(target[, make_copy, ...])	Apply lazy pipeline to target.
`clear`()
`count`(value)
`extend`(values)	S.extend(iterable) -- extend sequence by appending elements from the iterable
`index`(value, [start, [stop]])	Raises ValueError if the value is not present.
`insert`(index, value)	Insert transform at index.
`invalidate_cache`()	Invalidate cached results.
`pop`([index])	Raise IndexError if list is empty or index is out of range.
`remove`(value)	S.remove(value) -- remove first occurrence of value.
`reverse`()	S.reverse() -- reverse IN PLACE

__init__(transforms=None)[source]

Initialize lazy pipeline.

Parameters:: transforms (list[Transform], optional) – Initial transforms to include in pipeline

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply lazy pipeline to target.

Computation is deferred until results are accessed. Returns a LazyResult wrapper that triggers computation on property/method access.

Parameters:

target (T) – Data structure to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through all transforms

Returns:

Wrapper that triggers computation on access

Return type:

LazyResult

Notes

The actual transformation is not performed until the result is accessed. This allows JAX to optimize the entire computation graph.

invalidate_cache()[source]

Invalidate cached results.

Forces recomputation on next access. Useful if transforms have been modified or parameters changed.

Examples

>>> pipeline = LazyPipeline([transform1, transform2])
>>> result = pipeline.apply_to(dataset)
>>> _ = result.y_data  # Triggers computation
>>>
>>> # Modify pipeline
>>> pipeline.append(transform3)
>>> pipeline.invalidate_cache()  # Force recomputation

class piblin_jax.transform.LazyResult(pipeline)[source]

Bases: object

Wrapper that triggers lazy computation on property access.

This class wraps the actual result and defers computation until properties or methods are accessed.

Parameters:: pipeline (LazyPipeline) – The lazy pipeline that will compute the result

Examples

>>> lazy_result = LazyResult(pipeline)
>>> # No computation yet
>>> y = lazy_result.y_data  # Triggers computation here
>>> # Subsequent accesses use cached result
>>> x = lazy_result.x_data  # No recomputation

Notes

This class is transparent to the user - it behaves like the actual result object, but triggers computation on first access.

__getattr__(name)[source]

Get attribute from computed result.

Triggers computation on first access.

Parameters:: name (str) – Attribute name
Returns:: Attribute value from computed result
Return type:: Any

__init__(pipeline)[source]

Initialize lazy result wrapper.

Parameters:: pipeline (LazyPipeline) – Pipeline that will compute the result

__repr__()[source]

String representation.

__setattr__(name, value)[source]

Set attribute on computed result.

Triggers computation if not already done.

Parameters:

name (str) – Attribute name
value (Any) – Attribute value

class piblin_jax.transform.MeasurementSetTransform[source]

Bases: Transform[MeasurementSet]

Transform that operates on MeasurementSet objects.

MeasurementSets contain multiple measurements. This transform level can operate across measurements (e.g., normalization relative to the entire set).

Examples

>>> from piblin_jax.transform.base import MeasurementSetTransform
>>>
>>> class GlobalNormalizeTransform(MeasurementSetTransform):
...     def _apply(self, measurement_set):
...         # Find global max across all measurements
...         global_max = 0
...         for meas in measurement_set.measurements.values():
...             for ds in meas.datasets.values():
...                 if hasattr(ds, 'y_data'):
...                     global_max = max(global_max, ds.y_data.max())
...
...         # Normalize all datasets by global max
...         for meas in measurement_set.measurements.values():
...             for ds in meas.datasets.values():
...                 if hasattr(ds, 'y_data'):
...                     ds.y_data = ds.y_data / global_max
...         return measurement_set

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to MeasurementSet.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to MeasurementSet.

Parameters:

target (MeasurementSet) – MeasurementSet to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed measurement set

Return type:

MeasurementSet

Raises:

TypeError – If target is not a MeasurementSet instance

class piblin_jax.transform.MeasurementTransform[source]

Bases: Transform[Measurement]

Transform that operates on Measurement objects.

Measurements contain multiple datasets with associated metadata. This transform level can operate on all datasets within a measurement.

Examples

>>> from piblin_jax.transform.base import MeasurementTransform
>>>
>>> class NormalizeTransform(MeasurementTransform):
...     def _apply(self, measurement):
...         # Normalize all datasets in measurement
...         for dataset in measurement.datasets.values():
...             if hasattr(dataset, 'y_data'):
...                 max_val = dataset.y_data.max()
...                 dataset.y_data = dataset.y_data / max_val
...         return measurement

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Measurement.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to Measurement.

Parameters:

target (Measurement) – Measurement to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed measurement

Return type:

Measurement

Raises:

TypeError – If target is not a Measurement instance

class piblin_jax.transform.Pipeline(transforms=None)[source]

Bases: Transform[T], MutableSequence[Transform[T]]

Pipeline for composing multiple transforms sequentially.

A pipeline applies a sequence of transforms to data in order. It implements the MutableSequence interface, so it can be used like a list of transforms.

The pipeline is memory-efficient: when make_copy=True, it creates a single copy at entry, then applies all transforms in-place.

Parameters:: transforms (list[Transform], optional) – Initial list of transforms to include in pipeline

Examples

>>> from piblin_jax.transform import Pipeline, DatasetTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> # Create transforms
>>> class MultiplyTransform(DatasetTransform):
...     def __init__(self, factor):
...         super().__init__()
...         self.factor = factor
...
...     def _apply(self, dataset):
...         dataset.y_data = dataset.y_data * self.factor
...         return dataset
>>>
>>> # Create pipeline
>>> pipeline = Pipeline([
...     MultiplyTransform(2.0),
...     MultiplyTransform(3.0),  # Net effect: 6x
... ])
>>>
>>> # Apply to dataset
>>> dataset = OneDimensionalDataset(
...     x_data=np.array([1, 2, 3]),
...     y_data=np.array([2, 4, 6])
... )
>>> result = pipeline.apply_to(dataset, make_copy=True)
>>> # result.y_data is now [12, 24, 36]

Notes

Pipelines can be nested: a pipeline can contain other pipelines
Only one copy is made at entry, then all transforms apply in-place
This is much more memory efficient than copying at each step
Use lazy evaluation for even better performance with JAX

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`append`(transform)	Add transform to end of pipeline.
`apply_to`(target[, make_copy, ...])	Apply pipeline to target.
`clear`()
`count`(value)
`extend`(values)	S.extend(iterable) -- extend sequence by appending elements from the iterable
`index`(value, [start, [stop]])	Raises ValueError if the value is not present.
`insert`(index, value)	Insert transform at index.
`pop`([index])	Raise IndexError if list is empty or index is out of range.
`remove`(value)	S.remove(value) -- remove first occurrence of value.
`reverse`()	S.reverse() -- reverse IN PLACE

__delitem__(index)[source]

Delete transform(s) at index.

Parameters:: index (int or slice) – Index or slice to delete

Examples

>>> pipeline = Pipeline([t1, t2, t3])
>>> del pipeline[0]  # Remove first transform
>>> del pipeline[1:]  # Remove all but first transform

__getitem__(index)[source]

Get transform(s) at index.

Parameters:: index (int or slice) – Index or slice to retrieve
Returns:: Transform at index, or list of transforms for slice
Return type:: Transform or list[Transform]

Examples

>>> pipeline = Pipeline([t1, t2, t3])
>>> pipeline[0]  # Get first transform
>>> pipeline[1:3]  # Get slice of transforms

__init__(transforms=None)[source]

Initialize pipeline.

Parameters:: transforms (list[Transform], optional) – Initial transforms to include in pipeline

__len__()[source]

Get number of transforms in pipeline.

Returns:: Number of transforms
Return type:: int

Examples

>>> pipeline = Pipeline([t1, t2, t3])
>>> len(pipeline)
3

__repr__()[source]

String representation of pipeline.

Returns:: String representation showing number of transforms
Return type:: str

Examples

>>> pipeline = Pipeline([t1, t2, t3])
>>> repr(pipeline)
'Pipeline(3 transforms)'

__setitem__(index, value)[source]

Set transform(s) at index.

Parameters:

index (int or slice) – Index or slice to set
value (Transform or list[Transform]) – Transform(s) to set at index

Raises:

TypeError – If value is not a Transform instance

Examples

>>> pipeline = Pipeline([t1, t2, t3])
>>> pipeline[0] = new_transform  # Replace first transform

__str__()[source]

Human-readable string representation.

Returns:: String showing all transforms in pipeline
Return type:: str

append(transform)[source]

Add transform to end of pipeline.

Parameters:: transform (Transform) – Transform to append
Raises:: TypeError – If transform is not a Transform instance

Examples

>>> pipeline = Pipeline([t1, t2])
>>> pipeline.append(t3)  # Add t3 to end

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply pipeline to target.

Only makes copy once at entry, then applies all transforms in-place for memory efficiency.

Parameters:

target (T) – Data structure to transform
make_copy (bool, default True) – If True, create one copy at entry before applying transforms
propagate_uncertainty (bool, default False) – If True and target has uncertainty, propagate through all transforms

Returns:

Transformed data structure

Return type:

T

Notes

This is much more efficient than copying at each transform step. The single copy at entry ensures immutability while minimizing memory overhead.

When propagate_uncertainty=True, uncertainty is efficiently propagated through the entire pipeline in a single pass.

insert(index, value)[source]

Insert transform at index.

Parameters:

index (int) – Index at which to insert transform
value (Transform) – Transform to insert

Raises:

TypeError – If value is not a Transform instance

Examples

>>> pipeline = Pipeline([t1, t3])
>>> pipeline.insert(1, t2)  # Insert t2 between t1 and t3

class piblin_jax.transform.RegionMultiplyTransform(region, factor)[source]

Bases: RegionTransform

Example transform: Multiply region by a factor.

This is a concrete implementation of RegionTransform that multiplies the dependent variable within the specified region(s) by a constant factor.

Parameters:

region (LinearRegion | CompoundRegion) – Region(s) to transform
factor (float) – Multiplication factor

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.data.roi import LinearRegion, CompoundRegion
>>> from piblin_jax.transform.region import RegionMultiplyTransform
>>> # Single region example
>>> x_data = np.linspace(0, 10, 11)
>>> y_data = np.ones(11)
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x_data,
...     dependent_variable_data=y_data
... )
>>> region = LinearRegion(x_min=3.0, x_max=7.0)
>>> transform = RegionMultiplyTransform(region, factor=2.0)
>>> result = transform.apply_to(dataset, make_copy=True)
>>> # Points in [3, 7] are multiplied by 2.0, others unchanged
>>> # Multiple disjoint regions example
>>> region1 = LinearRegion(x_min=1.0, x_max=2.0)
>>> region2 = LinearRegion(x_min=8.0, x_max=9.0)
>>> compound = CompoundRegion([region1, region2])
>>> transform = RegionMultiplyTransform(compound, factor=0.5)
>>> result = transform.apply_to(dataset, make_copy=True)
>>> # Points in [1, 2] OR [8, 9] are multiplied by 0.5

Notes

This is a simple example transform for demonstration and testing. More complex transforms can be implemented following the same pattern.

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(region, factor)[source]

Initialize RegionMultiplyTransform.

Parameters:

region (LinearRegion | CompoundRegion) – Region(s) to transform
factor (float) – Multiplication factor

class piblin_jax.transform.RegionTransform(region)[source]

Bases: DatasetTransform

Base class for transforms that operate on specific regions.

RegionTransform applies a transformation only within specified region(s), preserving data outside the regions. This enables selective processing of data based on independent variable ranges.

Subclasses should implement the _apply_to_region() method to define the specific transformation to apply within the region(s).

Parameters:: region (LinearRegion | CompoundRegion) – Region(s) to transform

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.data.roi import LinearRegion
>>> from piblin_jax.transform.region import RegionMultiplyTransform
>>> # Create dataset
>>> x_data = np.array([0, 1, 2, 3, 4, 5])
>>> y_data = np.array([1, 1, 1, 1, 1, 1])
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x_data,
...     dependent_variable_data=y_data
... )
>>> # Define region and transform
>>> region = LinearRegion(x_min=2.0, x_max=4.0)
>>> transform = RegionMultiplyTransform(region, factor=2.0)
>>> # Apply transform (only region [2, 4] is multiplied)
>>> result = transform.apply_to(dataset, make_copy=True)
>>> result.dependent_variable_data
array([1., 1., 2., 2., 2., 1.])

Notes

Currently optimized for OneDimensionalDataset
Data outside regions is preserved exactly
Transformations use NumPy arrays internally for compatibility
Region masks are generated from the independent variable

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(region)[source]

Initialize RegionTransform.

Parameters:: region (LinearRegion | CompoundRegion) – Region(s) to transform

class piblin_jax.transform.Transform[source]

Bases: ABC, Generic

Abstract base class for all transforms.

Transforms operate on data structures at various hierarchy levels: - Dataset level: individual measurements - Measurement level: collections of datasets - MeasurementSet level: collections of measurements - Experiment level: collections of measurement sets - ExperimentSet level: collections of experiments

Transforms support: - Lazy evaluation: computation deferred until results accessed - JIT compilation: automatic compilation with JAX backend - Immutability: optional copying for functional programming style - Pipeline composition: chaining multiple transforms

Examples

>>> from piblin_jax.transform.base import DatasetTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> class MultiplyTransform(DatasetTransform):
...     def __init__(self, factor):
...         super().__init__()
...         self.factor = factor
...
...     def _apply(self, dataset):
...         dataset.y_data = dataset.y_data * self.factor
...         return dataset
>>>
>>> dataset = OneDimensionalDataset(
...     x_data=np.array([1, 2, 3]),
...     y_data=np.array([2, 4, 6])
... )
>>> transform = MultiplyTransform(2.0)
>>> result = transform.apply_to(dataset, make_copy=True)

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to target data structure.

__call__(target, make_copy=True, propagate_uncertainty=False)[source]

Shorthand for apply_to.

Allows using transform objects as callables: >>> result = transform(data)

instead of: >>> result = transform.apply_to(data)

Parameters:

target (T) – Data structure to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed data structure

Return type:

T

__init__()[source]

Initialize transform with default settings.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to target data structure.

This is the main public interface for applying transforms. It handles copying (if requested) and delegates to the subclass-specific _apply method.

Parameters:

target (T) – Data structure to transform (Dataset, Measurement, etc.)
make_copy (bool, default True) – If True, create a deep copy before transforming (default). If False, transform in-place (more memory efficient but modifies the original object).
propagate_uncertainty (bool, default False) – If True and target has uncertainty samples, propagate uncertainty through the transform using Monte Carlo sampling.

Returns:

Transformed data structure

Return type:

T

Examples

>>> transform = MyTransform()
>>> # Create copy and transform
>>> result = transform.apply_to(data, make_copy=True)
>>> # Transform in-place (memory efficient)
>>> result = transform.apply_to(data, make_copy=False)
>>> # With uncertainty propagation
>>> result = transform.apply_to(data_with_unc, propagate_uncertainty=True)

Notes

make_copy=True ensures functional programming style (immutability)
make_copy=False is more memory efficient for large datasets
In pipelines, only the first copy is made at entry
Uncertainty propagation applies the transform to each uncertainty sample

piblin_jax.transform.jit_transform(func)[source]

Decorator to enable JIT compilation for transform _apply methods.

This decorator automatically compiles transform methods using JAX’s JIT compiler when the JAX backend is available. For NumPy backend, it gracefully falls back to the uncompiled function.

Parameters:: func (Callable) – The _apply method to compile
Returns:: JIT-compiled function (JAX) or original function (NumPy)
Return type:: Callable

Examples

>>> class MyTransform(DatasetTransform):
...     @jit_transform
...     def _apply(self, dataset):
...         # This will be JIT compiled with JAX
...         dataset.y_data = dataset.y_data * 2.0
...         return dataset

Notes

JIT compilation can significantly improve performance
Only works with JAX backend (graceful fallback for NumPy)
First call may be slow (compilation), subsequent calls are fast
Static arguments should be marked appropriately

Base Transform Classes

Transform base classes for piblin-jax.

This module provides the abstract base class and hierarchy for all transforms: - Transform: Abstract base class for all transforms - DatasetTransform: Operates on Dataset objects - MeasurementTransform: Operates on Measurement objects - MeasurementSetTransform: Operates on MeasurementSet objects - ExperimentTransform: Operates on Experiment objects - ExperimentSetTransform: Operates on ExperimentSet objects

Transforms support: - Lazy evaluation (computation deferred until needed) - JIT compilation (via JAX backend) - Immutability (via make_copy parameter) - Pipeline composition (via Pipeline class)

class piblin_jax.transform.base.DatasetTransform[source]

Bases: Transform[Dataset]

Transform that operates on Dataset objects.

This is the lowest level transform in the hierarchy, operating on individual datasets (1D, 2D, 3D, etc.).

Examples

>>> from piblin_jax.transform.base import DatasetTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> class SmoothTransform(DatasetTransform):
...     def _apply(self, dataset):
...         # Smooth y-values with moving average
...         dataset.y_data = np.convolve(
...             dataset.y_data,
...             np.ones(3)/3,
...             mode='same'
...         )
...         return dataset
>>>
>>> dataset = OneDimensionalDataset(
...     x_data=np.array([1, 2, 3, 4, 5]),
...     y_data=np.array([1, 5, 2, 8, 3])
... )
>>> transform = SmoothTransform()
>>> smoothed = transform.apply_to(dataset)

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to Dataset.

Parameters:

target (Dataset) – Dataset to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed dataset

Return type:

Dataset

Raises:

TypeError – If target is not a Dataset instance

class piblin_jax.transform.base.ExperimentSetTransform[source]

Bases: Transform[ExperimentSet]

Transform that operates on ExperimentSet objects.

ExperimentSets are the top-level container for multiple experiments. This transform level can operate across all experiments in a set.

Examples

>>> from piblin_jax.transform.base import ExperimentSetTransform
>>>
>>> class CrossExperimentNormalizeTransform(ExperimentSetTransform):
...     def _apply(self, experiment_set):
...         # Normalize across all experiments
...         global_max = 0
...         for exp in experiment_set.experiments.values():
...             for mset in exp.measurement_sets.values():
...                 for meas in mset.measurements.values():
...                     for ds in meas.datasets.values():
...                         if hasattr(ds, 'y_data'):
...                             global_max = max(global_max, ds.y_data.max())
...
...         for exp in experiment_set.experiments.values():
...             for mset in exp.measurement_sets.values():
...                 for meas in mset.measurements.values():
...                     for ds in meas.datasets.values():
...                         if hasattr(ds, 'y_data'):
...                             ds.y_data = ds.y_data / global_max
...         return experiment_set

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to ExperimentSet.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to ExperimentSet.

Parameters:

target (ExperimentSet) – ExperimentSet to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed experiment set

Return type:

ExperimentSet

Raises:

TypeError – If target is not an ExperimentSet instance

class piblin_jax.transform.base.ExperimentTransform[source]

Bases: Transform[Experiment]

Transform that operates on Experiment objects.

Experiments contain multiple measurement sets. This transform level can operate across measurement sets within an experiment.

Examples

>>> from piblin_jax.transform.base import ExperimentTransform
>>>
>>> class TemperatureCorrectionTransform(ExperimentTransform):
...     def _apply(self, experiment):
...         # Apply temperature correction to all measurement sets
...         temp = experiment.metadata.get('temperature', 300)
...         correction = 1.0 + (temp - 300) * 0.001
...
...         for mset in experiment.measurement_sets.values():
...             for meas in mset.measurements.values():
...                 for ds in meas.datasets.values():
...                     if hasattr(ds, 'y_data'):
...                         ds.y_data = ds.y_data * correction
...         return experiment

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Experiment.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to Experiment.

Parameters:

target (Experiment) – Experiment to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed experiment

Return type:

Experiment

Raises:

TypeError – If target is not an Experiment instance

class piblin_jax.transform.base.MeasurementSetTransform[source]

Bases: Transform[MeasurementSet]

Transform that operates on MeasurementSet objects.

MeasurementSets contain multiple measurements. This transform level can operate across measurements (e.g., normalization relative to the entire set).

Examples

>>> from piblin_jax.transform.base import MeasurementSetTransform
>>>
>>> class GlobalNormalizeTransform(MeasurementSetTransform):
...     def _apply(self, measurement_set):
...         # Find global max across all measurements
...         global_max = 0
...         for meas in measurement_set.measurements.values():
...             for ds in meas.datasets.values():
...                 if hasattr(ds, 'y_data'):
...                     global_max = max(global_max, ds.y_data.max())
...
...         # Normalize all datasets by global max
...         for meas in measurement_set.measurements.values():
...             for ds in meas.datasets.values():
...                 if hasattr(ds, 'y_data'):
...                     ds.y_data = ds.y_data / global_max
...         return measurement_set

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to MeasurementSet.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to MeasurementSet.

Parameters:

target (MeasurementSet) – MeasurementSet to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed measurement set

Return type:

MeasurementSet

Raises:

TypeError – If target is not a MeasurementSet instance

class piblin_jax.transform.base.MeasurementTransform[source]

Bases: Transform[Measurement]

Transform that operates on Measurement objects.

Measurements contain multiple datasets with associated metadata. This transform level can operate on all datasets within a measurement.

Examples

>>> from piblin_jax.transform.base import MeasurementTransform
>>>
>>> class NormalizeTransform(MeasurementTransform):
...     def _apply(self, measurement):
...         # Normalize all datasets in measurement
...         for dataset in measurement.datasets.values():
...             if hasattr(dataset, 'y_data'):
...                 max_val = dataset.y_data.max()
...                 dataset.y_data = dataset.y_data / max_val
...         return measurement

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Measurement.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to Measurement.

Parameters:

target (Measurement) – Measurement to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed measurement

Return type:

Measurement

Raises:

TypeError – If target is not a Measurement instance

class piblin_jax.transform.base.Transform[source]

Bases: ABC, Generic

Abstract base class for all transforms.

Transforms operate on data structures at various hierarchy levels: - Dataset level: individual measurements - Measurement level: collections of datasets - MeasurementSet level: collections of measurements - Experiment level: collections of measurement sets - ExperimentSet level: collections of experiments

Transforms support: - Lazy evaluation: computation deferred until results accessed - JIT compilation: automatic compilation with JAX backend - Immutability: optional copying for functional programming style - Pipeline composition: chaining multiple transforms

Examples

>>> from piblin_jax.transform.base import DatasetTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> class MultiplyTransform(DatasetTransform):
...     def __init__(self, factor):
...         super().__init__()
...         self.factor = factor
...
...     def _apply(self, dataset):
...         dataset.y_data = dataset.y_data * self.factor
...         return dataset
>>>
>>> dataset = OneDimensionalDataset(
...     x_data=np.array([1, 2, 3]),
...     y_data=np.array([2, 4, 6])
... )
>>> transform = MultiplyTransform(2.0)
>>> result = transform.apply_to(dataset, make_copy=True)

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to target data structure.

__init__()[source]

Initialize transform with default settings.

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply transform to target data structure.

This is the main public interface for applying transforms. It handles copying (if requested) and delegates to the subclass-specific _apply method.

Parameters:

target (T) – Data structure to transform (Dataset, Measurement, etc.)
make_copy (bool, default True) – If True, create a deep copy before transforming (default). If False, transform in-place (more memory efficient but modifies the original object).
propagate_uncertainty (bool, default False) – If True and target has uncertainty samples, propagate uncertainty through the transform using Monte Carlo sampling.

Returns:

Transformed data structure

Return type:

T

Examples

>>> transform = MyTransform()
>>> # Create copy and transform
>>> result = transform.apply_to(data, make_copy=True)
>>> # Transform in-place (memory efficient)
>>> result = transform.apply_to(data, make_copy=False)
>>> # With uncertainty propagation
>>> result = transform.apply_to(data_with_unc, propagate_uncertainty=True)

Notes

make_copy=True ensures functional programming style (immutability)
make_copy=False is more memory efficient for large datasets
In pipelines, only the first copy is made at entry
Uncertainty propagation applies the transform to each uncertainty sample

__call__(target, make_copy=True, propagate_uncertainty=False)[source]

Shorthand for apply_to.

Allows using transform objects as callables: >>> result = transform(data)

instead of: >>> result = transform.apply_to(data)

Parameters:

target (T) – Data structure to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through transform

Returns:

Transformed data structure

Return type:

T

piblin_jax.transform.base.jit_transform(func)[source]

Decorator to enable JIT compilation for transform _apply methods.

This decorator automatically compiles transform methods using JAX’s JIT compiler when the JAX backend is available. For NumPy backend, it gracefully falls back to the uncompiled function.

Parameters:: func (Callable) – The _apply method to compile
Returns:: JIT-compiled function (JAX) or original function (NumPy)
Return type:: Callable

Examples

>>> class MyTransform(DatasetTransform):
...     @jit_transform
...     def _apply(self, dataset):
...         # This will be JIT compiled with JAX
...         dataset.y_data = dataset.y_data * 2.0
...         return dataset

Notes

JIT compilation can significantly improve performance
Only works with JAX backend (graceful fallback for NumPy)
First call may be slow (compilation), subsequent calls are fast
Static arguments should be marked appropriately

Pipeline

Pipeline composition for transforms.

This module provides pipeline functionality for composing multiple transforms: - Pipeline: Sequential composition of transforms - LazyPipeline: Pipeline with lazy evaluation support

Pipelines support: - MutableSequence interface (list-like operations) - Sequential transform application - Single copy at entry (memory efficient) - Lazy evaluation (computation deferred) - JIT compilation (entire pipeline)

class piblin_jax.transform.pipeline.LazyPipeline(transforms=None)[source]

Bases: Pipeline[T]

Pipeline with lazy evaluation support.

Unlike the standard Pipeline, LazyPipeline defers computation until the results are actually accessed. This allows JAX to optimize the entire computation graph as a single operation.

Lazy evaluation is triggered on: - Property access (e.g., result.y_data) - Method calls (e.g., result.visualize()) - Export operations (e.g., result.export())

Parameters:: transforms (list[Transform], optional) – Initial list of transforms to include in pipeline

Examples

>>> from piblin_jax.transform import LazyPipeline
>>>
>>> # Create lazy pipeline
>>> pipeline = LazyPipeline([
...     MultiplyTransform(2.0),
...     MultiplyTransform(3.0),
... ])
>>>
>>> # Apply to dataset (computation deferred)
>>> lazy_result = pipeline.apply_to(dataset, make_copy=True)
>>>
>>> # Access property (triggers computation)
>>> y_values = lazy_result.y_data  # Computation happens here

Notes

Lazy evaluation allows JAX to optimize the entire pipeline
First property access triggers computation and caches result
Subsequent accesses use cached result
More efficient than eager evaluation for complex pipelines

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`append`(transform)	Add transform to end of pipeline.
`apply_to`(target[, make_copy, ...])	Apply lazy pipeline to target.
`clear`()
`count`(value)
`extend`(values)	S.extend(iterable) -- extend sequence by appending elements from the iterable
`index`(value, [start, [stop]])	Raises ValueError if the value is not present.
`insert`(index, value)	Insert transform at index.
`invalidate_cache`()	Invalidate cached results.
`pop`([index])	Raise IndexError if list is empty or index is out of range.
`remove`(value)	S.remove(value) -- remove first occurrence of value.
`reverse`()	S.reverse() -- reverse IN PLACE

__init__(transforms=None)[source]

Initialize lazy pipeline.

Parameters:: transforms (list[Transform], optional) – Initial transforms to include in pipeline

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply lazy pipeline to target.

Computation is deferred until results are accessed. Returns a LazyResult wrapper that triggers computation on property/method access.

Parameters:

target (T) – Data structure to transform
make_copy (bool, default True) – If True, create copy before transforming
propagate_uncertainty (bool, default False) – If True, propagate uncertainty through all transforms

Returns:

Wrapper that triggers computation on access

Return type:

LazyResult

Notes

The actual transformation is not performed until the result is accessed. This allows JAX to optimize the entire computation graph.

invalidate_cache()[source]

Invalidate cached results.

Forces recomputation on next access. Useful if transforms have been modified or parameters changed.

Examples

>>> pipeline = LazyPipeline([transform1, transform2])
>>> result = pipeline.apply_to(dataset)
>>> _ = result.y_data  # Triggers computation
>>>
>>> # Modify pipeline
>>> pipeline.append(transform3)
>>> pipeline.invalidate_cache()  # Force recomputation

class piblin_jax.transform.pipeline.LazyResult(pipeline)[source]

Bases: object

Wrapper that triggers lazy computation on property access.

This class wraps the actual result and defers computation until properties or methods are accessed.

Parameters:: pipeline (LazyPipeline) – The lazy pipeline that will compute the result

Examples

>>> lazy_result = LazyResult(pipeline)
>>> # No computation yet
>>> y = lazy_result.y_data  # Triggers computation here
>>> # Subsequent accesses use cached result
>>> x = lazy_result.x_data  # No recomputation

Notes

This class is transparent to the user - it behaves like the actual result object, but triggers computation on first access.

__init__(pipeline)[source]

Initialize lazy result wrapper.

Parameters:: pipeline (LazyPipeline) – Pipeline that will compute the result

__getattr__(name)[source]

Get attribute from computed result.

Triggers computation on first access.

Parameters:: name (str) – Attribute name
Returns:: Attribute value from computed result
Return type:: Any

__setattr__(name, value)[source]

Set attribute on computed result.

Triggers computation if not already done.

Parameters:

name (str) – Attribute name
value (Any) – Attribute value

__repr__()[source]

String representation.

class piblin_jax.transform.pipeline.Pipeline(transforms=None)[source]

Bases: Transform[T], MutableSequence[Transform[T]]

Pipeline for composing multiple transforms sequentially.

A pipeline applies a sequence of transforms to data in order. It implements the MutableSequence interface, so it can be used like a list of transforms.

The pipeline is memory-efficient: when make_copy=True, it creates a single copy at entry, then applies all transforms in-place.

Parameters:: transforms (list[Transform], optional) – Initial list of transforms to include in pipeline

Examples

>>> from piblin_jax.transform import Pipeline, DatasetTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> # Create transforms
>>> class MultiplyTransform(DatasetTransform):
...     def __init__(self, factor):
...         super().__init__()
...         self.factor = factor
...
...     def _apply(self, dataset):
...         dataset.y_data = dataset.y_data * self.factor
...         return dataset
>>>
>>> # Create pipeline
>>> pipeline = Pipeline([
...     MultiplyTransform(2.0),
...     MultiplyTransform(3.0),  # Net effect: 6x
... ])
>>>
>>> # Apply to dataset
>>> dataset = OneDimensionalDataset(
...     x_data=np.array([1, 2, 3]),
...     y_data=np.array([2, 4, 6])
... )
>>> result = pipeline.apply_to(dataset, make_copy=True)
>>> # result.y_data is now [12, 24, 36]

Notes

Pipelines can be nested: a pipeline can contain other pipelines
Only one copy is made at entry, then all transforms apply in-place
This is much more memory efficient than copying at each step
Use lazy evaluation for even better performance with JAX

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`append`(transform)	Add transform to end of pipeline.
`apply_to`(target[, make_copy, ...])	Apply pipeline to target.
`clear`()
`count`(value)
`extend`(values)	S.extend(iterable) -- extend sequence by appending elements from the iterable
`index`(value, [start, [stop]])	Raises ValueError if the value is not present.
`insert`(index, value)	Insert transform at index.
`pop`([index])	Raise IndexError if list is empty or index is out of range.
`remove`(value)	S.remove(value) -- remove first occurrence of value.
`reverse`()	S.reverse() -- reverse IN PLACE

__init__(transforms=None)[source]

Initialize pipeline.

Parameters:: transforms (list[Transform], optional) – Initial transforms to include in pipeline

apply_to(target, make_copy=True, propagate_uncertainty=False)[source]

Apply pipeline to target.

Only makes copy once at entry, then applies all transforms in-place for memory efficiency.

Parameters:

target (T) – Data structure to transform
make_copy (bool, default True) – If True, create one copy at entry before applying transforms
propagate_uncertainty (bool, default False) – If True and target has uncertainty, propagate through all transforms

Returns:

Transformed data structure

Return type:

T

Notes

This is much more efficient than copying at each transform step. The single copy at entry ensures immutability while minimizing memory overhead.

When propagate_uncertainty=True, uncertainty is efficiently propagated through the entire pipeline in a single pass.

__getitem__(index: int) → Transform[T][source]

__getitem__(index: slice) → list[Transform[T]]

Get transform(s) at index.

Parameters:: index (int or slice) – Index or slice to retrieve
Returns:: Transform at index, or list of transforms for slice
Return type:: Transform or list[Transform]

Examples

>>> pipeline = Pipeline([t1, t2, t3])
>>> pipeline[0]  # Get first transform
>>> pipeline[1:3]  # Get slice of transforms

__setitem__(index: int, value: Transform[T]) → None[source]

__setitem__(index: slice, value: Iterable[Transform[T]]) → None

Set transform(s) at index.

Parameters:

index (int or slice) – Index or slice to set
value (Transform or list[Transform]) – Transform(s) to set at index

Raises:

TypeError – If value is not a Transform instance

Examples

>>> pipeline = Pipeline([t1, t2, t3])
>>> pipeline[0] = new_transform  # Replace first transform

__delitem__(index)[source]

Delete transform(s) at index.

Parameters:: index (int or slice) – Index or slice to delete

Examples

>>> pipeline = Pipeline([t1, t2, t3])
>>> del pipeline[0]  # Remove first transform
>>> del pipeline[1:]  # Remove all but first transform

__len__()[source]

Get number of transforms in pipeline.

Returns:: Number of transforms
Return type:: int

Examples

>>> pipeline = Pipeline([t1, t2, t3])
>>> len(pipeline)
3

insert(index, value)[source]

Insert transform at index.

Parameters:

index (int) – Index at which to insert transform
value (Transform) – Transform to insert

Raises:

TypeError – If value is not a Transform instance

Examples

>>> pipeline = Pipeline([t1, t3])
>>> pipeline.insert(1, t2)  # Insert t2 between t1 and t3

append(transform)[source]

Add transform to end of pipeline.

Parameters:: transform (Transform) – Transform to append
Raises:: TypeError – If transform is not a Transform instance

Examples

>>> pipeline = Pipeline([t1, t2])
>>> pipeline.append(t3)  # Add t3 to end

__repr__()[source]

String representation of pipeline.

Returns:: String representation showing number of transforms
Return type:: str

Examples

>>> pipeline = Pipeline([t1, t2, t3])
>>> repr(pipeline)
'Pipeline(3 transforms)'

__str__()[source]

Human-readable string representation.

Returns:: String showing all transforms in pipeline
Return type:: str

Region-Based Transforms

Region-based transforms for piblin-jax.

This module provides transforms that operate on specific regions of data: - RegionTransform: Base class for region-based transforms - RegionMultiplyTransform: Example concrete implementation

Region-based transforms apply transformations only within specified regions, preserving data outside those regions. This is useful for selective processing such as background subtraction in specific spectral ranges or local smoothing.

class piblin_jax.transform.region.RegionMultiplyTransform(region, factor)[source]

Bases: RegionTransform

Example transform: Multiply region by a factor.

This is a concrete implementation of RegionTransform that multiplies the dependent variable within the specified region(s) by a constant factor.

Parameters:

region (LinearRegion | CompoundRegion) – Region(s) to transform
factor (float) – Multiplication factor

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.data.roi import LinearRegion, CompoundRegion
>>> from piblin_jax.transform.region import RegionMultiplyTransform
>>> # Single region example
>>> x_data = np.linspace(0, 10, 11)
>>> y_data = np.ones(11)
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x_data,
...     dependent_variable_data=y_data
... )
>>> region = LinearRegion(x_min=3.0, x_max=7.0)
>>> transform = RegionMultiplyTransform(region, factor=2.0)
>>> result = transform.apply_to(dataset, make_copy=True)
>>> # Points in [3, 7] are multiplied by 2.0, others unchanged
>>> # Multiple disjoint regions example
>>> region1 = LinearRegion(x_min=1.0, x_max=2.0)
>>> region2 = LinearRegion(x_min=8.0, x_max=9.0)
>>> compound = CompoundRegion([region1, region2])
>>> transform = RegionMultiplyTransform(compound, factor=0.5)
>>> result = transform.apply_to(dataset, make_copy=True)
>>> # Points in [1, 2] OR [8, 9] are multiplied by 0.5

Notes

This is a simple example transform for demonstration and testing. More complex transforms can be implemented following the same pattern.

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(region, factor)[source]

Initialize RegionMultiplyTransform.

Parameters:

region (LinearRegion | CompoundRegion) – Region(s) to transform
factor (float) – Multiplication factor

class piblin_jax.transform.region.RegionTransform(region)[source]

Bases: DatasetTransform

Base class for transforms that operate on specific regions.

RegionTransform applies a transformation only within specified region(s), preserving data outside the regions. This enables selective processing of data based on independent variable ranges.

Subclasses should implement the _apply_to_region() method to define the specific transformation to apply within the region(s).

Parameters:: region (LinearRegion | CompoundRegion) – Region(s) to transform

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.data.roi import LinearRegion
>>> from piblin_jax.transform.region import RegionMultiplyTransform
>>> # Create dataset
>>> x_data = np.array([0, 1, 2, 3, 4, 5])
>>> y_data = np.array([1, 1, 1, 1, 1, 1])
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x_data,
...     dependent_variable_data=y_data
... )
>>> # Define region and transform
>>> region = LinearRegion(x_min=2.0, x_max=4.0)
>>> transform = RegionMultiplyTransform(region, factor=2.0)
>>> # Apply transform (only region [2, 4] is multiplied)
>>> result = transform.apply_to(dataset, make_copy=True)
>>> result.dependent_variable_data
array([1., 1., 2., 2., 2., 1.])

Notes

Currently optimized for OneDimensionalDataset
Data outside regions is preserved exactly
Transformations use NumPy arrays internally for compatibility
Region masks are generated from the independent variable

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(region)[source]

Initialize RegionTransform.

Parameters:: region (LinearRegion | CompoundRegion) – Region(s) to transform

Lambda and Dynamic Transforms

Lambda transforms and dynamic parameter transforms for piblin-jax.

This module provides: - LambdaTransform: Wraps arbitrary functions as transforms - DynamicTransform: Base class for data-driven parameter computation - AutoScaleTransform: Automatic data scaling to target range - AutoBaselineTransform: Automatic baseline correction from data

Lambda transforms allow users to create custom transforms from simple functions without defining new classes. Dynamic transforms compute parameters from the data itself, enabling adaptive processing.

class piblin_jax.transform.lambda_transform.AutoBaselineTransform(n_points=10, method='first')[source]

Bases: DynamicTransform

Automatically subtract baseline computed from data.

Computes baseline from first/last N points or minimum value, then subtracts it from all data. Useful for removing offsets and drift in experimental measurements.

Parameters:

n_points (int, default 10) – Number of points to use for baseline computation (only used for ‘first’ and ‘last’ methods)
method (str, default 'first') – Method for computing baseline: - ‘first’: Mean of first n_points - ‘last’: Mean of last n_points - ‘min’: Minimum value in data

Examples

>>> from piblin_jax.transform import AutoBaselineTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> # Subtract baseline from first 10 points
>>> transform = AutoBaselineTransform(n_points=10, method='first')
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=np.arange(100),
...     dependent_variable_data=np.random.randn(100) + 5.0  # offset by 5
... )
>>> result = transform.apply_to(dataset)
>>>
>>> # Subtract minimum value
>>> transform = AutoBaselineTransform(method='min')

Notes

‘first’ method: Good for time series where initial values are baseline
‘last’ method: Good for measurements that return to baseline
‘min’ method: Good for ensuring all values are non-negative
Computed parameter: ‘baseline’ (value to subtract)

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(n_points=10, method='first')[source]

Initialize auto-baseline transform.

Parameters:

n_points (int) – Number of points for baseline
method (str) – Baseline computation method (‘first’, ‘last’, ‘min’)

class piblin_jax.transform.lambda_transform.AutoScaleTransform(target_min=0.0, target_max=1.0)[source]

Bases: DynamicTransform

Automatically scale data to specified range.

Computes min/max from data and scales to target range. Useful for normalizing data to standard ranges like [0, 1] or [-1, 1].

Parameters:

target_min (float, default 0.0) – Target minimum value after scaling
target_max (float, default 1.0) – Target maximum value after scaling

Examples

>>> from piblin_jax.transform import AutoScaleTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> # Scale to [0, 1]
>>> transform = AutoScaleTransform()
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=np.array([1, 2, 3]),
...     dependent_variable_data=np.array([10, 20, 30])
... )
>>> result = transform.apply_to(dataset)
>>> # result.dependent_variable_data is now [0, 0.5, 1]
>>>
>>> # Scale to [-1, 1]
>>> transform = AutoScaleTransform(target_min=-1.0, target_max=1.0)

Notes

Handles constant data (where min == max) by setting all values to target_min
Preserves data ordering and relative differences
Computed parameters: ‘scale’ (multiplicative factor) and ‘offset’ (additive shift)

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(target_min=0.0, target_max=1.0)[source]

Initialize auto-scale transform.

Parameters:

target_min (float) – Target minimum value
target_max (float) – Target maximum value

class piblin_jax.transform.lambda_transform.DynamicTransform[source]

Bases: DatasetTransform

Base class for transforms with data-driven parameters.

Dynamic transforms compute parameters from the data itself, then apply transformations using those parameters. This enables adaptive processing where the transformation depends on data characteristics.

Subclasses must implement: - _compute_parameters(dataset): Extract parameters from data - _apply_with_parameters(dataset, params): Apply transformation

Parameters:: None

Examples

>>> from piblin_jax.transform.lambda_transform import DynamicTransform
>>> from piblin_jax.backend import jnp
>>>
>>> class CustomDynamicTransform(DynamicTransform):
...     def _compute_parameters(self, dataset):
...         y_data = dataset._dependent_variable_data
...         return {'mean': jnp.mean(y_data)}
...
...     def _apply_with_parameters(self, dataset, params):
...         dataset._dependent_variable_data -= params['mean']
...         return dataset

Notes

Parameters are computed fresh each time the transform is applied
Caching can be implemented in subclasses if needed
Only works with OneDimensionalDataset

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__()[source]

Initialize dynamic transform.

class piblin_jax.transform.lambda_transform.LambdaTransform(func=None, use_x=False, jit_compile=True, lambda_func=None)[source]

Bases: DatasetTransform

Transform that wraps an arbitrary function.

Allows users to create custom transforms from simple functions without defining new classes. Supports both simple functions that operate on the dependent variable only, and functions that use both independent and dependent variables.

Parameters:

func (Callable) – Function to apply to dependent variable. Signature: func(y_data: ndarray) -> ndarray Or: func(x_data: ndarray, y_data: ndarray) -> ndarray
use_x (bool, default False) – If True, pass both x and y to func. If False, pass only y to func.
jit_compile (bool, default True) – If True, attempt JIT compilation of the function. Falls back to regular function if compilation fails.

Examples

>>> from piblin_jax.transform import LambdaTransform
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> import numpy as np
>>>
>>> # Simple function
>>> transform = LambdaTransform(lambda y: y * 2.0)
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=np.array([1, 2, 3]),
...     dependent_variable_data=np.array([2, 4, 6])
... )
>>> result = transform.apply_to(dataset)
>>>
>>> # Function using x and y
>>> transform = LambdaTransform(
...     lambda x, y: y / x.max(),
...     use_x=True
... )
>>>
>>> # JAX-compatible function for JIT
>>> import jax.numpy as jnp
>>> transform = LambdaTransform(
...     lambda y: jnp.exp(y) * jnp.sin(y),
...     jit_compile=True
... )

Notes

Functions should use jnp instead of np for JIT compilation
JIT compilation improves performance but requires JAX-compatible code
If compilation fails, the transform falls back to the uncompiled function
Only works with OneDimensionalDataset

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(func=None, use_x=False, jit_compile=True, lambda_func=None)[source]

Initialize lambda transform.

Parameters:

func (Callable, optional) – Function to wrap as transform (works on arrays)
use_x (bool) – If True, pass both x and y to func
jit_compile (bool) – If True, attempt JIT compilation
lambda_func (Callable, optional) – Alias for func (for piblin compatibility)

Raises:

TypeError – If func is not callable
ValueError – If neither func nor lambda_func is provided

Notes

The function should operate on arrays, not dataset objects. For example: lambda y: y * 2.0 not lambda ds: ds.dependent_variable_data * 2.0

Dataset Transforms

Overview

Core dataset-level transforms for piblin-jax.

This module provides fundamental transforms for processing 1D datasets: - Interpolation: Resample to new x-values - Smoothing: Reduce noise (moving average, Gaussian) - Baseline correction: Remove systematic offsets and drifts - Normalization: Scale data to standard ranges - Calculus: Derivatives and integration

All transforms are JAX-compatible with JIT compilation support and graceful fallback to NumPy when JAX is unavailable.

class piblin_jax.transform.dataset.AsymmetricLeastSquaresBaseline(lambda_=100000.0, p=0.01, max_iter=10)[source]

Bases: DatasetTransform

Fit and subtract baseline using Asymmetric Least Squares (ALS) method.

This advanced baseline correction method is particularly effective for data with positive peaks on a varying baseline (e.g., spectra, chromatograms). The ALS method penalizes positive residuals more than negative ones, causing the baseline to fit below peaks.

Parameters:

lambda (float, default 1e5) – Smoothness parameter. Larger values = smoother baseline.
p (float, default 0.01) – Asymmetry parameter (0 < p < 1). Smaller values = baseline hugs peaks more closely from below.
max_iter (int, default 10) – Maximum number of iterations for convergence.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import AsymmetricLeastSquaresBaseline
>>>
>>> # Create spectrum with peaks on curved baseline
>>> x = np.linspace(0, 100, 1000)
>>> baseline = 10 + 0.1 * x + 0.001 * x**2
>>> peaks = 50 * np.exp(-((x - 30)**2) / 20)
>>> peaks += 30 * np.exp(-((x - 70)**2) / 15)
>>> y = baseline + peaks
>>> dataset = OneDimensionalDataset(x, y)
>>>
>>> # Remove baseline using ALS
>>> transform = AsymmetricLeastSquaresBaseline(lambda_=1e6, p=0.01)
>>> result = transform.apply_to(dataset)

Notes

ALS is iterative and may be slower than polynomial baseline
Very effective for spectroscopy and chromatography data
lambda controls smoothness (typical: 1e4 to 1e7)
p controls asymmetry (typical: 0.001 to 0.1)
Based on Eilers & Boelens (2005) paper

References

P. H. C. Eilers and H. F. M. Boelens, “Baseline Correction with Asymmetric Least Squares Smoothing”, Leiden University Medical Centre Report, 2005.

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(lambda_=100000.0, p=0.01, max_iter=10)[source]

Initialize ALS baseline transform.

Parameters:

lambda (float, default 1e5) – Smoothness parameter.
p (float, default 0.01) – Asymmetry parameter.
max_iter (int, default 10) – Maximum iterations.

class piblin_jax.transform.dataset.CumulativeIntegral(method='trapezoid')[source]

Bases: DatasetTransform

Compute cumulative integral of 1D dataset.

This transform computes the cumulative integral (running sum) of the dependent variable with respect to the independent variable using the trapezoidal rule.

Parameters:: method (str, default 'trapezoid') – Integration method: - ‘trapezoid’: Trapezoidal rule (2nd order accurate) - ‘simpson’: Simpson’s rule (4th order accurate, requires odd number of points)

method

Integration method.

Type:: str

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import CumulativeIntegral
>>>
>>> # Create constant function (integral should be linear)
>>> x = np.linspace(0, 10, 100)
>>> y = np.ones_like(x)  # Integral of 1 is x
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Compute cumulative integral
>>> integral = CumulativeIntegral()
>>> result = integral.apply_to(dataset)
>>> # Result should be approximately linear (x)
>>> result.dependent_variable_data[-1]  # Should be ~10
10.0

Notes

Trapezoidal rule: I[i] = sum((y[i] + y[i-1]) / 2 * dx[i])
First value is always 0 (integral from x[0] to x[0])
JIT-compiled with JAX backend
Handles non-uniform spacing
For smoother results on noisy data, consider smoothing first

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(method='trapezoid')[source]

Initialize cumulative integral transform.

Parameters:: method (str, default 'trapezoid') – Integration method.

class piblin_jax.transform.dataset.DefiniteIntegral(x_min=None, x_max=None, method='trapezoid')[source]

Bases: DatasetTransform

Compute definite integral over specified region.

This transform computes the definite integral (total area) under the curve between specified x-values.

Parameters:

x_min (float, optional) – Lower integration bound (default: use dataset minimum).
x_max (float, optional) – Upper integration bound (default: use dataset maximum).
method (str, default 'trapezoid') – Integration method.

x_min

Lower bound.

Type:: float or None

x_max

Upper bound.

Type:: float or None

method

Integration method.

Type:: str

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import DefiniteIntegral
>>>
>>> # Create data
>>> x = np.linspace(0, np.pi, 100)
>>> y = np.sin(x)  # Integral from 0 to pi is 2
>>> dataset = OneDimensionalDataset(x, y)
>>>
>>> # Compute definite integral
>>> integral = DefiniteIntegral()
>>> result = integral.apply_to(dataset)
>>> # Result stores integral value in metadata

Notes

Returns dataset with integral value stored in details
Original data is preserved
For cumulative integral, use CumulativeIntegral instead

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(x_min=None, x_max=None, method='trapezoid')[source]

Initialize definite integral transform.

Parameters:

x_min (float, optional) – Lower integration bound.
x_max (float, optional) – Upper integration bound.
method (str, default 'trapezoid') – Integration method.

class piblin_jax.transform.dataset.Derivative(order=1, method='gradient')[source]

Bases: DatasetTransform

Compute numerical derivative of 1D dataset.

This transform computes numerical derivatives using finite differences. Supports first and second derivatives with various accuracy schemes.

Parameters:

order (int, default 1) – Derivative order (1 or 2). - 1: First derivative (dy/dx) - 2: Second derivative (d²y/dx²)
method (str, default 'gradient') – Method for computing derivative: - ‘gradient’: Central differences (2nd order accurate) - ‘forward’: Forward differences (1st order accurate) - ‘backward’: Backward differences (1st order accurate)

order

Derivative order.

Type:: int

method

Differentiation method.

Type:: str

Raises:: ValueError – If order is not 1 or 2.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import Derivative
>>>
>>> # Create data with known derivative
>>> x = np.linspace(0, 10, 100)
>>> y = x**2  # dy/dx = 2x
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Compute first derivative
>>> deriv = Derivative(order=1)
>>> result = deriv.apply_to(dataset)
>>> # Result should be approximately 2*x
>>>
>>> # Compute second derivative
>>> deriv2 = Derivative(order=2)
>>> result2 = deriv2.apply_to(dataset)
>>> # Result should be approximately 2 (constant)

Notes

Uses jnp.gradient for central differences
Gradient method provides 2nd order accuracy
Handles non-uniform spacing in x
JIT-compiled with JAX backend
Edge effects present at boundaries
For noisy data, consider smoothing before differentiation

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(order=1, method='gradient')[source]

Initialize derivative transform.

Parameters:

order (int, default 1) – Derivative order (1 or 2).
method (str, default 'gradient') – Differentiation method.

Raises:

ValueError – If order is not 1 or 2.

class piblin_jax.transform.dataset.GaussianSmooth(sigma=1.0, truncate=3.0)[source]

Bases: DatasetTransform

Smooth data using Gaussian filter.

This transform applies a Gaussian filter for smoothing, which provides better frequency response than simple moving average.

Parameters:

sigma (float, default 1.0) – Standard deviation of Gaussian kernel in units of data points.
truncate (float, default 3.0) – Truncate filter at this many standard deviations.

Examples

>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import GaussianSmooth
>>> import numpy as np
>>>
>>> x = np.linspace(0, 10, 100)
>>> y = np.sin(x) + 0.3 * np.random.randn(100)
>>> dataset = OneDimensionalDataset(x, y)
>>>
>>> smooth = GaussianSmooth(sigma=2.0)
>>> result = smooth.apply_to(dataset)

Notes

Gaussian smoothing preserves features better than moving average
sigma controls the amount of smoothing
Larger sigma = more smoothing

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(sigma=1.0, truncate=3.0)[source]

Initialize Gaussian smoothing transform.

Parameters:

sigma (float, default 1.0) – Standard deviation of Gaussian kernel.
truncate (float, default 3.0) – Truncate at this many standard deviations.

class piblin_jax.transform.dataset.Interpolate1D(new_x, method='linear')[source]

Bases: DatasetTransform

Interpolate 1D dataset to new x-values.

This transform resamples a 1D dataset to a new set of independent variable values using linear interpolation. It supports both JAX and NumPy backends with automatic fallback.

Parameters:

new_x (array-like) – New independent variable values for interpolation.
method (str, default 'linear') – Interpolation method. Currently only ‘linear’ is fully supported. Future versions may support ‘cubic’, ‘spline’, etc.

new_x

Target x-values for interpolation.

Type:: ndarray

method

Interpolation method name.

Type:: str

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import Interpolate1D
>>>
>>> # Create original dataset
>>> x = np.array([0, 1, 2, 3, 4])
>>> y = np.array([0, 1, 4, 9, 16])  # y = x^2
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Interpolate to finer grid
>>> new_x = np.linspace(0, 4, 17)  # 17 points from 0 to 4
>>> interp = Interpolate1D(new_x, method='linear')
>>> result = interp.apply_to(dataset)
>>>
>>> # Result has new x-values with interpolated y-values
>>> result.independent_variable_data.shape
(17,)

Notes

Linear interpolation is used for both JAX and NumPy backends
For JAX backend, uses jnp.interp (compiled with JIT)
For NumPy backend, uses np.interp
Extrapolation uses constant values (edge values)
Metadata (conditions, details) is preserved from original dataset

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(new_x, method='linear')[source]

Initialize interpolation transform.

Parameters:

new_x (array-like) – New independent variable values.
method (str, default 'linear') – Interpolation method (‘linear’ supported).

class piblin_jax.transform.dataset.MaxNormalize[source]

Bases: DatasetTransform

Normalize data by dividing by maximum absolute value.

This simple normalization scales data so that the maximum absolute value is 1. Preserves zero and sign of data.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import MaxNormalize
>>>
>>> x = np.linspace(0, 10, 100)
>>> y = np.linspace(-50, 100, 100)
>>> dataset = OneDimensionalDataset(x, y)
>>>
>>> transform = MaxNormalize()
>>> result = transform.apply_to(dataset)
>>> np.max(np.abs(result.dependent_variable_data))
1.0

Notes

Formula: y_norm = y / max(abs(y))
Preserves zero and sign
Maximum absolute value becomes 1
Simple and fast

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

class piblin_jax.transform.dataset.MinMaxNormalize(feature_range=(0, 1))[source]

Bases: DatasetTransform

Min-max normalization to scale data to a specific range.

This transform scales the dependent variable to a target range, typically [0, 1]. This is useful for comparing datasets with different scales or preparing data for machine learning.

Parameters:: feature_range (tuple of float, default (0, 1)) – Target range for normalization as (min, max).

feature_range

Target range for scaled data.

Type:: tuple

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import MinMaxNormalize
>>>
>>> # Create data with arbitrary range
>>> x = np.linspace(0, 10, 100)
>>> y = np.linspace(5, 25, 100)  # Range: 5 to 25
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Normalize to [0, 1]
>>> transform = MinMaxNormalize()
>>> result = transform.apply_to(dataset)
>>> np.min(result.dependent_variable_data)  # Should be ~0
0.0
>>> np.max(result.dependent_variable_data)  # Should be ~1
1.0
>>>
>>> # Normalize to custom range
>>> transform = MinMaxNormalize(feature_range=(-1, 1))
>>> result = transform.apply_to(dataset)

Notes

Formula: y_scaled = (y - y_min) / (y_max - y_min) * (max - min) + min
Small epsilon added to denominator to avoid division by zero
JIT-compiled with JAX backend for efficiency
Independent variable is preserved
Metadata is preserved

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(feature_range=(0, 1))[source]

Initialize min-max normalization transform.

Parameters:: feature_range (tuple of float, default (0, 1)) – Target range as (min, max).

class piblin_jax.transform.dataset.MovingAverageSmooth(window_size=5)[source]

Bases: DatasetTransform

Smooth data using moving average filter.

This transform applies a simple moving average (box filter) to smooth noisy data. It uses convolution for efficient computation.

Parameters:: window_size (int, default 5) – Size of the moving average window. Must be odd to ensure symmetry.

window_size

Window size for moving average.

Type:: int

Raises:: ValueError – If window_size is not odd.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import MovingAverageSmooth
>>>
>>> # Create noisy data
>>> x = np.linspace(0, 10, 100)
>>> y = np.sin(x) + 0.5 * np.random.randn(100)
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Apply smoothing
>>> smooth = MovingAverageSmooth(window_size=5)
>>> result = smooth.apply_to(dataset)
>>>
>>> # Result has smoothed y-values (x unchanged)
>>> result.independent_variable_data  # Same as original
>>> result.dependent_variable_data    # Smoothed version

Notes

Uses ‘same’ mode for convolution (output same size as input)
Edge effects present at boundaries (first/last few points)
For JAX backend, convolution is JIT-compiled for efficiency
Window must be odd to ensure symmetric smoothing
Larger windows = more smoothing but more distortion

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(window_size=5)[source]

Initialize moving average smoothing transform.

Parameters:: window_size (int, default 5) – Size of moving average window (must be odd).
Raises:: ValueError – If window_size is even.

class piblin_jax.transform.dataset.PolynomialBaseline(degree=1)[source]

Bases: DatasetTransform

Fit and subtract polynomial baseline from data.

This transform fits a polynomial to the data and subtracts it, removing systematic trends and offsets. Commonly used in spectroscopy to remove background signals.

Parameters:: degree (int, default 1) – Degree of polynomial to fit. - 0: Constant offset - 1: Linear drift - 2: Quadratic curvature - Higher: More complex baselines

degree

Polynomial degree for baseline fitting.

Type:: int

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import PolynomialBaseline
>>>
>>> # Create data with linear drift
>>> x = np.linspace(0, 10, 100)
>>> signal = np.sin(x)
>>> baseline = 2.0 * x + 5.0  # Linear drift
>>> y = signal + baseline
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Remove linear baseline
>>> transform = PolynomialBaseline(degree=1)
>>> result = transform.apply_to(dataset)
>>>
>>> # Result should be close to original signal
>>> np.allclose(result.dependent_variable_data, signal, atol=0.1)
True

Notes

Uses least-squares polynomial fitting (np.polyfit)
Works for both JAX and NumPy backends (uses NumPy for fitting)
Higher degree polynomials may overfit to noise
For complex baselines, consider spline-based methods
Independent variable (x) is preserved

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(degree=1)[source]

Initialize polynomial baseline transform.

Parameters:: degree (int, default 1) – Polynomial degree (0 = constant, 1 = linear, etc.).

class piblin_jax.transform.dataset.RobustNormalize[source]

Bases: DatasetTransform

Robust normalization using median and IQR.

This transform normalizes data using median and interquartile range (IQR) instead of mean and standard deviation. More robust to outliers than z-score normalization.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import RobustNormalize
>>>
>>> # Create data with outliers
>>> x = np.linspace(0, 10, 100)
>>> y = np.random.randn(100)
>>> y[0] = 100  # Outlier
>>> dataset = OneDimensionalDataset(x, y)
>>>
>>> # Robust normalization (less affected by outlier)
>>> transform = RobustNormalize()
>>> result = transform.apply_to(dataset)

Notes

Formula: y_robust = (y - median(y)) / IQR(y)
IQR = Q3 - Q1 (interquartile range)
More robust to outliers than z-score
JIT-compiled with JAX backend

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

class piblin_jax.transform.dataset.ZScoreNormalize[source]

Bases: DatasetTransform

Z-score normalization (standardization).

This transform standardizes data to have zero mean and unit variance. Also known as standardization or z-score transformation. Useful for comparing datasets with different units or scales.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import ZScoreNormalize
>>>
>>> # Create data with arbitrary mean and std
>>> x = np.linspace(0, 10, 100)
>>> y = 5.0 * np.random.randn(100) + 10.0  # mean=10, std=5
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Standardize to mean=0, std=1
>>> transform = ZScoreNormalize()
>>> result = transform.apply_to(dataset)
>>> np.mean(result.dependent_variable_data)  # Should be ~0
0.0
>>> np.std(result.dependent_variable_data)   # Should be ~1
1.0

Notes

Formula: y_zscore = (y - mean(y)) / std(y)
Small epsilon added to denominator to avoid division by zero
Results have mean ≈ 0 and standard deviation ≈ 1
JIT-compiled with JAX backend
Preserves shape of distribution
Sensitive to outliers (unlike robust scaling)

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

Smoothing Transforms

Smoothing transforms for 1D datasets.

This module provides various smoothing/filtering transforms to reduce noise in time series and spectral data.

class piblin_jax.transform.dataset.smoothing.GaussianSmooth(sigma=1.0, truncate=3.0)[source]

Bases: DatasetTransform

Smooth data using Gaussian filter.

This transform applies a Gaussian filter for smoothing, which provides better frequency response than simple moving average.

Parameters:

sigma (float, default 1.0) – Standard deviation of Gaussian kernel in units of data points.
truncate (float, default 3.0) – Truncate filter at this many standard deviations.

Examples

>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import GaussianSmooth
>>> import numpy as np
>>>
>>> x = np.linspace(0, 10, 100)
>>> y = np.sin(x) + 0.3 * np.random.randn(100)
>>> dataset = OneDimensionalDataset(x, y)
>>>
>>> smooth = GaussianSmooth(sigma=2.0)
>>> result = smooth.apply_to(dataset)

Notes

Gaussian smoothing preserves features better than moving average
sigma controls the amount of smoothing
Larger sigma = more smoothing

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(sigma=1.0, truncate=3.0)[source]

Initialize Gaussian smoothing transform.

Parameters:

sigma (float, default 1.0) – Standard deviation of Gaussian kernel.
truncate (float, default 3.0) – Truncate at this many standard deviations.

class piblin_jax.transform.dataset.smoothing.MovingAverageSmooth(window_size=5)[source]

Bases: DatasetTransform

Smooth data using moving average filter.

This transform applies a simple moving average (box filter) to smooth noisy data. It uses convolution for efficient computation.

Parameters:: window_size (int, default 5) – Size of the moving average window. Must be odd to ensure symmetry.

window_size

Window size for moving average.

Type:: int

Raises:: ValueError – If window_size is not odd.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import MovingAverageSmooth
>>>
>>> # Create noisy data
>>> x = np.linspace(0, 10, 100)
>>> y = np.sin(x) + 0.5 * np.random.randn(100)
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Apply smoothing
>>> smooth = MovingAverageSmooth(window_size=5)
>>> result = smooth.apply_to(dataset)
>>>
>>> # Result has smoothed y-values (x unchanged)
>>> result.independent_variable_data  # Same as original
>>> result.dependent_variable_data    # Smoothed version

Notes

Uses ‘same’ mode for convolution (output same size as input)
Edge effects present at boundaries (first/last few points)
For JAX backend, convolution is JIT-compiled for efficiency
Window must be odd to ensure symmetric smoothing
Larger windows = more smoothing but more distortion

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(window_size=5)[source]

Initialize moving average smoothing transform.

Parameters:: window_size (int, default 5) – Size of moving average window (must be odd).
Raises:: ValueError – If window_size is even.

Interpolation Transforms

Interpolation transforms for 1D datasets.

This module provides interpolation transforms that resample datasets to new x-values using various interpolation methods.

class piblin_jax.transform.dataset.interpolate.Interpolate1D(new_x, method='linear')[source]

Bases: DatasetTransform

Interpolate 1D dataset to new x-values.

This transform resamples a 1D dataset to a new set of independent variable values using linear interpolation. It supports both JAX and NumPy backends with automatic fallback.

Parameters:

new_x (array-like) – New independent variable values for interpolation.
method (str, default 'linear') – Interpolation method. Currently only ‘linear’ is fully supported. Future versions may support ‘cubic’, ‘spline’, etc.

new_x

Target x-values for interpolation.

Type:: ndarray

method

Interpolation method name.

Type:: str

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import Interpolate1D
>>>
>>> # Create original dataset
>>> x = np.array([0, 1, 2, 3, 4])
>>> y = np.array([0, 1, 4, 9, 16])  # y = x^2
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Interpolate to finer grid
>>> new_x = np.linspace(0, 4, 17)  # 17 points from 0 to 4
>>> interp = Interpolate1D(new_x, method='linear')
>>> result = interp.apply_to(dataset)
>>>
>>> # Result has new x-values with interpolated y-values
>>> result.independent_variable_data.shape
(17,)

Notes

Linear interpolation is used for both JAX and NumPy backends
For JAX backend, uses jnp.interp (compiled with JIT)
For NumPy backend, uses np.interp
Extrapolation uses constant values (edge values)
Metadata (conditions, details) is preserved from original dataset

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(new_x, method='linear')[source]

Initialize interpolation transform.

Parameters:

new_x (array-like) – New independent variable values.
method (str, default 'linear') – Interpolation method (‘linear’ supported).

Normalization Transforms

Normalization transforms for 1D datasets.

This module provides various normalization and scaling transforms to standardize data for comparison and analysis.

class piblin_jax.transform.dataset.normalization.MaxNormalize[source]

Bases: DatasetTransform

Normalize data by dividing by maximum absolute value.

This simple normalization scales data so that the maximum absolute value is 1. Preserves zero and sign of data.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import MaxNormalize
>>>
>>> x = np.linspace(0, 10, 100)
>>> y = np.linspace(-50, 100, 100)
>>> dataset = OneDimensionalDataset(x, y)
>>>
>>> transform = MaxNormalize()
>>> result = transform.apply_to(dataset)
>>> np.max(np.abs(result.dependent_variable_data))
1.0

Notes

Formula: y_norm = y / max(abs(y))
Preserves zero and sign
Maximum absolute value becomes 1
Simple and fast

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

class piblin_jax.transform.dataset.normalization.MinMaxNormalize(feature_range=(0, 1))[source]

Bases: DatasetTransform

Min-max normalization to scale data to a specific range.

This transform scales the dependent variable to a target range, typically [0, 1]. This is useful for comparing datasets with different scales or preparing data for machine learning.

Parameters:: feature_range (tuple of float, default (0, 1)) – Target range for normalization as (min, max).

feature_range

Target range for scaled data.

Type:: tuple

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import MinMaxNormalize
>>>
>>> # Create data with arbitrary range
>>> x = np.linspace(0, 10, 100)
>>> y = np.linspace(5, 25, 100)  # Range: 5 to 25
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Normalize to [0, 1]
>>> transform = MinMaxNormalize()
>>> result = transform.apply_to(dataset)
>>> np.min(result.dependent_variable_data)  # Should be ~0
0.0
>>> np.max(result.dependent_variable_data)  # Should be ~1
1.0
>>>
>>> # Normalize to custom range
>>> transform = MinMaxNormalize(feature_range=(-1, 1))
>>> result = transform.apply_to(dataset)

Notes

Formula: y_scaled = (y - y_min) / (y_max - y_min) * (max - min) + min
Small epsilon added to denominator to avoid division by zero
JIT-compiled with JAX backend for efficiency
Independent variable is preserved
Metadata is preserved

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(feature_range=(0, 1))[source]

Initialize min-max normalization transform.

Parameters:: feature_range (tuple of float, default (0, 1)) – Target range as (min, max).

class piblin_jax.transform.dataset.normalization.RobustNormalize[source]

Bases: DatasetTransform

Robust normalization using median and IQR.

This transform normalizes data using median and interquartile range (IQR) instead of mean and standard deviation. More robust to outliers than z-score normalization.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import RobustNormalize
>>>
>>> # Create data with outliers
>>> x = np.linspace(0, 10, 100)
>>> y = np.random.randn(100)
>>> y[0] = 100  # Outlier
>>> dataset = OneDimensionalDataset(x, y)
>>>
>>> # Robust normalization (less affected by outlier)
>>> transform = RobustNormalize()
>>> result = transform.apply_to(dataset)

Notes

Formula: y_robust = (y - median(y)) / IQR(y)
IQR = Q3 - Q1 (interquartile range)
More robust to outliers than z-score
JIT-compiled with JAX backend

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

class piblin_jax.transform.dataset.normalization.ZScoreNormalize[source]

Bases: DatasetTransform

Z-score normalization (standardization).

This transform standardizes data to have zero mean and unit variance. Also known as standardization or z-score transformation. Useful for comparing datasets with different units or scales.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import ZScoreNormalize
>>>
>>> # Create data with arbitrary mean and std
>>> x = np.linspace(0, 10, 100)
>>> y = 5.0 * np.random.randn(100) + 10.0  # mean=10, std=5
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Standardize to mean=0, std=1
>>> transform = ZScoreNormalize()
>>> result = transform.apply_to(dataset)
>>> np.mean(result.dependent_variable_data)  # Should be ~0
0.0
>>> np.std(result.dependent_variable_data)   # Should be ~1
1.0

Notes

Formula: y_zscore = (y - mean(y)) / std(y)
Small epsilon added to denominator to avoid division by zero
Results have mean ≈ 0 and standard deviation ≈ 1
JIT-compiled with JAX backend
Preserves shape of distribution
Sensitive to outliers (unlike robust scaling)

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

Baseline Correction Transforms

Baseline subtraction transforms for 1D datasets.

This module provides baseline correction transforms to remove systematic offsets and drifts from spectroscopic and chromatographic data.

class piblin_jax.transform.dataset.baseline.AsymmetricLeastSquaresBaseline(lambda_=100000.0, p=0.01, max_iter=10)[source]

Bases: DatasetTransform

Fit and subtract baseline using Asymmetric Least Squares (ALS) method.

This advanced baseline correction method is particularly effective for data with positive peaks on a varying baseline (e.g., spectra, chromatograms). The ALS method penalizes positive residuals more than negative ones, causing the baseline to fit below peaks.

Parameters:

lambda (float, default 1e5) – Smoothness parameter. Larger values = smoother baseline.
p (float, default 0.01) – Asymmetry parameter (0 < p < 1). Smaller values = baseline hugs peaks more closely from below.
max_iter (int, default 10) – Maximum number of iterations for convergence.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import AsymmetricLeastSquaresBaseline
>>>
>>> # Create spectrum with peaks on curved baseline
>>> x = np.linspace(0, 100, 1000)
>>> baseline = 10 + 0.1 * x + 0.001 * x**2
>>> peaks = 50 * np.exp(-((x - 30)**2) / 20)
>>> peaks += 30 * np.exp(-((x - 70)**2) / 15)
>>> y = baseline + peaks
>>> dataset = OneDimensionalDataset(x, y)
>>>
>>> # Remove baseline using ALS
>>> transform = AsymmetricLeastSquaresBaseline(lambda_=1e6, p=0.01)
>>> result = transform.apply_to(dataset)

Notes

ALS is iterative and may be slower than polynomial baseline
Very effective for spectroscopy and chromatography data
lambda controls smoothness (typical: 1e4 to 1e7)
p controls asymmetry (typical: 0.001 to 0.1)
Based on Eilers & Boelens (2005) paper

References

P. H. C. Eilers and H. F. M. Boelens, “Baseline Correction with Asymmetric Least Squares Smoothing”, Leiden University Medical Centre Report, 2005.

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(lambda_=100000.0, p=0.01, max_iter=10)[source]

Initialize ALS baseline transform.

Parameters:

lambda (float, default 1e5) – Smoothness parameter.
p (float, default 0.01) – Asymmetry parameter.
max_iter (int, default 10) – Maximum iterations.

class piblin_jax.transform.dataset.baseline.PolynomialBaseline(degree=1)[source]

Bases: DatasetTransform

Fit and subtract polynomial baseline from data.

This transform fits a polynomial to the data and subtracts it, removing systematic trends and offsets. Commonly used in spectroscopy to remove background signals.

Parameters:: degree (int, default 1) – Degree of polynomial to fit. - 0: Constant offset - 1: Linear drift - 2: Quadratic curvature - Higher: More complex baselines

degree

Polynomial degree for baseline fitting.

Type:: int

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import PolynomialBaseline
>>>
>>> # Create data with linear drift
>>> x = np.linspace(0, 10, 100)
>>> signal = np.sin(x)
>>> baseline = 2.0 * x + 5.0  # Linear drift
>>> y = signal + baseline
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Remove linear baseline
>>> transform = PolynomialBaseline(degree=1)
>>> result = transform.apply_to(dataset)
>>>
>>> # Result should be close to original signal
>>> np.allclose(result.dependent_variable_data, signal, atol=0.1)
True

Notes

Uses least-squares polynomial fitting (np.polyfit)
Works for both JAX and NumPy backends (uses NumPy for fitting)
Higher degree polynomials may overfit to noise
For complex baselines, consider spline-based methods
Independent variable (x) is preserved

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(degree=1)[source]

Initialize polynomial baseline transform.

Parameters:: degree (int, default 1) – Polynomial degree (0 = constant, 1 = linear, etc.).

Calculus Transforms

Calculus-based transforms for 1D datasets.

This module provides derivatives and integration transforms for numerical differentiation and integration of experimental data.

class piblin_jax.transform.dataset.calculus.CumulativeIntegral(method='trapezoid')[source]

Bases: DatasetTransform

Compute cumulative integral of 1D dataset.

This transform computes the cumulative integral (running sum) of the dependent variable with respect to the independent variable using the trapezoidal rule.

Parameters:: method (str, default 'trapezoid') – Integration method: - ‘trapezoid’: Trapezoidal rule (2nd order accurate) - ‘simpson’: Simpson’s rule (4th order accurate, requires odd number of points)

method

Integration method.

Type:: str

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import CumulativeIntegral
>>>
>>> # Create constant function (integral should be linear)
>>> x = np.linspace(0, 10, 100)
>>> y = np.ones_like(x)  # Integral of 1 is x
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Compute cumulative integral
>>> integral = CumulativeIntegral()
>>> result = integral.apply_to(dataset)
>>> # Result should be approximately linear (x)
>>> result.dependent_variable_data[-1]  # Should be ~10
10.0

Notes

Trapezoidal rule: I[i] = sum((y[i] + y[i-1]) / 2 * dx[i])
First value is always 0 (integral from x[0] to x[0])
JIT-compiled with JAX backend
Handles non-uniform spacing
For smoother results on noisy data, consider smoothing first

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(method='trapezoid')[source]

Initialize cumulative integral transform.

Parameters:: method (str, default 'trapezoid') – Integration method.

class piblin_jax.transform.dataset.calculus.DefiniteIntegral(x_min=None, x_max=None, method='trapezoid')[source]

Bases: DatasetTransform

Compute definite integral over specified region.

This transform computes the definite integral (total area) under the curve between specified x-values.

Parameters:

x_min (float, optional) – Lower integration bound (default: use dataset minimum).
x_max (float, optional) – Upper integration bound (default: use dataset maximum).
method (str, default 'trapezoid') – Integration method.

x_min

Lower bound.

Type:: float or None

x_max

Upper bound.

Type:: float or None

method

Integration method.

Type:: str

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import DefiniteIntegral
>>>
>>> # Create data
>>> x = np.linspace(0, np.pi, 100)
>>> y = np.sin(x)  # Integral from 0 to pi is 2
>>> dataset = OneDimensionalDataset(x, y)
>>>
>>> # Compute definite integral
>>> integral = DefiniteIntegral()
>>> result = integral.apply_to(dataset)
>>> # Result stores integral value in metadata

Notes

Returns dataset with integral value stored in details
Original data is preserved
For cumulative integral, use CumulativeIntegral instead

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(x_min=None, x_max=None, method='trapezoid')[source]

Initialize definite integral transform.

Parameters:

x_min (float, optional) – Lower integration bound.
x_max (float, optional) – Upper integration bound.
method (str, default 'trapezoid') – Integration method.

class piblin_jax.transform.dataset.calculus.Derivative(order=1, method='gradient')[source]

Bases: DatasetTransform

Compute numerical derivative of 1D dataset.

This transform computes numerical derivatives using finite differences. Supports first and second derivatives with various accuracy schemes.

Parameters:

order (int, default 1) – Derivative order (1 or 2). - 1: First derivative (dy/dx) - 2: Second derivative (d²y/dx²)
method (str, default 'gradient') – Method for computing derivative: - ‘gradient’: Central differences (2nd order accurate) - ‘forward’: Forward differences (1st order accurate) - ‘backward’: Backward differences (1st order accurate)

order

Derivative order.

Type:: int

method

Differentiation method.

Type:: str

Raises:: ValueError – If order is not 1 or 2.

Examples

>>> import numpy as np
>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.dataset import Derivative
>>>
>>> # Create data with known derivative
>>> x = np.linspace(0, 10, 100)
>>> y = x**2  # dy/dx = 2x
>>> dataset = OneDimensionalDataset(
...     independent_variable_data=x,
...     dependent_variable_data=y
... )
>>>
>>> # Compute first derivative
>>> deriv = Derivative(order=1)
>>> result = deriv.apply_to(dataset)
>>> # Result should be approximately 2*x
>>>
>>> # Compute second derivative
>>> deriv2 = Derivative(order=2)
>>> result2 = deriv2.apply_to(dataset)
>>> # Result should be approximately 2 (constant)

Notes

Uses jnp.gradient for central differences
Gradient method provides 2nd order accuracy
Handles non-uniform spacing in x
JIT-compiled with JAX backend
Edge effects present at boundaries
For noisy data, consider smoothing before differentiation

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Dataset.

__init__(order=1, method='gradient')[source]

Initialize derivative transform.

Parameters:

order (int, default 1) – Derivative order (1 or 2).
method (str, default 'gradient') – Differentiation method.

Raises:

ValueError – If order is not 1 or 2.

Measurement Transforms

Overview

Measurement-level transforms for piblin-jax.

This module provides transforms that operate on Measurement and MeasurementSet objects: - FilterDatasets: Filter datasets within a Measurement - FilterMeasurements: Filter measurements within a MeasurementSet - SplitByRegion: Split datasets by regions - MergeReplicates: Merge measurements with identical conditions

class piblin_jax.transform.measurement.FilterDatasets(predicate=None, dataset_type=None)[source]

Bases: MeasurementTransform

Filter datasets within a Measurement.

This transform filters datasets based on either their type or a custom predicate function. Returns a new Measurement containing only the datasets that match the filter criteria.

Parameters:

predicate (Callable[[Dataset], bool] | None) – Function that returns True for datasets to keep. Cannot be used with dataset_type.
dataset_type (type | None) – Filter by dataset type (alternative to predicate). Cannot be used with predicate.

Raises:

ValueError – If neither predicate nor dataset_type is provided. If both predicate and dataset_type are provided.

Examples

>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.measurement import FilterDatasets
>>>
>>> # Filter by type
>>> transform = FilterDatasets(dataset_type=OneDimensionalDataset)
>>> result = transform.apply_to(measurement)
>>>
>>> # Filter by predicate
>>> transform = FilterDatasets(
...     predicate=lambda ds: ds.conditions.get('temp', 0) > 25
... )
>>> result = transform.apply_to(measurement)

Notes

Returns a new Measurement with filtered datasets
Preserves measurement-level conditions and details
Empty dataset list is allowed if no datasets match

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Measurement.

__init__(predicate=None, dataset_type=None)[source]

Initialize FilterDatasets transform.

Parameters:

predicate (Callable[[Dataset], bool] | None) – Function that returns True for datasets to keep
dataset_type (type | None) – Filter by dataset type

class piblin_jax.transform.measurement.FilterMeasurements(predicate)[source]

Bases: MeasurementSetTransform

Filter measurements within a MeasurementSet.

This transform filters measurements based on a predicate function. Returns a new MeasurementSet containing only the measurements that match the filter criteria.

Parameters:: predicate (Callable[[Measurement], bool]) – Function that returns True for measurements to keep
Raises:: TypeError – If predicate is not callable

Examples

>>> from piblin_jax.transform.measurement import FilterMeasurements
>>>
>>> # Filter by condition
>>> transform = FilterMeasurements(
...     predicate=lambda m: m.conditions.get('temp', 0) > 25
... )
>>> result = transform.apply_to(measurement_set)
>>>
>>> # Filter by replicate number
>>> transform = FilterMeasurements(
...     predicate=lambda m: m.conditions.get('replicate', 0) <= 3
... )
>>> result = transform.apply_to(measurement_set)

Notes

Returns a new MeasurementSet with filtered measurements
Preserves measurement-set-level conditions and details
Empty measurement list is allowed if no measurements match

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to MeasurementSet.

__init__(predicate)[source]

Initialize FilterMeasurements transform.

Parameters:: predicate (Callable[[Measurement], bool]) – Function that returns True for measurements to keep

class piblin_jax.transform.measurement.MergeReplicates(strategy='average')[source]

Bases: MeasurementSetTransform

Merge measurements with identical conditions.

This transform groups measurements by their conditions and merges replicates (measurements with identical conditions) using either averaging or concatenation.

Parameters:: strategy (str, default 'average') – Merge strategy: ‘average’ or ‘concatenate’
Raises:: ValueError – If strategy is not ‘average’ or ‘concatenate’

Examples

>>> from piblin_jax.transform.measurement import MergeReplicates
>>>
>>> # Average replicate measurements
>>> transform = MergeReplicates(strategy='average')
>>> result = transform.apply_to(measurement_set)

Notes

Groups measurements by conditions (all key-value pairs must match)
For ‘average’ strategy: - Averages dependent variable data across replicates - Assumes all replicates have same independent variable data - Assumes all replicates have same dataset structure
For ‘concatenate’ strategy: - Currently returns first measurement (not yet implemented)
Preserves measurement-set-level conditions and details
Single measurements (no replicates) are returned unchanged

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to MeasurementSet.

__init__(strategy='average')[source]

Initialize MergeReplicates transform.

Parameters:: strategy (str, default 'average') – Merge strategy: ‘average’ or ‘concatenate’

class piblin_jax.transform.measurement.SplitByRegion(regions)[source]

Bases: MeasurementTransform

Split datasets by multiple regions, creating new Measurement.

This transform splits each OneDimensionalDataset in a Measurement into multiple datasets based on the specified regions. Each region creates a new dataset containing only the data points within that region.

Parameters:: regions (list[LinearRegion]) – List of regions to split by
Raises:: ValueError – If regions list is empty

Examples

>>> from piblin_jax.data.roi import LinearRegion
>>> from piblin_jax.transform.measurement import SplitByRegion
>>>
>>> regions = [
...     LinearRegion(0, 5),
...     LinearRegion(5, 10)
... ]
>>> transform = SplitByRegion(regions)
>>> result = transform.apply_to(measurement)

Notes

Only processes OneDimensionalDataset objects
Non-1D datasets are silently skipped
Empty regions (no data points) are included as empty datasets
Preserves dataset-level conditions and details for each split
Preserves measurement-level conditions and details

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Measurement.

__init__(regions)[source]

Initialize SplitByRegion transform.

Parameters:: regions (list[LinearRegion]) – List of regions to split by

Filtering Transforms

Collection-level transforms for Measurements and MeasurementSets.

This module provides transforms that operate on collections: - FilterDatasets: Filter datasets within a Measurement - FilterMeasurements: Filter measurements within a MeasurementSet - SplitByRegion: Split datasets by regions - MergeReplicates: Merge measurements with identical conditions

class piblin_jax.transform.measurement.filter.FilterDatasets(predicate=None, dataset_type=None)[source]

Bases: MeasurementTransform

Filter datasets within a Measurement.

This transform filters datasets based on either their type or a custom predicate function. Returns a new Measurement containing only the datasets that match the filter criteria.

Parameters:

predicate (Callable[[Dataset], bool] | None) – Function that returns True for datasets to keep. Cannot be used with dataset_type.
dataset_type (type | None) – Filter by dataset type (alternative to predicate). Cannot be used with predicate.

Raises:

ValueError – If neither predicate nor dataset_type is provided. If both predicate and dataset_type are provided.

Examples

>>> from piblin_jax.data.datasets import OneDimensionalDataset
>>> from piblin_jax.transform.measurement import FilterDatasets
>>>
>>> # Filter by type
>>> transform = FilterDatasets(dataset_type=OneDimensionalDataset)
>>> result = transform.apply_to(measurement)
>>>
>>> # Filter by predicate
>>> transform = FilterDatasets(
...     predicate=lambda ds: ds.conditions.get('temp', 0) > 25
... )
>>> result = transform.apply_to(measurement)

Notes

Returns a new Measurement with filtered datasets
Preserves measurement-level conditions and details
Empty dataset list is allowed if no datasets match

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Measurement.

__init__(predicate=None, dataset_type=None)[source]

Initialize FilterDatasets transform.

Parameters:

predicate (Callable[[Dataset], bool] | None) – Function that returns True for datasets to keep
dataset_type (type | None) – Filter by dataset type

class piblin_jax.transform.measurement.filter.FilterMeasurements(predicate)[source]

Bases: MeasurementSetTransform

Filter measurements within a MeasurementSet.

This transform filters measurements based on a predicate function. Returns a new MeasurementSet containing only the measurements that match the filter criteria.

Parameters:: predicate (Callable[[Measurement], bool]) – Function that returns True for measurements to keep
Raises:: TypeError – If predicate is not callable

Examples

>>> from piblin_jax.transform.measurement import FilterMeasurements
>>>
>>> # Filter by condition
>>> transform = FilterMeasurements(
...     predicate=lambda m: m.conditions.get('temp', 0) > 25
... )
>>> result = transform.apply_to(measurement_set)
>>>
>>> # Filter by replicate number
>>> transform = FilterMeasurements(
...     predicate=lambda m: m.conditions.get('replicate', 0) <= 3
... )
>>> result = transform.apply_to(measurement_set)

Notes

Returns a new MeasurementSet with filtered measurements
Preserves measurement-set-level conditions and details
Empty measurement list is allowed if no measurements match

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to MeasurementSet.

__init__(predicate)[source]

Initialize FilterMeasurements transform.

Parameters:: predicate (Callable[[Measurement], bool]) – Function that returns True for measurements to keep

class piblin_jax.transform.measurement.filter.MergeReplicates(strategy='average')[source]

Bases: MeasurementSetTransform

Merge measurements with identical conditions.

This transform groups measurements by their conditions and merges replicates (measurements with identical conditions) using either averaging or concatenation.

Parameters:: strategy (str, default 'average') – Merge strategy: ‘average’ or ‘concatenate’
Raises:: ValueError – If strategy is not ‘average’ or ‘concatenate’

Examples

>>> from piblin_jax.transform.measurement import MergeReplicates
>>>
>>> # Average replicate measurements
>>> transform = MergeReplicates(strategy='average')
>>> result = transform.apply_to(measurement_set)

Notes

Groups measurements by conditions (all key-value pairs must match)
For ‘average’ strategy: - Averages dependent variable data across replicates - Assumes all replicates have same independent variable data - Assumes all replicates have same dataset structure
For ‘concatenate’ strategy: - Currently returns first measurement (not yet implemented)
Preserves measurement-set-level conditions and details
Single measurements (no replicates) are returned unchanged

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to MeasurementSet.

__init__(strategy='average')[source]

Initialize MergeReplicates transform.

Parameters:: strategy (str, default 'average') – Merge strategy: ‘average’ or ‘concatenate’

class piblin_jax.transform.measurement.filter.SplitByRegion(regions)[source]

Bases: MeasurementTransform

Split datasets by multiple regions, creating new Measurement.

This transform splits each OneDimensionalDataset in a Measurement into multiple datasets based on the specified regions. Each region creates a new dataset containing only the data points within that region.

Parameters:: regions (list[LinearRegion]) – List of regions to split by
Raises:: ValueError – If regions list is empty

Examples

>>> from piblin_jax.data.roi import LinearRegion
>>> from piblin_jax.transform.measurement import SplitByRegion
>>>
>>> regions = [
...     LinearRegion(0, 5),
...     LinearRegion(5, 10)
... ]
>>> transform = SplitByRegion(regions)
>>> result = transform.apply_to(measurement)

Notes

Only processes OneDimensionalDataset objects
Non-1D datasets are silently skipped
Empty regions (no data points) are included as empty datasets
Preserves dataset-level conditions and details for each split
Preserves measurement-level conditions and details

Methods

`__call__`(target[, make_copy, ...])	Shorthand for apply_to.
`apply_to`(target[, make_copy, ...])	Apply transform to Measurement.

__init__(regions)[source]

Initialize SplitByRegion transform.

Parameters:: regions (list[LinearRegion]) – List of regions to split by

Transformations

Overview

Quick Examples

Basic Transform Pipeline

Lazy Evaluation for Performance

Custom Lambda Transforms

See Also

API Reference

Module Contents

Base Transform Classes

Pipeline

Region-Based Transforms

Lambda and Dynamic Transforms

Dataset Transforms

Overview

Smoothing Transforms

Interpolation Transforms

Normalization Transforms

Baseline Correction Transforms

Calculus Transforms

Measurement Transforms

Overview

Filtering Transforms