Core Concepts
=============

This guide explains the fundamental concepts and architecture of piblin_jax.

Architecture Overview
---------------------

piblin-jax is built on a layered architecture designed for performance, composability, and ease of use:

Layered Architecture
^^^^^^^^^^^^^^^^^^^^

.. code-block:: text

    ┌───────────────────────────────────────────────────────────────────┐
    │                      Application Layer                            │
    │  User Scripts, Notebooks, Custom Analysis, Rheology Applications  │
    └───────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    ┌───────────────────────────────────────────────────────────────────┐
    │                        User API Layer                             │
    ├───────────────────────────────────────────────────────────────────┤
    │  ┌──────────────┐  ┌─────────────┐  ┌──────────────┐            │
    │  │   Data       │  │ Collections │  │   File I/O   │            │
    │  │ Structures   │  │ (Measurement│  │  (Readers/   │            │
    │  │ (Datasets)   │  │      Set)   │  │   Writers)   │            │
    │  └──────────────┘  └─────────────┘  └──────────────┘            │
    └───────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    ┌───────────────────────────────────────────────────────────────────┐
    │                      Core Processing Layer                        │
    ├───────────────────────────────────────────────────────────────────┤
    │  ┌──────────────┐  ┌─────────────┐  ┌──────────────┐            │
    │  │  Transform   │  │   Fitting   │  │   Bayesian   │            │
    │  │   Pipeline   │  │   (NLSQ)    │  │  (NumPyro)   │            │
    │  │   System     │  │             │  │   Models     │            │
    │  └──────────────┘  └─────────────┘  └──────────────┘            │
    └───────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    ┌───────────────────────────────────────────────────────────────────┐
    │                    Backend Abstraction Layer                      │
    ├───────────────────────────────────────────────────────────────────┤
    │  Array Operations  │  Math Functions  │  Backend Detection      │
    │  (JAX/NumPy)       │  (exp, sin, ...)  │  (jax/numpy)           │
    └───────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
    ┌───────────────────────────────────────────────────────────────────┐
    │                       Foundation Layer                            │
    ├───────────────────────────────────────────────────────────────────┤
    │  ┌──────────────┐  ┌─────────────┐  ┌──────────────┐            │
    │  │     JAX      │  │   NumPyro   │  │    NumPy     │            │
    │  │ (JIT, GPU,   │  │   (MCMC,    │  │ (Fallback)   │            │
    │  │  Auto-diff)  │  │  Bayesian)  │  │              │            │
    │  └──────────────┘  └─────────────┘  └──────────────┘            │
    └───────────────────────────────────────────────────────────────────┘

Module Organization
^^^^^^^^^^^^^^^^^^^

.. code-block:: text

    piblin_jax/
    ├── data/                    # Data structures
    │   ├── datasets/            # Dataset types (0D, 1D, 2D, 3D)
    │   │   ├── base.py          # BaseDataset abstract class
    │   │   ├── zero_dimensional.py
    │   │   ├── one_dimensional.py
    │   │   ├── two_dimensional.py
    │   │   └── three_dimensional.py
    │   ├── collections/         # Hierarchical organization
    │   │   ├── measurement.py   # Single measurement
    │   │   ├── measurement_set.py
    │   │   └── experiment.py
    │   └── metadata/            # Metadata handling
    │       ├── core.py          # Metadata validation
    │       └── roi.py           # Region of Interest
    │
    ├── transform/               # Data transformation
    │   ├── pipeline.py          # Pipeline composition
    │   ├── base.py              # Transform abstract class
    │   ├── dataset/             # Dataset transforms
    │   │   ├── smoothing.py     # Gaussian smoothing
    │   │   ├── interpolation.py # 1D interpolation
    │   │   └── normalization.py # Data normalization
    │   ├── region/              # Region transforms
    │   │   └── selection.py     # ROI selection
    │   └── lambda_transform.py  # Custom transforms
    │
    ├── bayesian/                # Bayesian inference
    │   ├── base.py              # BayesianModel class
    │   └── rheology/            # Rheological models
    │       ├── power_law.py     # Power law model
    │       ├── arrhenius.py     # Arrhenius model
    │       ├── cross.py         # Cross model
    │       └── carreau_yasuda.py
    │
    ├── fitting/                 # NLSQ curve fitting
    │   ├── curve_fit.py         # Main fitting interface
    │   └── parameter_estimation.py
    │
    ├── dataio/                  # File I/O
    │   ├── readers/             # Data readers
    │   │   ├── csv_reader.py
    │   │   └── registry.py      # Reader registry
    │   └── writers/             # Data writers
    │       ├── csv_writer.py
    │       └── registry.py
    │
    └── backend/                 # Backend abstraction
        ├── detection.py         # Detect JAX/NumPy
        ├── array.py             # Array interface
        └── math.py              # Math functions

Data Flow
^^^^^^^^^

Typical workflow through the layers:

.. code-block:: text

    [CSV File] ──────────────► dataio.readers
                                     │
                                     ▼
                              OneDimensionalDataset
                                     │
                                     ▼
                              Transform Pipeline
                              ┌─────────────────┐
                              │  Smoothing      │
                              │       ↓         │
                              │  Interpolation  │
                              │       ↓         │
                              │  Normalization  │
                              └─────────────────┘
                                     │
                                     ▼
                              Processed Dataset ──► Fitting/Bayesian
                                                         │
                                                         ▼
                                                    Parameters +
                                                   Uncertainties
                                                         │
                                                         ▼
                                              [Results/Visualization]

Each transform in the pipeline:

1. **Receives**: Immutable dataset
2. **Applies**: JAX-optimized operations via backend layer
3. **Returns**: New immutable dataset (original unchanged)
4. **Metadata**: Automatically tracks transformation history

Key Design Principles
---------------------

1. **Immutability**: Data structures are immutable by default
2. **Composability**: Transforms can be composed into pipelines
3. **Type Safety**: Comprehensive type hints throughout
4. **Performance**: JAX-powered automatic optimization
5. **Compatibility**: 100% backward compatible with piblin

Data Structures
---------------

Datasets
^^^^^^^^

Datasets are the core data containers in piblin_jax. They are immutable and
type-specific:

**Zero-Dimensional Dataset**
  Scalar values with uncertainty::

    from piblin_jax.data import ZeroDimensionalDataset
    temperature = ZeroDimensionalDataset(value=25.0, uncertainty=0.5)

**One-Dimensional Dataset**
  Arrays of (x, y) data::

    from piblin_jax.data import OneDimensionalDataset
    dataset = OneDimensionalDataset(x=x_values, y=y_values)

**Two-Dimensional Dataset**
  Gridded data (x, y, z)::

    from piblin_jax.data import TwoDimensionalDataset
    surface = TwoDimensionalDataset(x=x, y=y, z=z_grid)

**Three-Dimensional Dataset**
  Volumetric data::

    from piblin_jax.data import ThreeDimensionalDataset
    volume = ThreeDimensionalDataset(x=x, y=y, z=z, data=data_3d)

**Composite Datasets**
  Multiple related datasets::

    from piblin_jax.data import CompositeDataset
    composite = CompositeDataset(datasets={'temp': temp_ds, 'pressure': pressure_ds})

Collections
^^^^^^^^^^^

Collections organize multiple datasets hierarchically:

**Measurement**
  Related datasets from a single experimental run::

    from piblin_jax.data.collections import Measurement
    measurement = Measurement(name='Trial 1')
    measurement.add_dataset('temperature', temp_dataset)

**MeasurementSet**
  Multiple related measurements::

    from piblin_jax.data.collections import MeasurementSet
    measurement_set = MeasurementSet(name='Daily Experiments')

**Experiment**
  Hierarchical organization of measurements::

    from piblin_jax.data.collections import Experiment
    experiment = Experiment(name='Rheology Study')

Metadata System
^^^^^^^^^^^^^^^

All data structures support rich metadata::

    dataset = OneDimensionalDataset(
        x=x, y=y,
        metadata={
            'sample_id': 'ABC123',
            'temperature': 25.0,
            'operator': 'Alice',
            'timestamp': '2025-10-19T12:00:00'
        }
    )

Transforms
----------

Transform Types
^^^^^^^^^^^^^^^

**Dataset Transforms**
  Operate on individual datasets:

  - ``GaussianSmoothing``: Smooth noisy data
  - ``Interpolate1D``: Interpolate to new x-values
  - ``Normalization``: Normalize data
  - ``Derivative``: Numerical differentiation
  - ``Integral``: Numerical integration

**Region Transforms**
  Select or modify regions:

  - ``SelectRegion``: Extract data within bounds
  - ``RemoveRegion``: Remove data within bounds

**Measurement Transforms**
  Operate on measurements:

  - ``Filter``: Filter measurements by criteria

**Lambda Transforms**
  Custom transformations::

    from piblin_jax.transform import LambdaTransform
    custom = LambdaTransform(func=lambda ds: modify(ds))

Transform Pipeline
^^^^^^^^^^^^^^^^^^

Compose transforms into reusable pipelines::

    from piblin_jax.transform import Pipeline

    pipeline = Pipeline([
        GaussianSmoothing(sigma=2.0),
        Interpolate1D(new_x=new_points),
        Normalization(method='minmax')
    ])

    result = pipeline.apply_to(dataset)

Pipelines are:

- **Reusable**: Apply to multiple datasets
- **Composable**: Nest pipelines within pipelines
- **Serializable**: Save and load pipeline configurations
- **Optimized**: JAX automatically optimizes execution

Backend Abstraction
-------------------

piblin-jax abstracts numerical operations through a backend layer:

.. code-block:: python

    from piblin_jax.backend import get_backend, array, exp, sin

    # Get current backend
    backend = get_backend()  # 'jax' or 'numpy'

    # Backend-agnostic operations
    x = array([1.0, 2.0, 3.0])
    y = exp(sin(x))

This allows:

- **Transparent GPU acceleration** when JAX is available
- **Fallback to NumPy** for compatibility
- **Consistent API** regardless of backend

Bayesian Inference
------------------

piblin-jax integrates NumPyro for Bayesian parameter estimation:

Model Structure
^^^^^^^^^^^^^^^

All Bayesian models inherit from ``BayesianModel``::

    from piblin_jax.bayesian import BayesianModel

    class MyModel(BayesianModel):
        def model(self, x, y=None):
            # Define priors
            param1 = numpyro.sample('param1', dist.Normal(0, 1))

            # Define likelihood
            y_pred = param1 * x
            numpyro.sample('obs', dist.Normal(y_pred, 0.1), obs=y)

Built-in Models
^^^^^^^^^^^^^^^

- **PowerLawModel**: :math:`\\eta = K \\dot{\\gamma}^{n-1}`
- **ArrheniusModel**: :math:`\\eta = A \\exp(E_a / RT)`
- **CrossModel**: Flow curves with plateaus
- **CarreauYasudaModel**: Complex rheological behavior

See :doc:`uncertainty` for details.

Uncertainty Propagation
^^^^^^^^^^^^^^^^^^^^^^^

Uncertainties propagate through transforms automatically when using
Bayesian models.

piblin Compatibility
--------------------

piblin-jax maintains 100% API compatibility with piblin:

Compatibility Layer
^^^^^^^^^^^^^^^^^^^

::

    import piblin_jax as piblin

    # All piblin code works unchanged
    data = piblin.read_file('data.csv')
    # ... existing piblin workflow ...

This allows gradual migration and A/B testing of performance.

Performance Optimization
------------------------

JAX Integration
^^^^^^^^^^^^^^^

piblin-jax leverages JAX for:

- **JIT Compilation**: Automatic optimization
- **Vectorization**: SIMD operations
- **GPU Acceleration**: Transparent GPU usage
- **Auto-differentiation**: For Bayesian inference

Lazy Evaluation
^^^^^^^^^^^^^^^

Operations are lazy when possible, deferring computation until needed.

Batching
^^^^^^^^

Process multiple datasets efficiently using collections.

Type System
-----------

piblin-jax is fully typed with comprehensive type hints::

    from typing import Optional
    from piblin_jax.data import OneDimensionalDataset

    def process_data(
        dataset: OneDimensionalDataset,
        sigma: float = 1.0,
        normalize: bool = True
    ) -> OneDimensionalDataset:
        ...

This enables:

- **IDE autocomplete**
- **Static type checking** with mypy
- **Better documentation**
- **Fewer runtime errors**

Best Practices
--------------

1. **Use Pipelines**: Compose transforms for reusability
2. **Leverage Collections**: Organize related datasets
3. **Add Metadata**: Document your data
4. **Type Annotations**: Use type hints in custom code
5. **Immutability**: Don't modify data in-place
6. **GPU Wisely**: Use GPU for large datasets (>10k points)

Example: Complete Workflow
---------------------------

::

    import piblin_jax
    from piblin_jax.transform import Pipeline, GaussianSmoothing, Normalization
    from piblin_jax.data.collections import MeasurementSet

    # Load data
    datasets = [piblin_jax.read_file(f'sample_{i}.csv') for i in range(10)]

    # Create measurement set
    ms = MeasurementSet.from_datasets(datasets)

    # Define pipeline
    pipeline = Pipeline([
        GaussianSmoothing(sigma=2.0),
        Normalization(method='minmax')
    ])

    # Process all datasets
    processed = ms.apply_transform(pipeline)

    # Bayesian analysis
    from piblin_jax.bayesian import PowerLawModel
    model = PowerLawModel()

    for measurement in processed.measurements:
        ds = measurement.get_dataset('flow_curve')
        model.fit(ds.x, ds.y)
        print(model.summary())

Next Steps
----------

- **Hands-on Tutorial**: :doc:`../tutorials/basic_workflow`
- **Uncertainty Quantification**: :doc:`uncertainty`
- **Performance Tips**: :doc:`performance`
- **API Reference**: :doc:`../api/index`