Uncertainty Quantification Tutorial
====================================

This tutorial demonstrates how to perform Bayesian uncertainty quantification
in piblin-jax for rheological and scientific data analysis. You'll learn how to:

- Fit Bayesian models to experimental data
- Obtain parameter estimates with credible intervals
- Generate predictions with uncertainty bands
- Propagate uncertainty through transforms
- Interpret posterior distributions

Why Uncertainty Quantification?
--------------------------------

Traditional least-squares fitting gives point estimates of parameters but
doesn't quantify uncertainty in those estimates. Bayesian methods provide:

**Full posterior distributions**
    Not just a single value, but the entire probability distribution over
    parameter values given your data.

**Credible intervals**
    Rigorous uncertainty ranges for parameters (e.g., "95% probability that
    K is between 4.5 and 5.5").

**Prediction intervals**
    Uncertainty bands for model predictions that account for both parameter
    uncertainty and measurement noise.

**Principled model comparison**
    Bayesian evidence for comparing competing models.

Basic Example: Power-Law Model
-------------------------------

Let's start with a simple shear-thinning fluid following the power-law model:

.. math::

   \eta = K \dot{\gamma}^{n-1}

where :math:`\eta` is viscosity, :math:`\dot{\gamma}` is shear rate,
:math:`K` is consistency index, and :math:`n` is flow behavior index.

Generate Synthetic Data
~~~~~~~~~~~~~~~~~~~~~~~

First, create some synthetic data with noise::

    import numpy as np
    from piblin_jax.bayesian.models import PowerLawModel

    # True parameters
    K_true = 5.0
    n_true = 0.6

    # Generate data
    np.random.seed(42)
    shear_rate = np.logspace(-1, 2, 20)
    viscosity_true = K_true * shear_rate ** (n_true - 1)
    viscosity = viscosity_true + np.random.normal(0, 0.5, size=len(shear_rate))

Fit Bayesian Model
~~~~~~~~~~~~~~~~~~

Now fit the Bayesian power-law model::

    # Create and fit model
    model = PowerLawModel(n_samples=2000, n_warmup=1000)
    model.fit(shear_rate, viscosity)

    # Check if sampling succeeded
    if model.is_fitted:
        print("Model fitting successful!")
    else:
        print("Warning: Check sampling diagnostics")

The model uses MCMC (Markov Chain Monte Carlo) to sample from the posterior
distribution. ``n_samples=2000`` specifies the number of posterior samples,
and ``n_warmup=1000`` is the burn-in period.

Examine Results
~~~~~~~~~~~~~~~

View a summary of the posterior distribution::

    summary = model.summary()
    print(summary)

Output::

    Parameter Posterior Summary:
    ----------------------------
    K: mean=5.02, std=0.15, 95% CI=[4.73, 5.31]
    n: mean=0.598, std=0.012, 95% CI=[0.575, 0.621]
    sigma: mean=0.51, std=0.09, 95% CI=[0.38, 0.71]

The summary shows:

- **mean**: Posterior mean (point estimate)
- **std**: Posterior standard deviation (uncertainty)
- **95% CI**: 95% credible interval

Extract Specific Parameters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Access posterior samples directly::

    # Get posterior samples
    samples = model.samples
    K_samples = samples['K']
    n_samples = samples['n']

    # Calculate custom statistics
    K_median = np.median(K_samples)
    K_90_interval = np.percentile(K_samples, [5, 95])

    print(f"K median: {K_median:.3f}")
    print(f"K 90% interval: [{K_90_interval[0]:.3f}, {K_90_interval[1]:.3f}]")

Visualize Results
~~~~~~~~~~~~~~~~~

Generate diagnostic plots::

    import matplotlib.pyplot as plt

    # Plot fit with uncertainty
    fig, axes = model.plot_fit(
        shear_rate,
        viscosity,
        show_uncertainty=True,
        uncertainty_level=0.95
    )
    plt.savefig('power_law_fit.png', dpi=300)
    plt.show()

The plot shows:

- Data points (observed viscosity)
- Mean prediction (posterior mean fit)
- 95% credible interval (shaded band)

Make Predictions
~~~~~~~~~~~~~~~~

Predict at new shear rates with uncertainty::

    # New prediction points
    new_shear_rate = np.logspace(-2, 3, 100)

    # Predict with uncertainty
    predictions = model.predict(new_shear_rate, return_uncertainty=True)
    mean_pred = predictions['mean']
    lower_pred = predictions['lower']
    upper_pred = predictions['upper']

    # Plot predictions
    plt.figure(figsize=(10, 6))
    plt.plot(new_shear_rate, mean_pred, 'r-', label='Mean prediction')
    plt.fill_between(new_shear_rate, lower_pred, upper_pred,
                     alpha=0.3, label='95% prediction interval')
    plt.scatter(shear_rate, viscosity, c='k', label='Data')
    plt.xscale('log')
    plt.yscale('log')
    plt.xlabel('Shear Rate (1/s)')
    plt.ylabel('Viscosity (Pa·s)')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()

Working with Datasets
---------------------

Quantiq integrates Bayesian uncertainty with the Dataset API::

    from piblin_jax.data.datasets import OneDimensionalDataset

    # Create dataset
    dataset = OneDimensionalDataset(
        independent_variable_data=shear_rate,
        dependent_variable_data=viscosity,
        conditions={'temperature': 25.0, 'sample': 'A1'}
    )

    # Fit model to dataset
    model = PowerLawModel(n_samples=2000)
    model.fit(dataset.independent_variable_data,
              dataset.dependent_variable_data)

    # Add uncertainty to dataset
    dataset_with_unc = dataset.with_uncertainty(
        model=model,
        n_samples=1000,
        keep_samples=True
    )

    # Check uncertainty status
    print(f"Has uncertainty: {dataset_with_unc.has_uncertainty}")

    # Get credible intervals
    lower, upper = dataset_with_unc.get_credible_intervals(level=0.95)

Propagating Uncertainty Through Transforms
-------------------------------------------

Uncertainty can be propagated through data transformations::

    from piblin_jax.transform.dataset import GaussianSmoothing

    # Create dataset with uncertainty
    dataset_with_unc = dataset.with_uncertainty(
        model=model,
        n_samples=1000,
        keep_samples=True
    )

    # Apply transform with uncertainty propagation
    smoother = GaussianSmoothing(sigma=2.0)
    smoothed = smoother.apply_to(
        dataset_with_unc,
        propagate_uncertainty=True
    )

    # Uncertainty is now propagated through the transform
    print(f"Smoothed has uncertainty: {smoothed.has_uncertainty}")

Advanced Example: Arrhenius Model
----------------------------------

Temperature-dependent viscosity following Arrhenius equation:

.. math::

   \eta = A \exp\left(\frac{E_a}{RT}\right)

where :math:`A` is pre-exponential factor, :math:`E_a` is activation energy,
:math:`R` is gas constant, and :math:`T` is temperature.

Generate Temperature-Dependent Data
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

::

    from piblin_jax.bayesian.models import ArrheniusModel

    # True parameters
    A_true = 1e-3  # Pa·s
    Ea_true = 50000  # J/mol
    R = 8.314  # J/(mol·K)

    # Temperature range (K)
    temperature = np.linspace(273, 373, 15)

    # Generate data with noise
    np.random.seed(42)
    viscosity_true = A_true * np.exp(Ea_true / (R * temperature))
    viscosity = viscosity_true * np.random.lognormal(0, 0.1, size=len(temperature))

Fit Arrhenius Model
~~~~~~~~~~~~~~~~~~~

::

    # Fit model
    model = ArrheniusModel(n_samples=2000)
    model.fit(temperature, viscosity)

    # View results
    print(model.summary())

    # Plot fit
    fig, axes = model.plot_fit(temperature, viscosity, show_uncertainty=True)
    plt.xlabel('Temperature (K)')
    plt.ylabel('Viscosity (Pa·s)')
    plt.show()

Interpreting Results
~~~~~~~~~~~~~~~~~~~~

The posterior samples reveal parameter correlations::

    import matplotlib.pyplot as plt

    samples = model.samples
    A_samples = samples['A']
    Ea_samples = samples['Ea']

    # Plot joint distribution
    plt.figure(figsize=(8, 6))
    plt.scatter(A_samples, Ea_samples / 1000, alpha=0.3, s=1)
    plt.xlabel('A (Pa·s)')
    plt.ylabel('Ea (kJ/mol)')
    plt.title('Joint Posterior Distribution')
    plt.grid(True, alpha=0.3)
    plt.show()

This scatter plot reveals correlation between :math:`A` and :math:`E_a`:
if :math:`A` is higher, :math:`E_a` tends to be higher too.

Model Comparison
----------------

Compare different rheological models using Bayesian evidence::

    from piblin_jax.bayesian.models import PowerLawModel, CrossModel

    # Fit competing models
    power_law = PowerLawModel(n_samples=2000)
    power_law.fit(shear_rate, viscosity)

    cross = CrossModel(n_samples=2000)
    cross.fit(shear_rate, viscosity)

    # Compare using information criteria
    power_law_aic = power_law.aic()
    cross_aic = cross.aic()

    print(f"Power-law AIC: {power_law_aic:.1f}")
    print(f"Cross AIC: {cross_aic:.1f}")

    if cross_aic < power_law_aic:
        print("Cross model is preferred (lower AIC)")
    else:
        print("Power-law model is preferred (lower AIC)")

Lower AIC indicates better model fit penalized for complexity.

Tips and Best Practices
------------------------

**Number of samples**
    Use at least 1000-2000 samples for reliable uncertainty estimates.
    More samples give smoother distributions but take longer.

**Convergence diagnostics**
    Always check ``model.is_fitted`` and examine trace plots to ensure
    MCMC chains have converged::

        if not model.is_fitted:
            print("Warning: Sampling may not have converged")
            # Increase n_samples or n_warmup

**Prior sensitivity**
    Bayesian results depend on priors. piblin-jax uses weakly informative
    priors by default. For custom priors, see the API documentation.

**Computational cost**
    Bayesian fitting is ~10-100x slower than NLSQ. Use NLSQ first for
    initial exploration, then Bayesian for final analysis with uncertainty.

**Uncertainty vs confidence**
    Credible intervals (Bayesian) have a different interpretation than
    confidence intervals (frequentist). A 95% credible interval means
    "95% probability the parameter is in this range given the data."

Next Steps
----------

- See :doc:`rheological_models` for detailed model descriptions
- See :doc:`custom_transforms` to create uncertainty-aware transforms
- See :doc:`../user_guide/uncertainty` for complete API reference
- See ``examples/bayesian_rheological_models.py`` for full code examples

References
----------

- Gelman, A., et al. (2013). Bayesian Data Analysis, 3rd Edition.
  Chapman and Hall/CRC.
- McElreath, R. (2020). Statistical Rethinking, 2nd Edition.
  CRC Press.