Basic Workflow Tutorial ======================= This tutorial demonstrates a complete data analysis workflow using piblin-jax, from data loading through transformation to visualization and analysis. Overview -------- We'll walk through a typical rheology experiment workflow: 1. Load experimental data 2. Inspect and visualize raw data 3. Apply smoothing to reduce noise 4. Normalize and interpolate data 5. Extract regions of interest 6. Perform statistical analysis 7. Generate publication-quality plots This tutorial assumes you have piblin-jax installed. See :doc:`../user_guide/installation` if you need to install it first. Step 1: Loading Data --------------------- Let's start by loading some experimental rheology data. We'll create synthetic data for this tutorial, but in practice you'd load from a file. Creating Sample Data ^^^^^^^^^^^^^^^^^^^^ :: import numpy as np import matplotlib.pyplot as plt from piblin_jax.data import OneDimensionalDataset # Generate synthetic flow curve data # (shear rate vs viscosity for a shear-thinning fluid) np.random.seed(42) # Shear rate from 0.1 to 100 s^-1 shear_rate = np.logspace(-1, 2, 50) # Power-law fluid: eta = K * gamma_dot^(n-1) K = 5.0 # Consistency index n = 0.6 # Flow behavior index (< 1 = shear-thinning) # True viscosity with added noise viscosity_true = K * shear_rate**(n - 1) noise = 0.05 * viscosity_true * np.random.randn(len(shear_rate)) viscosity = viscosity_true + noise # Create dataset dataset = OneDimensionalDataset( x=shear_rate, y=viscosity, x_label='Shear Rate (1/s)', y_label='Viscosity (Pa.s)', name='Flow Curve' ) print(f"Dataset: {dataset.name}") print(f"Points: {len(dataset.x)}") print(f"X range: [{dataset.x.min():.2f}, {dataset.x.max():.2f}]") print(f"Y range: [{dataset.y.min():.2f}, {dataset.y.max():.2f}]") Loading from File ^^^^^^^^^^^^^^^^^ In real applications, you'd load data from files:: import piblin_jax # Load CSV file dataset = piblin_jax.read_file('flow_curve.csv') # Or use specific reader from piblin_jax.dataio import CSVReader reader = CSVReader(x_column=0, y_column=1) dataset = reader.read('flow_curve.csv') Step 2: Initial Visualization ------------------------------ Always visualize your raw data first to understand its characteristics:: fig, ax = plt.subplots(figsize=(8, 6)) # Plot on log-log scale (common for rheology) ax.loglog(dataset.x, dataset.y, 'o', alpha=0.6, label='Raw Data') ax.set_xlabel(dataset.x_label) ax.set_ylabel(dataset.y_label) ax.set_title(f'{dataset.name} - Raw Data') ax.grid(True, alpha=0.3) ax.legend() plt.tight_layout() plt.show() Key observations from the plot: - Data shows power-law behavior (linear on log-log plot) - Some scatter due to measurement noise - No obvious outliers - Good coverage of shear rate range Step 3: Data Smoothing ---------------------- Apply Gaussian smoothing to reduce noise while preserving trends:: from piblin_jax.transform import GaussianSmoothing # Create smoothing transform # sigma controls smoothness (higher = more smooth) smoother = GaussianSmoothing(sigma=1.5) # Apply to dataset smoothed = smoother.apply_to(dataset) print(f"Original dataset: {len(dataset.x)} points") print(f"Smoothed dataset: {len(smoothed.x)} points") Compare raw and smoothed data:: fig, ax = plt.subplots(figsize=(8, 6)) ax.loglog(dataset.x, dataset.y, 'o', alpha=0.4, label='Raw Data') ax.loglog(smoothed.x, smoothed.y, '-', linewidth=2, label='Smoothed') ax.set_xlabel(dataset.x_label) ax.set_ylabel(dataset.y_label) ax.set_title('Smoothing Effect') ax.grid(True, alpha=0.3) ax.legend() plt.tight_layout() plt.show() Step 4: Interpolation --------------------- Interpolate to a regular grid for easier analysis:: from piblin_jax.transform import Interpolate1D # Create regular grid on log scale new_shear_rate = np.logspace(-1, 2, 100) # Interpolate interpolator = Interpolate1D( new_x=new_shear_rate, kind='cubic' # Use cubic interpolation ) interpolated = interpolator.apply_to(smoothed) print(f"Interpolated to {len(interpolated.x)} points") Step 5: Building a Pipeline ---------------------------- Combine multiple transforms into a reusable pipeline:: from piblin_jax.transform import Pipeline # Create pipeline: smooth -> interpolate pipeline = Pipeline([ GaussianSmoothing(sigma=1.5), Interpolate1D(new_x=new_shear_rate, kind='cubic') ]) # Apply pipeline processed = pipeline.apply_to(dataset) # Visualize result fig, ax = plt.subplots(figsize=(8, 6)) ax.loglog(dataset.x, dataset.y, 'o', alpha=0.4, label='Raw') ax.loglog(processed.x, processed.y, '-', linewidth=2, label='Processed') ax.set_xlabel(dataset.x_label) ax.set_ylabel(dataset.y_label) ax.set_title('Pipeline Result') ax.grid(True, alpha=0.3) ax.legend() plt.tight_layout() plt.show() Pipelines are reusable - apply to multiple datasets:: dataset1 = piblin_jax.read_file('sample1.csv') dataset2 = piblin_jax.read_file('sample2.csv') result1 = pipeline.apply_to(dataset1) result2 = pipeline.apply_to(dataset2) Step 6: Region of Interest --------------------------- Extract and analyze specific regions:: from piblin_jax.transform import SelectRegion # Extract low shear rate region (gamma_dot < 10 s^-1) low_shear_selector = SelectRegion(x_min=0.1, x_max=10.0) low_shear = low_shear_selector.apply_to(processed) # Extract high shear rate region (gamma_dot > 10 s^-1) high_shear_selector = SelectRegion(x_min=10.0, x_max=100.0) high_shear = high_shear_selector.apply_to(processed) # Visualize regions fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(15, 4)) # Full range ax1.loglog(processed.x, processed.y, '-', linewidth=2) ax1.set_xlabel(dataset.x_label) ax1.set_ylabel(dataset.y_label) ax1.set_title('Full Range') ax1.grid(True, alpha=0.3) # Low shear ax2.loglog(low_shear.x, low_shear.y, '-', linewidth=2, color='orange') ax2.set_xlabel(dataset.x_label) ax2.set_ylabel(dataset.y_label) ax2.set_title('Low Shear Rate') ax2.grid(True, alpha=0.3) # High shear ax3.loglog(high_shear.x, high_shear.y, '-', linewidth=2, color='green') ax3.set_xlabel(dataset.x_label) ax3.set_ylabel(dataset.y_label) ax3.set_title('High Shear Rate') ax3.grid(True, alpha=0.3) plt.tight_layout() plt.show() Step 7: Numerical Derivatives ------------------------------ Calculate shear stress from viscosity and shear rate:: from piblin_jax.transform import Derivative # Shear stress tau = eta * gamma_dot # In log-log space, this is addition: log(tau) = log(eta) + log(gamma_dot) # For direct calculation, use element-wise operations log_shear_rate = np.log10(processed.x) log_viscosity = np.log10(processed.y) log_shear_stress = log_viscosity + log_shear_rate # Create shear stress dataset from piblin_jax.data import OneDimensionalDataset stress_dataset = OneDimensionalDataset( x=processed.x, y=10**log_shear_stress, x_label='Shear Rate (1/s)', y_label='Shear Stress (Pa)', name='Shear Stress Curve' ) # Visualize fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5)) ax1.loglog(processed.x, processed.y, '-', linewidth=2) ax1.set_xlabel('Shear Rate (1/s)') ax1.set_ylabel('Viscosity (Pa.s)') ax1.set_title('Flow Curve') ax1.grid(True, alpha=0.3) ax2.loglog(stress_dataset.x, stress_dataset.y, '-', linewidth=2, color='red') ax2.set_xlabel('Shear Rate (1/s)') ax2.set_ylabel('Shear Stress (Pa)') ax2.set_title('Stress Curve') ax2.grid(True, alpha=0.3) plt.tight_layout() plt.show() Step 8: Statistical Analysis ----------------------------- Perform statistical analysis on processed data:: # Calculate statistics mean_viscosity = np.mean(processed.y) std_viscosity = np.std(processed.y) min_viscosity = np.min(processed.y) max_viscosity = np.max(processed.y) print("\\nViscosity Statistics:") print(f" Mean: {mean_viscosity:.2f} Pa.s") print(f" Std Dev: {std_viscosity:.2f} Pa.s") print(f" Range: [{min_viscosity:.2f}, {max_viscosity:.2f}] Pa.s") # Power-law parameters from log-log slope log_x = np.log10(processed.x) log_y = np.log10(processed.y) # Linear fit in log-log space coeffs = np.polyfit(log_x, log_y, 1) slope = coeffs[0] intercept = coeffs[1] n_fitted = slope + 1 # Power-law index K_fitted = 10**intercept # Consistency print("\\nPower-Law Fit (eta = K*gamma_dot^(n-1)):") print(f" K (consistency): {K_fitted:.2f} Pa.s^n") print(f" n (flow index): {n_fitted:.2f}") print(f" True values: K={K:.2f}, n={n:.2f}") Step 9: Publication-Quality Plot --------------------------------- Create a polished figure for publication:: fig = plt.figure(figsize=(10, 8)) gs = fig.add_gridspec(2, 2, hspace=0.3, wspace=0.3) # Main plot: Flow curve ax_main = fig.add_subplot(gs[0, :]) ax_main.loglog(dataset.x, dataset.y, 'o', alpha=0.3, markersize=6, label='Raw Data') ax_main.loglog(processed.x, processed.y, '-', linewidth=2.5, color='darkblue', label='Smoothed & Interpolated') # Add power-law fit y_fit = K_fitted * processed.x**(n_fitted - 1) ax_main.loglog(processed.x, y_fit, '--', linewidth=2, color='red', alpha=0.7, label=f'Power-Law Fit (n={n_fitted:.2f})') ax_main.set_xlabel('Shear Rate, $\\dot{\\gamma}$ (s$^{-1}$)', fontsize=12) ax_main.set_ylabel('Viscosity, $\\eta$ (Pa.s)', fontsize=12) ax_main.set_title('Rheological Flow Curve', fontsize=14, fontweight='bold') ax_main.grid(True, alpha=0.3, which='both') ax_main.legend(fontsize=10, framealpha=0.9) # Bottom left: Residuals ax_resid = fig.add_subplot(gs[1, 0]) residuals = (processed.y - y_fit) / y_fit * 100 # Percent error ax_resid.semilogx(processed.x, residuals, 'o-', markersize=4, alpha=0.7) ax_resid.axhline(0, color='black', linestyle='--', alpha=0.5) ax_resid.set_xlabel('Shear Rate (s$^{-1}$)', fontsize=10) ax_resid.set_ylabel('Residual (%)', fontsize=10) ax_resid.set_title('Fit Residuals', fontsize=11) ax_resid.grid(True, alpha=0.3) # Bottom right: Statistics ax_stats = fig.add_subplot(gs[1, 1]) ax_stats.axis('off') stats_text = f""" Dataset Statistics Data Points: {len(processed.x)} Shear Rate Range: {processed.x.min():.2f} - {processed.x.max():.2f} s{^-1 Viscosity Range: {processed.y.min():.2f} - {processed.y.max():.2f} Pa.s Power-Law Parameters: K = {K_fitted:.2f} Pa.s n = {n_fitted:.2f} Shear-Thinning Index: {((1-n_fitted)*100):.0f}% (n < 1) """ ax_stats.text(0.1, 0.9, stats_text, transform=ax_stats.transAxes, fontsize=9, verticalalignment='top', fontfamily='monospace', bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.3)) plt.suptitle('Rheological Analysis with piblin-jax', fontsize=15, fontweight='bold', y=0.98) # Save figure plt.savefig('rheology_analysis.png', dpi=300, bbox_inches='tight') plt.show() print("\\nFigure saved as 'rheology_analysis.png'") Step 10: Working with Multiple Samples --------------------------------------- Analyze multiple samples using measurement sets:: from piblin_jax.data.collections import MeasurementSet # Create multiple datasets (e.g., different temperatures) temperatures = [20, 40, 60] # degC datasets = {} for temp in temperatures: # Generate data with temperature-dependent viscosity # (Arrhenius behavior) viscosity_temp = viscosity * np.exp(0.02 * (temp - 20)) noise_temp = 0.05 * viscosity_temp * np.random.randn(len(shear_rate)) datasets[temp] = OneDimensionalDataset( x=shear_rate, y=viscosity_temp + noise_temp, x_label='Shear Rate (1/s)', y_label='Viscosity (Pa.s)', name=f'Flow Curve @ {temp} degC' ) # Apply same pipeline to all datasets processed_datasets = {} for temp, ds in datasets.items(): processed_datasets[temp] = pipeline.apply_to(ds) # Visualize all temperatures fig, ax = plt.subplots(figsize=(10, 7)) colors = plt.cm.coolwarm(np.linspace(0, 1, len(temperatures))) for i, (temp, ds) in enumerate(processed_datasets.items()): ax.loglog(ds.x, ds.y, '-', linewidth=2, color=colors[i], label=f'{temp} degC') ax.set_xlabel('Shear Rate (s$^{-1}$)', fontsize=12) ax.set_ylabel('Viscosity (Pa.s)', fontsize=12) ax.set_title('Temperature-Dependent Flow Curves', fontsize=14, fontweight='bold') ax.grid(True, alpha=0.3, which='both') ax.legend(fontsize=10, title='Temperature') plt.tight_layout() plt.show() Summary ------- In this tutorial, we've covered a complete workflow: 1.  Data loading (synthetic and from files) 2.  Initial visualization and inspection 3.  Data smoothing and noise reduction 4.  Interpolation to regular grids 5.  Building reusable transform pipelines 6.  Region selection and analysis 7.  Derivative calculations 8.  Statistical analysis and model fitting 9.  Publication-quality visualization 10.  Multi-sample analysis Next Steps ---------- - **Bayesian Analysis**: See :doc:`uncertainty_quantification` for advanced parameter estimation with uncertainty - **Custom Transforms**: Learn to create your own transforms in :doc:`custom_transforms` - **Rheological Models**: Explore built-in models in :doc:`rheological_models` - **Performance**: Optimize for large datasets in :doc:`../user_guide/performance` Complete Code ------------- The complete code for this tutorial is available in the ``examples/`` directory as ``basic_workflow.py``. To run it:: python examples/basic_workflow.py