EasyHybrid Example: Synthetic Data Analysis

This example demonstrates how to use EasyHybrid to train a hybrid model on synthetic data for respiration modeling with Q10 temperature sensitivity.

julia

using EasyHybrid
using Metal
# using CUDA
using Lux

Data Loading and Preprocessing

Load synthetic dataset from GitHub into DataFrame

julia

df = load_timeseries_netcdf("https://github.com/bask0/q10hybrid/raw/master/data/Synthetic4BookChap.nc");

Select a subset of data for faster execution

julia

df = df[1:5000, :];
first(df, 5)

5×6 DataFrame

Row	time	sw_pot	dsw_pot	ta	reco	rb
	DateTime	Float64?	Float64?	Float64?	Float64?	Float64?
1	2003-01-01T00:15:00	109.817	115.595	2.1	0.844741	1.42522
2	2003-01-01T00:45:00	109.817	115.595	1.98	0.840641	1.42522
3	2003-01-01T01:15:00	109.817	115.595	1.89	0.837579	1.42522
4	2003-01-01T01:45:00	109.817	115.595	2.06	0.843372	1.42522
5	2003-01-01T02:15:00	109.817	115.595	2.09	0.844399	1.42522

Define the Physical Model

RbQ10 model: Respiration model with Q10 temperature sensitivity

Parameters:

ta: air temperature [°C]
Q10: temperature sensitivity factor [-]
rb: basal respiration rate [μmol/m²/s]
tref: reference temperature [°C] (default: 15.0)

julia

function RbQ10(; ta, Q10, rb, tref = 15.0f0)
    reco = rb .* Q10 .^ (0.1f0 .* (ta .- tref))
    return (; reco, Q10, rb)
end

RbQ10 (generic function with 1 method)

Define Model Parameters

Parameter specification: (default, lower_bound, upper_bound)

julia

parameters = (
    # Parameter name | Default | Lower | Upper      | Description
    rb = (3.0f0, 0.0f0, 13.0f0),  # Basal respiration [μmol/m²/s]
    Q10 = (2.0f0, 1.0f0, 4.0f0),   # Temperature sensitivity factor [-]
)

(rb = (3.0f0, 0.0f0, 13.0f0), Q10 = (2.0f0, 1.0f0, 4.0f0))

Configure Hybrid Model Components

Define input variables

julia

forcing = [:ta]                    # Forcing variables (temperature)

1-element Vector{Symbol}:
 :ta

Target variable

julia

target = [:reco]                   # Target variable (respiration)

1-element Vector{Symbol}:
 :reco

Parameter classification

julia

global_param_names = [:Q10]        # Global parameters (same for all samples)
neural_param_names = [:rb]         # Neural network predicted parameters

1-element Vector{Symbol}:
 :rb

Single NN Hybrid Model Training

julia

predictors_single_nn = [:sw_pot, :dsw_pot]   # Predictor variables (solar radiation, and its derivative)

small_nn_hybrid_model = constructHybridModel(
    predictors_single_nn,              # Input features
    forcing,                 # Forcing variables
    target,                  # Target variables
    RbQ10,                  # Process-based model function
    parameters,              # Parameter definitions
    neural_param_names,      # NN-predicted parameters
    global_param_names,      # Global parameters
    hidden_layers = [16, 16], # Neural network architecture
    activation = sigmoid,      # Activation function
    scale_nn_outputs = true, # Scale neural network outputs
    input_batchnorm = true   # Apply batch normalization to inputs
)

large_nn_hybrid_model = constructHybridModel(
    predictors_single_nn,              # Input features
    forcing,                 # Forcing variables
    target,                  # Target variables
    RbQ10,                  # Process-based model function
    parameters,              # Parameter definitions
    neural_param_names,      # NN-predicted parameters
    global_param_names,      # Global parameters
    hidden_layers = [1024, 512, 256, 128, 64], # Neural network architecture
    activation = sigmoid,      # Activation function
    scale_nn_outputs = true, # Scale neural network outputs
    input_batchnorm = true   # Apply batch normalization to inputs
)

Hybrid Model (Single NN)
Neural Network: 
  Chain(
      layer_1 = InputBatchNorm(
          layer = BatchNorm(2, affine=false, track_stats=true),
      ),
      layer_2 = Dense(2 => 1024, σ),                # 3_072 parameters
      layer_3 = Dense(1024 => 512, σ),              # 524_800 parameters
      layer_4 = Dense(512 => 256, σ),               # 131_328 parameters
      layer_5 = Dense(256 => 128, σ),               # 32_896 parameters
      layer_6 = Dense(128 => 64, σ),                # 8_256 parameters
      layer_7 = Dense(64 => 1),                     # 65 parameters
  )         # Total: 700_417 parameters,
            #        plus 5 states.
Configuration:
  predictors = [:sw_pot, :dsw_pot]
  forcing = [:ta]
  targets = [:reco]
  mechanistic_model = RbQ10
  neural_param_names = [:rb]
  global_param_names = [:Q10]
  fixed_param_names = Symbol[]
  scale_nn_outputs = true
  start_from_default = true
  config = (; hidden_layers = [1024, 512, 256, 128, 64], activation = σ, scale_nn_outputs = true, input_batchnorm = true, start_from_default = true,)

Parameters:
  ┌─────┬─────────┬───────┬───────┐
  │     │ default │ lower │ upper │
  ├─────┼─────────┼───────┼───────┤
  │  rb │     3.0 │   0.0 │  13.0 │
  │ Q10 │     2.0 │   1.0 │   4.0 │
  └─────┴─────────┴───────┴───────┘

train on DataFrame

Train the hybrid model

julia

cfg = EasyHybrid.TrainConfig(
    nepochs = 20,
    batchsize = 64,
    opt = RMSProp(0.01),
    loss_types = [:mse, :nse],
    show_progress = false,
    keep_history = false, # set to true to keep per-epoch history, losses, predictions, etc.
    save_training = false, # Set to true to enable saving training history and checkpoints
)

using BenchmarkTools

gpu_small_nn() = tune(
    small_nn_hybrid_model, df, cfg;
    gdev = gpu_device(), model_name = "small_nn_gpu"
)

cpu_small_nn() = tune(
    small_nn_hybrid_model, df, cfg;
    gdev = cpu_device(), model_name = "small_nn_cpu"
)

gpu_large_nn() = tune(
    large_nn_hybrid_model, df, cfg;
    gdev = gpu_device(), model_name = "large_nn_gpu"
)

cpu_large_nn() = tune(
    large_nn_hybrid_model, df, cfg;
    gdev = cpu_device(), model_name = "large_nn_cpu"
)

cpu_large_nn (generic function with 1 method)

warm-up to pay compilation once

julia

gpu_small_nn();
cpu_small_nn();
gpu_large_nn();
cpu_large_nn();

┌ Warning: Plotting enabled but keep_history is false. Plots will not be generated.
└ @ EasyHybrid ~/work/EasyHybrid.jl/EasyHybrid.jl/src/training/initialization.jl:14
[ Info: Training outputs will be saved to: /Users/runner/work/EasyHybrid.jl/EasyHybrid.jl/docs/build
┌ Warning: Could not determine number of GPU cores; some algorithms may not run optimally.
└ @ Metal ~/.julia/packages/Metal/TF981/src/utilities.jl:116
[ Info: Returning best model from epoch 6 with validation loss: 0.0088579105
┌ Warning: Plotting enabled but keep_history is false. Plots will not be generated.
└ @ EasyHybrid ~/work/EasyHybrid.jl/EasyHybrid.jl/src/training/initialization.jl:14
[ Info: Training outputs will be saved to: /Users/runner/work/EasyHybrid.jl/EasyHybrid.jl/docs/build
[ Info: Returning best model from epoch 3 with validation loss: 0.017481212
┌ Warning: Plotting enabled but keep_history is false. Plots will not be generated.
└ @ EasyHybrid ~/work/EasyHybrid.jl/EasyHybrid.jl/src/training/initialization.jl:14
[ Info: Training outputs will be saved to: /Users/runner/work/EasyHybrid.jl/EasyHybrid.jl/docs/build
[ Info: Returning best model from epoch 13 with validation loss: 0.0577121
┌ Warning: Plotting enabled but keep_history is false. Plots will not be generated.
└ @ EasyHybrid ~/work/EasyHybrid.jl/EasyHybrid.jl/src/training/initialization.jl:14
[ Info: Training outputs will be saved to: /Users/runner/work/EasyHybrid.jl/EasyHybrid.jl/docs/build
[ Info: Returning best model from epoch 2 with validation loss: 0.057252932

With Large NN CPU is slower than GPU

Large NN on GPU

julia

@benchmark gpu_large_nn() samples = 4 evals = 1

BenchmarkTools.Trial: 1 sample with 1 evaluation per sample.
 Single result which took 52.925 s (1.72% GC) to evaluate,
 with a memory estimate of 1.62 GiB, over 18209774 allocations.

Large NN on CPU

julia

@benchmark cpu_large_nn() samples = 4 evals = 1

BenchmarkTools.Trial: 1 sample with 1 evaluation per sample.
 Single result which took 44.367 s (51.19% GC) to evaluate,
 with a memory estimate of 62.83 GiB, over 5666435 allocations.

With Small NN CPU and GPU are on par

julia

using BenchmarkTools

Small NN on GPU

julia

@benchmark gpu_small_nn() samples = 4 evals = 1

BenchmarkTools.Trial: 1 sample with 1 evaluation per sample.
 Single result which took 30.030 s (0.00% GC) to evaluate,
 with a memory estimate of 561.82 MiB, over 13437352 allocations.

Small NN on CPU

julia

@benchmark cpu_small_nn() samples = 4 evals = 1

BenchmarkTools.Trial: 4 samples with 1 evaluation per sample.
 Range (min … max):  1.160 s …    1.705 s  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.351 s               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.392 s ± 263.515 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █ █                                  █                   █  
  █▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  1.16 s         Histogram: frequency by time         1.71 s <

 Memory estimate: 342.01 MiB, allocs estimate: 5000409.

EasyHybrid Example: Synthetic Data Analysis ​

Data Loading and Preprocessing ​

Define the Physical Model ​

Define Model Parameters ​

Configure Hybrid Model Components ​

Single NN Hybrid Model Training ​

train on DataFrame ​

With Large NN CPU is slower than GPU ​

With Small NN CPU and GPU are on par ​

EasyHybrid Example: Synthetic Data Analysis

Data Loading and Preprocessing

Define the Physical Model

Define Model Parameters

Configure Hybrid Model Components

Single NN Hybrid Model Training

train on DataFrame

With Large NN CPU is slower than GPU

With Small NN CPU and GPU are on par