Skip to content

CC BY-SA 4.0

EasyHybrid Example: Synthetic Data Analysis

This example demonstrates how to use EasyHybrid to train a hybrid model on synthetic data for respiration modeling with Q10 temperature sensitivity.

julia
using EasyHybrid
using Metal
# using CUDA
using Lux

Data Loading and Preprocessing

Load synthetic dataset from GitHub into DataFrame

julia
df = load_timeseries_netcdf("https://github.com/bask0/q10hybrid/raw/master/data/Synthetic4BookChap.nc");

Select a subset of data for faster execution

julia
df = df[1:5000, :];
first(df, 5)
5×6 DataFrame
Rowtimesw_potdsw_pottarecorb
DateTimeFloat64?Float64?Float64?Float64?Float64?
12003-01-01T00:15:00109.817115.5952.10.8447411.42522
22003-01-01T00:45:00109.817115.5951.980.8406411.42522
32003-01-01T01:15:00109.817115.5951.890.8375791.42522
42003-01-01T01:45:00109.817115.5952.060.8433721.42522
52003-01-01T02:15:00109.817115.5952.090.8443991.42522

Define the Physical Model

RbQ10 model: Respiration model with Q10 temperature sensitivity

Parameters:

  • ta: air temperature [°C]

  • Q10: temperature sensitivity factor [-]

  • rb: basal respiration rate [μmol/m²/s]

  • tref: reference temperature [°C] (default: 15.0)

julia
function RbQ10(; ta, Q10, rb, tref = 15.0f0)
    reco = rb .* Q10 .^ (0.1f0 .* (ta .- tref))
    return (; reco, Q10, rb)
end
RbQ10 (generic function with 1 method)

Define Model Parameters

Parameter specification: (default, lower_bound, upper_bound)

julia
parameters = (
    # Parameter name | Default | Lower | Upper      | Description
    rb = (3.0f0, 0.0f0, 13.0f0),  # Basal respiration [μmol/m²/s]
    Q10 = (2.0f0, 1.0f0, 4.0f0),   # Temperature sensitivity factor [-]
)
(rb = (3.0f0, 0.0f0, 13.0f0), Q10 = (2.0f0, 1.0f0, 4.0f0))

Configure Hybrid Model Components

Define input variables

julia
forcing = [:ta]                    # Forcing variables (temperature)
1-element Vector{Symbol}:
 :ta

Target variable

julia
target = [:reco]                   # Target variable (respiration)
1-element Vector{Symbol}:
 :reco

Parameter classification

julia
global_param_names = [:Q10]        # Global parameters (same for all samples)
neural_param_names = [:rb]         # Neural network predicted parameters
1-element Vector{Symbol}:
 :rb

Single NN Hybrid Model Training

julia
predictors_single_nn = [:sw_pot, :dsw_pot]   # Predictor variables (solar radiation, and its derivative)

small_nn_hybrid_model = constructHybridModel(
    predictors_single_nn,              # Input features
    forcing,                 # Forcing variables
    target,                  # Target variables
    RbQ10,                  # Process-based model function
    parameters,              # Parameter definitions
    neural_param_names,      # NN-predicted parameters
    global_param_names,      # Global parameters
    hidden_layers = [16, 16], # Neural network architecture
    activation = sigmoid,      # Activation function
    scale_nn_outputs = true, # Scale neural network outputs
    input_batchnorm = true   # Apply batch normalization to inputs
)

large_nn_hybrid_model = constructHybridModel(
    predictors_single_nn,              # Input features
    forcing,                 # Forcing variables
    target,                  # Target variables
    RbQ10,                  # Process-based model function
    parameters,              # Parameter definitions
    neural_param_names,      # NN-predicted parameters
    global_param_names,      # Global parameters
    hidden_layers = [1024, 512, 256, 128, 64], # Neural network architecture
    activation = sigmoid,      # Activation function
    scale_nn_outputs = true, # Scale neural network outputs
    input_batchnorm = true   # Apply batch normalization to inputs
)
Hybrid Model (Single NN)
Neural Network: 
  Chain(
      layer_1 = InputBatchNorm(
          layer = BatchNorm(2, affine=false, track_stats=true),
      ),
      layer_2 = Dense(2 => 1024, σ),                # 3_072 parameters
      layer_3 = Dense(1024 => 512, σ),              # 524_800 parameters
      layer_4 = Dense(512 => 256, σ),               # 131_328 parameters
      layer_5 = Dense(256 => 128, σ),               # 32_896 parameters
      layer_6 = Dense(128 => 64, σ),                # 8_256 parameters
      layer_7 = Dense(64 => 1),                     # 65 parameters
  )         # Total: 700_417 parameters,
            #        plus 5 states.
Configuration:
  predictors = [:sw_pot, :dsw_pot]
  forcing = [:ta]
  targets = [:reco]
  mechanistic_model = RbQ10
  neural_param_names = [:rb]
  global_param_names = [:Q10]
  fixed_param_names = Symbol[]
  scale_nn_outputs = true
  start_from_default = true
  config = (; hidden_layers = [1024, 512, 256, 128, 64], activation = σ, scale_nn_outputs = true, input_batchnorm = true, start_from_default = true,)

Parameters:
  Hybrid Parameters
    ┌─────┬─────────┬───────┬───────┐
    │     │ default │ lower │ upper │
    ├─────┼─────────┼───────┼───────┤
    │  rb │     3.0 │   0.0 │  13.0 │
    │ Q10 │     2.0 │   1.0 │   4.0 │
    └─────┴─────────┴───────┴───────┘

train on DataFrame

Train the hybrid model

julia
cfg = EasyHybrid.TrainConfig(
    nepochs = 20,
    batchsize = 64,
    opt = RMSProp(0.01),
    loss_types = [:mse, :nse],
    show_progress = false,
    keep_history = false, # set to true to keep per-epoch history, losses, predictions, etc.
    save_training = false, # Set to true to enable saving training history and checkpoints
)

using BenchmarkTools

gpu_small_nn() = tune(
    small_nn_hybrid_model, df, cfg;
    gdev = gpu_device(), model_name = "small_nn_gpu"
)

cpu_small_nn() = tune(
    small_nn_hybrid_model, df, cfg;
    gdev = cpu_device(), model_name = "small_nn_cpu"
)

gpu_large_nn() = tune(
    large_nn_hybrid_model, df, cfg;
    gdev = gpu_device(), model_name = "large_nn_gpu"
)

cpu_large_nn() = tune(
    large_nn_hybrid_model, df, cfg;
    gdev = cpu_device(), model_name = "large_nn_cpu"
)
cpu_large_nn (generic function with 1 method)

warm-up to pay compilation once

julia
gpu_small_nn();
cpu_small_nn();
gpu_large_nn();
cpu_large_nn();
[ Info: Training outputs will be saved to: /Users/runner/work/EasyHybrid.jl/EasyHybrid.jl/docs/build
[ Info: Returning best model from epoch 3 with validation loss: 0.017481212
[ Info: Training outputs will be saved to: /Users/runner/work/EasyHybrid.jl/EasyHybrid.jl/docs/build
[ Info: Returning best model from epoch 3 with validation loss: 0.017481212
[ Info: Training outputs will be saved to: /Users/runner/work/EasyHybrid.jl/EasyHybrid.jl/docs/build
[ Info: Returning best model from epoch 2 with validation loss: 0.057252932
[ Info: Training outputs will be saved to: /Users/runner/work/EasyHybrid.jl/EasyHybrid.jl/docs/build
[ Info: Returning best model from epoch 2 with validation loss: 0.057252932

With Large NN CPU is slower than GPU

Large NN on GPU

julia
@benchmark gpu_large_nn() samples = 4 evals = 1
BenchmarkTools.Trial: 1 sample with 1 evaluation per sample.
 Single result which took 46.381 s (52.46% GC) to evaluate,
 with a memory estimate of 62.86 GiB, over 6443682 allocations.

Large NN on CPU

julia
@benchmark cpu_large_nn() samples = 4 evals = 1
BenchmarkTools.Trial: 1 sample with 1 evaluation per sample.
 Single result which took 48.061 s (49.69% GC) to evaluate,
 with a memory estimate of 62.86 GiB, over 6443682 allocations.

With Small NN CPU and GPU are on par

julia
using BenchmarkTools

Small NN on GPU

julia
@benchmark gpu_small_nn() samples = 4 evals = 1
BenchmarkTools.Trial: 3 samples with 1 evaluation per sample.
 Range (minmax):  1.793 s   2.500 s GC (min … max): 0.00% … 0.00%
 Time  (median):     2.076 s                GC (median):    0.00%
 Time  (mean ± σ):   2.123 s ± 355.932 ms GC (mean ± σ):  0.00% ± 0.00%

  

  1.79 s         Histogram: frequency by time          2.5 s <

 Memory estimate: 376.08 MiB, allocs estimate: 5747969.

Small NN on CPU

julia
@benchmark cpu_small_nn() samples = 4 evals = 1
BenchmarkTools.Trial: 3 samples with 1 evaluation per sample.
 Range (minmax):  1.972 s   2.285 s GC (min … max): 0.00% … 0.00%
 Time  (median):     2.223 s                GC (median):    0.00%
 Time  (mean ± σ):   2.160 s ± 165.903 ms GC (mean ± σ):  0.00% ± 0.00%

  

  1.97 s         Histogram: frequency by time         2.29 s <

 Memory estimate: 376.08 MiB, allocs estimate: 5747968.