Skip to content

EasyHybrid.jl

Documentation for EasyHybrid.jl.

EasyHybrid.EasyHybrid Module
julia
EasyHybrid

EasyHybrid is a Julia package for hybrid machine learning models, combining neural networks and traditional statistical methods. It provides tools for data preprocessing, model training, and evaluation, making it easier to build and deploy hybrid models.

source
EasyHybrid.BulkDensitySOC Type
julia
BulkDensitySOC(NN, predictors, targets, oBD)

A hybrid model with a neural network NN, predictors and one global parameter oBD.

source
EasyHybrid.BulkDensitySOC Method
julia
BulkDensitySOC(NN, predictors, oBD)(ds_k)

Hybrid model for bulk density based on the Federer (1993) paper http://dx.doi.org/10.1139/x93-131 plus SOC concentrations, density and coarse fraction

source
EasyHybrid.FluxPartModelQ10Lux Type
julia
FluxPartModelQ10Lux(RUE_NN, Rb_NN, RUE_predictors, Rb_predictors, forcing, targets, Q10)

A flux partitioning model with separate neural networks for RUE (Radiation Use Efficiency) and Rb (basal respiration), using Q10 temperature sensitivity for respiration calculations.

source
EasyHybrid.FluxPartModelQ10Lux Method
julia
FluxPartModelQ10Lux(RUE_NN, Rb_NN, RUE_predictors, Rb_predictors, forcing, targets, Q10)(ds_k, ps, st)

Model definition

  • GPP = SW_IN * RUE(αᵢ(t)) / 12.011 # µmol/m²/s = J/s/m² * g/MJ / g/mol

  • Reco = Rb(βᵢ(t)) * Q10^((T(t) - T_ref)/10)

  • NEE = Reco - GPP

where:

  • RUE(αᵢ(t)) is the radiation use efficiency predicted by neural network

  • Rb(βᵢ(t)) is the basal respiration predicted by neural network

  • SW_IN is incoming shortwave radiation

  • T(t) is air temperature

  • T_ref is reference temperature (15°C)

  • Q10 is temperature sensitivity factor

source
EasyHybrid.FluxPartModel_NEE_ET2 Method

FluxPartModel_NEE_ET2( RUE_predictors::AbstractArray{Symbol}, Rb_predictors::AbstractArray{Symbol}, WUE_predictors::AbstractArray{Symbol}, Ecoeff_predictors::AbstractArray{Symbol}; forcing=[:SW_IN, :TA], Q10=[1.5f0], neurons=15 )

source
EasyHybrid.FluxPartModel_NEE_ET2 Method

(m::FluxPartModel_NEE_ET2)(dk, infer::Symbol)

source
EasyHybrid.FluxPartModel_Q10 Method

FluxPartModel_Q10(RUE_predictors, Rb_predictors; Q10=[1.5f0])

source
EasyHybrid.FluxPartModel_Q10 Method

(m::FluxPartModel_Q10)(dk, infer::Symbol)

source
EasyHybrid.HybridParams Type
julia
HybridParams{M<:Function}

A little parametric stub for “the params of function M.” All of your function‐based models become HybridParams{typeof(f)}.

source
EasyHybrid.LinearHM Type
julia
LinearHM(NN, predictors, forcing, targets, β)

A linear hybrid model with a neural network NN, predictors, forcing and targets terms.

source
EasyHybrid.LinearHM Method
julia
LinearHM(NN, predictors, forcing, β)(ds_k)

Model definition ŷ = α x + β

Apply the linear hybrid model to the input data ds_k (a KeyedArray with proper dimensions). The model uses the neural network NN to compute new α's based on the predictors and then computes using the forcing term x.

Returns:

A tuple containing the predicted values and a named tuple with the computed values of α and the state st.

Example:

julia
using Lux
using EasyHybrid
using Random
using AxisKeys

ds_k = KeyedArray(rand(Float32, 3,4); data=[:a, :b, :c], sample=1:4)
m = Lux.Chain(Dense(2, 5), Dense(5, 1))
# Instantiate the model
# Note: The model is initialized with a neural network and the predictors and forcing terms
lh_model = LinearHM(m, (:a, :b), (:c,), 1.5f0)
rng = Random.default_rng()
Random.seed!(rng, 0)
ps, st = LuxCore.setup(rng, lh_model)
# Apply the model to the data
ŷ, αst = LuxCore.apply(lh_model, ds_k, ps, st)
source
EasyHybrid.LinearHybridModel Method

LinearHybridModel(predictors::AbstractArray{Symbol}, forcing::AbstractArray{Symbol}, out_dim::Int, neurons::Int; b=[1.5f0])

source
EasyHybrid.LinearHybridModel Method

(lhm::LinearHybridModel)(df, infer::Symbol)

source
EasyHybrid.LinearHybridModel_2outputs Method

LinearHybridModel_2outputs(predictors::AbstractArray{Symbol}, forcing::AbstractArray{Symbol}, out_dim::Int, neurons::Int, nn_chain; a=[1.0f0], b=[1.5f0])

- nn_chain :: DenseNN, a Dense neural network
source
EasyHybrid.LinearHybridModel_2outputs Method

LinearHybridModel_2outputs(df, infer::Symbol)

source
EasyHybrid.LoggingLoss Type
julia
LoggingLoss

A structure to define a logging loss function for hybrid models. It allows for multiple loss types and an aggregation function to be specified.

Arguments:

  • loss_types::Vector{Symbol}: A vector of loss types to compute, e.g., [:mse, :mae].

  • training_loss::Symbol: The loss type to use during training, e.g., :mse.

  • agg::Function: A function to aggregate the losses, e.g., sum or mean.

  • train_mode::Bool: A flag indicating whether the model is in training mode. If true, it uses training_loss; otherwise, it uses loss_types.

source
EasyHybrid.RbQ10_2p Type
julia
RbQ10_2p(forcing, targets, Q10)
source
EasyHybrid.RbQ10_2p Method
julia
RbQ10_2p(NN, predictors, forcing, targets, Q10)(ds_k)

Model definition ŷ = Rb(αᵢ(t)) * Q10^((T(t) - T_ref)/10)

ŷ (respiration rate) is computed as a function of the neural network output Rb(αᵢ(t)) and the temperature T(t) adjusted by the reference temperature T_ref (default 15°C) using the Q10 temperature sensitivity factor. ````

source
EasyHybrid.RespirationRbQ10 Type
julia
RespirationRbQ10(NN, predictors, forcing, targets, Q10)

A linear hybrid model with a neural network NN, predictors, targets and forcing terms.

source
EasyHybrid.RespirationRbQ10 Method
julia
RespirationRbQ10(NN, predictors, forcing, targets, Q10)(ds_k)

Model definition ŷ = Rb(αᵢ(t)) * Q10^((T(t) - T_ref)/10)

ŷ (respiration rate) is computed as a function of the neural network output Rb(αᵢ(t)) and the temperature T(t) adjusted by the reference temperature T_ref (default 15°C) using the Q10 temperature sensitivity factor. ````

source
EasyHybrid.Rs_components Type
julia
Rs_components(NN, predictors, forcing, targets, Q10_het, Q10_root, Q10_myc)

A linear hybrid model with a neural network NN, predictors, targets and forcing terms.

source
EasyHybrid.Rs_components Method
julia
Rs_components(NN, predictors, forcing, targets, Q10)(ds_k)

Model definition ŷ = Rb(αᵢ(t)) * Q10^((T(t) - T_ref)/10)

ŷ (respiration rate) is computed as a function of the neural network output Rb(αᵢ(t)) and the temperature T(t) adjusted by the reference temperature T_ref (default 15°C) using the Q10 temperature sensitivity factor. ````

source
EasyHybrid.SinusHybridModel Method

SinusHybridModel(predictors, forcing, out_dim; neurons=15, b=[1.5f0])

source
EasyHybrid.SinusHybridModel Method

(lhm::SinusHybridModel)(dk, infer::Symbol)

source
EasyHybrid.WrappedTuples Type
julia
WrappedTuples(vec::Vector{<:NamedTuple})

Wraps a vector of named tuples to allow dot-access to each field as a vector.

source
Base.Multimedia.display Method
julia
Base.display(io::IO, parameter_container::ParameterContainer)

Display a ParameterContainer containing parameter bounds in a formatted table.

This function creates a nicely formatted table showing parameter names as row labels and bound types (default, lower, upper) as column headers.

Arguments

  • io::IO: Output stream

  • parameter_container::ParameterContainer: A ParameterContainer with parameter bounds (typically created by build_parameter_matrix)

Returns

  • Displays a formatted table using PrettyTables.jl

Example

julia
# Create parameter defaults and bounds
parameter_defaults_and_bounds = (
    θ_s = (0.464f0, 0.302f0, 0.700f0),
    α   = (log(0.103f0), log(0.01f0), log(7.874f0)),
    n   = (log(3.163f0 - 1), log(1.100f0 - 1), log(20.000f0 - 1)),
)

# Build ParameterContainer and display
parameter_container = ParameterContainer(parameter_defaults_and_bounds)
display(parameter_container)  # or just parameter_container

Notes

  • Requires PrettyTables.jl to be loaded

  • The table shows parameter names as row labels and bound types as column headers

  • Default alignment is right-aligned for all columns

source
EasyHybrid.DenseNN Method

DenseNN(in_dim, out_dim, neurons)

source
EasyHybrid.Dense_RUE_Rb Method

Dense_RUE_Rb(in_dim; neurons=15, out_dim=1, affine=true)

source
EasyHybrid.GRU_NN Method

GRU_NN(in_dim,out_dim, neurons)

source
EasyHybrid.build_parameter_matrix Method
julia
build_parameter_matrix(parameter_defaults_and_bounds::NamedTuple)

Build a ComponentArray matrix from a NamedTuple containing parameter defaults and bounds.

This function converts a NamedTuple where each value is a tuple of (default, lower, upper) bounds into a ComponentArray with named axes for easy parameter management in hybrid models.

Arguments

  • parameter_defaults_and_bounds::NamedTuple: A NamedTuple where each key is a parameter name and each value is a tuple of (default, lower, upper) for that parameter.

Returns

  • ComponentArray: A 2D ComponentArray with:
    • Row axis: Parameter names (from the NamedTuple keys)

    • Column axis: Bound types (:default, :lower, :upper)

    • Data: The parameter values organized in a matrix format

Example

julia
# Define parameter defaults and bounds
parameter_defaults_and_bounds = (
    θ_s = (0.464f0, 0.302f0, 0.700f0),     # Saturated water content [cm³/cm³]
    h_r = (1500.0f0, 1500.0f0, 1500.0f0),  # Pressure head at residual water content [cm]
    α   = (log(0.103f0), log(0.01f0), log(7.874f0)),  # Shape parameter [cm⁻¹]
    n   = (log(3.163f0 - 1), log(1.100f0 - 1), log(20.000f0 - 1)),  # Shape parameter [-]
)

# Build the ComponentArray
parameter_matrix = build_parameter_matrix(parameter_defaults_and_bounds)

# Access specific parameter bounds
parameter_matrix.θ_s.default  # Get default value for θ_s
parameter_matrix[:, :lower]   # Get all lower bounds
parameter_matrix[:, :upper]   # Get all upper bounds

Notes

  • The function expects each value in the NamedTuple to be a tuple with exactly 3 elements

  • The order of bounds is always (default, lower, upper)

  • The resulting ComponentArray can be used for parameter optimization and constraint handling

source
EasyHybrid.build_parameters Method
julia
build_parameters(parameters::NamedTuple, f::DataType) -> AbstractHybridModel

Constructs a parameter container from a named tuple of parameter bounds and wraps it in a user-defined subtype of AbstractHybridModel.

Arguments

  • parameters::NamedTuple: Named tuple where each entry is a tuple of (default, lower, upper) bounds for a parameter.

  • f::DataType: A constructor for a subtype of AbstractHybridModel that takes a ParameterContainer as its argument.

Returns

  • An instance of the user-defined AbstractHybridModel subtype containing the parameter container.
source
EasyHybrid.compute_bulk_density Method
julia
compute_bulk_density(SOCconc, oBD, mBD)

model for bulk density based on the Federer (1993) paper http://dx.doi.org/10.1139/x93-131 plus SOC concentrations, density and coarse fraction

source
EasyHybrid.compute_loss Function
julia
compute_loss(ŷ, y, y_nan, targets, training_loss::Symbol, agg::Function)
compute_loss(ŷ, y, y_nan, targets, loss_types::Vector{Symbol}, agg::Function)

Compute the loss for the given predictions and targets using the specified training loss (or vector of losses) type and aggregation function.

Arguments:

  • : Predicted values.

  • y: Target values.

  • y_nan: Mask for NaN values.

  • targets: The targets for which the loss is computed.

  • training_loss::Symbol: The loss type to use during training, e.g., :mse.

  • loss_types::Vector{Symbol}: A vector of loss types to compute, e.g., [:mse, :mae].

  • agg::Function: The aggregation function to apply to the computed losses, e.g., sum or mean.

Returns a single loss value if training_loss is provided, or a NamedTuple of losses for each type in loss_types.

source
EasyHybrid.constructNNModel Method
julia
constructNNModel(predictors, targets; hidden_layers, activation, scale_nn_outputs)

Main constructor: hidden_layers can be either • a Vector{Int} of sizes, or • a Chain of hidden-layer Dense blocks.

source
EasyHybrid.display_parameter_bounds Method
julia
display_parameter_bounds(parameter_container::ParameterContainer; alignment=:r)

Legacy function for displaying parameter bounds with custom alignment.

Arguments

  • parameter_container::ParameterContainer: A ParameterContainer with parameter bounds

  • alignment: Alignment for table columns (default: right-aligned for all columns)

Returns

  • Displays a formatted table using PrettyTables.jl
source
EasyHybrid.evec Method

evec(nt::NamedTuple)

source
EasyHybrid.get_predictions_targets Method
julia
get_predictions_targets(HM, x, (y_t, y_nan), ps, st, targets)

Get predictions and targets from the hybrid model and return them along with the NaN mask.

source
EasyHybrid.initialize_plotting_observables Method
julia
initialize_plotting_observables(init_ŷ_train, init_ŷ_val, y_train, y_val, l_init_train, l_init_val, training_loss, agg, monitor_names, target_names)

Initialize plotting observables for training visualization if the Makie extension is loaded.

source
EasyHybrid.load_timeseries_netcdf Method
julia
load_timeseries_netcdf(path::AbstractString; timedim::AbstractString = "time") -> DataFrame

Reads a NetCDF file where all data variables are 1D over the specified timedim and returns a tidy DataFrame with one row per time step.

  • Only includes variables whose sole dimension is timedim.

  • Does not attempt to parse or convert time units; all columns are read as-is.

source
EasyHybrid.loss_fn Function
julia
loss_fn(ŷ, y, y_nan, ::Val{:rmse})
loss_fn(ŷ, y, y_nan, ::Val{:mse})
loss_fn(ŷ, y, y_nan, ::Val{:mae})
loss_fn(ŷ, y, y_nan, ::Val{:pearson})
loss_fn(ŷ, y, y_nan, ::Val{:r2})

Compute the loss for the given predictions and targets using the specified loss type.

Arguments:

  • : Predicted values.

  • y: Target values.

  • y_nan: Mask for NaN values.

  • ::Val{:rmse}: Root Mean Square Error or ::Val{:mse}: Mean Square Error or ::Val{:mae}: Mean Absolute Error or ::Val{:pearson}: Pearson correlation coefficient or ::Val{:r2}: R-squared.

You can define additional loss functions as needed by adding more methods to this function.

Example:

In your working script just do the following:

julia
import EasyHybrid: loss_fn
function EasyHybrid.loss_fn(ŷ, y, y_nan, ::Val{:nse})
    return 1 - sum((ŷ[y_nan] .- y[y_nan]).^2) / sum((y[y_nan] .- mean(y[y_nan])).^2)
end
source
EasyHybrid.lossfn Method
julia
lossfn(HM::LuxCore.AbstractLuxContainerLayer, x, (y_t, y_nan), ps, st, logging::LoggingLoss)

Arguments:

  • HM::LuxCore.AbstractLuxContainerLayer: The hybrid model to compute the loss for.

  • x: Input data for the model.

  • (y_t, y_nan): Tuple containing the target values and a mask for NaN values.

  • ps: Parameters of the model.

  • st: State of the model.

  • logging::LoggingLoss: Logging configuration for the loss function.

source
EasyHybrid.prepare_data Method
julia
prepare_data(hm, data)

Utility function to see if the data is already in the expected format or if further filtering and re-packing is needed.

Arguments:

  • hm: The Hybrid Model

  • data: either a Tuple of KeyedArrays or a single KeyedArray.

Returns a tuple of KeyedArrays

source
EasyHybrid.prepare_hidden_chain Method
julia
prepare_hidden_chain(hidden_layers, in_dim, out_dim; activation, input_batchnorm=false)

Construct a neural network Chain for use in NN models.

Arguments

  • hidden_layers::Union{Vector{Int}, Chain}:

    • If a Vector{Int}, specifies the sizes of each hidden layer. For example, [32, 16] creates two hidden layers with 32 and 16 units, respectively.

    • If a Chain, the user provides a pre-built chain of hidden layers (excluding input/output layers).

  • in_dim::Int: Number of input features (input dimension).

  • out_dim::Int: Number of output features (output dimension).

  • activation: Activation function to use in hidden layers (default: tanh).

  • input_batchnorm::Bool: If true, applies a BatchNorm layer to the input (default: false).

Returns

  • A Chain object representing the full neural network, with the following structure:
    • Optional input batch normalization (if input_batchnorm=true)

    • Input layer: Dense(in_dim, h₁, activation) where h₁ is the first hidden size

    • Hidden layers: either user-supplied Chain or constructed from hidden_layers

    • Output layer: Dense(hₖ, out_dim) where hₖ is the last hidden size

where h₁ is the first hidden size and hₖ the last.

source
EasyHybrid.scale_single_param Method
julia
scale_single_param(name, raw_val, parameters)

Scale a single parameter using the sigmoid scaling function.

source
EasyHybrid.scale_single_param_minmax Method
julia
scale_single_param_minmax(name, hm::AbstractHybridModel)

Scale a single parameter using the minmax scaling function.

source
EasyHybrid.select_cols Method

select_cols(df, predictors, x)

source
EasyHybrid.select_cols Method

select_cols(df::KeyedArray, predictors, x)

source
EasyHybrid.select_predictors Method

select_predictors(df, predictors)

source
EasyHybrid.select_predictors Method

select_predictors(dk::KeyedArray, predictors)

source
EasyHybrid.select_variable Method

select_variable(df::KeyedArray, x)

source
EasyHybrid.split_data Method
julia
split_data(data, split_by_id; shuffleobs=false, split_ratio=0.8)

Split data into training and validation sets, either randomly or by grouping by ID.

Arguments:

  • data: The data to split, typically a tuple of (x, y) KeyedArrays

  • split_by_id: Either nothing for random splitting, a Symbol for column-based splitting, or an AbstractVector for custom ID-based splitting

  • shuffleobs: Whether to shuffle observations during splitting (default: false)

  • split_ratio: Ratio of data to use for training (default: 0.8)

Returns:

  • (x_train, y_train): Training data tuple

  • (x_val, y_val): Validation data tuple

source
EasyHybrid.split_data Method

split_data(df::DataFrame, target, xvars, seqID; f=0.8, batchsize=32, shuffle=true, partial=true)

source
EasyHybrid.split_data Method

split_data(df::DataFrame, target, xvars; f=0.8, batchsize=32, shuffle=true, partial=true)

source
EasyHybrid.to_keyedArray Function

tokeyedArray(dfg::Union{Vector,GroupedDataFrame{DataFrame}}, vars=All())

source
EasyHybrid.to_keyedArray Method

tokeyedArray(df::DataFrame)

source
EasyHybrid.train Method
julia
train(hybridModel, data, save_ps; nepochs=200, batchsize=10, opt=Adam(0.01), patience=typemax(Int),
      file_name=nothing, loss_types=[:mse, :r2], training_loss=:mse, agg=sum, train_from=nothing,
      random_seed=161803, shuffleobs=false, yscale=log10, monitor_names=[], return_model=:best, 
      split_by_id=nothing, split_data_at=0.8, plotting=true, show_progress=true, hybrid_name=randstring(10))

Train a hybrid model using the provided data and save the training process to a file in JLD2 format. Default output file is trained_model.jld2 at the current working directory under output_tmp.

Arguments:

  • hybridModel: The hybrid model to be trained.

  • data: The training data, either a single DataFrame, a single KeyedArray, or a tuple of KeyedArrays.

  • save_ps: A tuple of physical parameters to save during training.

Core Training Parameters:

  • nepochs: Number of training epochs (default: 200).

  • batchsize: Size of the training batches (default: 10).

  • opt: The optimizer to use for training (default: Adam(0.01)).

  • patience: The number of epochs to wait before early stopping (default: typemax(Int) -> no early stopping).

Loss and Evaluation:

  • training_loss: The loss type to use during training (default: :mse).

  • loss_types: A vector of loss types to compute during training (default: [:mse, :r2]).

  • agg: The aggregation function to apply to the computed losses (default: sum).

Data Handling:

  • shuffleobs: Whether to shuffle the training data (default: false).

  • split_by_id: Column name or function to split data by ID (default: nothing -> no ID-based splitting).

  • split_data_at: Fraction of data to use for training when splitting (default: 0.8).

Training State and Reproducibility:

  • train_from: A tuple of physical parameters and state to start training from or an output of train (default: nothing -> new training).

  • random_seed: The random seed to use for training (default: 161803).

Output and Monitoring:

  • file_name: The name of the file to save the training process (default: nothing -> "trained_model.jld2").

  • hybrid_name: Name identifier for the hybrid model (default: randomly generated 10-character string).

  • return_model: The model to return: :best for the best model, :final for the final model (default: :best).

  • monitor_names: A vector of monitor names to track during training (default: []).

Visualization and UI:

  • plotting: Whether to generate plots during training (default: true).

  • show_progress: Whether to show progress bars during training (default: true).

  • yscale: The scale to apply to the y-axis for plotting (default: log10).

source
EasyHybrid.unpack_keyedarray Method

unpack_keyedarray(ka::KeyedArray, variable::Symbol) Extract a single variable from a KeyedArray and return it as a vector.

Arguments:

  • ka: The KeyedArray to unpack

  • variable: Symbol representing the variable to extract

Returns:

  • Vector containing the variable data

Example:

julia
# Extract just SW_IN from a KeyedArray
sw_in = unpack_keyedarray(ds_keyed, :SW_IN)
source
EasyHybrid.unpack_keyedarray Method

unpack_keyedarray(ka::KeyedArray, variables::Vector{Symbol}) Extract specified variables from a KeyedArray and return them as a NamedTuple of vectors.

Arguments:

  • ka: The KeyedArray to unpack

  • variables: Vector of symbols representing the variables to extract

Returns:

  • NamedTuple with variable names as keys and vectors as values

Example:

julia
# Extract SW_IN and TA from a KeyedArray
data = unpack_keyedarray(ds_keyed, [:SW_IN, :TA])
sw_in = data.SW_IN
ta = data.TA
source
EasyHybrid.unpack_keyedarray Method

unpack_keyedarray(ka::KeyedArray) Extract all variables from a KeyedArray and return them as a NamedTuple of vectors.

Arguments:

  • ka: The KeyedArray to unpack

Returns:

  • NamedTuple with all variable names as keys and vectors as values

Example:

julia
# Extract all variables from a KeyedArray
data = unpack_keyedarray(ds_keyed)
# Access individual variables
sw_in = data.SW_IN
ta = data.TA
nee = data.NEE
source
EasyHybrid.@hybrid Macro
julia
@hybrid ModelName α β γ

Macro to define hybrid model structs with arbitrary numbers of physical parameters.

This defines a struct with:

  • Default fields: NN (neural network), predictors, forcing, targets.

  • Additional physical parameters, i.e., α β γ.

Examples

julia
@hybrid MyModel α β γ
@hybrid FluidModel (:viscosity, :density)
@hybrid SimpleModel :a :b
source