Fit model to data

Data from real membrane filtration systems could be obtained, and the goal now is to fit these kind of data to a simulation model. Such data corresponds to the volume $v$ and/or energy $e$ during time, but the mass $m$ attached to it could not be observed.

using Filtration
using Plots
using LinearAlgebra:norm

Synthetic data

On this tutorial, data are generated from the following model with the following parameters $p$.

a = 1; b = 1; e = 1;
p = [a, b, e]

f₁(m,p) = p[2] / (p[3] + m)
f₂(m,p) = -p[1] * m
g(m,p)  = 1 / (p[3] + m)
g₁(m,p) = g(m,p)
g₂(m,p) = -g(m,p)

model = Model(g₁, g₂, f₁, f₂, p, :max, initial_guess = 2.);
nothing; #hide

$n$ data are generated with a normal noise $\varepsilon$ with a choosen control.

n = 100
ε = 0
u₁ = 0.6; u₂ = 0.8
t0 = 0.; t1 = 25.; tf = 50.
u = Control([t -> u₁ , t -> u₂], [t0, t1, tf], 2, 1)
x0 = [0., 4., 0.]
sol = simulate(model, u, x0)
t = collect(range(u.time[1], u.time[end], n))
data = [x[1] for x ∈ sol.state.(t)] + ε*(randn(n) .- 0.5)
plt = plot(sol, state_ylabels = ["v", "m", "t"])
scatter!(plt, t, data, subplot = 1, label = "data", mc = :red, ma = 0.5, ms = 2)

Known initial guess

Let assume that the initial mass $m(t_0) = m_0$ is known. The parameters can be easily fitted to the data

initial_p = [0.5, 0.5, 0.5]
fitted_p = fit(model, u, t, data, initial_p, x0, method = :known)
println("Error norm : ", norm(fitted_p - p))
sol_init = simulate(model, u, x0; p = initial_p)
sol_sim = simulate(model, u, x0; p = fitted_p)
plt = plot(sol, state_ylabels = ["v", "m", "t"], label = "Real")
plot!(plt, sol_init, c = :red, label = "Initial", ls = :dash)
scatter!(plt, t, data, subplot = 1, label = "data", mc = :red, ma = 0.5, ms = 2)
plot!(sol_sim, label = "Fitted", c = :green, ls = :dash)

Unknown initial guess

Let assume now that the initial mass $m(t_0)$ is not known. One proposes to apply a constant control $u = u_1$ during a time interval $[t_0, t_1]$ until the steady state $m_1$ is reach, and to change the control after to $u = u_2$ during $[t_1, t_f]$. Thanks to the first arc, some informations could be obtained.

First, at steady state, we have

\[ \dot m(t) = 0 \quad \Longrightarrow \quad f_0(m_1,p) + u_1 f_1(m_1,p) = 0,\]

where $f_0$ and $f_1$ corresponds to the functions of the $m$ dynamic. Such equation leads to

\[ m(t_1) \approx m_1 = \psi(u_1, p)\]

where $\psi$ is the inverse of $m \mapsto \frac{-f_0(m,p)}{f_1(m,p)}$.

Second, at steady state, we have

\[ \dot v(t) = g_0(m_1,p) + u_1 g_1(m_1,p) = c\]

where $c$ is constant.

Both constraints on mass $m(t_1)$ and $\dot v$ are used to fit parameters with unknown initial guess on the second singular arc.

To do this, one first need to compute $c$ on the first arc as the slope of $v$ at steady state.

# Affine fit function
function affine_fit(x, y)
    x = [x ones(length(x))]
    y = [y ones(length(y))]
    coeffs = x \ y
    return coeffs[1], coeffs[2]
end

# R2 test
function R2(x, y, a, b)
    y_hat = a*x .+ b
    mean_y = sum(y) / length(y)
    ss_res = sum((y .- y_hat).^2)
    ss_tot = sum((y .- mean_y).^2)
    return 1 - (ss_res / ss_tot)
end

# First arc data
t_ = t[t.<t1]
data_ = data[t.<t1]

# Compute R2 for n removed points
all_n = collect(1:length(t_)-5)
all_t = []
all_r2 = []
for n ∈ all_n
    a, b = affine_fit(t_[n:end], data_[n:end])
    push!(all_r2, R2(t_[n:end], data_[n:end], a, b))
    push!(all_t, t_[n])
end

# Get n wrt R2
ε = 1e-5
ind = findfirst(1 .- all_r2 .< ε )

# Print & Plot
slope, intercept = affine_fit(t_[all_n[ind]:end], data_[all_n[ind]:end])
println("Slope of v̇ at steady state : ", slope)
plt = plot(all_t, all_r2, xlabel = "Number of unused points", ylabel = "R²", label = "")
scatter!(plt, [all_t[ind]], [all_r2[ind]], color = :black, label = "Selected number")

By providing this slope $c$ and the function $\phi$ (as the get_m function), we can esily fit parameters $p$ of the model on the data of the second arc

# Inverse of -f₋/f₊
function get_m(u, p)
    Δ = p[3]^2 - 4*(p[2]* (u+1))/(p[1]* (u-1))
    m = (-p[3] + sqrt(Δ))/2
    return m
end

t_ = t[t.>t1]
t_ = t_ .- t_[1]
data_ = data[t.>t1]
data_ = data_ .- data_[1]
u_ = Control([t -> u₂], [0, tf - t1], 1, 1)
x0_ = [0., 3., 0.]

fitted_p = fit(model, u_, t_, data_, initial_p, x0_; method = :unknown, get_m = get_m, u = u₁, slope = slope)
println("Error norm : ", norm(fitted_p - p))
sol_sim = simulate(model, u, x0; p = fitted_p)
plt = plot(sol, state_ylabels = ["v", "m", "t"], label = "Real")
scatter!(plt, t, data, subplot = 1, label = "data", mc = :red, ma = 0.5, ms = 2)
plot!(sol_sim, label = "Fitted", c = :green, ls = :dash)