Module deconvolution

This module contains numerous routines for fitting peak shapes and deconvolution of 2D spectra.

Automatic deconvolution

The deconvolution process is inspired by article by Shimadzu 10.1016/j.chroma.2016.09.037.

The entire procedure is as follows:

  1. Get initial guess of spectra
    • if enough maxima are present, spectra at maxima are used

    • otherwise, the timepoints are clustered based on cosine similarity and averaged spectra of clusters are used

  2. Refine spectra by fitting peak shapes to some PeakModel

  3. Optionally, unrestricted refinement of concentrations based on spectra

deconvolve_adaptive()

mocca2.deconvolve_adaptive(data: ndarray[Any, dtype[_ScalarType_co]], model: PeakModel | Literal['BiGaussian', 'BiGaussianTailing', 'FraserSuzuki', 'Bemg'], max_mse: float, relaxe_concs: bool, min_comps: int, max_comps: int) Tuple[ndarray[Any, dtype[_ScalarType_co]], ndarray[Any, dtype[_ScalarType_co]], float]

Deconvolves data with increasingly more components until MSE limit is reached

Parameters

data: NDArray

2D data [wavelength, data]

model: PeakModel | Literal[‘BiGaussian’, ‘BiGaussianTailing’, ‘FraserSuzuki’]

mathematical model used for fitting shapes of components of peaks

max_mse: float

Maximum allowed MSE for termination

relaxe_concs: bool

If False, the fitted peak model functions are returned

Otherwise, the concentrations are refined with restricted least squares

min_comps: int

Minimum number of components that can be fitted

max_comps: int

Maximum number of components that can be fitted

Returns

NDArray

concentrations [compound, time]

NDArray

spectra [compound, wavelength], normalized such that mean = 1

float

MSE

deconvolve_fixed()

mocca2.deconvolve_fixed(data: ndarray[Any, dtype[_ScalarType_co]], n_comps: int, model: PeakModel, relaxe_concs: bool) Tuple[ndarray[Any, dtype[_ScalarType_co]], ndarray[Any, dtype[_ScalarType_co]], float]

Deconvolves data with given number of components. Returns concentration, spectra and MSE

Parameters

data: NDArray

2D data [wavelength, data]

n_comps: int

how many components should be used for deconvolution

model: PeakModel

mathematical model used for fitting shapes of components of peaks

relaxe_concs: bool

If False, the fitted bigaussian functions are returned

Otherwise, the concentrations are refined with restricted least squares

Returns

NDArray

concentrations [compound, time]

NDArray

spectra [compound, wavelength], normalized such that mean = 1

float

MSE

Peak Models

Mathematical models that describe peak shapes. A nice summary of peak shapes is in 10.1016/S0922-3487(98)80024-5.

You can define your own peak shape model, but it must inherit from PeakModel and implement all relevant methods.

PeakModel

Abstract class that defines all methods that peak model must have.

class mocca2.deconvolution.peak_models.PeakModel

Abstract class for functions that model peak shape

val(t: float | ndarray[Any, dtype[_ScalarType_co]], *params: float) float | ndarray[Any, dtype[_ScalarType_co]]

Returns height of peak specified by params at given time t.

grad(t: float | ndarray[Any, dtype[_ScalarType_co]], *params: float) ndarray[Any, dtype[_ScalarType_co]]

Returns gradient w.r.t params. If t is array, the return shape is [parameter, t]

init_guess(height: float, maximum: float, width_left: float, width_right: float) ndarray[Any, dtype[_ScalarType_co]]

Creates initial guess of the model parameters from peak descriptors (height, location of maximum, left and right widths of peak)

get_bounds(max_t: float) List[Tuple[float, float]]

Returns bounds for individual parameters in the shape [(p1 min, p1 max), (p2 min, p2 max), …].

max_t is the maximum t that will be passed into the model, and also the number of time points in the peak region.

n_params() int

Returns number of parameters of the model

FraserSuzuki

Fraser-Suzuki peak model works very well on high-quality experimental data. It is described by the following equation.

\[f(t, h, \mu, \sigma, a) = h \cdot \exp \left( - \frac{1}{a^2} \ln^2 \left( 1 + \frac{a(t-\mu)}{\sigma} \right) \right)\]
class mocca2.deconvolution.peak_models.FraserSuzuki

Fraser-Suzuki peak model

BiGaussian

BiGaussian peak model is simple yet efficient for describing peaks. It is described by the following equation.

\[\begin{split}f(t, h, \mu, \sigma_1, \sigma_2) = \begin{cases} h \cdot \exp \left( -\frac{(t-\mu)^2}{2{\sigma_1}^2}\right) &\text{if } t < \mu \\ h \cdot \exp \left( -\frac{(t-\mu)^2}{2{\sigma_2}^2}\right) &\text{if } t \ge \mu \end{cases}\end{split}\]
class mocca2.deconvolution.peak_models.BiGaussian

BiGaussian peak model

BiGaussianTailing

BiGaussian with exponential tailing. This is suitable if the peaks have slowly decaying tails. It is described by the following equation.

\[\begin{split}f(t, h, \mu, \sigma_1, \sigma_2, t_h, t_w) = \begin{cases} h \cdot \exp \left( -\frac{(t-\mu)^2}{2{\sigma_1}^2}\right) &\text{if } t < \mu \\ h \cdot (1-t_h)\cdot \exp \left( -\frac{(t-\mu)^2}{2{\sigma_2}^2}\right) + h \cdot t_h \cdot \exp \left(-\frac{t-\mu}{t_w}\right) &\text{if } t \ge \mu \end{cases}\end{split}\]
class mocca2.deconvolution.peak_models.BiGaussianTailing

BiGaussian with exponential tailing peak model

BEMG

Bi-Exponentially Modified Gaussian function as described in 10.1016/j.chroma.2016.09.037. Very versatile model that can describe tailing and fronting.

class mocca2.deconvolution.peak_models.Bemg

Bi-Exponentially Modified Gaussian peak model

Fitting Peak Models

The peak shape, and optionally also the component spectra, can be fitted using fit_peak_model() to best explain the 2D chromatogram data.

Initial guess for spectra of individual components can be obtained from guess_spectra().

fit_peak_model_and_spectra()

mocca2.deconvolution.fit_peak_model.fit_peak_model(data: ndarray[Any, dtype[_ScalarType_co]], model: PeakModel, spectra: ndarray[Any, dtype[_ScalarType_co]] | None = None, n_compounds: int | None = None, adjust_spectra: bool = True, initial_params: ndarray[Any, dtype[_ScalarType_co]] | None = None, initial_concs: ndarray[Any, dtype[_ScalarType_co]] | None = None) Tuple[ndarray[Any, dtype[_ScalarType_co]], float, ndarray[Any, dtype[_ScalarType_co]]]

Fits peak shapes to data. Spectra can be fixed or fitted too.

Parameters

data: NDArray

2D chromatogram data, [wavelength, time]

model: PeakModel

mathematical model of the peak shape

spectra: NDArray | None

absorption spectra of individual components, [components, wavelength]. If not provided, guess_spectra() is used.

n_compounds: int | None = None

number of peaks to deconvolve, must be provided if spectra is None. Ignored if spectra are provided

adjust_spectra: bool

whether the spectra should be also fitted. When False, the spectra are kept constant

initial_params: None | NDArray

initial parameters for the models, flattened

initial_concs: None | NDArray

initial guess of the concentrations

Returns

NDArray:

Concentrations [component, time]

float

Mean Squared Error

NDArray:

The optimized parameters, flattened

guess_spectra()

mocca2.deconvolution.guess_spectra.guess_spectra(data: ndarray[Any, dtype[_ScalarType_co]], n_compounds: int) ndarray[Any, dtype[_ScalarType_co]]

Clusters data in the peak to try to isolate dominant spectra.

Parameters

data: NDArray

2D peak data [wavelength, time]

n_compounds: int

how many spectra should be generated

Returns

NDArray

The generated spectra [compound, wavelength]. Spectra are not normalized

Constrained Linear Problems

This submodule provides methods for estimating the non-negative concentrations or spectra that best fit the 2D chromatogram data, assuming the other information (concentrations or spectra) is known.

concentrations_from_spectra()

mocca2.deconvolution.nonnegative_lstsq.concentrations_from_spectra(data: ndarray[Any, dtype[_ScalarType_co]], spectra: ndarray[Any, dtype[_ScalarType_co]]) Tuple[ndarray[Any, dtype[_ScalarType_co]], float]

Calculates positive concentrations such that the mean squared error is minimized. Returns concentrations and MSE

Parameters

data: NDArray

Absorbance data [wavelength, time]

spectra: NDArray

Spectra of the individual compounds [compound, wavelength]

Returns

NDArray

Positive concentrations that minimize the MSE [compound, time]

float

MSE

spectra_from_concentrations()

mocca2.deconvolution.nonnegative_lstsq.spectra_from_concentrations(data: ndarray[Any, dtype[_ScalarType_co]], concentrations: ndarray[Any, dtype[_ScalarType_co]]) Tuple[ndarray[Any, dtype[_ScalarType_co]], float]

Calculates positive spectra such that the mean squared error is minimized. Returns spectra and MSE

Parameters

data: NDArray

Absorbance data [wavelength, time]

concentrations: NDArray

Concentrations of the individual compounds [compound, time]

Returns

NDArray

Positive spectra that minimize the MSE [compound, wavelength]

float

MSE