Module deconvolution
This module contains numerous routines for fitting peak shapes and deconvolution of 2D spectra.
Automatic deconvolution
The deconvolution process is inspired by article by Shimadzu 10.1016/j.chroma.2016.09.037.
The entire procedure is as follows:
- Get initial guess of spectra
if enough maxima are present, spectra at maxima are used
otherwise, the timepoints are clustered based on cosine similarity and averaged spectra of clusters are used
Refine spectra by fitting peak shapes to some
PeakModel
Optionally, unrestricted refinement of concentrations based on spectra
deconvolve_adaptive()
- mocca2.deconvolve_adaptive(data: ndarray[Any, dtype[_ScalarType_co]], model: PeakModel | Literal['BiGaussian', 'BiGaussianTailing', 'FraserSuzuki', 'Bemg'], max_mse: float, relaxe_concs: bool, min_comps: int, max_comps: int) Tuple[ndarray[Any, dtype[_ScalarType_co]], ndarray[Any, dtype[_ScalarType_co]], float]
Deconvolves data with increasingly more components until MSE limit is reached
Parameters
- data: NDArray
2D data [wavelength, data]
- model: PeakModel | Literal[‘BiGaussian’, ‘BiGaussianTailing’, ‘FraserSuzuki’]
mathematical model used for fitting shapes of components of peaks
- max_mse: float
Maximum allowed MSE for termination
- relaxe_concs: bool
If False, the fitted peak model functions are returned
Otherwise, the concentrations are refined with restricted least squares
- min_comps: int
Minimum number of components that can be fitted
- max_comps: int
Maximum number of components that can be fitted
Returns
- NDArray
concentrations [compound, time]
- NDArray
spectra [compound, wavelength], normalized such that mean = 1
- float
MSE
deconvolve_fixed()
- mocca2.deconvolve_fixed(data: ndarray[Any, dtype[_ScalarType_co]], n_comps: int, model: PeakModel, relaxe_concs: bool) Tuple[ndarray[Any, dtype[_ScalarType_co]], ndarray[Any, dtype[_ScalarType_co]], float]
Deconvolves data with given number of components. Returns concentration, spectra and MSE
Parameters
- data: NDArray
2D data [wavelength, data]
- n_comps: int
how many components should be used for deconvolution
- model: PeakModel
mathematical model used for fitting shapes of components of peaks
- relaxe_concs: bool
If False, the fitted bigaussian functions are returned
Otherwise, the concentrations are refined with restricted least squares
Returns
- NDArray
concentrations [compound, time]
- NDArray
spectra [compound, wavelength], normalized such that mean = 1
- float
MSE
Peak Models
Mathematical models that describe peak shapes. A nice summary of peak shapes is in 10.1016/S0922-3487(98)80024-5.
You can define your own peak shape model, but it must inherit from PeakModel
and implement all relevant methods.
PeakModel
Abstract class that defines all methods that peak model must have.
- class mocca2.deconvolution.peak_models.PeakModel
Abstract class for functions that model peak shape
- val(t: float | ndarray[Any, dtype[_ScalarType_co]], *params: float) float | ndarray[Any, dtype[_ScalarType_co]]
Returns height of peak specified by params at given time t.
- grad(t: float | ndarray[Any, dtype[_ScalarType_co]], *params: float) ndarray[Any, dtype[_ScalarType_co]]
Returns gradient w.r.t params. If t is array, the return shape is [parameter, t]
- init_guess(height: float, maximum: float, width_left: float, width_right: float) ndarray[Any, dtype[_ScalarType_co]]
Creates initial guess of the model parameters from peak descriptors (height, location of maximum, left and right widths of peak)
- get_bounds(max_t: float) List[Tuple[float, float]]
Returns bounds for individual parameters in the shape [(p1 min, p1 max), (p2 min, p2 max), …].
max_t is the maximum t that will be passed into the model, and also the number of time points in the peak region.
- n_params() int
Returns number of parameters of the model
FraserSuzuki
Fraser-Suzuki peak model works very well on high-quality experimental data. It is described by the following equation.
- class mocca2.deconvolution.peak_models.FraserSuzuki
Fraser-Suzuki peak model
BiGaussian
BiGaussian peak model is simple yet efficient for describing peaks. It is described by the following equation.
- class mocca2.deconvolution.peak_models.BiGaussian
BiGaussian peak model
BiGaussianTailing
BiGaussian with exponential tailing. This is suitable if the peaks have slowly decaying tails. It is described by the following equation.
- class mocca2.deconvolution.peak_models.BiGaussianTailing
BiGaussian with exponential tailing peak model
BEMG
Bi-Exponentially Modified Gaussian function as described in 10.1016/j.chroma.2016.09.037. Very versatile model that can describe tailing and fronting.
- class mocca2.deconvolution.peak_models.Bemg
Bi-Exponentially Modified Gaussian peak model
Fitting Peak Models
The peak shape, and optionally also the component spectra, can be fitted using fit_peak_model()
to best explain the 2D chromatogram data.
Initial guess for spectra of individual components can be obtained from guess_spectra()
.
fit_peak_model_and_spectra()
- mocca2.deconvolution.fit_peak_model.fit_peak_model(data: ndarray[Any, dtype[_ScalarType_co]], model: PeakModel, spectra: ndarray[Any, dtype[_ScalarType_co]] | None = None, n_compounds: int | None = None, adjust_spectra: bool = True, initial_params: ndarray[Any, dtype[_ScalarType_co]] | None = None, initial_concs: ndarray[Any, dtype[_ScalarType_co]] | None = None) Tuple[ndarray[Any, dtype[_ScalarType_co]], float, ndarray[Any, dtype[_ScalarType_co]]]
Fits peak shapes to data. Spectra can be fixed or fitted too.
Parameters
- data: NDArray
2D chromatogram data, [wavelength, time]
- model: PeakModel
mathematical model of the peak shape
- spectra: NDArray | None
absorption spectra of individual components, [components, wavelength]. If not provided, guess_spectra() is used.
- n_compounds: int | None = None
number of peaks to deconvolve, must be provided if spectra is None. Ignored if spectra are provided
- adjust_spectra: bool
whether the spectra should be also fitted. When False, the spectra are kept constant
- initial_params: None | NDArray
initial parameters for the models, flattened
- initial_concs: None | NDArray
initial guess of the concentrations
Returns
- NDArray:
Concentrations [component, time]
- float
Mean Squared Error
- NDArray:
The optimized parameters, flattened
guess_spectra()
- mocca2.deconvolution.guess_spectra.guess_spectra(data: ndarray[Any, dtype[_ScalarType_co]], n_compounds: int) ndarray[Any, dtype[_ScalarType_co]]
Clusters data in the peak to try to isolate dominant spectra.
Parameters
- data: NDArray
2D peak data [wavelength, time]
- n_compounds: int
how many spectra should be generated
Returns
- NDArray
The generated spectra [compound, wavelength]. Spectra are not normalized
Constrained Linear Problems
This submodule provides methods for estimating the non-negative concentrations or spectra that best fit the 2D chromatogram data, assuming the other information (concentrations or spectra) is known.
concentrations_from_spectra()
- mocca2.deconvolution.nonnegative_lstsq.concentrations_from_spectra(data: ndarray[Any, dtype[_ScalarType_co]], spectra: ndarray[Any, dtype[_ScalarType_co]]) Tuple[ndarray[Any, dtype[_ScalarType_co]], float]
Calculates positive concentrations such that the mean squared error is minimized. Returns concentrations and MSE
Parameters
- data: NDArray
Absorbance data [wavelength, time]
- spectra: NDArray
Spectra of the individual compounds [compound, wavelength]
Returns
- NDArray
Positive concentrations that minimize the MSE [compound, time]
- float
MSE
spectra_from_concentrations()
- mocca2.deconvolution.nonnegative_lstsq.spectra_from_concentrations(data: ndarray[Any, dtype[_ScalarType_co]], concentrations: ndarray[Any, dtype[_ScalarType_co]]) Tuple[ndarray[Any, dtype[_ScalarType_co]], float]
Calculates positive spectra such that the mean squared error is minimized. Returns spectra and MSE
Parameters
- data: NDArray
Absorbance data [wavelength, time]
- concentrations: NDArray
Concentrations of the individual compounds [compound, time]
Returns
- NDArray
Positive spectra that minimize the MSE [compound, wavelength]
- float
MSE