Module baseline

The baseline can be estimated using estimate_baseline().

Currently available algorithms are AsLS (asymetric least squares with smoothness penalty), arPLS (asymmetrically reweighted penalized least squares) and FlatFit. The AsLS and arPLS algorithms were adapted from 10.1039/C4AN01061B.

As described in the paper, AsLS is biased to estimating baseline lower than it actually is. On the other hand, arPLS can estimate the baseline too high which can cut off some smallest peaks.

In general, I would recommend using FlatFit. If your data is very noisy, arPLS should be a good choice.

estimate_baseline()

This is a wrapper for the baseline estimation algorithms asls(), arpls() and flatfit().

mocca2.estimate_baseline(data: ndarray[tuple[int, ...], dtype[_ScalarType_co]] | Data2D, method: Literal['asls', 'arpls', 'flatfit'] = 'arpls', smoothness: float = 1.0, p: float | None = None, tol: float = 1e-07, max_iter: int | None = None, smooth_wl: int | None = None) → ndarray[tuple[int, ...], dtype[_ScalarType_co]]

Estimates baseline using AsLS, arPLS or FlatFit algorithm

Parameters

data: NDArray | Data2D: Data with shape [N] or [sample, N]
method: Literal[‘asls’, ‘arpls’, ‘flatfit’]: Possible baseline estimation methods are AsLS, arPLS and FlatFit. FlatFit and AsLS work well with smooth data, asPLS works better with noisy data
smoothness: float: size of smoothness penalty
p: float | None: Assymetry factor, different for AsLS and arPLS
tol: float: maximum relative change of w for convergence
max_iter: int | None: maximum number of iterations. If not specified, guessed automatically
smooth_wl: int | None: if specified, applies Savitzky-Golay filter (order 2) accross wavelength axis with given window size

Returns

NDArray: values that minimize the asymmetric squared error with smoothness penalty, same shape as data

See details in the individual routines or at [StackOverflow](https://stackoverflow.com/a/50160920) and [10.1039/C4AN01061B](https://doi.org/10.1039/C4AN01061B).

AsLS()

Asymmetric Least Squares with smoothness penalty.

AsLS: Asymmetric Least Squares with smoothness penalty

Parameters

data: ArrayLike: 1D data
smoothness: float: size of smoothness penalty
p: float: asymetry factor, w = p if y_fit < data else (1-p)
tol: float: maximum relative change of w for convergence
max_iter: int | None: maximum number of iterations
baseline_guess: ArrayLike | None: initial guess for baseline

Returns

NDArray: values that minimize the asymmetric squared error with smoothness penalty

Description

This routine finds vector z that minimized:

(y-z).T @ W @ (y-z) + smoothness * z.T @ D.T @ D @ z

where w = p if y < z else (1-p) and D is finite differences for second derivative.

See details at [StackOverflow](https://stackoverflow.com/a/50160920) and [10.1039/C4AN01061B](https://doi.org/10.1039/C4AN01061B).

arPLS()

Asymmetrically reweighted Penalized Least Squares ith smoothness penalty..

arPLS: Asymmetrically Reweighted Penalized Least Squares

Parameters

data: ArrayLike: 1D data
smoothness: float: size of smoothness penalty
p: float: lower values shift the baseline lower
tol: float: maximum relative change of w for convergence
max_iter: int | None: maximum number of iterations
baseline_guess: ArrayLike | None: initial guess for baseline

Returns

NDArray: values that minimize the asymmetric squared error with smoothness penalty, and w

Description

This routine finds vector z that minimized:

(y-z).T @ W @ (y-z) + smoothness * z.T @ D.T @ D @ z

where w is nonlinear weighting function and D is finite differences for second derivative.

See details at [10.1039/C4AN01061B](https://doi.org/10.1039/C4AN01061B).

FlatFit()

FlatFit algorithm with smoothness penalty. The details will be published soon in the MOCCA2 paper.

FlatFit: least squares weighted by inverse scale of 1st and 2nd derivatives with smoothness penalty

Parameters

data: ArrayLike: 1D data
smoothness: float: size of smoothness penalty
p: float: relative size of Savitzky-Golay filter

Returns

NDArray: values that minimize the asymmetric squared error with smoothness penalty

Description

This routine finds vector z that minimized:

(y-z).T @ W @ (y-z) + smoothness * z.T @ D.T @ D @ z

where W is determined by slope and curvature at given point