Module baseline

The baseline can be estimated using estimate_baseline().

Currently available algorithms are AsLS (asymetric least squares with smoothness penalty), arPLS (asymmetrically reweighted penalized least squares) and FlatFit. The AsLS and arPLS algorithms were adapted from 10.1039/C4AN01061B.

As described in the paper, AsLS is biased to estimating baseline lower than it actually is. On the other hand, arPLS can estimate the baseline too high which can cut off some smallest peaks.

In general, I would recommend using FlatFit. If your data is very noisy, arPLS should be a good choice.


mocca2.estimate_baseline(data: ndarray[Any, dtype[_ScalarType_co]] | Data2D, method: Literal['asls', 'arpls', 'flatfit'] = 'arpls', smoothness: float = 1.0, p: float | None = None, tol: float = 1e-07, max_iter: int | None = None, smooth_wl: int | None = None) ndarray[Any, dtype[_ScalarType_co]]

data: NDArray | Data2D

Data with shape [N] or [sample, N]

method: Literal[‘asls’, ‘arpls’, ‘flatfit’]

Possible baseline estimation methods are AsLS, arPLS and FlatFit. FlatFit and AsLS work well with smooth data, asPLS works better with noisy data

smoothness: float

size of smoothness penalty

p: float | None

Assymetry factor, different for AsLS and arPLS

tol: float

maximum relative change of w for convergence

max_iter: int | None

maximum number of iterations. If not specified, guessed automatically

smooth_wl: int | None

if specified, applies Savitzky-Golay filter (order 2) accross wavelength axis with given window size



values that minimize the asymmetric squared error with smoothness penalty, same shape as data

See details in the individual routines or at [StackOverflow]( and [10.1039/C4AN01061B](


Asymmetric Least Squares with smoothness penalty.

mocca2.baseline.asls.asls(data: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], smoothness: float, p: float, tol: float = 1e-07, max_iter: int | None = None, baseline_guess: ndarray[Any, dtype[_ScalarType_co]] | None = None) ndarray[Any, dtype[_ScalarType_co]]

data: ArrayLike

1D data

smoothness: float

size of smoothness penalty

p: float

asymetry factor, w = p if y_fit < data else (1-p)

tol: float

maximum relative change of w for convergence

max_iter: int | None

maximum number of iterations

baseline_guess: ArrayLike | None

initial guess for baseline



values that minimize the asymmetric squared error with smoothness penalty


This routine finds vector z that minimized:

(y-z).T @ W @ (y-z) + smoothness * z.T @ D.T @ D @ z

where w = p if y < z else (1-p) and D is finite differences for second derivative.

See details at [StackOverflow]( and [10.1039/C4AN01061B](


Asymmetrically reweighted Penalized Least Squares ith smoothness penalty..

mocca2.baseline.arpls.arpls(data: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], smoothness: float, p: float = 2.0, tol: float = 1e-07, max_iter: int | None = None, baseline_guess: ndarray[Any, dtype[_ScalarType_co]] | None = None) ndarray[Any, dtype[_ScalarType_co]]

data: ArrayLike

1D data

smoothness: float

size of smoothness penalty

p: float

lower values shift the baseline lower

tol: float

maximum relative change of w for convergence

max_iter: int | None

maximum number of iterations

baseline_guess: ArrayLike | None

initial guess for baseline



values that minimize the asymmetric squared error with smoothness penalty, and w


This routine finds vector z that minimized:

(y-z).T @ W @ (y-z) + smoothness * z.T @ D.T @ D @ z

where w is nonlinear weighting function and D is finite differences for second derivative.

See details at [10.1039/C4AN01061B](


FlatFit algorithm with smoothness penalty. The details will be published soon in the MOCCA2 paper.

mocca2.baseline.flatfit.flatfit(data: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], smoothness: float, p: float) ndarray[Any, dtype[_ScalarType_co]]

data: ArrayLike

1D data

smoothness: float

size of smoothness penalty

p: float

relative size of Savitzky-Golay filter



values that minimize the asymmetric squared error with smoothness penalty


This routine finds vector z that minimized:

(y-z).T @ W @ (y-z) + smoothness * z.T @ D.T @ D @ z

where W is determined by slope and curvature at given point