solver
eempy.solver.core
Solver routines for excitation–emission matrix (EEM) decomposition.
This module provides numerical solvers used to decompose an EEM stack into a small number of non-negative components using matrix (NMF) and tensor (CP/PARAFAC) factorization models.
Implemented features include: - HALS/ALS-style NNLS updates for NMF and CP factors, - quadratic (Tikhonov) priors with NaN-masked entries, - elastic-net regularization (L1/L2 mix), - optional paired-sample ratio constraints on sample-mode scores, - elementwise masking to ignore sparse invalid tensor entries.
hals_nnls
hals_nnls(UtM, UtU, prior_dict=None, gamma=None, alpha=0, l1_ratio=0, V=None, max_iter=500, tol=1e-08, eps=1e-08, fixed_components=None)
HALS-style non-negative least squares update with optional priors and elastic-net.
This routine updates a factor matrix V (with shape (rank, n_features)) in a least-squares subproblem in NMF and
PARAFAC:
min_{V >= 0} 1/2 || UtM - UtU @ V ||_F^2
+ sum_k (gamma_k/2) ||mask_k * (V_k - p_k)||_2^2
+ alpha * [ l1_ratio * ||V||_1 + (1 - l1_ratio)/2 * ||V||_F^2 ]
where:
- UtM corresponds to UᵀM (or a mode-wise MTTKRP in PARAFAC),
- UtU corresponds to UᵀU (Gram matrix),
- prior_dict[k] = p_k is an optional per-component prior vector for row k, with NaNs allowed to indicate "no
prior" at specific entries,
- gamma can be a scalar applied to all prior components or a dict {k: gamma_k} to specify gamma values for
different priors.
The update is performed row-by-row (FastHALS residual update), and can keep selected components fixed (useful when seeding or freezing known spectra).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
UtM
|
np.ndarray, shape (rank, n)
|
Cross-product term (e.g., UᵀM or MTTKRPᵀ), with one row per component. |
required |
UtU
|
np.ndarray, shape (rank, rank)
|
Gram matrix term (e.g., UᵀU). |
required |
prior_dict
|
dict[int, array-like]
|
Mapping from component index |
None
|
gamma
|
float or dict[int, float]
|
Prior strength. If a float is provided, it is applied to all keys in |
None
|
alpha
|
float
|
Overall elastic-net strength (0 disables elastic-net). |
0
|
l1_ratio
|
float
|
Elastic-net mixing parameter in [0, 1]. 0 = pure L2, 1 = pure L1. |
0
|
V
|
np.ndarray, shape (rank, n)
|
Initial value for |
None
|
max_iter
|
int
|
Maximum number of HALS iterations. |
500
|
tol
|
float
|
Relative convergence tolerance based on HALS update magnitude. |
1e-08
|
eps
|
float
|
Small positive constant used for numerical stability and positivity floor. |
1e-08
|
fixed_components
|
list[int]
|
Indices of components (rows of |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
V |
(ndarray, shape(rank, n))
|
Updated non-negative factor matrix. |
nmf_with_prior_hals
nmf_with_prior_hals(X, rank, prior_dict_H=None, prior_dict_W=None, prior_dict_A=None, prior_dict_B=None, prior_dict_C=None, ref_components=None, gamma_W=0, gamma_H=0, gamma_A=0, gamma_B=0, gamma_C=0, alpha_W=0, alpha_H=0, alpha_A=0, alpha_B=0, alpha_C=0, l1_ratio=0, idx_top=None, idx_bot=None, lam=0, fit_rank_one=False, component_shape=None, max_iter_als=500, max_iter_nnls=500, tol=1e-06, eps=1e-08, init='random', custom_init=None, fixed_components=None, random_state=None)
Non-negative matrix factorization (NMF) using HALS with optional priors and regularizations.
This solver decomposes a non-negative matrix X into:
X ≈ W @ H
where W (m × rank) contains sample scores / concentrations and H (rank × n) contains component
spectra/features. It is designed for EEM decomposition where X can be an unfolded EEM stack and where
components may be constrained by priors or by a paired-sample ratio constraint.
Supported options include:
- Quadratic priors on W and/or H (NaNs skip entries).
- Elastic-net regularization on W and/or H.
- A paired-row ratio penalty on W: W[idx_top] ≈ beta * W[idx_bot] with weight lam.
- A hybrid model where selected components are forced to be rank-1 outer products
in EEM space (CP-like): H[k] reshaped to (n_ex, n_em) is approximated by
outer(B[:,k], C[:,k]). This is controlled via fit_rank_one and requires
component_shape.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
np.ndarray, shape (m, n)
|
Non-negative data matrix (e.g., unfolded EEM stack). |
required |
rank
|
int
|
Number of components. |
required |
prior_dict_H
|
dict[int, array-like]
|
Priors for rows of |
None
|
prior_dict_W
|
dict[int, array-like]
|
Priors for columns of |
None
|
prior_dict_A
|
dict[int, array-like]
|
Priors for CP sample-mode factor |
None
|
prior_dict_B
|
dict[int, array-like]
|
Priors for CP excitation factor |
None
|
prior_dict_C
|
dict[int, array-like]
|
Priors for CP emission factor |
None
|
ref_components
|
dict[int, array-like]
|
Reference components used to permute the solution for consistent ordering. |
None
|
gamma_W
|
float or dict[int, float]
|
Prior weight(s) for |
0
|
gamma_H
|
float or dict[int, float]
|
Prior weight(s) for |
0
|
gamma_A
|
float or dict[int, float]
|
Prior weight(s) for CP sample-mode factor |
0
|
gamma_B
|
float or dict[int, float]
|
Prior weight(s) for CP excitation factor |
0
|
gamma_C
|
float or dict[int, float]
|
Prior weight(s) for CP emission factor |
0
|
alpha_W
|
float
|
Elastic-net strength for |
0
|
alpha_H
|
float
|
Elastic-net strength for |
0
|
alpha_A
|
float
|
Elastic-net strength for CP sample-mode factor |
0
|
alpha_B
|
float
|
Elastic-net strength for CP excitation factor |
0
|
alpha_C
|
float
|
Elastic-net strength for CP emission factor |
0
|
l1_ratio
|
float
|
Elastic-net mixing parameter in [0, 1]. |
0
|
idx_top
|
sequence[int]
|
Row indices in |
None
|
idx_bot
|
sequence[int]
|
Row indices in |
None
|
lam
|
float or dict[int, float]
|
Ratio-penalty weight(s). Enable ratio constraint when provided together
with |
0
|
fit_rank_one
|
dict[int, bool] or False
|
If a dict, |
False
|
component_shape
|
tuple[int, int]
|
|
None
|
max_iter_als
|
int
|
Maximum number of outer ALS iterations. |
500
|
max_iter_nnls
|
int
|
Maximum number of HALS iterations per NNLS subproblem. |
500
|
tol
|
float
|
Relative convergence tolerance on the reconstruction error. |
1e-06
|
eps
|
float
|
Small positive constant used as a positivity floor. |
1e-08
|
init
|
str, {"random", "svd", "nndsvd", "nndsvda", "nndsvdar", "ordinary_nmf", "ordinary_cp", "custom"}
|
Initialization method. |
'random'
|
custom_init
|
tuple(W, H)
|
Custom initialization, only used when |
None
|
fixed_components
|
list[int]
|
Indices of components to keep fixed during updates. |
None
|
random_state
|
int or None
|
Random seed. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
W |
(ndarray, shape(m, rank))
|
Non-negative sample scores / concentrations. |
H |
(ndarray, shape(rank, n))
|
Non-negative component matrix (flattened EEM components if applicable). |
beta |
(ndarray or None, shape(rank))
|
Per-component ratio estimates when the ratio penalty is enabled; otherwise None. |
parafac_with_prior_hals
parafac_with_prior_hals(tensor, rank, prior_dict_A=None, prior_dict_B=None, prior_dict_C=None, gamma_A=0, gamma_B=0, gamma_C=0, ref_components=None, alpha_A=0, alpha_B=0, alpha_C=0, l1_ratio=0, lam=0, idx_top=None, idx_bot=None, max_iter_als=200, max_iter_nnls=500, tol=1e-09, eps=1e-08, init='svd', custom_init=None, random_state=None, fixed_components=None, mask=None)
Non-negative PARAFAC decomposition for EEM stacks using HALS with optional priors and masking.
This solver factorizes a 3-way tensor X into non-negative PARAFAC factors:
X ≈ [[A, B, C]]
where:
- A (n_samples × rank) contains sample scores / concentrations,
- B (n_ex × rank) contains excitation loadings,
- C (n_em × rank) contains emission loadings.
Optional features supported by this implementation:
- Sparse elementwise masking.
- Quadratic priors on any factor (A/B/C), with NaNs allowed to skip entries.
- Elastic-net regularization on any factor (L1/L2 mix).
- A ratio constraint on paired rows of A: A[idx_top] ≈ beta * A[idx_bot], controlled by lam and estimated per
component (beta).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tensor
|
array-like, shape (I, J, K)
|
Input tensor (EEM stack). Non-negative values are expected. |
required |
rank
|
int
|
Number of CP components. |
required |
prior_dict_A
|
dict[int, array-like]
|
Priors for columns of |
None
|
prior_dict_B
|
dict[int, array-like]
|
Priors for columns of |
None
|
prior_dict_C
|
dict[int, array-like]
|
Priors for columns of |
None
|
gamma_A
|
float or dict[int, float]
|
Prior weight(s) for |
0
|
gamma_B
|
float or dict[int, float]
|
Prior weight(s) for |
0
|
gamma_C
|
float or dict[int, float]
|
Prior weight(s) for |
0
|
ref_components
|
dict[int, array-like]
|
Reference components used to permute factors for consistent component order. Each value should be a flattened (J*K,) reference spectrum (e.g., outer(B,C)). |
None
|
alpha_A
|
float
|
Elastic-net strength for |
0
|
alpha_B
|
float
|
Elastic-net strength for |
0
|
alpha_C
|
float
|
Elastic-net strength for |
0
|
l1_ratio
|
float
|
Elastic-net mixing parameter in [0, 1]. |
0
|
lam
|
float or dict[int, float]
|
Ratio-penalty weight(s) for the sample-mode factor |
0
|
idx_top
|
sequence[int]
|
Row indices in |
None
|
idx_bot
|
sequence[int]
|
Row indices in |
None
|
max_iter_als
|
int
|
Maximum number of outer ALS iterations. |
200
|
max_iter_nnls
|
int
|
Maximum number of inner HALS iterations for each NNLS subproblem. |
500
|
tol
|
float
|
Relative convergence tolerance on masked reconstruction error. |
1e-09
|
eps
|
float
|
Small positive constant for numerical stability and positivity floor. |
1e-08
|
init
|
str, {"random", "svd", "nndsvd", "nndsvda", "nndsvdar", "ordinary_cp", "custom"}
|
Initialization method for factors. |
'svd'
|
custom_init
|
tuple(A, B, C)
|
Custom initialization, only used when |
None
|
random_state
|
int or None
|
Random seed for reproducible initialization. |
None
|
fixed_components
|
list[int]
|
Component indices to keep fixed during updates (useful when seeding known spectra). |
None
|
mask
|
array-like, shape (I, J, K)
|
Elementwise mask (1 = keep / valid entry, 0 = ignore entry). If None, finite
entries of |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
(ndarray, shape(I, rank))
|
Sample-mode factor (scores / concentrations). |
B |
(ndarray, shape(J, rank))
|
Excitation factor loadings. |
C |
(ndarray, shape(K, rank))
|
Emission factor loadings. |
beta |
(ndarray or None, shape(rank))
|
Estimated per-component ratio between paired sample rows when the ratio penalty is enabled; otherwise None. |
replace_factor_with_prior
replace_factor_with_prior(factors, prior, replaced_mode, replaced_rank='best-fit', frozen_rank=None, project_prior=True, X=None, show_replaced_rank=False)
Replace a single component in a two-factor decomposition with a prior vector.
This helper is primarily used for NMF/PARAFAC initialization, where a loading is replaced by a prior vector. This is particularly useful when the user has prior knowledge of some of the factors while keeping other unknown factors initialized via standard methods. It can (i) select which component to replace by maximizing correlation with the prior ("best-fit"), (ii) directly replace a specified component, and (iii) optionally project the prior to the other factor to keep the reconstruction consistent. E.g., when replacing a loading in one mode, the corresponding loading in the other mode is updated via a non-negative least-squares projection.
Notes
- Replacement is performed in-place on
factors. - When
project_prior=True, the method computes a residual excluding the replaced component and projects the prior onto the other mode using a non-negative least-squares-style clipping.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
factors
|
list[np.ndarray]
|
Factor matrices for a rank- |
required |
prior
|
array-like, shape (m,) or (n,)
|
Prior vector to insert into the selected component of |
required |
replaced_mode
|
int
|
Which factor to modify: 0 or 1. |
required |
replaced_rank
|
int or {"best-fit"}
|
Component index to replace. If "best-fit", the component with the highest Pearson correlation to |
'best-fit'
|
frozen_rank
|
int or None
|
Component index that must not be replaced (useful when one component is fixed). |
None
|
project_prior
|
bool
|
If True, also update the complementary factor column by projecting the prior onto the residual (requires |
True
|
X
|
np.ndarray
|
Data matrix being factorized, shape (m, n). Required if |
None
|
show_replaced_rank
|
bool
|
If True, also return the selected |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
factors |
list[ndarray]
|
Updated factor matrices. |
replaced_rank |
(int, optional)
|
Only returned when |
solve_W
solve_W(X1, H, X2=None, beta=None, reg=0.0, non_negativity=True)
Solve for the sample-score matrix W in a (possibly constrained) linear regression subproblem.
This helper solves for W in one of the following least-squares problems:
1) Standard regression (single dataset): minimize_W || X1 - W @ H ||_F^2
2) Paired / ratio-augmented regression (two datasets): minimize_W || X1 - W @ H ||_F^2 + || X2 - W @ diag(beta) @ H ||_F^2
Optionally, non-negativity can be enforced via row-wise NNLS.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X1
|
np.ndarray, shape (m, n)
|
Primary data matrix. |
required |
H
|
np.ndarray, shape (r, n)
|
Component matrix (rows are components). |
required |
X2
|
np.ndarray, shape (m, n)
|
Secondary data matrix used when |
None
|
beta
|
np.ndarray, shape (r,)
|
Per-component scaling applied to the secondary term. If None, the second term is omitted. |
None
|
reg
|
float
|
Ridge (L2) regularization strength added to the normal equations (0 disables). |
0.0
|
non_negativity
|
bool
|
If True, enforce |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
W |
(ndarray, shape(m, r))
|
Estimated sample-score / concentration matrix. |
unfolded_eem_stack_initialization
unfolded_eem_stack_initialization(M, rank, method='nndsvd')
Initialize non-negative matrix factors from an unfolded EEM stack.
Given a non-negative matrix M (typically an unfolding of a 3D EEM stack), this routine produces an
initialization (U, V) suitable for NMF / HALS updates.
Supported initializations include:
- ordinary_nmf : U, V generated through scikit-learn NMF with NNDSVD init.
- svd : absolute-value SVD-based factors (non-negative by clipping).
- nndsvd, nndsvda, nndsvdar : NNDSVD variants (Boutsidis & Gallopoulos), where zeros are kept (nndsvd),
filled with the mean (nndsvda), or filled with small random values (nndsvdar).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
M
|
np.ndarray, shape (m, n)
|
Non-negative matrix to factorize (e.g., unfolded tensor). |
required |
rank
|
int
|
Number of components (latent factors). |
required |
method
|
str, {"ordinary_nmf", "svd", "nndsvd", "nndsvda", "nndsvdar"}
|
Initialization method. Default is "nndsvd". |
'nndsvd'
|
Returns:
| Name | Type | Description |
|---|---|---|
U |
(ndarray, shape(m, rank))
|
Left factor initialization. |
V |
(ndarray, shape(rank, n))
|
Right factor initialization. |
References
[1] Boutsidis, Christos, and Efstratios Gallopoulos. "SVD based initialization: A head start for nonnegative matrix factorization." Pattern recognition 41.4 (2008): 1350-1362.
update_beta_in_hals
update_beta_in_hals(W: ndarray, idx_top, idx_bot, eps: float = 0, boundaries: tuple = (0.95, 2)) -> np.ndarray
Estimate per-component ratios between paired sample rows.
Given a factor matrix W and paired row indices (idx_top, idx_bot), this function estimates a ratio
beta[j] for each component j such that:
W[idx_top, j] ≈ beta[j] * W[idx_bot, j]
The least-squares solution is computed independently per component and then clamped to a user-provided interval.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
W
|
np.ndarray, shape (m, r)
|
Score / concentration matrix with |
required |
idx_top
|
sequence[int]
|
Row indices corresponding to the "top/original" samples. |
required |
idx_bot
|
sequence[int]
|
Row indices corresponding to the "bottom/perturbed" samples.
Must have the same length as |
required |
eps
|
float
|
Small constant added to the denominator to avoid division by zero. |
0
|
boundaries
|
tuple[float, float]
|
(min_beta, max_beta) bounds used to clamp each estimated ratio. |
(0.95, 2)
|
Returns:
| Name | Type | Description |
|---|---|---|
beta |
(ndarray, shape(r))
|
Estimated ratio for each component, clamped to |
update_column_in_hals
update_column_in_hals(Rk, hk_norm2, beta_k, lam, k, prior_dict=None, gamma=0.0, alpha=0.0, l1_ratio=0.0, idx_top=None, idx_bot=None, eps=1e-08)
HALS-style update for one column with a paired-row ratio penalty.
This update is designed for sample-mode factors (e.g., W in NMF or A in PARAFAC) when the dataset contains
paired samples (top/bottom) that are expected to follow a component-wise ratio:
w[top_i] ≈ beta_k * w[bot_i]
The subproblem solved for a single component column w_k is:
1/2 || Rk - d * w_k ||_2^2
- (lam/2) * sum_i (w[top_i] - beta_k * w[bot_i])^2
- (gamma/2) * ||mask * (w_k - p_k)||_2^2
- alpha * [ l1_ratio * ||w_k||_1 + (1 - l1_ratio)/2 * ||w_k||_2^2 ]
where NaNs in the prior are ignored (masked out).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
Rk
|
np.ndarray, shape (m,)
|
Current HALS residual for component |
required |
hk_norm2
|
float
|
Squared norm of the corresponding component in the other factor(s) (i.e., the diagonal element |
required |
beta_k
|
float
|
Current ratio estimate for component |
required |
lam
|
float
|
Ratio penalty weight for component |
required |
k
|
int
|
Component index (used to look up |
required |
prior_dict
|
dict[int, array-like]
|
Mapping from component index to a prior vector of length |
None
|
gamma
|
float
|
Prior weight. |
0.0
|
alpha
|
float
|
Elastic-net strength. |
0.0
|
l1_ratio
|
float
|
Elastic-net mixing parameter in [0, 1]. |
0.0
|
idx_top
|
sequence[int]
|
Indices of "top/original" rows in the paired sample design. |
None
|
idx_bot
|
sequence[int]
|
Indices of "bottom/perturbed" rows paired with |
None
|
eps
|
float
|
Small constant used as a positivity floor and for numerical stability. |
1e-08
|
Returns:
| Name | Type | Description |
|---|---|---|
w_new |
(ndarray, shape(m))
|
Updated non-negative column for component |