High-level API

Transfer-entropy estimation and utilities

locaTE.estimate_TE — Function

estimate_TE(
    X::AbstractMatrix,
    regulators,
    targets,
    P::AbstractMatrix,
    QT::AbstractMatrix,
    R::AbstractMatrix;
    clusters = nothing,
    discretizer_alg::DiscretizationAlgorithm = DiscretizeBayesianBlocks(),
    showprogress::Bool = true,
    wclr::Bool = false,
)

High-level function for estimating local transfer entropy from cell-by-gene expression matrix X, with forward transition kernel P, backward transition kernel QT (e.g. calculated from P using to_backward_kernel), and neighbourhood kernel R. A subset of regulators and targets can be passed in as index vectors regulators and targets respectively. If one seeks to use metacells, a (sparse) Boolean matrix clusters can be passed of dimensions cells × metacells, encoding the cell-metacell memberships. A custom discretization algorithm can be passed using discretizer_alg (see the documentation of Discretizers.jl for further details) A progress bar is shown optionally depending on showprogress: this is enabled by default. If wclr is set to true, a matrix of filtered TE scores is returned in place of raw TE scores: this is disabled by default.

locaTE.estimate_TE_cu — Function

estimate_TE_cu(
    X::AbstractMatrix,
    regulators,
    targets,
    P::AbstractMatrix,
    QT::AbstractMatrix,
    R::AbstractMatrix;
    clusters = nothing,
    discretizer_alg::DiscretizationAlgorithm = DiscretizeBayesianBlocks(),
    showprogress::Bool = true,
    wclr::Bool = false,
    N_blocks::Int = 1,
    mode = :dense 
)

High-level function for estimating local transfer entropy, utilising GPU acceleration. The usage for this function is identical to estimate_TE, except the TE estimation step is done using a CUDA kernel. The number of CUDA blocks to be used can be passed as N_blocks; by default this is taken to be 1. Two modes are available, mode = :dense in which a dense representation of a submatrix of the coupling is used, and mode = :sparse in which a truly sparse representation of the coupling is used.

locaTE.to_backward_kernel — Function

to_backward_kernel(P::AbstractArray)

Compute backward kernel QT from a forward transition kernel P using the transpose method.

locaTE.construct_normalized_laplacian — Function

construct_normalized_laplacian(X_rep, k)

Construct k-NN graph and normalized, symmetric Laplacian matrix from dimensionality-reduced representation X_rep

Denoising and factor analysis

locaTE.fitsp — Function

fitsp(G::AbstractMatrix, L::AbstractMatrix, α; ρ = 0.05, λ1 = 25.0, λ2 = 0.075, maxiter = 2500)

Denoise TE scores by solving the weighted L1-L2 regularized regression problem

\[ \min_{X} \frac{1}{2} \sum_{i = 1}^{N} \alpha_{i} \| X_i - G_i \|_2^2 + \frac{λ_1}{2} \operatorname{tr}(X^\top L X) + λ_2 \sum_{i = 1}^N \alpha_i \| X_i \|_1.\]

fitsp(G::AbstractMatrix, L::AbstractMatrix; ρ = 0.05, λ1 = 25.0, λ2 = 0.075, maxiter = 2500)

Denoise TE scores by solving the L1-L2 regularized regression problem

\[ \min_{X} \frac{1}{2} \sum_{i = 1}^{N} \| X_i - G_i \|_2^2 + \frac{λ_1}{2} \operatorname{tr}(X^\top L X) + λ_2 \sum_{i = 1}^N \| X_i \|_1.\]

locaTE.fitnmf — Function

fitnmf(G, L_all, L, H, k; α = 0, β = 0, λ = [0, 0], μ = [0, 0], iter = 500, print_iter = 50, initialize = :nndsvd, δ = 1e-5, dictionary = false, η = 1.0, U_init = nothing, V_init = nothing)

Regularized non-negative matrix factorization by solving the problem

\[ \min_{U, V} \frac{1}{2} \| UV^\top - G \|_2^2 + \frac{α}{2} \operatorname{tr}(VU^\top L UV^\top) - \beta \langle H, UV^\top \rangle + \frac{λ_1}{2} \operatorname{tr}(U^\top K_1 U) + μ_1 \| U \|_1 + \frac{λ_2}{2} \operatorname{tr}(V^\top K_2 V) + μ_2 \| V \|_1.\]

L_all contains positive semidefinite (potentially sparse) matrices corresponding to $[K_1, K_2]$ that act on the factor matrices, while L is a positive semidefinite matrix acting on the low rank reconstruction.

A number of initializations are possible by setting the value of initialize: random (:rand), nonnegative double singular value decomposition (:nndsvd, using the implementation here), 2 iterations of NMF (:nmf, using this function), or manual initialization U_init, V_init (:manual).

Returns U, V and trace containing objective values.

locaTE.fitntf — Function

fitntf(G, L, L_g, H, λ, μ, α, β, k; iter = 250, print_iter = 50, dictionary = false, δ = 1e-5, η = 1.0)

Regularized non-negative tensor factorization by solving the problem

\[ \min_{S, \{ A^{(i)} \}_{i = 1}^3} \frac{1}{2} \| X - G\|_2^2 + \frac{\alpha}{2} \operatorname{tr}(X_{(1)}^\top L X_{(1)}) - \beta \langle H, X\rangle + \sum_{i = 1}^3 \frac{\lambda_i}{2} \operatorname{tr}((A^{(i)})^\top L^{(i)} A^{(i)}) + \sum_{i = 1}^3 \mu_i \| A^{(i)} \|_1. \]

where for brevity $X = S \times_{i = 1}^3 A^{(i)}$.

L contains positive semidefinite (potentially sparse) matrices corresponding to $L^{(i)}$ in the above formula that act on the factor matrices, and L_g corresponds to L, acting on the low rank reconstruction.

The decomposition is currently initialised using the Tensorly library, with 1 iteration of non_negative_parafac with init = "svd".

Currently only optimises over the factor matrices while keeping S fixed (i.e. seeks a CP decomposition)

Evaluation

locaTE.aupr — Function

aupr(p::AbstractVector, r::AbstractVector)

Compute AUPR from a vector of precision p and recall r rates at different thresholds.

locaTE.prec_rec_rate — Function

prec_rec_rate(J::AbstractMatrix, Z::AbstractMatrix, q::Real; J_thresh = 0.5)

Compute precision and recall rates for a ground truth matrix J, score matrix Z, threshold q ∈ [0, 1]. Entries of J such that abs.(J) .> J_thresh are treated as true edges.

prec_rec_rate(J::AbstractMatrix, Z::AbstractMatrix, Nq::Integer; kwargs...)

Compute vectors of precision and recall rates for Nq uniformly spaced discrimination thresholds.

locaTE.ep — Function

ep(p::AbstractVector, r::AbstractVector; f = 0.1)

Compute early precision (EP) from a vector of precision p and recall r rates at different thresholds, for r ≤ f.

locaTE.auroc — Function

auroc(tp::AbstractVector, fp::AbstractVector)

Compute AUROC from a vector of true positive rates tp and false positive rates fp at different thresholds.

locaTE.tp_fp_rate — Function

tp_fp_rate(J::AbstractMatrix, Z::AbstractMatrix, q::Real; J_thresh = 0.5)

Compute true positive and false positive rates for a ground truth matrix J, score matrix Z, threshold q ∈ [0, 1]. Entries of J such that abs.(J) .> J_thresh are treated as true edges.

tp_fp_rate(J::AbstractMatrix, Z::AbstractMatrix, Nq::Integer; kwargs...)

Compute vectors of true positive and false positive rates for Nq uniformly spaced discrimination thresholds.