scphylo.tl.phiscsi_bulk#

scphylo.tl.phiscsi_bulk(df_input, alpha, beta, kmax=0, vaf_info=None, delta=0.2, time_limit=86400, n_threads=1)[source]#

Solving using PhISCS-I (in single-cell mode with bulk and mutation elimination).

a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data [PhISCS].

Parameters
  • df_input (pandas.DataFrame) – Input genotype matrix in which rows are cells and columns are mutations. Values inside this matrix show the presence (1), absence (0) and missing entires (3).

  • alpha (float) – False positive error rate.

  • beta (float) – False negative error rate.

  • kmax (int, optional) – Max number of mutations to be eliminated, by default 0

  • vaf_info (pandas.DataFrame, optional) – Information about the variant allele frequency in bulk data The size is n_SNVs x n_samples, by default None

  • delta (float, optional) – Delta parameter accounting for VAF variance, by default 0.2

  • time_limit (int, optional) – Time limit of the Gurobi running in seconds, by default 86400 (one day)

  • n_threads (int, optional) – Number of threads for Gurobi solver, by default 1

Returns

A conflict-free matrix in which rows are cells and columns are mutations. Values inside this matrix show the presence (1) and absence (0).

Return type

pandas.DataFrame

Examples

>>> adata = scp.datasets.acute_lymphocytic_leukemia2()
>>> adata.var["VAF"] = (
    2
    * adata.var["MutantCount"]
    / (adata.var["MutantCount"] + adata.var["ReferenceCount"])
)
>>> df_out = scp.tl.phiscsi_bulk(
    adata.to_df(),
    alpha=0.001,
    beta=0.181749,
    delta=0.2,
    kmax=3,
    vaf_info=adata.var[["VAF"]],
)