API#

Import scphylo as:

import scphylo as scp

After mutation calling and building the input data via our suggested mutation calling pipeline.

Read/Write (io)#

This module offers a bunch of functions for reading and writing of the data.

io.read(filepath)

Read genotype matrix and read-count matrix.

io.write(obj, filepath)

Write genotype matrix or read-count matrix into a file.

Preprocessing (pp)#

This module offers a bunch of functions for filtering and preprocessing of the data.

pp.remove_mut_by_list(adata, alist)

Remove a list of mutations from the data.

pp.remove_cell_by_list(adata, alist)

Remove a list of cells from the data.

pp.filter_mut_reference_must_present_in_at_least(adata)

Remove mutations based on the wild-type status.

pp.filter_mut_mutant_must_present_in_at_least(adata)

Remove mutations based on the mutant status.

pp.bifiltering(df, cellr, mutr[, time_limit])

Bi-filtering to find maximally inforemed submatrix.

pp.consensus_combine(df)

Combine cells in genotype matrix.

Tools (tl)#

This module offers a high-level API to compute the conflict-free solution and calculating the probability of mutations seeding particular cells.

Solving the noisy input genotype matrix (scphylo-Boost)

tl.booster(df_input, alpha, beta[, solver, ...])

Trisicell-Boost solver.

tl.scite(df_input, alpha, beta[, n_iters, ...])

Solving using SCITE.

tl.phiscsb(df_input, alpha, beta[, experiment])

Solving using PhISCS-B (only in single-cell mode, no bulk).

tl.scistree(df_input, alpha, beta[, ...])

Solving using ScisTree.

tl.onconem(df_input, alpha, beta)

Solving using OncoNEM.

tl.huntress(df_input, alpha, beta[, n_threads])

Solving using HUNTRESS.

tl.siclonefit(df_input, alpha, beta[, ...])

Solving using SiCloneFit.

tl.sphyr(df_input, alpha, beta[, ...])

Solving using SPhyR.

tl.gpps(df_input, alpha, beta[, k_dollo, ...])

Solving using gpps.

tl.phiscsi_bulk(df_input, alpha, beta[, ...])

Solving using PhISCS-I (in single-cell mode with bulk and mutation elimination).

tl.scelestial(df_input)

Solving using Scelestial.

Partition function calculation (scphylo-PartF)

tl.partition_function(df_input, alpha, beta, ...)

Calculate the probability of a mutation seeding particular cells.

Consensus tree building (scphylo-Cons)

tl.consensus(sc1, sc2)

Build the consensus tree between two tumor progression trees.

For comparing two phylogenetic trees

tl.ad(df_grnd, df_sol)

Ancestor-descendent accuracy.

tl.dl(df_grnd, df_sol)

Different-lineage accuracy.

tl.mltd(df_grnd, df_sol)

Multi-labeled tree dissimilarity measure (MLTD).

tl.tpted(df_grnd, df_sol)

Tumor phylogeny tree edit distance measure (TPTED).

tl.caset(df_grnd, df_sol)

Commonly Ancestor Sets score.

tl.disc(df_grnd, df_sol)

Distinctly Inherited Sets score.

tl.mp3(df_grnd, df_sol)

Triplet-based similarity score.

tl.rf(df_grnd, df_sol)

Robinson-Foulds score.

tl.gs(df_grnd, df_sol)

Genotype-similarity accuracy.

Plotting (pl)#

This module offers plotting the tree in clonal or dendrogram format.

pl.clonal_tree(tree[, muts_as_number, ...])

Draw the tree in clonal format.

pl.dendro_tree(tree[, width, height, dpi, ...])

Draw the tree in dendro fromat.

Utils (ul)#

This module offers a bunch of utility functions.

ul.to_tree(df)

Convert a conflict-free matrix to a tree object.

ul.to_cfmatrix(tree)

Convert phylogenetic tree to conflict-free matrix.

ul.to_mtree(tree)

Convert the phylogenetic tree to mutation tree.

ul.hclustering(df[, metric, method, return_dist])

Hierarchical clustering.

ul.is_conflict_free(df_in)

Check conflict-free criteria via Gusfield algorithm.

ul.is_conflict_free_gusfield(df_in)

Check conflict-free criteria via Gusfield algorithm.

Datasets (datasets)#

This module offers a bunch of functions for simulating data.

datasets.example([is_expression])

Return an example for sanity checking and playing with scPhylo.

datasets.simulate([n_cells, n_muts, ...])

Simulate single-cell noisy genotype matrix.

datasets.add_noise(df_in, alpha, beta, missing)

Add noise to the input genotype matrix.

datasets.add_readcount(df_in[, ...])

Add readcount to the input genotype matrix.

datasets.melanoma20()

Mouse Melanoma dataset with 20 sublines.

datasets.colorectal1()

Human Colorectal Cancer (Patient 1).

datasets.colorectal2([readcount])

Human Colorectal Cancer (Patient 2).

datasets.colorectal3()

Human Colorectal Cancer.

datasets.acute_myeloid_leukemia1()

Human Acute Myeloid Leukemia dataset (Patient 1).

datasets.acute_myeloid_leukemia2()

Human Acute Myeloid Leukemia dataset (Patient 2).

datasets.acute_lymphocytic_leukemia1()

Human Acute Lymphocytic Leukemia dataset (Patient 1).

datasets.acute_lymphocytic_leukemia2()

Human Acute Lymphocytic Leukemia dataset (Patient 2).

datasets.acute_lymphocytic_leukemia3()

Human Acute Lymphocytic Leukemia dataset (Patient 3).

datasets.acute_lymphocytic_leukemia4()

Human Acute Lymphocytic Leukemia dataset (Patient 4).

datasets.acute_lymphocytic_leukemia5()

Human Acute Lymphocytic Leukemia dataset (Patient 5).

datasets.acute_lymphocytic_leukemia6()

Human Acute Lymphocytic Leukemia dataset (Patient 6).

datasets.high_grade_serous_ovarian_cancer1()

High Grade Serous Ovarian Cancer (Patient 2).

datasets.high_grade_serous_ovarian_cancer2()

High Grade Serous Ovarian Cancer (Patient 3).

datasets.high_grade_serous_ovarian_cancer3()

High Grade Serous Ovarian Cancer (Patient 9).

datasets.high_grade_serous_ovarian_cancer_3celllines()

High Grade Serous Ovarian Cancer (3 cell lines).

datasets.myeloproliferative_neoplasms18()

JAK2-Negative Myeloproliferative Neoplasm.

datasets.myeloproliferative_neoplasms78()

JAK2-Negative Myeloproliferative Neoplasm.

datasets.myeloproliferative_neoplasms712()

JAK2-Negative Myeloproliferative Neoplasm.

datasets.renal_cell_carcinoma()

Clear-cell Renal-cell Carcinoma.

datasets.muscle_invasive_bladder()

Muscle Invasive Bladder Cancer.

datasets.erbc()

Oestrogen-receptor-positive (ER+) Breast Cancer.

datasets.tnbc()

Triple-negative Breast Cancer.