Title: | Graph-Based Permutation Tests for Microbiome Data |
---|---|
Description: | Provides functions for graph-based multiple-sample testing and visualization of microbiome data, in particular data stored in 'phyloseq' objects. The tests are based on those described in Friedman and Rafsky (1979) <http://www.jstor.org/stable/2958919>, and the tests are described in more detail in Callahan et al. (2016) <doi:10.12688/f1000research.8986.1>. |
Authors: | Julia Fukuyama [aut, cre] |
Maintainer: | Julia Fukuyama <[email protected]> |
License: | CC0 |
Version: | 0.1.1 |
Built: | 2024-11-01 04:40:45 UTC |
Source: | https://github.com/jfukuyama/phyloseqgraphtest |
This package lets you test for differences between groups of samples with a graph-based permutation test.
The main function in the package is graph_perm_test
,
which takes a phyloseq
object.
The graph used in the test can be visualized using
plot_test_network
. The permutation distribution and
the test statistic can be visualized with
plot_permutations
.
Performs graph-based tests for one-way designs.
graph_perm_test( physeq, sampletype, grouping = 1:nsamples(physeq), distance = "jaccard", type = c("mst", "knn", "threshold.value", "threshold.nedges"), max.dist = 0.4, knn = 1, nedges = nsamples(physeq), keep.isolates = TRUE, nperm = 499 )
graph_perm_test( physeq, sampletype, grouping = 1:nsamples(physeq), distance = "jaccard", type = c("mst", "knn", "threshold.value", "threshold.nedges"), max.dist = 0.4, knn = 1, nedges = nsamples(physeq), keep.isolates = TRUE, nperm = 499 )
physeq |
A phyloseq object. |
sampletype |
A string giving the column name of the sample to be tested. This should be a factor with two or more levels. |
grouping |
Either a string with the name of a sample data column or a factor of length equal to the number of samples in physeq. These are the groups of samples whose labels should be permuted and are used for repeated measures designs. Default is no grouping (each group is of size 1). |
distance |
A distance, see |
type |
One of "mst", "knn", "threshold". If "mst", forms the minimum spanning tree of the sample points. If "knn", forms a directed graph with links from each node to its k nearest neighbors. If "threshold", forms a graph with edges between every pair of samples within a certain distance. |
max.dist |
For type "threshold", the maximum distance between two samples such that we put an edge between them. |
knn |
For type "knn", the number of nearest neighbors. |
nedges |
If using "threshold.nedges", the number of edges to use. |
keep.isolates |
In the returned network, keep the unconnected points? |
nperm |
The number of permutations to perform. |
A list with the observed number of pure edges, the vector containing the number of pure edges in each permutation, the permutation p-value, the graph used for testing, and a vector with the sample types used for the test.
library(phyloseq) data(enterotype) gt = graph_perm_test(enterotype, sampletype = "SeqTech", type = "mst") gt
library(phyloseq) data(enterotype) gt = graph_perm_test(enterotype, sampletype = "SeqTech", type = "mst") gt
igraph
This is copied with very slight modification from https://github.com/briatte/ggnetwork/blob/master/R/fortify-igraph.R, as that version is not on CRAN yet.
new_fortify.igraph( model, data = NULL, layout = igraph::nicely(), arrow.gap = ifelse(igraph::is.directed(model), 0.025, 0), by = NULL, scale = TRUE, stringsAsFactors = getOption("stringsAsFactors", FALSE), ... )
new_fortify.igraph( model, data = NULL, layout = igraph::nicely(), arrow.gap = ifelse(igraph::is.directed(model), 0.025, 0), by = NULL, scale = TRUE, stringsAsFactors = getOption("stringsAsFactors", FALSE), ... )
model |
an object of class |
data |
not used by this method. |
layout |
a function call to an
|
arrow.gap |
a parameter that will shorten the network edges in order to
avoid overplotting edge arrows and nodes; defaults to |
by |
a character vector that matches an edge attribute, which will be
used to generate a data frame that can be plotted with
|
scale |
whether to (re)scale the layout coordinates. Defaults to
|
stringsAsFactors |
whether vertex and edge attributes should be
converted to factors if they are of class |
... |
additional parameters for the |
a data.frame
object.
Plots a histogram of the permutation distribution of the number of pure edges and a mark showing the observed number of pure edges.
plot_permutations(graphtest, bins = 30)
plot_permutations(graphtest, bins = 30)
graphtest |
The output from graph_perm_test. |
bins |
The number of bins to use for the histogram. |
A ggplot object.
library(phyloseq) data(enterotype) gt = graph_perm_test(enterotype, sampletype = "SeqTech") plot_permutations(gt)
library(phyloseq) data(enterotype) gt = graph_perm_test(enterotype, sampletype = "SeqTech") plot_permutations(gt)
When using the graph_perm_test function, a graph is created. This function will plot the graph used for testing with nodes colored by sample type and edges marked as pure or mixed.
plot_test_network(graphtest)
plot_test_network(graphtest)
graphtest |
The output from graph_perm_test. |
A ggplot object created by ggnetwork.
library(phyloseq) data(enterotype) gt = graph_perm_test(enterotype, sampletype = "SeqTech") plot_test_network(gt)
library(phyloseq) data(enterotype) gt = graph_perm_test(enterotype, sampletype = "SeqTech") plot_test_network(gt)
Print psgraphtest objects
## S3 method for class 'psgraphtest' print(x, ...)
## S3 method for class 'psgraphtest' print(x, ...)
x |
|
... |
Not used |
Copied from https://github.com/briatte/ggnetwork/blob/f3b8b84d28a65620a94f7aecd769c0ea939466e3/R/utilities.R so as to fix a problem with the cran version of ggnetwork.
scale_safely(x, scale = diff(range(x)))
scale_safely(x, scale = diff(range(x)))
x |
a vector to rescale |
scale |
the scale on which to rescale the vector |
The rescaled vector, coerced to a vector if necessary. If the original vector was constant, all of its values are replaced by 0.5.
Kipp Johnson