Package 'phyloseqGraphTest'

Title: Graph-Based Permutation Tests for Microbiome Data
Description: Provides functions for graph-based multiple-sample testing and visualization of microbiome data, in particular data stored in 'phyloseq' objects. The tests are based on those described in Friedman and Rafsky (1979) <http://www.jstor.org/stable/2958919>, and the tests are described in more detail in Callahan et al. (2016) <doi:10.12688/f1000research.8986.1>.
Authors: Julia Fukuyama [aut, cre]
Maintainer: Julia Fukuyama <[email protected]>
License: CC0
Version: 0.1.1
Built: 2024-11-01 04:40:45 UTC
Source: https://github.com/jfukuyama/phyloseqgraphtest

Help Index


phyloseqGraphTest: Non-parametric graph-based testing for microbiome data.

Description

This package lets you test for differences between groups of samples with a graph-based permutation test.

Details

The main function in the package is graph_perm_test, which takes a phyloseq object.

The graph used in the test can be visualized using plot_test_network. The permutation distribution and the test statistic can be visualized with plot_permutations.


Performs graph-based permutation tests

Description

Performs graph-based tests for one-way designs.

Usage

graph_perm_test(
  physeq,
  sampletype,
  grouping = 1:nsamples(physeq),
  distance = "jaccard",
  type = c("mst", "knn", "threshold.value", "threshold.nedges"),
  max.dist = 0.4,
  knn = 1,
  nedges = nsamples(physeq),
  keep.isolates = TRUE,
  nperm = 499
)

Arguments

physeq

A phyloseq object.

sampletype

A string giving the column name of the sample to be tested. This should be a factor with two or more levels.

grouping

Either a string with the name of a sample data column or a factor of length equal to the number of samples in physeq. These are the groups of samples whose labels should be permuted and are used for repeated measures designs. Default is no grouping (each group is of size 1).

distance

A distance, see distance for a list of the possible methods.

type

One of "mst", "knn", "threshold". If "mst", forms the minimum spanning tree of the sample points. If "knn", forms a directed graph with links from each node to its k nearest neighbors. If "threshold", forms a graph with edges between every pair of samples within a certain distance.

max.dist

For type "threshold", the maximum distance between two samples such that we put an edge between them.

knn

For type "knn", the number of nearest neighbors.

nedges

If using "threshold.nedges", the number of edges to use.

keep.isolates

In the returned network, keep the unconnected points?

nperm

The number of permutations to perform.

Value

A list with the observed number of pure edges, the vector containing the number of pure edges in each permutation, the permutation p-value, the graph used for testing, and a vector with the sample types used for the test.

Examples

library(phyloseq)
data(enterotype)
gt = graph_perm_test(enterotype, sampletype = "SeqTech", type = "mst")
gt

Fortify method for networks of class igraph

Description

This is copied with very slight modification from https://github.com/briatte/ggnetwork/blob/master/R/fortify-igraph.R, as that version is not on CRAN yet.

Usage

new_fortify.igraph(
  model,
  data = NULL,
  layout = igraph::nicely(),
  arrow.gap = ifelse(igraph::is.directed(model), 0.025, 0),
  by = NULL,
  scale = TRUE,
  stringsAsFactors = getOption("stringsAsFactors", FALSE),
  ...
)

Arguments

model

an object of class igraph.

data

not used by this method.

layout

a function call to an igraph layout function, such as layout_nicely (the default), or a 2 column matrix giving the x and y coordinates for the vertices. See layout_ for details.

arrow.gap

a parameter that will shorten the network edges in order to avoid overplotting edge arrows and nodes; defaults to 0 when the network is undirected (no edge shortening), or to 0.025 when the network is directed. Small values near 0.025 will generally achieve good results when the size of the nodes is reasonably small.

by

a character vector that matches an edge attribute, which will be used to generate a data frame that can be plotted with facet_wrap or facet_grid. The nodes of the network will appear in all facets, at the same coordinates. Defaults to NULL (no faceting).

scale

whether to (re)scale the layout coordinates. Defaults to TRUE, but should be set to FALSE if layout contains meaningful spatial coordinates, such as latitude and longitude.

stringsAsFactors

whether vertex and edge attributes should be converted to factors if they are of class character. Defaults to the value of getOption("stringsAsFactors"), which is FALSE by default: see data.frame.

...

additional parameters for the layout_ function

Value

a data.frame object.


Plots the permutation distribution

Description

Plots a histogram of the permutation distribution of the number of pure edges and a mark showing the observed number of pure edges.

Usage

plot_permutations(graphtest, bins = 30)

Arguments

graphtest

The output from graph_perm_test.

bins

The number of bins to use for the histogram.

Value

A ggplot object.

Examples

library(phyloseq)
data(enterotype)
gt = graph_perm_test(enterotype, sampletype = "SeqTech")
plot_permutations(gt)

Plots the graph used for testing

Description

When using the graph_perm_test function, a graph is created. This function will plot the graph used for testing with nodes colored by sample type and edges marked as pure or mixed.

Usage

plot_test_network(graphtest)

Arguments

graphtest

The output from graph_perm_test.

Value

A ggplot object created by ggnetwork.

Examples

library(phyloseq)
data(enterotype)
gt = graph_perm_test(enterotype, sampletype = "SeqTech")
plot_test_network(gt)

Print psgraphtest objects

Description

Print psgraphtest objects

Usage

## S3 method for class 'psgraphtest'
print(x, ...)

Arguments

x

psgraphtest object.

...

Not used


Rescale x to (0, 1), except if x is constant

Description

Copied from https://github.com/briatte/ggnetwork/blob/f3b8b84d28a65620a94f7aecd769c0ea939466e3/R/utilities.R so as to fix a problem with the cran version of ggnetwork.

Usage

scale_safely(x, scale = diff(range(x)))

Arguments

x

a vector to rescale

scale

the scale on which to rescale the vector

Value

The rescaled vector, coerced to a vector if necessary. If the original vector was constant, all of its values are replaced by 0.5.

Author(s)

Kipp Johnson