phyloseqGraphTest: Non-parametric graph-based testing for microbiome data.
Description
This package lets you test for differences between groups of samples with a graph-based permutation test.
Details
The main function in the package is graph_perm_test ,
which takes a phyloseq object.
The graph used in the test can be visualized using
plot_test_network . The permutation distribution and
the test statistic can be visualized with
plot_permutations .
format_fortify
Description
a unified function to format network or
igraph object. Copied with
very slight modification from
https://github.com/briatte/ggnetwork/blob/master/R/utilities.R to
fix the same CRAN problem as new_fortify.igraph.
Usage
format_fortify(
model,
nodes = NULL,
weights = NULL,
arrow.gap = 0,
by = NULL,
scale = TRUE,
stringsAsFactors = getOption("stringsAsFactors", FALSE),
.list_vertex_attributes_fun = NULL,
.get_vertex_attributes_fun = NULL,
.list_edges_attributes_fun = NULL,
.get_edges_attributes_fun = NULL,
.as_edges_list_fun = NULL
)
Arguments
nodes
a nodes object from a call to fortify.
weights
the name of an edge attribute to use as edge weights when
computing the network layout, if the layout supports such weights (see
'Details').
Defaults to NULL (no edge weights).
arrow.gap
a parameter that will shorten the network edges in order to
avoid overplotting edge arrows and nodes; defaults to 0 when the
network is undirected (no edge shortening), or to 0.025 when the
network is directed. Small values near 0.025 will generally achieve
good results when the size of the nodes is reasonably small.
by
a character vector that matches an edge attribute, which will be
used to generate a data frame that can be plotted with
facet_wrap or facet_grid . The
nodes of the network will appear in all facets, at the same coordinates.
Defaults to NULL (no faceting).
scale
whether to (re)scale the layout coordinates. Defaults to
TRUE, but should be set to FALSE if layout contains
meaningful spatial coordinates, such as latitude and longitude.
stringsAsFactors
whether vertex and edge attributes should be
converted to factors if they are of class character. Defaults to
the value of getOption("stringsAsFactors"), which is FALSE
by default: see data.frame .
.list_vertex_attributes_fun
a "list vertex attributes" function.
.get_vertex_attributes_fun
a "get vertex attributes" function.
.list_edges_attributes_fun
a "get edges attributes" function.
.get_edges_attributes_fun
a "get edges attributes" function.
.as_edges_list_fun
a "as edges list" function.
Value
a data.frame object.
Performs graph-based permutation tests
Description
Performs graph-based tests for one-way designs.
Usage
graph_perm_test(
physeq,
sampletype,
grouping = 1:nsamples(physeq),
distance = "jaccard",
type = c("mst", "knn", "threshold.value", "threshold.nedges"),
max.dist = 0.4,
knn = 1,
nedges = nsamples(physeq),
keep.isolates = TRUE,
nperm = 499
)
Arguments
physeq
A phyloseq object.
sampletype
A string giving the column name of the sample to be tested. This should be a factor with two or more levels.
grouping
Either a string with the name of a sample data column or a factor of length equal to the number of samples in physeq. These are the groups of samples whose labels should be permuted and are used for repeated measures designs. Default is no grouping (each group is of size 1).
distance
A distance, see distance for a
list of the possible methods.
type
One of "mst", "knn", "threshold". If "mst", forms the minimum spanning tree of the sample points. If "knn", forms a directed graph with links from each node to its k nearest neighbors. If "threshold", forms a graph with edges between every pair of samples within a certain distance.
max.dist
For type "threshold", the maximum distance between two samples such that we put an edge between them.
knn
For type "knn", the number of nearest neighbors.
nedges
If using "threshold.nedges", the number of edges to use.
keep.isolates
In the returned network, keep the unconnected points?
nperm
The number of permutations to perform.
Value
A list with the observed number of pure edges, the vector containing the number of pure edges in each permutation, the permutation p-value, the graph used for testing, and a vector with the sample types used for the test.
Examples
library(phyloseq)
data(enterotype)
gt = graph_perm_test(enterotype, sampletype = "SeqTech", type = "mst")
gt
Fortify method for networks of class igraph
Description
This is copied with very slight modification from https://github.com/briatte/ggnetwork/blob/master/R/fortify-igraph.R, as that version is not on CRAN yet.
Usage
new_fortify.igraph(
model,
data = NULL,
layout = igraph::nicely(),
arrow.gap = ifelse(igraph::is.directed(model), 0.025, 0),
by = NULL,
scale = TRUE,
stringsAsFactors = getOption("stringsAsFactors", FALSE),
...
)
Arguments
model
an object of class igraph .
data
not used by this method.
layout
a function call to an
igraph layout function, such as
layout_nicely (the default), or a 2 column matrix
giving the x and y coordinates for the vertices.
See layout_ for details.
arrow.gap
a parameter that will shorten the network edges in order to
avoid overplotting edge arrows and nodes; defaults to 0 when the
network is undirected (no edge shortening), or to 0.025 when the
network is directed. Small values near 0.025 will generally achieve
good results when the size of the nodes is reasonably small.
by
a character vector that matches an edge attribute, which will be
used to generate a data frame that can be plotted with
facet_wrap or facet_grid . The
nodes of the network will appear in all facets, at the same coordinates.
Defaults to NULL (no faceting).
scale
whether to (re)scale the layout coordinates. Defaults to
TRUE, but should be set to FALSE if layout contains
meaningful spatial coordinates, such as latitude and longitude.
stringsAsFactors
whether vertex and edge attributes should be
converted to factors if they are of class character. Defaults to
the value of getOption("stringsAsFactors"), which is FALSE
by default: see data.frame .
...
additional parameters for the layout_ function
Value
a data.frame object.
Permute labels
Description
Permutes sample labels, respecting repeated measures.
Usage
permute(sampledata, grouping, sampletype)
Arguments
sampledata
Data frame describing the samples.
grouping
Grouping for repeated measures.
sampletype
The sampletype used for testing (a column of sampledata).
Value
A permuted set of labels where the permutations are done over the levels of grouping.
Plots the permutation distribution
Description
Plots a histogram of the permutation distribution of the number of pure edges and a mark showing the observed number of pure edges.
Usage
plot_permutations(graphtest, bins = 30)
Arguments
graphtest
The output from graph_perm_test.
bins
The number of bins to use for the histogram.
Value
A ggplot object.
Examples
library(phyloseq)
data(enterotype)
gt = graph_perm_test(enterotype, sampletype = "SeqTech")
plot_permutations(gt)
Plots the graph used for testing
Description
When using the graph_perm_test function, a graph is created. This function will plot the graph used for testing with nodes colored by sample type and edges marked as pure or mixed.
Usage
plot_test_network(graphtest)
Arguments
graphtest
The output from graph_perm_test.
Value
A ggplot object created by ggnetwork.
Examples
library(phyloseq)
data(enterotype)
gt = graph_perm_test(enterotype, sampletype = "SeqTech")
plot_test_network(gt)
Print psgraphtest objects
Description
Print psgraphtest objects
Usage
## S3 method for class 'psgraphtest'
print(x, ...)
Arguments
x
psgraphtest object.
...
Not used
Rescale x to (0, 1), except if x is constant
Description
Copied from https://github.com/briatte/ggnetwork/blob/f3b8b84d28a65620a94f7aecd769c0ea939466e3/R/utilities.R so as to fix a problem with the cran version of ggnetwork.
Usage
scale_safely(x, scale = diff(range(x)))
Arguments
x
a vector to rescale
scale
the scale on which to rescale the vector
Value
The rescaled vector, coerced to a vector if necessary. If the original vector was constant, all of its values are replaced by 0.5.
Author(s)
Kipp Johnson
Check for valid grouping
Description
Grouping should describe a repeated measures design, so this function tests whether all of the levels of grouping have the same value of sampletype.
Usage
validGrouping(sd, sampletype, grouping)
Arguments
sd
Data frame describing the samples.
sampletype
The sampletype used for testing.
grouping
Grouping for repeated measures.
Value
TRUE or FALSE for valid or invalid grouping.