Model and analyze MS/MS-based protein melting data.
Description
mstherm is a package for modeling and analysis of
MS/MS-based thermal proteome profiling (TPP) experiments.
Author(s)
Jeremy Volkening jdv@base2bio.com
Create a new MSThermExperiment.
Description
MSThermExperiment creates a new experiment object from
a set of filenames or data frames.
Usage
MSThermExperiment(control, annotations)
Arguments
control
data frame or filename of tab-delimited table describing the experimental setup and locations of data on disk (see Details)
annotations
data frame or filename to tab-delimited table containing protein names and annotations (usually functional descriptions but can be any text
Details
Both parameters can take either a data frame or a tab-delimited filename on disk (which will be read into a data frame). "control" should contain columns with the following headers (in any order):
- "name"
Unique identifier of a single replicate
- "sample"
Sample name that a replicate belongs to
- "data_file"
Path to file on disk containing the quantification data
- "meta_file"
Path to file on disk containing the labeling metadata
The "meta_file" should be tab-delimited text and contain two columns labeled "channel" and "temp". The "data_file" should be tab-delimited text and contain, at a minimum, the following columns:
- "peptide"
Sequence of the matched peptide in single-letter IUPAC
- "protein"
Protein or protein group to which the peptide belongs
- "..."
One column per isobaric channel, containing absolute quantification values. Column names must match those in the "channel" column of the meta file, with the exception that R will automatically convert any name not compatible with its syntax rules. To be safe, use only letters, digits, underscores, and periods in channel names and never start with a digit (e.g. use "TMT.126" rather than "126")
The following columns can also be utilized for filtering if included (all others will simply be ignored):
- "coelute_inf"
Calculated precursor co-isolation interference (0.0-1.0)
- "score"
Score assigned by the processing software to the PSM
"annotations" should contain two columns with the headers "name" and "annotation". "name" should match the protein names in the data files, and "annotation" can contain any text (generally a functional description)
Value
An MSThermExperiment object
Examples
control <- system.file("extdata", "demo_project/control.tsv", package="mstherm")
annots <- system.file("extdata", "demo_project/annots.tsv", package="mstherm")
expt <- MSThermExperiment(control, annotations=annots)
Convert absolute quantitation to relative ratios.
Description
abs_to_ratio takes a vector of absolute values and
returns a vector of ratios relative to some starting point.
Usage
abs_to_ratio(x, method = "first")
Arguments
x
vector of numeric absolute quantitation values
method
method to use to determine starting value (denominator)
Details
The denominator used to calculate relative protein concentrations can affect the ability to model noisy data. In the theoretically ideal scenario, everything would be relative to the lowest temperature point. However, other methods can be used to help alleviate problems related to noise. Available methods include:
- "first"
Use the first value (lowest temperature point) (default)
- "max"
Use the maximum value
- "top3"
Use the mean of the three highest values
- "near"
Use the median of all values greater than 80 the first value
Value
A numeric vector of the same length as input
MSResultSet to data frame.
Description
Populates a data frame with information from an MSResultSet, one row per protein/group
Usage
## S3 method for class 'MSThermResultSet'
as.data.frame(x, ...)
Arguments
x
an MSResultSet object
...
additional arguments passed to or from other functions
Value
A data frame populated with relevant information per result
Examples
control <- system.file("extdata", "demo_project/control.tsv", package="mstherm")
annots <- system.file("extdata", "demo_project/annots.tsv", package="mstherm")
expt <- MSThermExperiment(control, annotations=annots)
expt <- normalize_to_std(expt, "cRAP_ALBU_BOVIN", plot=FALSE)
res <- model_experiment(expt, bootstrap=FALSE, np=2)
df <- as.data.frame(res)
Generate protein ratio profile from spectrum quantification matrix.
Description
gen_profile takes a matrix of spectrum channel
quantification values belonging to a protein and "rolls them up" into a
vector of protein-level relative quantification values.
Usage
gen_profile(x, method = "sum", method.denom = "first")
Arguments
x
matrix of spectrum quantification values, one row per spectrum and one column per channel
method
method to use to "roll up" spectrum values to protein level
method.denom
method used to determine ratio denominator, passed as
the "method" argument to abs_to_ratio
Details
The following methods for spectrum-to-protein conversion are supported:
- "sum"
use the sum of the Spectrum values for each channel
- "median"
use the median of the spectrum values for each channel
- "ratio.median"
Like "median", but values for each spectrum are first converted to ratios according to "method.denom" channel
- "ratio.mean"
Like "ratio.median" but using mean of ratios
Value
A numeric vector of the same length as the number of matrix columns
Model MSThermExperiment.
Description
Model multiple proteins from an MSThermExperiment object.
Usage
model_experiment(expt, proteins, np, ...)
Arguments
expt
An MSThermExperiment object
proteins
A vector of protein IDs to model (default is all proteins).
np
Number of parallel jobs to start (default = number of available processors)
...
Parameters passed to model_protein()
Value
MSThermResultSet object
Examples
control <- system.file("extdata", "demo_project/control.tsv", package="mstherm")
annots <- system.file("extdata", "demo_project/annots.tsv", package="mstherm")
expt <- MSThermExperiment(control, annotations=annots)
expt <- normalize_to_std(expt, "cRAP_ALBU_BOVIN", plot=FALSE)
res <- model_experiment(expt, bootstrap=FALSE, np=2)
summary(res)
Model single protein.
Description
Model a single protein from an MSThermExperiment object.
Usage
model_protein(expt, protein, min_rep_psm = 0, min_smp_psm = 0,
min_tot_psm = 0, max_inf = 1, min_score, max_score, smooth = 0,
method = "sum", method.denom = "near", trim = 0, bootstrap = 0,
min_bs_psms = 8, annot_sep = "|", max_slope = 0, min_r2 = 0,
min_reps = 0, only_modeled = 0, check_missing = 0,
missing_cutoff = 0.3)
Arguments
expt
An MSThermExperiment object
protein
ID of the protein to model
min_rep_psm
Minimum number of spectral matches required for each replicate to model protein
min_smp_psm
Minimum number of spectral matches required for each sample to model protein
min_tot_psm
Minimum number of spectral matches required across all replicates to model protein
max_inf
Maximum co-isolation interference level allowed to include a spectrum in protein-level quantification
min_score
minimum score allowed to include a spectrum in protein-level quantification
max_score
maximum score allowed to include a spectrum in protein-level quantification
smooth
(t/F) Perform loess smoothing on the data prior to modeling
method
Protein quantification method to use (see Details)
method.denom
Method used to calculate denominator of abundance (see Details)
trim
(t/F) Trim all lower data points less than the abundance maximum
bootstrap
(T/F) Perform bootstrap analysis to determine confidence intervals (slow)
min_bs_psms
Minimum number of spectral matches required to perform bootstrapping
annot_sep
Symbol used to separate protein group IDs (used for retrieval of annotations) (default: '|')
max_slope
Maximum slope to consider model (implies "only_modeled")
min_r2
Minimum R2 value to consider model (implies "only_modeled")
min_reps
Minimum number of modeled replicates for each sample to return protein
only_modeled
(t/F) Only consider modeled proteins
check_missing
(t/F) Run simple test to filter out PSMs with missing quantification channels where values are expected
missing_cutoff
Minimum fraction relative to surrounding data points used in the check for missing channels
Details
Valid quantification methods include:
- "sum"
use the sum of the spectrum values for each channel
- "median"
use the median of the spectrum values for each channel
- "ratio.median"
Like "median", but values for each spectrum are first converted to ratios according to "method.denom" channel
- "ratio.mean"
Like "ratio.median" but using mean of ratios
Valid denominator methods include:
- "first"
Use the first value (lowest temperature point) (default)
- "max"
Use the maximum value
- "top3"
Use the mean of the three highest values
- "near"
Use the median of all values greater than 80 the first value
Value
MSThermResult object
Examples
control <- system.file("extdata", "demo_project/control.tsv", package="mstherm")
annots <- system.file("extdata", "demo_project/annots.tsv", package="mstherm")
expt <- MSThermExperiment(control, annotations=annots)
expt <- normalize_to_std(expt, "cRAP_ALBU_BOVIN", plot=FALSE)
model <- model_protein(expt, "P38707", smooth=TRUE, bootstrap=FALSE)
summary(model)
Normalize to a profile.
Description
Normalizes an MSThermReplicate based on a pre-determined vector of relative abundances
Usage
normalize_to_profile(replicate, profile, model = T, plot = T)
Arguments
replicate
an MSThermReplicate object
profile
a vector of relative values
model
whether to fit scale factors to model
plot
(T/f) whether to display a summary plot
Value
An MsThermReplicate object with normalized data slots
Examples
control <- system.file("extdata", "demo_project/control.tsv", package="mstherm")
annots <- system.file("extdata", "demo_project/annots.tsv", package="mstherm")
expt <- MSThermExperiment(control, annotations=annots)
profile <- c(50.0, 50.5, 47.5, 42.0, 37.0, 25.0, 16.0, 11.5, 10.5, 10.0)
expt$samples$Control$replicates$C1 <- normalize_to_profile(
expt$samples$Control$replicates$C1, profile, plot=FALSE
)
Normalize to a spike-in standard.
Description
Normalizes each replicate of an experiment based on a given spike-in protein standard (assumed to be present in equimolar amounts in each channel).
Usage
normalize_to_std(expt, protein, model = T, plot = T)
Arguments
expt
an MSThermExperiment object
protein
ID of a protein to normalize against
model
whether to fit scale factors to model
plot
(T/f) whether to display a summary plot
Value
An MsThermExperiment object with normalized data slots
Examples
control <- system.file("extdata", "demo_project/control.tsv", package="mstherm")
annots <- system.file("extdata", "demo_project/annots.tsv", package="mstherm")
expt <- MSThermExperiment(control, annotations=annots)
expt <- normalize_to_std(expt, "cRAP_ALBU_BOVIN", plot=FALSE)
Re-normalize based on Tm.
Description
Normalizes each replicate of an experiment based on linear regression of calculated Tm (corrects for remaining systematic error).
Usage
normalize_to_tm(expt, res)
Arguments
expt
An MSThermExperiment object
res
An MSThermResultSet object
Details
An assumption can be made in most TPP experiments using a single organism that the Tm of most proteins should not be changing. However, global shifts have been observed between replicates, presumably due to technical variance, which complicate data interpretation. This method attempts to remove this source of error by doing a bootstrap renormalization of the quantification values based on pairwise linear regression between calculated Tms of each replicate. A reference set of Tms is calculated based on all replicates, and each replicate is normalized to this based on the calculated slope and intercept of the input data.
Value
An MsThermExperiment object with re-normalized data slots
Examples
control <- system.file("extdata", "demo_project/control.tsv", package="mstherm")
annots <- system.file("extdata", "demo_project/annots.tsv", package="mstherm")
expt <- MSThermExperiment(control, annotations=annots)
expt <- normalize_to_std(expt, "cRAP_ALBU_BOVIN", plot=FALSE)
res <- model_experiment(expt, smooth=TRUE, bootstrap=FALSE, np=2)
expt <- normalize_to_tm(expt, res)
Plot MSThermResult object.
Description
Generate a denaturation plot for an modeled protein/group.
Usage
## S3 method for class 'MSThermResult'
plot(x, table = T, col, CI.points = T, CI.Tm = T,
...)
Arguments
x
An MSThermResult object
table
(T/f) include table of per-replicate parameters
col
array of colors used to plot samples
CI.points
(T/F) plot temperature point confidence intervals
CI.Tm
(T/F) plot Tm confidence intervals
...
other parameters passed through to plot()
Value
Nothing
Examples
control <- system.file("extdata", "demo_project/control.tsv", package="mstherm")
annots <- system.file("extdata", "demo_project/annots.tsv", package="mstherm")
expt <- MSThermExperiment(control, annotations=annots)
expt <- normalize_to_std(expt, "cRAP_ALBU_BOVIN", plot=FALSE)
res <- model_experiment(expt, bootstrap=FALSE, np=2)
# plot single MSThermResult
plot(res$P38707)
# plot all proteins (e.g. to pdf device, one-per-page)
plot(res)
Plot MSThermResultSet object.
Description
Generate a series of denaturation plots for all results in an MSThermResultSet.
Usage
## S3 method for class 'MSThermResultSet'
plot(x, ...)
Arguments
x
an MSThermResultSet object
...
other parameters are passed through to plot.MSThermResult
Details
Since this function makes multiple sequential calls to
plot.MSThermResult, it is usually used in conjunction with a multipage
graphics device such as "pdf()". Otherwise each subsequent call
will only overwrite the previous output.
Value
Nothing
Examples
# see plot.MSThermResult for an example
Summarize MSThermResult object.
Description
Print a summary of an MSThermResult, including samples and parameters.
Usage
## S3 method for class 'MSThermResult'
summary(object, ...)
Arguments
object
an MSThermResult object
...
additional arguments passed to or from other functions
Value
Nothing
Examples
# see model_protein() for an example
Summarize MSThermResultSet object.
Description
Print a summary of an MSThermResultSet, including samples and parameters.
Usage
## S3 method for class 'MSThermResultSet'
summary(object, ...)
Arguments
object
an MSThermResultSet object
...
additional arguments passed to or from other functions
Value
Nothing
Examples
# see model_experiment() for an example
Export MSThermResultSet to an SQLite database.
Description
Exports and MSThermResultSet object to a new SQLite database file. Each model (specific to a given replicate and protein) is exported as an individual record. The schema used for the 'data' table can be seen in the code below.
Usage
write.sqlite(res, file)
Arguments
res
An MSThermResultSet object
file
Path to the output sqlite database to be created
Value
Nothing
Examples
control <- system.file("extdata", "demo_project/control.tsv", package="mstherm")
annots <- system.file("extdata", "demo_project/annots.tsv", package="mstherm")
expt <- MSThermExperiment(control, annotations=annots)
expt <- normalize_to_std(expt, "cRAP_ALBU_BOVIN", plot=FALSE)
res <- model_experiment(expt, bootstrap=FALSE, np=2)
fn <- tempfile(fileext = ".sqlite")
write.sqlite(res, fn)
unlink(fn) # for example only