Title: | Read and Analyze Mass Spectrometry Alignment Files |
---|---|
Description: | A few functions for analyzing MS-DIAL alignments in R. Includes functions for feature normalization, subtraction of blanks, and mass library (msp) search. |
Authors: | Ethan Bass [aut, cre] |
Maintainer: | Ethan Bass <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.4.2 |
Built: | 2024-11-05 18:20:09 UTC |
Source: | https://github.com/ethanbass/mzinspectr |
The function can take multiple response variables on the left hand side of the
formula (separated by +
). In this case, a separate boxplot will be
produced for each response variable.
## S3 method for class 'ms_alignment' boxplot(x, formula, ...)
## S3 method for class 'ms_alignment' boxplot(x, formula, ...)
x |
A peak_table object |
formula |
A formula object |
... |
Additional arguments to |
Attaches experimental metadata to 'ms_alignment' object. One of the columns in the supplied metadata must match exactly the row names of the peak table.
ms_attach_metadata(x, metadata, col)
ms_attach_metadata(x, metadata, col)
x |
A |
metadata |
A 'data.frame' containing the sample metadata. |
col |
The name of the column containing the sample names. |
A ms_alignment
object with attached metadata in the
$sample_meta
slot.
Ethan Bass
Convert retention times to retention indices in alignment object.
ms_calculate_RIs(x, Ris)
ms_calculate_RIs(x, Ris)
x |
An |
Ris |
A matrix or data.frame containing retention times in column one and retention indices in column two. |
Call MS-DIAL console app For help configuring the MSDIAL console app on mac OSX or linux, please see the instructions helpfully compiled by Jiung-Wen Chen.
ms_call_msdial( system, path_in, path_out, method, settings, p = FALSE, mce = FALSE )
ms_call_msdial( system, path_in, path_out, method, settings, p = FALSE, mce = FALSE )
system |
Either |
path_in |
Path to files. |
path_out |
Path to output directory |
method |
A method file. |
settings |
Settings in lieu of a method file. |
p |
Logical. |
mce |
Logical |
Returns MSDIAL alignment.
Filter alignment by provided indices.
ms_filter_alignment(x, idx, what = c("rows", "cols"), inverse = FALSE)
ms_filter_alignment(x, idx, what = c("rows", "cols"), inverse = FALSE)
x |
An |
idx |
Indices to be retained or excluded according to the value of |
what |
Which dimension to filter on. Either ( |
inverse |
Whether to retain (default) or remove the specified columns. |
Ethan Bass
Find peak based on retention time and/or mass
ms_find_peak(x, rt, mz, rt.tol = 0.01, mz.tol = 0.05, plot_it = TRUE)
ms_find_peak(x, rt, mz, rt.tol = 0.01, mz.tol = 0.05, plot_it = TRUE)
x |
An |
rt |
Retention time |
mz |
Quant.mass |
rt.tol |
Tolerance for matching retention time |
mz.tol |
Tolerance for matching Quant.mass |
plot_it |
Logical. Whether to plot the spectra. |
Returns EI spectrum as a data.frame
.
Ethan Bass
Get spectrum from MSDIAL alignment object
ms_get_spectrum(x, col)
ms_get_spectrum(x, col)
x |
An |
col |
Index of the feature (column). |
Returns spectrum as a data.frame with two columns: "mz" and "intensity".
Ethan Bass
Mirror plot function
Plot two spectra as a mirror plot.
Plot two spectra as a mirror plot.
ms_mirror_plot(x, ...) ## S3 method for class 'data.frame' ms_mirror_plot( x, y, plot_labels = TRUE, type = c("plotly", "base"), scale = TRUE, lab_int = 0.2, digits = 1, bar_width = 1, match_score = TRUE, ... ) ## S3 method for class 'ms_alignment' ms_mirror_plot( x, cols, ref, type = c("plotly", "base"), scale = TRUE, plot_labels = TRUE, lab_int = 0.2, digits = 1, bar_width = 1, match_score = TRUE, ... )
ms_mirror_plot(x, ...) ## S3 method for class 'data.frame' ms_mirror_plot( x, y, plot_labels = TRUE, type = c("plotly", "base"), scale = TRUE, lab_int = 0.2, digits = 1, bar_width = 1, match_score = TRUE, ... ) ## S3 method for class 'ms_alignment' ms_mirror_plot( x, cols, ref, type = c("plotly", "base"), scale = TRUE, plot_labels = TRUE, lab_int = 0.2, digits = 1, bar_width = 1, match_score = TRUE, ... )
x |
A |
... |
Additional arguments |
y |
Mass spectrum as data.frame with m/z values in column one and ionization intensity in column two. |
plot_labels |
Logical. Whether to label m/z values on plot. |
type |
What kind of plot to produce. Either base R ( |
scale |
Logical. Whether to scale mass spectrum. Defaults to TRUE. |
lab_int |
Labels will be plotted above the specified proportion of the largest ion. |
digits |
How many figures to include on m/z labels. |
bar_width |
Width of bars. |
match_score |
Logical. Whether to plot match score or not. |
cols |
One or more columns in the peak table |
ref |
A row in the matches slot corresponding to the provided column. |
Normalize by internal standard.
ms_normalize_itsd(x, idx, plot_it = FALSE)
ms_normalize_itsd(x, idx, plot_it = FALSE)
x |
An |
idx |
Column index of internal standard. |
plot_it |
Logical. Whether to plot ITSD against total peak area. |
A normalized ms_alignment
object or matrix
,
according to the input.
Ethan Bass
Performs Probabilistic Quotient Normalization on peak table.
ms_normalize_pqn(x, ref = c("median", "mean"), QC = NULL)
ms_normalize_pqn(x, ref = c("median", "mean"), QC = NULL)
x |
A |
ref |
Reference for normalization: either |
QC |
vector of number(s) to specify samples which average to use as reference (e.g. QC samples) |
A normalized ms_alignment
object or matrix
,
according to the input.
Adapted from the Rcpm package by Rico Derks (licensed under GPL3).
E. Nevedomskaya
Rico Derks
Ethan Bass
Dieterle, F., Ross, A., Schlotterbeck, G. & Senn, H. Probabilistic Quotient Normalization as Robust Method to Account for Dilution of Complex Biological Mixtures. Application in H1 NMR Metabonomics. Anal. Chem. 78, 4281-4290 (2006).
Divides each row by the sum of the features in that row.
ms_normalize_tsn(x)
ms_normalize_tsn(x)
x |
An |
A normalized ms_alignment
object or matrix
,
according to the input.
col
.Plot mass spectrum of peak given by col
.
ms_plot_spectrum( x, col, plot_labels = TRUE, lab_int = 0.2, title = TRUE, type = c("plotly", "base"), scale = FALSE, bar_width = 1, digits = 1, ... )
ms_plot_spectrum( x, col, plot_labels = TRUE, lab_int = 0.2, title = TRUE, type = c("plotly", "base"), scale = FALSE, bar_width = 1, digits = 1, ... )
x |
An alignment object. |
col |
Spectrum to plot. |
plot_labels |
Logical. Whether to plot labels or not. |
lab_int |
Labels will be plotted above the specified proportion of the largest ion. |
title |
Logical. Whether to plot title. Defaults to TRUE. |
type |
What kind of plot to produce. Either base R ( |
scale |
Logical. Whether to scale mass spectrum. Defaults to FALSE. |
bar_width |
Width of bars. |
digits |
How many figures to include on mz labels |
... |
Additional arguments. |
If export
is TRUE
, returns spectrum as data.frame
.
Otherwise, no return value.
Ethan Bass
Read MSDIAL alignment file
ms_read_alignment(path, format = c("msdial"))
ms_read_alignment(path, format = c("msdial"))
path |
Path to mass spectrometry alignment file. |
format |
The format of the provided alignment file. Currently, only
MS-DIAL '.txt' files are supported ( |
Returns ms_alignment
object. A list of 3 data.frames,
containing peak data (tab
), peak metadata (peak_meta
) and
sample metadata (sample_meta
).
Ethan Bass
Convert peak table to tidy format for plotting.
ms_reshape_peaktable( x, peaks, metadata, treatments = NULL, fixed_levels = TRUE )
ms_reshape_peaktable( x, peaks, metadata, treatments = NULL, fixed_levels = TRUE )
x |
An MS dial alignment object. |
peaks |
A character vector specifying the peaks to include in tidy output. If the character vector is named, the names of the vector elements will be used in place of the original peak names. |
metadata |
A character vector specifying the metadata to include in the tidy output. |
treatments |
This argument is deprecated as of version 0.3.2. It is synonymous with the new metadata argument which should be used instead. |
fixed_levels |
Logical. Whether to fix factor levels of features in the
order provided. Defaults to |
If export
is TRUE
, returns spectrum as data.frame
.
Otherwise, no return value.
Ethan Bass
Convert retention times to retention indices.
ms_rt_to_ri(rts, RIs)
ms_rt_to_ri(rts, RIs)
rts |
A vector of retention times. |
RIs |
A matrix or data.frame containing retention times in column one and retention indices in column two. |
Launch MS search gadget for interactive viewing of spectral matches.
ms_search_gadget(data)
ms_search_gadget(data)
data |
An |
This function can be used to identify peaks in a peak table by matching them
to a spectral database (db
). It takes several arguments that can
be used to customize the matching algorithm, including ri_thresh
,
spectral weight
, n_results
. The retention index threshold
(ri_thresh
) is used to subset the provided database, which greatly
improves the search speed. Only database entries with a retention index
falling within the specified threshold will be considered. The spectral
weight affects the weight given to spectral similarity (versus retention
index similarity) when calculating the the total similarity score, which is
used to rank matches.
ms_search_spectra( x, db, cols, ..., ri_thresh = 100, spectral_weight = 0.6, n_results = 10, parallel, mc.cores = 2, print = FALSE, progress_bar = TRUE )
ms_search_spectra( x, db, cols, ..., ri_thresh = 100, spectral_weight = 0.6, n_results = 10, parallel, mc.cores = 2, print = FALSE, progress_bar = TRUE )
x |
An |
db |
MSP database. The provided object should be a nested list, where the
sublists contain the following elements: retention indices in an element named
|
cols |
Index or indices of feature(s) to be identified. |
... |
Additional arguments to |
ri_thresh |
Maximum difference between retention indices for a match.
to be considered. Defaults to 100. Use |
spectral_weight |
A number between 0 and 1 specifying the weight given. to spectral similarity versus retention index similarity. Defaults to 0.6. |
n_results |
How many results to return. Defaults to 10. |
parallel |
Logical. Whether to use parallel processing. (This feature does not work on Windows). |
mc.cores |
How many cores to use for parallel processing? Defaults to 2. |
print |
Logical. Whether to print the results after each search. Defaults to FALSE. |
progress_bar |
Logical. Whether to display progress bar or not. |
Returns a modified ms_alignment
object with database matches
in the matches
slot as a list of data frames. Each data.frame
will contain the database matches as rows and columns corresponding to the
elements of the database entry (e.g. "Name", "InChIKey", etc.) as well as
match scores for spectral similarity (spectral_match
), retention index
similarity (ri_match
) and the total similarity score (total_score
).
See mspcompiler for help compiling an msp database.
Ethan Bass
Subtract blanks
ms_subtract_blanks( x, blanks.idx, blanks.pattern, what = c("mean", "median"), drop = TRUE )
ms_subtract_blanks( x, blanks.idx, blanks.pattern, what = c("mean", "median"), drop = TRUE )
x |
A |
blanks.idx |
Indices of blank samples |
blanks.pattern |
A string that uniquely identifies blank samples by name |
what |
Whether to subtract the mean or median value |
drop |
Logical. Whether to drop columns containing only zeros. Defaults to TRUE. |
A ms_alignment
object with the mean or median of the blanks
subtracted from each peak.
Converts peak table to tidy format for plotting. This function is deprecated
as of version 0.3.3
. Please use ms_reshape_peaktable
instead.
ms_tidy_msdial(x, peaks, metadata, treatments = NULL)
ms_tidy_msdial(x, peaks, metadata, treatments = NULL)
x |
An MS dial alignment object. |
peaks |
A character vector specifying the peaks to include in tidy output. If the character vector is named, the names of the vector elements will be used in place of the original peak names. |
metadata |
A character vector specifying the metadata to include in the tidy output. |
treatments |
This argument is deprecated as of version 0.3.2. It is synonymous with the new metadata argument which should be used instead. |
If export
is TRUE
, returns spectrum as data.frame
.
Otherwise, no return value.
Ethan Bass
Plots the trace and/or spectrum for a given peak in peak table.
## S3 method for class 'ms_alignment' plot( x, col, plot_spectrum = TRUE, box_plot = FALSE, vars = NULL, spectrum_labels = TRUE, engine = c("base", "plotly"), ... )
## S3 method for class 'ms_alignment' plot( x, col, plot_spectrum = TRUE, box_plot = FALSE, vars = NULL, spectrum_labels = TRUE, engine = c("base", "plotly"), ... )
x |
A |
col |
A vector specifying the peak(s) that you wish to plot. |
plot_spectrum |
Logical. If TRUE, plots the mass spectrum of the chosen peak. Defaults to TRUE. |
box_plot |
Logical. If TRUE, plots box plot using factors
defined by |
vars |
Independent variables for boxplot. Righthand side of formula. |
spectrum_labels |
Logical. If TRUE, plots labels on maxima in spectral plot. Defaults to TRUE. |
engine |
Which plotting engine to use: either |
... |
Additional arguments to |
Can be used to confirm the identity of a peak or check that a particular column in the peak table represents a single compound. Can also be used to create simple box-plots to examine the distribution of a peak with respect to variables defined in sample metadata.
No return value.
If plot_spectrum
is TRUE, plots the spectrum for the specified chromatogram
at the specified retention time. The spectrum is a single row from the chromatographic
matrix.
If box_plot
is TRUE, produces a boxplot
from the
specified peak with groups provided by vars
.
Ethan Bass
This function is slightly adapted from the SpectrumSimilarity
function
in [OrgMassSpecR](https://orgmassspec.github.io/) where it is licensed under
BSD-2 (© 2011-2017, Nathan Dodder). The function was re-factored here for
increased speed.
spectral_similarity( spec.top, spec.bottom, tol = 0.25, b = 10, xlim = c(50, 1200), x.threshold = 0 )
spectral_similarity( spec.top, spec.bottom, tol = 0.25, b = 10, xlim = c(50, 1200), x.threshold = 0 )
spec.top |
data frame containing the experimental spectrum's peak list with the m/z values in the first column and corresponding intensities in the second. |
spec.bottom |
data frame containing the reference spectrum's peak list with the m/z values in the first column and corresponding intensities in the second. |
tol |
numeric value specifying the tolerance used to align the m/z values of the two spectra. |
b |
numeric value specifying the baseline threshold for peak identification. Expressed as a percent of the maximum intensity. |
xlim |
numeric vector of length 2, defining the beginning and ending values of the x-axis. |
x.threshold |
numeric value of length 1 specifying the m/z threshold used for the similarity score calculation. Only peaks with m/z values above the threshold are used in the calculation. This can be used to exclude noise and/or non-specific ions at the low end of the spectrum. By default all ions are used. |
Nathan G. Dodder
Ethan Bass