Package 'chromatographR' reference manual

Title:	Chromatographic Data Analysis Toolset
Description:	Tools for high-throughput analysis of HPLC-DAD/UV chromatograms (or similar data). Includes functions for preprocessing, alignment, peak-finding and fitting, peak-table construction, data-visualization, etc. Preprocessing and peak-table construction follow the rough formula laid out in alsace (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C., 2015. <doi:10.1093/bioinformatics/btv299>. Alignment of chromatograms is available using parametric time warping (ptw) (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. <doi:10.1093/bioinformatics/btv299>) or variable penalty dynamic time warping (VPdtw) (Clifford, D., & Stone, G. 2012. <doi:10.18637/jss.v047.i08>). Peak-finding uses the algorithm by Tom O'Haver <https://terpconnect.umd.edu/~toh/spectrum/PeakFindingandMeasurement.htm>. Peaks are then fitted to a gaussian or exponential-gaussian hybrid peak shape using non-linear least squares (Lan, K. & Jorgenson, J. W. 2001. <doi:10.1016/S0021-9673(01)00594-5>). See the vignette for more details and suggested workflow.
Authors:	Ethan Bass [aut, cre] , Hans W Borchers [ctb, cph] (Author of savgol and pinv functions bundled from pracma)
Maintainer:	Ethan Bass <[email protected]>
License:	GPL (>= 2)
Version:	0.7.3
Built:	2025-02-10 06:31:23 UTC
Source:	https://github.com/ethanbass/chromatographR

chromatographR

Description

Chromatographic Data Analysis Toolset

Details

Tools for high-throughput analysis of HPLC-DAD/UV chromatograms (or similar data). Includes functions for preprocessing, alignment, peak-finding and fitting, peak-table construction, data-visualization, etc. Preprocessing and peak-table construction follow the rough formula laid out in alsace (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C., 2015. doi:10.1093/bioinformatics/btv299). Alignment of chromatograms is available using parametric time warping (ptw) (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. doi:10.1093/bioinformatics/btv299) or variable penalty dynamic time warping (VPdtw) (Clifford, D., & Stone, G. 2012. doi:10.18637/jss.v047.i08). Peak-finding relies on the algorithm suggested by Tom O'Haver in his Pragmatic Introduction to Signal Processing. Peaks are then fitted to a gaussian or exponential-gaussian hybrid peak shape using non-linear least squares (Lan, K. & Jorgenson, J. W. 2001. doi:10.1016/S0021-9673(01)00594-5). More details on package usage and a suggested workflow can be found in the vignette.

Author(s)

Ethan Bass

Attach experimental metadata

Description

Attaches sample metadata to 'peak_table' object. Metadata should be provided as a data.frame object. One of the columns in the supplied metadata must match exactly the row names of the peak table.

Usage

attach_metadata(peak_table, metadata, column)
attach_metadata(peak_table, metadata, column)

Arguments

`peak_table`	A 'peak_table' object.
`metadata`	A 'data.frame' containing the sample metadata.
`column`	The name of the column in your `metadata` object containing the sample names. Sample names must match the row names of `peak_table$tab`.

Value

A peak_table object with attached metadata in the $sample_meta slot.

Author(s)

Ethan Bass

Examples

data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")
data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")

Attach reference spectra

Description

Gathers reference spectra and attaches them to peak_table object. Reference spectra are defined either as the spectrum with the highest intensity ( max.int) or as the spectrum with the highest average correlation to the other spectra in the peak_table (max.cor).

Usage

attach_ref_spectra(peak_table, chrom_list, ref = c("max.cor", "max.int"))
attach_ref_spectra(peak_table, chrom_list, ref = c("max.cor", "max.int"))

Arguments

`peak_table`	Peak table from `get_peaktable`.
`chrom_list`	A list of chromatograms in matrix format (timepoints x wavelengths). If no argument is provided here, the function will try to find the `chrom_list` object used to create the provided `peak_table`.
`ref`	What criterion to use to select reference spectra. Current options are maximum correlation (`max.cor`) or maximum signal intensity (`max.int`).

Value

A peak_table object with reference spectra attached in the $ref_spectra slot.

Author(s)

Ethan Bass

Examples

data(pk_tab)
pk_tab <- attach_ref_spectra(pk_tab, ref="max.int")
pk_tab <- attach_ref_spectra(pk_tab, ref = "max.cor")
data(pk_tab)
pk_tab <- attach_ref_spectra(pk_tab, ref="max.int")
pk_tab <- attach_ref_spectra(pk_tab, ref = "max.cor")

Make boxplot from peak table.

Description

The function can take multiple response variables on the left hand side of the formula (separated by +). In this case, a separate boxplot will be produced for each response variable.

Usage

## S3 method for class 'peak_table'
boxplot(x, formula, ...)
## S3 method for class 'peak_table'
boxplot(x, formula, ...)

Arguments

`x`	A peak_table object
`formula`	A formula object
`...`	Additional arguments to `boxplot`

Examples

data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")
boxplot(pk_tab, formula=V11 ~ trt)
data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")
boxplot(pk_tab, formula=V11 ~ trt)

Cluster spectra

Description

Cluster peaks by spectral similarity.

Usage

cluster_spectra(
  peak_table,
  peak_no = NULL,
  alpha = 0.05,
  min_size = 5,
  max_size = NULL,
  nboot = 1000,
  plot_dend = TRUE,
  plot_spectra = TRUE,
  verbose = getOption("verbose"),
  save = FALSE,
  parallel = TRUE,
  max.only = FALSE,
  output = c("pvclust", "clusters"),
  ...
)
cluster_spectra(
  peak_table,
  peak_no = NULL,
  alpha = 0.05,
  min_size = 5,
  max_size = NULL,
  nboot = 1000,
  plot_dend = TRUE,
  plot_spectra = TRUE,
  verbose = getOption("verbose"),
  save = FALSE,
  parallel = TRUE,
  max.only = FALSE,
  output = c("pvclust", "clusters"),
  ...
)

Arguments

`peak_table`	Peak table from `get_peaktable`.
`peak_no`	Minimum and maximum thresholds for the number of peaks a cluster may have. This argument is deprecated in favor of `min_size` and `max_size`.
`alpha`	Confidence threshold for inclusion of cluster.
`min_size`	Minimum number of peaks a cluster may have.
`max_size`	Maximum number of peaks a cluster may have.
`nboot`	Number of bootstrap replicates for `pvclust`.
`plot_dend`	Logical. If TRUE, plots dendrogram with bootstrap values.
`plot_spectra`	Logical. If TRUE, plots overlapping spectra for each cluster.
`verbose`	Logical. If TRUE, prints progress report to console.
`save`	Logical. If TRUE, saves pvclust object to current directory.
`parallel`	Logical. If TRUE, use parallel processing for `pvclust`.
`max.only`	Logical. If TRUE, returns only highest level for nested dendrograms.
`output`	What to return. Either `clusters` to return list of clusters, `pvclust` to return pvclust object, or `both` to return both items.
`...`	Additional arguments to `pvclust`.

Details

Function to cluster peaks by spectral similarity. Before using this function, reference spectra must be attached to the peak_table using the attach_ref_spectra function. These reference spectra are then used to construct a distance matrix based on spectral similarity (pearson correlation) between peaks. Hierarchical clustering with bootstrap resampling is performed on the resulting correlation matrix to classify peaks by spectral similarity, as implemented in pvclust. Finally, bootstrap values can be used to select clusters that exceed a certain confidence threshold as defined by alpha.

Clusters can be filtered by the minimum and maximum size of the cluster using the min_size and max_size arguments respectively. If max_only is TRUE, only the largest cluster in a nested tree of clusters meeting the specified confidence threshold will be returned.

Value

Returns clusters and/or pvclust object according to the value of the output argument.

If output = clusters, returns a list of S4 cluster objects.
If output = pvclust, returns a pvclust object.
If output = both, returns a nested list containing [[1]] the pvclust object, and [[2]] the list of S4 cluster objects.

The cluster objects consist of the following components:

peaks: a character vector containing the names of all peaks contained in the given cluster.
pval: a numeric vector of length 1 containing the bootstrap p-value (au) for the given cluster.

Note

Users should be aware that the clustering algorithm will often return nested clusters. Thus, an individual peak could appear in more than one cluster.
It is highly suggested to use more than 100 bootstraps if you run the clustering algorithm on real data even though we use nboot = 100 in the example to reduce runtime. The authors of pvclust suggest nboot = 10000.

Author(s)

Ethan Bass

References

R. Suzuki & H. Shimodaira. 2006. Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, 22(12):1540-1542. doi:10.1093/bioinformatics/btl117.

Examples


data(pk_tab)
data(Sa_warp)
pk_tab <- attach_ref_spectra(pk_tab, Sa_warp, ref = "max.int")
cl <- cluster_spectra(pk_tab, nboot = 100, max.only = FALSE, 
save = FALSE, alpha = 0.03)

data(pk_tab)
data(Sa_warp)
pk_tab <- attach_ref_spectra(pk_tab, Sa_warp, ref = "max.int")
cl <- cluster_spectra(pk_tab, nboot = 100, max.only = FALSE, 
save = FALSE, alpha = 0.03)

Combine peaks

Description

Utility function to combine duplicate peaks in peak table, i.e. peaks that were integrated at more than one wavelength or component. Specify tolerance (tol) for retention time matching and minimum spectral correlation (min.cor) for a match.

Usage

combine_peaks(
  peak_table,
  tol = 0.01,
  min.cor = 0.9,
  choose = "max",
  verbose = getOption("verbose")
)
combine_peaks(
  peak_table,
  tol = 0.01,
  min.cor = 0.9,
  choose = "max",
  verbose = getOption("verbose")
)

Arguments

`peak_table`	Peak table from `get_peaktable`.
`tol`	Tolerance for matching retention times (maximum retention time difference). Defaults to `.01`.
`min.cor`	Minimum spectral correlation to confirm a match. Defaults to `0.9`.
`choose`	If `max` will retain peak with highest intensity. Otherwise, the first column in the data.frame will be retained.
`verbose`	Logical. Whether to print status to the console.

Value

A peak table similar to the input peak table, but with duplicate columns combined according to the specified criteria.

Author(s)

Ethan Bass

Examples

data(pk_tab)
data(Sa_warp)
pk_tab <- attach_ref_spectra(pk_tab)
combine_peaks(pk_tab, tol = .02, min.cor = .9)
data(pk_tab)
data(Sa_warp)
pk_tab <- attach_ref_spectra(pk_tab)
combine_peaks(pk_tab, tol = .02, min.cor = .9)

Correct peak positions according to a PTW warping model

Description

Corrects retention time differences using parametric time warping as implemented in ptw.

Usage

correct_peaks(peak_list, mod_list, chrom_list, match_names = TRUE)
correct_peaks(peak_list, mod_list, chrom_list, match_names = TRUE)

Arguments

`peak_list`	A 'peak_list' object created by `get_peaks`, containing a nested list of peak tables where the first level is the sample, and the second level is the spectral wavelength. Every component is described by a matrix where every row is one peak, and the columns contain information on retention time, peak width (FWHM), peak width, height, and area.
`mod_list`	A list of ptw models.
`chrom_list`	List of chromatograms supplied to create ptw models.
`match_names`	Logical. Whether to actively match the names of the `peak_list` to the list of models (`mod_list`). Defaults to `TRUE`.

Details

Once an appropriate warping model has been established, corrected retention times can be predicted for each peak. These are stored in a separate column in the list of peak tables.

Value

The input list of peak tables is returned with extra columns containing the corrected retention time.

Note

This function is adapted from getPeakTable function in the alsace package by Ron Wehrens.

Author(s)

Ron Wehrens, Ethan Bass

Correct retention time

Description

Aligns chromatograms using one of two algorithms, according to the value of alg: either parametric time warping, as implemented in ptw, or variable penalty dynamic time warping, as implemented in VPdtw. The init.coef and n.traces arguments apply only to ptw warping, while penalty and maxshift apply only to vpdtw warping.

Usage

correct_rt(
  chrom_list,
  lambdas,
  models = NULL,
  reference = "best",
  alg = c("ptw", "vpdtw"),
  what = c("corrected.values", "models"),
  init.coef = c(0, 1, 0),
  n.traces = NULL,
  n.zeros = 0,
  scale = FALSE,
  trwdth = 200,
  plot_it = FALSE,
  penalty = 5,
  maxshift = 50,
  verbose = getOption("verbose"),
  show_progress = NULL,
  cl = 2,
  ...
)
correct_rt(
  chrom_list,
  lambdas,
  models = NULL,
  reference = "best",
  alg = c("ptw", "vpdtw"),
  what = c("corrected.values", "models"),
  init.coef = c(0, 1, 0),
  n.traces = NULL,
  n.zeros = 0,
  scale = FALSE,
  trwdth = 200,
  plot_it = FALSE,
  penalty = 5,
  maxshift = 50,
  verbose = getOption("verbose"),
  show_progress = NULL,
  cl = 2,
  ...
)

Arguments

`chrom_list`	List of chromatograms in matrix format.
`lambdas`	Select wavelengths to use by name.
`models`	List of models to warp by. The models provided here (if any) must match the algorithm selected in `alg`.
`reference`	Index of the sample that is to be considered the reference sample.
`alg`	algorithm to use: parametric time warping (`ptw`) or variable penalty dynamic time warping (`vpdtw`).
`what`	What to return: either the 'corrected.values' (useful for visual inspection) or the warping 'models' (for further programmatic use).
`init.coef`	Starting values for the optimization.
`n.traces`	Number of traces to use.
`n.zeros`	Number of zeros to add.
`scale`	Logical. If true, scale chromatograms before warping.
`trwdth`	width of the triangle in the WCC criterion.
`plot_it`	Logical. Whether to plot alignment.
`penalty`	The divisor used to calculate the penalty for `VPdtw`. The warping penalty is calculated by dividing the `dilation` by this number. Thus, a higher number will produce a lower penalty and be more permissive, while a lower number will produce a higher penalty and allow less warping. Defaults to 5.
`maxshift`	Integer. Maximum allowable shift for `VPdtw`. Defaults to 50.
`verbose`	Whether to print verbose output.
`show_progress`	Logical. Whether to show progress bar. Defaults to `TRUE` if `pbapply` is installed. Currently works only for `ptw` alignments.
`cl`	Argument to `pblapply` or `mclapply`. Either an integer specifying the number of clusters to use for parallel processing or a cluster object created by `makeCluster`. Defaults to 2. On Windows integer values will be ignored.
`...`	Optional arguments for the `ptw` function. The only argument that cannot be changed is `warp.type`: this is always equal to `"global"`.

Value

A list of warping models or a list of warped absorbance profiles, depending on the value of the what argument.

Note

Adapted from correctRT function in the alsace package by Ron Wehrens.

Author(s)

Ethan Bass

References

Clifford, D., Stone, G., Montoliu, I., Rezzi, S., Martin, F. P., Guy, P., Bruce, S., & Kochhar, S. 2009. Alignment using variable penalty dynamic time warping. Analytical chemistry, 81(3):1000-1007. doi:10.1021/ac802041e.
Clifford, D., & Stone, G. 2012. Variable Penalty Dynamic Time Warping Code for Aligning Mass Spectrometry Chromatograms in R. Journal of Statistical Software, 47(8):1-17. doi:10.18637/jss.v047.i08.
Eilers, P.H.C. 2004. Parametric Time Warping. Anal. Chem., 76:404-411. doi:10.1021/ac034800e.
Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. Fast parametric time warping of peak lists. Bioinformatics, 31:3063-3065. doi:10.1093/bioinformatics/btv299.
Wehrens, R., Carvalho, E., Fraser, P.D. 2015. Metabolite profiling in LC–DAD using multivariate curve resolution: the alsace package for R. Metabolomics, 11:143-154. doi:10.1007/s11306-014-0683-5.

Examples


data(Sa_pr)
warping.models <- correct_rt(Sa_pr, what = "models", lambdas=c(210))
warp <- correct_rt(chrom_list = Sa_pr, models = warping.models)

data(Sa_pr)
warping.models <- correct_rt(Sa_pr, what = "models", lambdas=c(210))
warp <- correct_rt(chrom_list = Sa_pr, models = warping.models)

Filter peak lists

Description

Utility function to remove peaks from a peak list (e.g., because their intensity is too low). Currently one can filter on peak height, peak area, standard deviation, and/or retention time.

Usage

filter_peaks(peak_list, min_height, min_area, min_sd, max_sd, min_rt, max_rt)
filter_peaks(peak_list, min_height, min_area, min_sd, max_sd, min_rt, max_rt)

Arguments

`peak_list`	A peak_list object, consisting of a nested list of peak tables, where the first level is the sample, and the second level is the spectral component. Every component is described by a matrix where every row is one peak, and the columns contain information on retention time, full width at half maximum (FWHM), peak width, height, and area.
`min_height`	Minimum peak height.
`min_area`	Minimum peak area.
`min_sd`	Minimal standard deviation.
`max_sd`	Maximum standard deviation.
`min_rt`	Minimum retention time.
`max_rt`	Maximum retention time.

Value

A peak list similar to the input, with all rows removed from that do not satisfy the specified criteria.

Author(s)

Ron Wehrens, Ethan Bass

Filter peak table

Description

Utility function to remove peaks from peak table, e.g., because their intensity is too low. Currently one can filter on mean or median peak intensity, or retention time.

Usage

filter_peaktable(
  peak_table,
  rts,
  min_rt,
  max_rt,
  min_value,
  lambda,
  what = c("median", "mean", "max"),
  tol = 0
)
filter_peaktable(
  peak_table,
  rts,
  min_rt,
  max_rt,
  min_value,
  lambda,
  what = c("median", "mean", "max"),
  tol = 0
)

Arguments

`peak_table`	A peak_table object from `get_peaktable`.
`rts`	Vector of retention times to include in the peak table.
`min_rt`	Minimum retention time to include in the peak table.
`max_rt`	Maximum retention time to include in the peak table.
`min_value`	Minimal cutoff for summarized peak intensity.
`lambda`	Component(s) to include in peak table (e.g. wavelengths if you are using HPLC-DAD/UV).
`what`	Whether to summarize intensities using `mean`, `median`, or `max`.
`tol`	Tolerance for matching of retention times to `rts`.

Value

A peak table similar to the input, with all columns removed from the peak table that do not satisfy the specified criteria.

Author(s)

Ethan Bass

Examples

data(pk_tab)
pk_tab <- filter_peaktable(pk_tab, min_rt = 10, max_rt = 16)
data(pk_tab)
pk_tab <- filter_peaktable(pk_tab, min_rt = 10, max_rt = 16)

Find peaks

Description

Find peaks in chromatographic profile.

Usage

find_peaks(
  y,
  smooth_type = c("gaussian", "box", "savgol", "mva", "tmva", "none"),
  smooth_window = 0.001,
  slope_thresh = 0,
  amp_thresh = 0,
  bounds = TRUE
)
find_peaks(
  y,
  smooth_type = c("gaussian", "box", "savgol", "mva", "tmva", "none"),
  smooth_window = 0.001,
  slope_thresh = 0,
  amp_thresh = 0,
  bounds = TRUE
)

Arguments

`y`	Signal (as a numerical vector).
`smooth_type`	Type of smoothing. Either gaussian kernel (`"gaussian"`), box kernel (`"box"`), savitzky-golay smoothing (`"savgol"`), moving average (`"mva"`), triangular moving average (`"tmva"`), or no smoothing (`"none"`).
`smooth_window`	Smoothing window. Larger values of this parameter will exclude sharp, narrow features. If the supplied value is between 0 and 1, window will be interpreted as a proportion of points to include. Otherwise, the window will be the absolute number of points to include in the window. (Defaults to `.001`).
`slope_thresh`	Minimum threshold for slope of the smoothed first derivative. This parameter filters on the basis of peak width, such that larger values will exclude broad peaks from the peak list. (Defaults to `0`).
`amp_thresh`	Minimum threshold for peak amplitude. This parameter filters on the basis of peak height, such that larger values will exclude small peaks from the peak list. (Defaults to `0`).
`bounds`	Logical. If TRUE, includes peak boundaries in data.frame. (Defaults to `TRUE`).

Details

Find peaks by looking for zero-crossings in the smoothed first derivative of the signal (y) that exceed the specified slope threshold (slope_thresh). Additionally, peaks can be filtered by supplying a minimal amplitude threshold (amp_thresh), filtering out peaks below the specified height. Smoothing is intended to prevent the algorithm from getting caught up on local minima and maxima that do not represent true features. Several smoothing options are available, including "gaussian", box kernel ("box"), savitzky-golay smoothing ("savgol"), moving average ("mva"), triangular moving average ("tmva"), or no smoothing ("none").

It is recommended to do pre-processing using the preprocess function before peak detection. Overly high chromatographic resolution can sometimes cause peaks to be split into multiple segments. In this case, it is recommended to increase the smooth_window or reduce the resolution along the time axis by adjusting the dim1 argument during preprocessing.

Value

If bounds == TRUE, returns a data.frame containing the center, start, and end of each identified peak. Otherwise, returns a numeric vector of peak centers. All locations are expressed as indices.

Note

The find_peaks function is adapted from MATLAB code included in Prof. Tom O'Haver's Pragmatic Introduction to Signal Processing.

Author(s)

Ethan Bass

References

O'Haver, Tom. Pragmatic Introduction to Signal Processing: Applications in scientific measurement. https://terpconnect.umd.edu/~toh/spectrum/ (Accessed January, 2022).

Examples

data(Sa_pr)
find_peaks(Sa_pr[[1]][,"220"])
data(Sa_pr)
find_peaks(Sa_pr[[1]][,"220"])

Fit chromatographic peaks to an exponential-gaussian hybrid or gaussian profile

Description

Fit peak parameters using exponential-gaussian hybrid or gaussian function.

Usage

fit_peaks(
  x,
  lambda,
  pos = NULL,
  sd.max = 50,
  fit = c("egh", "gaussian", "raw"),
  max.iter = 1000,
  estimate_purity = TRUE,
  noise_threshold = 0.001,
  ...
)
fit_peaks(
  x,
  lambda,
  pos = NULL,
  sd.max = 50,
  fit = c("egh", "gaussian", "raw"),
  max.iter = 1000,
  estimate_purity = TRUE,
  noise_threshold = 0.001,
  ...
)

Arguments

`x`	A chromatogram in matrix format.
`lambda`	Wavelength to fit peaks at.
`pos`	Locations of peaks in vector y. If NULL, `find_peaks` will run automatically to find peak positions.
`sd.max`	Maximum width (standard deviation) for peaks. Defaults to 50.
`fit`	Function for peak fitting. (Currently exponential-gaussian hybrid `egh`, `gaussian` and `raw` settings are supported). If `raw` is selected, trapezoidal integration will be performed on raw data without fitting a peak shape. Defaults to `egh`.)
`max.iter`	Maximum number of iterations to use in nonlinear least squares peak-fitting. (Defaults to 1000).
`estimate_purity`	Logical. Whether to estimate purity or not. Defaults to TRUE.
`noise_threshold`	Noise threshold. Input to `get_purity`.
`...`	Additional arguments to `find_peaks`.

Details

Peak parameters are calculated by fitting the data to a gaussian or exponential-gaussian hybrid curve using non-linear least squares estimation as implemented in nlsLM. Area under the fitted curve is then estimated using trapezoidal approximation.

Value

The fit_peaks function returns a matrix, whose columns contain the following information about each peak:

`rt`	Location of the peak maximum.
`start`	Start of peak (only included in table if `bounds = TRUE`).
`end`	End of peak (only included in table if `bounds = TRUE`).
`sd`	The standard deviation of the peak.
`tau`	$\tau$ parameter (only included in table if `fit = "egh"`).
`FWHM`	The full width at half maximum.
`height`	Peak height.
`area`	Peak area.
`r.squared`	The R-squared value for linear fit of the model to the data.
`purity`	The spectral purity of peak as assessed by `get_purity`.

Again, the first five elements (rt, start, end, sd and FWHM) are expressed as indices, so not in terms of the real retention times. The transformation to "real" time is done in function get_peaks.

Note

The fit_peaks function is adapted from Dr. Robert Morrison's DuffyTools package as well as code published in Ron Wehrens' alsace package.

Author(s)

Ethan Bass

References

Lan, K. & Jorgenson, J. W. 2001. A hybrid of exponential and gaussian functions as a simple model of asymmetric chromatographic peaks. Journal of Chromatography A 915:1-13. doi:10.1016/S0021-9673(01)00594-5.
Naish, P. J. & Hartwell, S. 1988. Exponentially Modified Gaussian functions - A good model for chromatographic peaks in isocratic HPLC? Chromatographia, 26: 285-296. doi:10.1007/BF02268168.

Examples

data(Sa_pr)
fit_peaks(Sa_pr[[1]], lambda = 220)
data(Sa_pr)
fit_peaks(Sa_pr[[1]], lambda = 220)

Get lambdas

Description

Get wavelengths from a list of chromatograms or a peak_table object.

Usage

get_lambdas(x)
get_lambdas(x)

Arguments

`x`	List of chromatograms or `peak_table` object.

Value

Numeric vector of wavelengths.

Get peak list.

Description

Finds and fits peaks and extracts peak parameters from a list of chromatograms at the specified wavelengths.

Usage

get_peaks(
  chrom_list,
  lambdas,
  fit = c("egh", "gaussian", "raw"),
  sd.max = 50,
  max.iter = 100,
  time.units = c("min", "s", "ms"),
  estimate_purity = FALSE,
  noise_threshold = 0.001,
  show_progress = NULL,
  cl = 2,
  collapse = FALSE,
  ...
)
get_peaks(
  chrom_list,
  lambdas,
  fit = c("egh", "gaussian", "raw"),
  sd.max = 50,
  max.iter = 100,
  time.units = c("min", "s", "ms"),
  estimate_purity = FALSE,
  noise_threshold = 0.001,
  show_progress = NULL,
  cl = 2,
  collapse = FALSE,
  ...
)

Arguments

`chrom_list`	A list of profile matrices, each of the same dimensions (timepoints × wavelengths).
`lambdas`	Character vector of wavelengths to find peaks at.
`fit`	What type of fit to use. Current options are exponential-gaussian hybrid (`egh`), gaussian or raw. The `raw` setting performs trapezoidal integration directly on the raw data without fitting a peak shape.
`sd.max`	Maximum width (standard deviation) for peaks. Defaults to 50.
`max.iter`	Maximum number of iterations for non-linear least squares in `fit_peaks`.
`time.units`	Units of `sd`, `FWHM`, `area`, and `tau` (if applicable). Options are minutes (`"min"`), seconds (`"s"`), or milliseconds (`"ms"`).
`estimate_purity`	Logical. Whether to estimate purity or not. Defaults to `FALSE`. (If `TRUE`, this will slow down the function significantly).
`noise_threshold`	Noise threshold. Argument to `get_purity`.
`show_progress`	Logical. Whether to show progress bar. Defaults to `TRUE` if `pbapply` is installed.
`cl`	Argument to `pblapply` or `mclapply`. Either an integer specifying the number of clusters to use for parallel processing or a cluster object created by `makeCluster`. Defaults to 2. On Windows integer values will be ignored.
`collapse`	Logical. Whether to collapse multiple peak lists per sample into a single list when multiple wavelengths (`lambdas`) are provided.
`...`	Additional arguments to `find_peaks`. Arguments provided to `find_peaks` can be used to fine-tune the peak-finding algorithm. Most importantly, the `smooth_window` should be increased if features are being split into multiple bins. Other arguments that can be used here include `smooth_type`, `slope_thresh`, and `amp_thresh`.

Details

Peaks are located by finding zero-crossings in the smoothed first derivative of the specified chromatographic traces (function find_peaks). At the given positions, an exponential-gaussian hybrid (or regular gaussian) function is fit to the signal using fit_peaks according to the value of fit. Finally, the area is calculated using trapezoidal approximation.

Additional arguments can be provided to find_peaks to fine-tune the peak-finding algorithm. For example, the smooth_window can be increased to prevent peaks from being split into multiple features. Overly aggressive smoothing may cause small peaks to be overlooked.

The standard deviation (sd), full-width at half maximum (FWHM), tau tau, and area are returned in units determined by time.units. By default, the units are in minutes. To compare directly with 'ChemStation' integration results, the time units should be in seconds.

Value

The result is an S3 object of class peak_list, containing a nested list of data.frames containing information about the peaks fitted for each chromatogram at each of wavelengths specified by the lamdas argument. Each row in these data.frames is a peak and the columns contain information about various peak parameters:

rt: The retention time of the peak maximum.
start: The retention time where the peak is estimated to begin.
end: The retention time where the peak is estimated to end.
sd: The standard deviation of the fitted peak shape.
tau The value of parameter $\tau$ . This parameter determines peak asymmetry for peaks fit with an exponential-gaussian hybrid function. (This column will only appear if fit = egh.
FWHM: The full-width at half maximum.
height: The height of the peak.
area: The area of the peak as determined by trapezoidal approximation.
r.squared The coefficient of determination ( $R^2$ ) of the fitted model to the raw data. (Note: this value is calculated by fitting a linear model of the fitted peak values to the raw data. This approach is statistically questionable, since the models are fit using non-linear least squares. Nevertheless, it can still be useful as a rough metric for "goodness-of-fit").
purity The peak purity.

Note

The bones of this function are adapted from the getAllPeaks function authored by Ron Wehrens (though the underlying algorithms for peak identification and peak-fitting are not the same).

Author(s)

Ethan Bass

References

Lan, K. & Jorgenson, J. W. 2001. A hybrid of exponential and gaussian functions as a simple model of asymmetric chromatographic peaks. Journal of Chromatography A 915:1-13. doi:10.1016/S0021-9673(01)00594-5.
Naish, P. J. & Hartwell, S. 1988. Exponentially Modified Gaussian functions - A good model for chromatographic peaks in isocratic HPLC? Chromatographia, 26: 285-296. doi:10.1007/BF02268168.
O'Haver, Tom. Pragmatic Introduction to Signal Processing: Applications in scientific measurement. https://terpconnect.umd.edu/~toh/spectrum/ (Accessed January, 2022).
Wehrens, R., Carvalho, E., Fraser, P.D. 2015. Metabolite profiling in LC–DAD using multivariate curve resolution: the alsace package for R. Metabolomics 11:143-154. doi:10.1007/s11306-014-0683-5.

Examples


data(Sa_pr)
pks <- get_peaks(Sa_pr, lambdas = c('210'), sd.max=50, fit="egh")

data(Sa_pr)
pks <- get_peaks(Sa_pr, lambdas = c('210'), sd.max=50, fit="egh")

Convert peak list into an ordered peak table.

Description

Returns a peak_table object. The first slot contains a matrix of intensities, where rows correspond to samples and columns correspond to aligned features. The rest of the slots contain various meta-data about peaks, samples, and experimental settings.

Usage

get_peaktable(
  peak_list,
  chrom_list,
  response = c("area", "height"),
  use.cor = NULL,
  hmax = 0.2,
  plot_it = FALSE,
  ask = plot_it,
  clust = c("rt", "sp.rt"),
  sigma.t = NULL,
  sigma.r = 0.5,
  deepSplit = FALSE,
  verbose = FALSE,
  out = c("data.frame", "matrix")
)
get_peaktable(
  peak_list,
  chrom_list,
  response = c("area", "height"),
  use.cor = NULL,
  hmax = 0.2,
  plot_it = FALSE,
  ask = plot_it,
  clust = c("rt", "sp.rt"),
  sigma.t = NULL,
  sigma.r = 0.5,
  deepSplit = FALSE,
  verbose = FALSE,
  out = c("data.frame", "matrix")
)

Arguments

`peak_list`	A `peak_list` object created by `get_peaks`, containing a nested list of peak tables: the first level is the sample, and the second level is the spectral wavelength. Every component is described by a `data.frame` with a row for each peak and columns containing information on various peak parameters.
`chrom_list`	A list of chromatographic matrices.
`response`	Indicates whether peak area or peak height is to be used as intensity measure. Defaults to `area` setting.
`use.cor`	Logical. Indicates whether to use corrected retention times (`rt.cor` column) or raw retention times (`rt` column). Unless otherwise specified, the `rt.cor` column will be used by default if it exists in the provided `peak_list`.
`hmax`	Height at which the complete linkage dendrogram will be cut. Can be interpreted as the maximal intercluster retention time difference.
`plot_it`	Logical. If `TRUE`, for every component a strip plot will be shown indicating the clustering.
`ask`	Logical. Ask before showing new plot? Defaults to `TRUE`.
`clust`	Specify whether to perform hierarchical clustering based on spectral similarity and retention time (`sp.rt`) or retention time alone (`rt`). Defaults to `rt`. The `sp.rt` option is experimental and should be used with caution.
`sigma.t`	Width of gaussian in retention time distance function. Controls weight given to retention time if `sp.rt` is selected.
`sigma.r`	Width of gaussian in spectral similarity function. Controls weight given to spectral correlation if `sp.rt` is selected.
`deepSplit`	Logical. Controls sensitivity to cluster splitting. If `TRUE`, function will return more smaller clusters. See documentation for `cutreeDynamic` for additional information.
`verbose`	Logical. Whether to print warning when combining peaks into single time window. Defaults to `FALSE`.
`out`	Specify `data.frame` or `matrix` as output. Defaults to `data.frame`.

Details

The function performs a complete linkage clustering of retention times across all samples, and cuts at a height given by the user (which can be understood as the maximal inter-cluster retention time difference) in the simple case based on retention times. Clustering can also incorporate information about spectral similarity using a distance function adapted from Broeckling et al., 2014:

$e^{-\frac{(1-c_{ij})^2}{2\sigma_r^2}} \cdot e^{-\frac{(1-(t_i-t_j)^2)}{2\sigma_t^2}}$

If two peaks from the same sample are assigned to the same cluster, a warning message is printed to the console. These warnings can usually be ignored, but one could also consider reducing the hmax variable. However, this may lead to splitting of peaks across multiple clusters. Another option is to filter the peaks by intensity to remove small features.

Value

The function returns an S3 peak_table object, containing the following elements:

tab: the peak table itself – a data-frame of intensities in a sample x peak configuration.
pk_meta: A data.frame containing peak meta-data (e.g., the spectral component, peak number, and average retention time).
sample_meta: A data.frame of sample meta-data. Must be added using attach_metadata.
ref_spectra: A data.frame of reference spectra (in a wavelength x peak configuration). Must be added using attach_ref_spectra.
args: A vector of arguments given to get_peaktable to generate the peak table.

Note

This function is adapted from getPeakTable function in the alsace package by Ron Wehrens.

Author(s)

Ethan Bass

References

Broeckling, C. D., Afsar F.A., Neumann S., Ben-Hur A., and Prenni J.E. 2014. RAMClust: A Novel Feature Clustering Method Enables Spectral-Matching-Based Annotation for Metabolomics Data. Anal. Chem. 86:6812-6817. doi:10.1021/ac501530d.
Wehrens, R., Carvalho, E., Fraser, P.D. 2015. Metabolite profiling in LC–DAD using multivariate curve resolution: the alsace package for R. Metabolomics 11:143-154. doi:10.1007/s11306-014-0683-5.

Examples


data(Sa_pr)
pks <- get_peaks(Sa_pr, lambdas = c('210'))
get_peaktable(pks, response = "area")

data(Sa_pr)
pks <- get_peaks(Sa_pr, lambdas = c('210'))
get_peaktable(pks, response = "area")

Get retention times

Description

Get retention times from a list of chromatograms or a peak_table object.

Usage

get_times(x, idx = 1)
get_times(x, idx = 1)

Arguments

`x`	List of chromatograms or `peak_table` object.
`idx`	Index of chromatogram from which to extract times

Value

Numeric vector of retention times from the chromatogram specified by idx.

Merge split peaks

Description

Utility function to combine split peaks into a single column of the peak table.

Usage

merge_peaks(peak_table, peaks, method = c("max", "sum"))
merge_peaks(peak_table, peaks, method = c("max", "sum"))

Arguments

`peak_table`	Peak table from `get_peaktable`.
`peaks`	A vector specifying the names or indices of peaks to be merged.
`method`	Method to merge peaks. Either `max` to select the largest peak from each sample or `sum` to sum the peaks together.

Details

Merges the specified peaks in peak table, by selecting the largest value from each column if method is "max". If method is "sum", merges peak by summing their values.

Value

A peak table similar to the input peak table, but where the specified columns are combined.

Author(s)

Ethan Bass

Examples

data(pk_tab)
pk_tab <- merge_peaks(peak_table = pk_tab, peaks=c("V10","V11"))
data(pk_tab)
pk_tab <- merge_peaks(peak_table = pk_tab, peaks=c("V10","V11"))

Make mirror plot from peak table.

Description

Plots chromatograms as a mirror plot.

Usage

mirror_plot(
  x,
  chrom_list,
  lambdas = NULL,
  var,
  subset = NULL,
  print_legend = TRUE,
  legend_txt = NULL,
  legend_pos = "topright",
  legend_size = 1,
  mirror = TRUE,
  xlim = NULL,
  ylim = NULL,
  ...
)
mirror_plot(
  x,
  chrom_list,
  lambdas = NULL,
  var,
  subset = NULL,
  print_legend = TRUE,
  legend_txt = NULL,
  legend_pos = "topright",
  legend_size = 1,
  mirror = TRUE,
  xlim = NULL,
  ylim = NULL,
  ...
)

Arguments

`x`	The peak table (output from `get_peaktable` function).
`chrom_list`	A list of chromatograms in matrix format (timepoints x wavelengths). If no argument is provided here, the function will try to find the `chrom_list` object used to create the `peak_table`.
`lambdas`	The wavelength you wish to plot the traces at.
`var`	Variable to index chromatograms.
`subset`	Character vector specifying levels to use (if more than 2 levels are present in `var`).
`print_legend`	Logical. Whether to print legend. Defaults to `TRUE`.
`legend_txt`	Character vector containing labels for legend.
`legend_pos`	Legend position.
`legend_size`	Legend size (`cex` argument). Default is 1.
`mirror`	Logical. Whether to plot as mirror or stacked plots. Defaults to `TRUE`.
`xlim`	Numerical vector specifying limits for x axis.
`ylim`	Numerical vector specifying limits for y axis.
`...`	Additional arguments to `matplot` function.

Details

Can be used to confirm the identity of a peak or check that a particular column in the peak table represents a single compound. Can also be used to create simple box-plots to examine the distribution of a peak with respect to variables defined in sample metadata.

Value

No return value, called for side effects.

Side effects

If mirror_plot is TRUE, plots a mirror plot comparing two treatments defined by var and subset (if more than two factors are present in var). Otherwise, if mirror_plot is FALSE, the treatments are plotted in two separate panes.

Author(s)

Ethan Bass

Examples

data(Sa_warp)
data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")
mirror_plot(pk_tab,lambdas = c("210","260"), var = "trt", mirror = TRUE, 
  col = c("green","blue"))
data(Sa_warp)
data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")
mirror_plot(pk_tab,lambdas = c("210","260"), var = "trt", mirror = TRUE, 
  col = c("green","blue"))

Normalize peak table or chromatograms

Description

Normalizes peak table or list of chromatograms by specified column in sample metadata. Metadata must first be attached to peak_table using attach_metadata.

Usage

normalize_data(
  peak_table,
  column,
  chrom_list,
  what = c("peak_table", "chrom_list"),
  by = c("meta", "peak")
)
normalize_data(
  peak_table,
  column,
  chrom_list,
  what = c("peak_table", "chrom_list"),
  by = c("meta", "peak")
)

Arguments

`peak_table`	A 'peak_table' object
`column`	The name of the column containing the weights.
`chrom_list`	List of chromatograms for normalization. The samples must be in same order as the peak_table. If no argument is provided here, the function will try to find the `chrom_list` object used to create the provided `peak_table`.
`what`	'peak_table' or list of chromatograms ('chrom_list').
`by`	Whether to normalize by a column in sample metadata (`meta`) or by a column in the peak table itself (`peak`).

Value

A peak_table object where the peaks are normalized by the mass of each sample.

Author(s)

Ethan Bass

Examples

data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")
norm <- normalize_data(pk_tab, "mass", what = "peak_table")
data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")
norm <- normalize_data(pk_tab, "mass", what = "peak_table")

Goldenrod peak table

Description

Peak table generated from exemplary goldenrod root extracts.

Usage

data(pk_tab)
data(pk_tab)

Format

A peak_table object.

Plot all spectra for chosen peak.

Description

Plot multiple for a given peak in peak table. Wrapper for plot_spectrum.

Usage

plot_all_spectra(
  peak,
  peak_table,
  chrom_list,
  idx = "all",
  chrs = NULL,
  engine = c("base", "ggplot2", "plotly"),
  plot_spectrum = TRUE,
  export_spectrum = TRUE,
  scale_spectrum = TRUE,
  overlapping = TRUE,
  verbose = FALSE,
  what = c("peak", "rt", "idx"),
  ...
)
plot_all_spectra(
  peak,
  peak_table,
  chrom_list,
  idx = "all",
  chrs = NULL,
  engine = c("base", "ggplot2", "plotly"),
  plot_spectrum = TRUE,
  export_spectrum = TRUE,
  scale_spectrum = TRUE,
  overlapping = TRUE,
  verbose = FALSE,
  what = c("peak", "rt", "idx"),
  ...
)

Arguments

`peak`	The name of a peak to plot (in character format)
`peak_table`	The peak table (output from `get_peaktable` function)
`chrom_list`	A list of chromatograms in matrix format (timepoints x components). If no argument is provided here, the function will try to find the `chrom_list` object used to create the provided `peak_table`.
`idx`	Vector of chromatograms to plot.
`chrs`	Deprecated. Please use `idx` instead.
`engine`	Which plotting engine to use: `base`, `ggplot2`, or `plotly`.
`plot_spectrum`	Logical. If TRUE, plots the spectrum of the chosen peak.
`export_spectrum`	Logical. If TRUE, exports spectrum to console. Defaults to FALSE.
`scale_spectrum`	Logical. If TRUE, scales spectrum to unit height.
`overlapping`	Logical. If TRUE, plot spectra in single plot.
`verbose`	Logical. If TRUE, prints verbose output to console.
`what`	What to look for. Either `peak` to extract spectral information for a certain peak, `rt` to scan by retention time, or `idx` to scan by numeric index. Defaults to "peak" mode.
`...`	Additional arguments to plot_spectrum.

Value

If export_spectrum is TRUE, returns the spectra as a data.frame with wavelengths as rows and one column for each sample in the chrom_list encoding the absorbance (or normalized absorbance, if scale_spectrum is TRUE) at each wavelength. Otherwise, there is no return value.

Side effects

If plot_spectrum is TRUE, plots the spectra for the specified chromatogram (idx) of the given peak. The spectrum is a single row from the chromatographic matrix.

Author(s)

Ethan Bass

Examples


data(Sa_warp)
pks <- get_peaks(Sa_warp, lambda="220")
pk_tab <- get_peaktable(pks)
plot_all_spectra(peak="V13", peak_table = pk_tab, overlapping=TRUE)

data(Sa_warp)
pks <- get_peaks(Sa_warp, lambda="220")
pk_tab <- get_peaktable(pks)
plot_all_spectra(peak="V13", peak_table = pk_tab, overlapping=TRUE)

Plot traces from list of chromatograms.

Description

Plots the specified traces from a list of chromatograms.

Usage

plot_chroms(
  x,
  lambdas,
  idx,
  xlim,
  ylim,
  xlab = "",
  ylab = "Absorbance",
  engine = c("base", "ggplot", "plotly"),
  linewidth = 1,
  show_legend = TRUE,
  legend_position = "topright",
  ...
)
plot_chroms(
  x,
  lambdas,
  idx,
  xlim,
  ylim,
  xlab = "",
  ylab = "Absorbance",
  engine = c("base", "ggplot", "plotly"),
  linewidth = 1,
  show_legend = TRUE,
  legend_position = "topright",
  ...
)

Arguments

`x`	A list of chromatograms in matrix format (timepoints x wavelengths).
`lambdas`	The wavelength(s) you wish to plot the trace at.
`idx`	A vector representing the names or numerical indices of the chromatograms to plot.
`xlim`	Range of x axis.
`ylim`	Range of y axis.
`xlab`	X label.
`ylab`	Y label. Defaults to "Absorbance".
`engine`	Plotting engine. Either `base` (`matplot`), `plotly`, or `ggplot2-package`.
`linewidth`	Line width.
`show_legend`	Logical. Whether to display legend or not. Defaults to TRUE.
`legend_position`	Position of legend.
`...`	Additional arguments to plotting function specified by `engine`.

Value

No return value, called for side effects.

Side effects

Plots the traces of the specified chromatograms idx at the specified wavelengths lambdas. Plots can be produced using base graphics, ggplot2, or plotly, according to the value of engine.

Author(s)

Ethan Bass

Examples

data(Sa_pr)
plot_chroms(Sa_pr, idx = c(1:2), lambdas = c(210))
data(Sa_pr)
plot_chroms(Sa_pr, idx = c(1:2), lambdas = c(210))

Plot spectrum from peak table

Description

Plots the trace and/or spectrum for a given peak in peak.table object, or plots the spectrum a particular retention time for a given chromatogram.

Usage

plot_spectrum(
  loc = NULL,
  peak_table,
  chrom_list,
  idx = "max",
  lambda = "max",
  plot_spectrum = TRUE,
  plot_trace = TRUE,
  spectrum_labels = TRUE,
  scale_spectrum = FALSE,
  export_spectrum = FALSE,
  verbose = TRUE,
  what = c("peak", "rt", "idx", "click"),
  engine = c("base", "plotly", "ggplot2"),
  chr = NULL,
  ...
)
plot_spectrum(
  loc = NULL,
  peak_table,
  chrom_list,
  idx = "max",
  lambda = "max",
  plot_spectrum = TRUE,
  plot_trace = TRUE,
  spectrum_labels = TRUE,
  scale_spectrum = FALSE,
  export_spectrum = FALSE,
  verbose = TRUE,
  what = c("peak", "rt", "idx", "click"),
  engine = c("base", "plotly", "ggplot2"),
  chr = NULL,
  ...
)

Arguments

`loc`	The name of the peak or retention time for which you wish to extract spectral data.
`peak_table`	The peak table (output from `get_peaktable`).
`chrom_list`	A list of chromatograms in matrix format (timepoints x wavelengths). If no argument is provided here, the function will try to find the `chrom_list` object used to create the provided `peak_table`.
`idx`	Numerical index of chromatogram you wish to plot, or "max" to automatically plot the chromatogram with the largest signal.
`lambda`	The wavelength you wish to plot the trace at if plot_trace == TRUE and/or the wavelength to be used for the determination of signal abundance.
`plot_spectrum`	Logical. If `TRUE`, plots the spectrum of the chosen peak. Defaults to `TRUE`.
`plot_trace`	Logical. If `TRUE`, plots the trace of the chosen peak at lambda. Defaults to `TRUE`.
`spectrum_labels`	Logical. If `TRUE`, plots labels on maxima in spectral plot. Defaults to `TRUE`.
`scale_spectrum`	Logical. If `TRUE`, scales spectrum to unit height. Defaults to `FALSE`.
`export_spectrum`	Logical. If `TRUE`, exports spectrum to console. Defaults to `FALSE`.
`verbose`	Logical. If `TRUE`, prints verbose output to console. Defaults to `TRUE`.
`what`	What to look for. Either `peak` to extract spectral information for a certain peak, `rt` to scan by retention time, `idx` to scan by numeric index, or `click` to manually select retention time by clicking on the chromatogram. Defaults to "peak" mode.
`engine`	Which plotting engine to use: `base`, `ggplot2`, or `plotly`.
`chr`	Deprecated. Please use `idx` instead.
`...`	Additional arguments.

Details

Can be used to confirm the identity of a peak or check that a particular column in the peak table represents a single compound. Retention times can also be selected by clicking on the plotted trace if what == 'click'.

Value

If export_spectrum is TRUE, returns the spectrum as a data.frame with wavelengths as rows and a single column encoding the absorbance (or normalized absorbance, if scale_spectrum is TRUE) at each wavelength. If export_spectrum is FALSE, the output depends on the plotting engine. If engine == "plotly", returns a plotly object containing the specified plots. Otherwise, if engine == "base", there is no return value.

Side effects

If plot_trace is TRUE, plots the chromatographic trace of the specified chromatogram (idx), at the specified wavelength (lambda) with a dotted red line to indicate the retention time given by loc. The trace is a single column from the chromatographic matrix.
If plot_spectrum is TRUE, plots the spectrum for the specified chromatogram at the specified retention time. The spectrum is a single row from the chromatographic matrix.

Author(s)

Ethan Bass

Examples


data(Sa)
pks <- get_peaks(Sa, lambda = "220.00000")
pk_tab <- get_peaktable(pks)
oldpar <- par(no.readonly = TRUE)
par(mfrow = c(2, 1))
plot_spectrum(loc = "V10", peak_table = pk_tab, what = "peak")
par(oldpar)

data(Sa)
pks <- get_peaks(Sa, lambda = "220.00000")
pk_tab <- get_peaktable(pks)
oldpar <- par(no.readonly = TRUE)
par(mfrow = c(2, 1))
plot_spectrum(loc = "V10", peak_table = pk_tab, what = "peak")
par(oldpar)

Plot fitted peak shapes.

Description

Visually assess integration accuracy by plotting fitted peaks over trace.

Usage

## S3 method for class 'peak_list'
plot(
  x,
  ...,
  chrom_list,
  idx = 1,
  lambda = NULL,
  points = FALSE,
  ticks = FALSE,
  a = 0.5,
  color = NULL,
  cex.points = 0.5,
  numbers = FALSE,
  cex.font = 0.5,
  y.offset = 25,
  plot_purity = FALSE,
  res,
  index = NULL
)
## S3 method for class 'peak_list'
plot(
  x,
  ...,
  chrom_list,
  idx = 1,
  lambda = NULL,
  points = FALSE,
  ticks = FALSE,
  a = 0.5,
  color = NULL,
  cex.points = 0.5,
  numbers = FALSE,
  cex.font = 0.5,
  y.offset = 25,
  plot_purity = FALSE,
  res,
  index = NULL
)

Arguments

`x`	A `peak_list` object. Output from the `get_peaks` function.
`...`	Additional arguments to main plot function.
`chrom_list`	List of chromatograms (retention time x wavelength matrices)
`idx`	Index or name of chromatogram to be plotted.
`lambda`	Wavelength for plotting.
`points`	Logical. If TRUE, plot peak maxima. Defaults to FALSE.
`ticks`	Logical. If TRUE, mark beginning and end of each peak. Defaults to FALSE.
`a`	Alpha parameter controlling the transparency of fitted shapes.
`color`	The color of the fitted shapes.
`cex.points`	Size of points. Defaults to 0.5
`numbers`	Whether to number peaks. Defaults to FALSE.
`cex.font`	Font size if peaks are numbered. Defaults to 0.5.
`y.offset`	Y offset for peak numbers. Defaults to 25.
`plot_purity`	Whether to add visualization of peak purity.
`res`	time resolution for peak fitting
`index`	This argument is deprecated. Please use `idx` instead.

Value

No return value, called for side effects.

Side effects

Plots a chromatographic trace from the specified chromatogram (chr) at the specified wavelength (lambda) with fitted peak shapes from the provided peak_list drawn underneath the curve.

Author(s)

Ethan Bass

Plot spectrum from peak table

Description

Plots the trace and/or spectrum for a given peak in peak table.

Usage

## S3 method for class 'peak_table'
plot(
  x,
  loc,
  chrom_list,
  what = "peak",
  idx = "max",
  lambda = "max",
  plot_spectrum = TRUE,
  plot_trace = TRUE,
  box_plot = FALSE,
  vars = NULL,
  spectrum_labels = TRUE,
  scale_spectrum = FALSE,
  export_spectrum = FALSE,
  verbose = TRUE,
  engine = c("base", "plotly", "ggplot"),
  chr = NULL,
  ...
)
## S3 method for class 'peak_table'
plot(
  x,
  loc,
  chrom_list,
  what = "peak",
  idx = "max",
  lambda = "max",
  plot_spectrum = TRUE,
  plot_trace = TRUE,
  box_plot = FALSE,
  vars = NULL,
  spectrum_labels = TRUE,
  scale_spectrum = FALSE,
  export_spectrum = FALSE,
  verbose = TRUE,
  engine = c("base", "plotly", "ggplot"),
  chr = NULL,
  ...
)

Arguments

`x`	The peak table (output from `get_peaktable` function).
`loc`	A vector specifying the peak(s) or retention time(s) that you wish to plot.
`chrom_list`	A list of chromatograms in matrix format (timepoints x wavelengths). If no argument is provided here, the function will try to find the `chrom_list` object used to create the `peak_table`.
`what`	What to look for. Either `peak` to extract spectral information for a certain peak, `rt` to scan by retention time, or `click` to manually select retention time by clicking on the chromatogram. Defaults to `peak`.
`idx`	Numerical index of chromatogram you wish to plot; "max" to plot the chromatogram with the largest signal; or "all" to plot spectra for all chromatograms.
`lambda`	The wavelength you wish to plot the trace at (if `plot_chrom` is TRUE and/or the wavelength to be used for the determination of signal abundance.
`plot_spectrum`	Logical. If TRUE, plots the spectrum of the chosen peak. Defaults to TRUE.
`plot_trace`	Logical. If TRUE, plots the trace of the chosen peak at lambda. Defaults to TRUE.
`box_plot`	Logical. If TRUE, plots box plot using categories defined by `vars`.
`vars`	Independent variables for boxplot. Righthand side of formula.
`spectrum_labels`	Logical. If TRUE, plots labels on maxima in spectral plot. Defaults to TRUE.
`scale_spectrum`	Logical. If TRUE, scales spectrum to unit height. Defaults to FALSE.
`export_spectrum`	Logical. If TRUE, exports spectrum to console. Defaults to FALSE.
`verbose`	Logical. If TRUE, prints verbose output to console. Defaults to TRUE.
`engine`	Which plotting engine to use: either `base` or `plotly`.
`chr`	Deprecated. Please use `idx` instead.
`...`	Additional arguments to `boxplot`.

Details

Value

If export_spectrum is TRUE, returns the spectrum as a data.frame with wavelengths as rows and columns encoding the absorbance (or normalized absorbance, if scale_spectrum is TRUE) for the specified sample(s). Otherwise, there is no return value.

Side effects

If plot_trace is TRUE, plots the chromatographic trace of the specified chromatogram (idx), at the specified wavelength (lambda) with a dotted red line to indicate the retention time given by loc. The trace is a single column from the chromatographic matrix.

If plot_spectrum is TRUE, plots the spectrum for the specified chromatogram at the specified retention time. The spectrum is a single row from the chromatographic matrix.

If box_plot is TRUE, produces a boxplot from the specified peak with groups provided by vars.

Author(s)

Ethan Bass

Plot PTW alignments

Description

Plot PTW alignments

Usage

## S3 method for class 'ptw_list'
plot(x, lambdas, legend = TRUE, ...)
## S3 method for class 'ptw_list'
plot(x, lambdas, legend = TRUE, ...)

Arguments

`x`	A `ptw_list` object created by `correct_rt`.
`lambdas`	Which lambdas to plot.
`legend`	Logical. Whether to label the plots.
`...`	Additional arguments to `matplot`.

Author(s)

Ethan Bass

Preprocess time/wavelength data

Description

Standard pre-processing of response matrices, consisting of a time axis and a spectral axis (e.g. HPLC-DAD/UV data). For smooth data, like UV-VIS data, the size of the matrix can be reduced by interpolation. By default, the data are baseline-corrected in the time direction (baseline.corr) and smoothed in the spectral dimension using cubic smoothing splines (smooth.spline.

Usage

preprocess(
  X,
  dim1,
  dim2,
  remove.time.baseline = TRUE,
  spec.smooth = TRUE,
  maxI = NULL,
  parallel = NULL,
  interpolate_rows = TRUE,
  interpolate_cols = TRUE,
  mc.cores,
  cl = 2,
  show_progress = NULL,
  ...
)
preprocess(
  X,
  dim1,
  dim2,
  remove.time.baseline = TRUE,
  spec.smooth = TRUE,
  maxI = NULL,
  parallel = NULL,
  interpolate_rows = TRUE,
  interpolate_cols = TRUE,
  mc.cores,
  cl = 2,
  show_progress = NULL,
  ...
)

Arguments

`X`	A numerical data matrix, or list of data matrices. Missing values are not allowed. If rownames or colnames attributes are used, they should be numerical and signify time points and wavelengths, respectively.
`dim1`	A new, usually shorter, set of time points (numerical). The range of these should not exceed the range of the original time points.
`dim2`	A new, usually shorter, set of wavelengths (numerical). The range of these should not exceed the range of the original wavelengths.
`remove.time.baseline`	Logical, indicating whether baseline correction should be done in the time direction, according to `baseline.corr`. Default is TRUE.
`spec.smooth`	Logical, indicating whether smoothing should be done in the spectral direction, according to `smooth.spline`. Default is TRUE.
`maxI`	if given, the maximum intensity in the matrix is set to this value.
`parallel`	Logical, indicating whether to use parallel processing. Defaults to TRUE (unless you're on Windows).
`interpolate_rows`	Logical. Whether to interpolate along the time axis (`dim1`). Defaults to TRUE.
`interpolate_cols`	Logical. Whether to interpolate along the spectral axis (`dim2`). Defaults to TRUE.
`mc.cores`	How many cores to use for parallel processing. Defaults to 2. This argument has been deprecated and replaces with `cl`.
`cl`	Argument to `pblapply` or `mclapply`. Either an integer specifying the number of clusters to use for parallel processing or a cluster object created by `makeCluster`. Defaults to 2. On Windows integer values will be ignored.
`show_progress`	Logical. Whether to show progress bar. Defaults to `TRUE` if `pbapply` is installed.
`...`	Further optional arguments to `baseline.corr`.

Value

The function returns the preprocessed data matrix (or list of matrices), with row names and column names indicating the time points and wavelengths, respectively.

Note

Adapted from the preprocess function in the alsace package by Ron Wehrens.

Author(s)

Ethan Bass

References

Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. Fast parametric time warping of peak lists. Bioinformatics 31:3063-3065. doi:10.1093/bioinformatics/btv299.
Wehrens, R., Carvalho, E., Fraser, P.D. 2015. Metabolite profiling in LC–DAD using multivariate curve resolution: the alsace package for R. Metabolomics 11:1:143-154. doi:10.1007/s11306-014-0683-5.

Examples


data(Sa)
new.ts <- seq(10,18.66,by=.01) # choose time-points
new.lambdas <- seq(200, 318, by = 2) # choose wavelengths
Sa_pr <- preprocess(Sa[[1]], dim1 = new.ts, dim2 = new.lambdas)

data(Sa)
new.ts <- seq(10,18.66,by=.01) # choose time-points
new.lambdas <- seq(200, 318, by = 2) # choose wavelengths
Sa_pr <- preprocess(Sa[[1]], dim1 = new.ts, dim2 = new.lambdas)

Reshape chromatograms Reshapes a list of chromatograms from wide to long format.

Description

Reshape chromatograms Reshapes a list of chromatograms from wide to long format.

Usage

reshape_chroms(x, idx, sample_var = "sample", lambdas = NULL, rts = NULL)
reshape_chroms(x, idx, sample_var = "sample", lambdas = NULL, rts = NULL)

Arguments

`x`	A list of chromatographic matrices in wide format.
`idx`	Indices of chromatograms to convert
`sample_var`	String with name of new column containing sample IDs.
`lambdas`	Vector specifying wavelength(s) to include.
`rts`	Vector specifying retention times to include.

Value

A list of chromatographic matrices in long format.

Author(s)

Ethan Bass

Reshapes peak table from wide to long format

Description

Reshapes peak table from wide to long format

Usage

reshape_peaktable(x, peaks, metadata, fixed_levels = TRUE)
reshape_peaktable(x, peaks, metadata, fixed_levels = TRUE)

Arguments

`x`	A `peak_table` object.
`peaks`	A character vector specifying the peaks to include. If the character vector is named, the names of the vector elements will be used in place of the original peak names.
`metadata`	A character vector specifying the metadata fields to include.
`fixed_levels`	Logical. Whether to fix factor levels of features in the order provided. Defaults to `TRUE`.

Value

A data.frame containing the information for the specified peaks in long format.

Author(s)

Ethan Bass

Raw goldenrod root chromatograms

Description

A list of four HPLC-DAD data matrices of Solidago altissima roots extracted in 90% methanol. Retention times are stored in rows and wavelengths are stored in columns.

Usage

data(Sa)
data(Sa)

Format

A list of four matrices (1301 times x 60 wavelengths).

Preprocessed goldenrod root chromatograms

Description

A list of four pre-processed HPLC-DAD chromatograms derived from the raw data stored in Sa. Retention times are stored in rows and wavelengths are stored in columns. The time axis is compressed to save space and processing time so the data are a little choppy.

Usage

data(Sa_pr)
data(Sa_pr)

Format

A list of four pre-processed matrices (434 retention times x 60 wavelengths).

Warped goldenrod root chromatograms.

Description

A list of four pre-processed and warped goldenrod root chromatograms derived from the raw data stored in Sa.

Usage

data(Sa_warp)
data(Sa_warp)

Format

A list of four pre-processed and warped matrices (434 times x 60 wavelengths).

Plot spectra by clicking on the chromatogram

Description

Plot spectra by clicking on the chromatogram

Usage

scan_chrom(
  chrom_list,
  idx,
  lambda,
  plot_spectrum = TRUE,
  peak_table = NULL,
  scale_spectrum = FALSE,
  spectrum_labels = TRUE,
  export_spectrum = FALSE,
  chr = NULL,
  ...
)
scan_chrom(
  chrom_list,
  idx,
  lambda,
  plot_spectrum = TRUE,
  peak_table = NULL,
  scale_spectrum = FALSE,
  spectrum_labels = TRUE,
  export_spectrum = FALSE,
  chr = NULL,
  ...
)

Arguments

`chrom_list`	A list of chromatograms in matrix format (timepoints x wavelengths). If no argument is provided here, the function will try to find the `chrom_list` object used to create the provided `peak_table`.
`idx`	Numerical index of chromatogram you wish to plot.
`lambda`	The wavelength to plot the trace at.
`plot_spectrum`	Logical. Whether to plot the spectrum or not.
`peak_table`	The peak table (output from `get_peaktable` function).
`scale_spectrum`	Logical. If TRUE, scales spectrum to unit height. Defaults to FALSE.
`spectrum_labels`	Logical. If TRUE, plots labels on maxima in spectral plot. Defaults to TRUE.
`export_spectrum`	Logical. If TRUE, exports spectrum to console. Defaults to FALSE.
`chr`	Deprecated. Please use `idx` instead.
`...`	Additional arguments.

Value

Side effects

Plots a chromatographic trace from the specified chromatogram (idx), at the specified wavelength (lambda) with a dotted red line to indicate the user-selected retention time. The trace is a single column from the chromatographic matrix.

If plot_spectrum is TRUE, plots the spectrum for the specified chromatogram at the user-specified retention time. The spectrum is a single row from the chromatographic matrix.

Author(s)

Ethan Bass

Examples


data(Sa_pr)
scan_chrom(Sa_pr, lambda = "210", idx = 2, export_spectrum = TRUE)

data(Sa_pr)
scan_chrom(Sa_pr, lambda = "210", idx = 2, export_spectrum = TRUE)

Subset peak table Return subset of `peak_table` object.

Description

Subset peak table Return subset of peak_table object.

Usage

## S3 method for class 'peak_table'
subset(x, subset, select, drop = FALSE)
## S3 method for class 'peak_table'
subset(x, subset, select, drop = FALSE)

Arguments

`x`	A `peak_table` object
`subset`	Logical expression indicating rows (samples) to keep from `peak_table`; missing values are taken as false.
`select`	Logical expression indicating columns (peaks) to select from `peak_table`.
`drop`	Logical. Passed to indexing operator.

Value

A peak_table object with samples specified by subset and peaks specified by select.

Author(s)

Ethan Bass

Export peak table

Description

Export peak table in csv or xlsx format according to the value of format.

Usage

write_peaktable(
  peak_table,
  path,
  filename = "peak_table",
  format = c("csv", "xlsx"),
  what = c("tab", "pk_meta", "sample_meta", "ref_spectra", "args")
)
write_peaktable(
  peak_table,
  path,
  filename = "peak_table",
  format = c("csv", "xlsx"),
  what = c("tab", "pk_meta", "sample_meta", "ref_spectra", "args")
)

Arguments

`peak_table`	Peak table object from `get_peaktable`.
`path`	Path to write file.
`filename`	File name. Defaults to "peak_table".
`format`	File format to export. Either `csv` or `xlsx`.
`what`	Which elements of the `peak_table` to export.

Value

No return value. The function is called for its side effects.

Side effects

Exports peak_table object as .csv or .xlsx file according to the value of format.

Examples


data(pk_tab)
path_out = tempdir()
write_peaktable(pk_tab, path = path_out, what = c("tab"))

data(pk_tab)
path_out = tempdir()
write_peaktable(pk_tab, path = path_out, what = c("tab"))

Package 'chromatographR'

Help Index

chromatographR

Description

Details

Author(s)

See Also

Attach experimental metadata

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Attach reference spectra

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Make boxplot from peak table.

Description

Usage

Arguments

Examples

Cluster spectra

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

Examples

Combine peaks

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Correct peak positions according to a PTW warping model

Description

Usage

Arguments

Details

Value

Note

Author(s)

See Also

Correct retention time

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

Filter peak lists

Description

Usage

Arguments

Value

Author(s)

See Also

Filter peak table

Description

Usage

Arguments

Value

Author(s)

See Also

Examples