Package 'chromatographR'

Title: Chromatographic Data Analysis Toolset
Description: Tools for high-throughput analysis of HPLC-DAD/UV chromatograms (or similar data). Includes functions for preprocessing, alignment, peak-finding and fitting, peak-table construction, data-visualization, etc. Preprocessing and peak-table construction follow the rough formula laid out in 'alsace' (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C., 2015. <doi:10.1093/bioinformatics/btv299>. Alignment of chromatograms is available using parametric time warping (as implemented in the 'ptw' package) (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. <doi:10.1093/bioinformatics/btv299>) or variable penalty dynamic time warping (as implemented in 'VPdtw') (Clifford, D., & Stone, G. 2012. <doi:10.18637/jss.v047.i08>). Peak-finding uses the algorithm by Tom O'Haver <https://terpconnect.umd.edu/~toh/spectrum/PeakFindingandMeasurement.htm>. Peaks are then fitted to a gaussian or exponential-gaussian hybrid peak shape using non-linear least squares (Lan, K. & Jorgenson, J. W. 2001. <doi:10.1016/S0021-9673(01)00594-5>). See the vignette for more details and suggested workflow.
Authors: Ethan Bass [aut, cre] (ORCID: <https://orcid.org/0000-0002-6175-6739>), Hans W Borchers [ctb, cph] (Author of savgol and pinv functions bundled from pracma)
Maintainer: Ethan Bass <[email protected]>
License: GPL (>= 2)
Version: 0.7.5.9000
Built: 2026-05-31 18:11:42 UTC
Source: https://github.com/ethanbass/chromatographR

Help Index


chromatographR

Description

Chromatographic Data Analysis Toolset

Details

Tools for high-throughput analysis of HPLC-DAD/UV chromatograms (or similar data). Includes functions for preprocessing, alignment, peak-finding and fitting, peak-table construction, data-visualization, etc. Preprocessing and peak-table construction follow the rough formula laid out in alsace (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C., 2015. doi:10.1093/bioinformatics/btv299). Alignment of chromatograms is available using parametric time warping (PTW) (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. doi:10.1093/bioinformatics/btv299) or variable penalty dynamic time warping (VPdtw) (Clifford, D., & Stone, G. 2012. doi:10.18637/jss.v047.i08). Peak-finding relies on the algorithm suggested by Tom O'Haver in his Pragmatic Introduction to Signal Processing. Peaks are then fitted to a gaussian or exponential-gaussian hybrid peak shape using non-linear least squares (Lan, K. & Jorgenson, J. W. 2001. doi:10.1016/S0021-9673(01)00594-5). More details on package usage and a suggested workflow can be found in the vignette.

Analysis functions

Visualization functions

Utility functions

  • combine_peaks(): Combine duplicate peaks in peak table based on retention time and spectral similarity.

  • merge_peaks(): Merge split peaks into a single column of a peak table.

  • get_times(): Return retention times from a peak table or a list of chromatograms.

  • get_lambdas(): Return wavelengths from a peak table or a list of chromatograms.

  • reshape_chroms(): Reshape a list of chromatograms to long format.

  • reshape_peaktable(): Reshape a peak_table object to long format.

  • write_peaktable(): Export peak table in csv or xlsx format.

Example data

  • Sa: A list of four goldenrod root chromatograms.

  • Sa_pr: Preprocessed goldenrod root chromatograms.

  • Sa_warp: Preprocessed and aligned goldenrod root chromatograms.

  • pk_tab: Peak table from aligned goldenrod root chromatograms.

Author(s)

Ethan Bass

See Also

Useful links:


Attach experimental metadata

Description

Attaches sample metadata to a peak_table object. Metadata should be provided as a data.frame object. One of the columns in the supplied metadata must match exactly the row names of the peak table.

Usage

attach_metadata(peak_table, metadata, column)

Arguments

peak_table

A peak_table object.

metadata

A data.frame containing the sample metadata.

column

The name of the column in your metadata object containing the sample names. Sample names must match the row names of peak_table$tab.

Value

A peak_table object with attached metadata in the $sample_meta slot.

Author(s)

Ethan Bass

See Also

get_peaktable normalize_data

Examples

data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")

Attach reference spectra

Description

Gathers reference spectra and attaches them to a peak_table object. Reference spectra are defined either as the spectrum with the highest intensity ("max.int") or as the spectrum with the highest average correlation to the other spectra in the peak table ("max.cor").

Usage

attach_ref_spectra(peak_table, chrom_list, ref = c("max.cor", "max.int"))

Arguments

peak_table

Peak table from get_peaktable.

chrom_list

A list of chromatograms in matrix format (timepoints x wavelengths). If no argument is provided here, the function will try to find the chrom_list object used to create the provided peak_table.

ref

What criterion to use to select reference spectra. Current options are maximum correlation ("max.cor") or maximum signal intensity ("max.int").

Value

A peak_table object with reference spectra attached in the $ref_spectra slot.

Author(s)

Ethan Bass

See Also

get_peaks get_peaktable

Examples

data(pk_tab)
pk_tab <- attach_ref_spectra(pk_tab, ref = "max.int")
pk_tab <- attach_ref_spectra(pk_tab, ref = "max.cor")

Make boxplot from peak table.

Description

The function can take multiple response variables on the left hand side of the formula (separated by +). In this case, a separate boxplot will be produced for each response variable.

Usage

## S3 method for class 'peak_table'
boxplot(x, formula, ...)

Arguments

x

A peak_table object.

formula

A formula object.

...

Additional arguments to boxplot.

Value

No return value, called for side effects.

Side effects

Creates a boxplot according to the provided formula, using data from the supplied peak_table object.

Author(s)

Ethan Bass

See Also

Other visualization functions: mirror_plot(), plot.peak_list(), plot.peak_table(), plot_all_spectra(), plot_chroms(), plot_chroms_heatmap(), plot_spectrum(), scan_chrom()

Examples

data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")
boxplot(pk_tab, formula=V11 ~ trt)

Cluster spectra

Description

Cluster peaks by spectral similarity.

Usage

cluster_spectra(
  peak_table,
  peak_no = NULL,
  alpha = 0.05,
  min_size = 5,
  max_size = NULL,
  nboot = 1000,
  plot_dend = TRUE,
  plot_spectra = TRUE,
  verbose = getOption("verbose"),
  save = FALSE,
  parallel = TRUE,
  max.only = FALSE,
  output = c("pvclust", "clusters"),
  ...
)

Arguments

peak_table

Peak table from get_peaktable.

peak_no

Minimum and maximum thresholds for the number of peaks a cluster may have. This argument is deprecated in favor of min_size and max_size.

alpha

Confidence threshold for inclusion of cluster.

min_size

Minimum number of peaks a cluster may have.

max_size

Maximum number of peaks a cluster may have.

nboot

Number of bootstrap replicates for pvclust.

plot_dend

Logical. If TRUE, plots dendrogram with bootstrap values.

plot_spectra

Logical. If TRUE, plots overlapping spectra for each cluster.

verbose

Logical. If TRUE, prints progress report to console.

save

Logical. If TRUE, saves pvclust object to current directory.

parallel

Logical. If TRUE, use parallel processing for pvclust.

max.only

Logical. If TRUE, returns only highest level for nested dendrograms.

output

What to return. Either clusters to return list of clusters, pvclust to return pvclust object, or both to return both items.

...

Additional arguments to pvclust.

Details

Before using this function, reference spectra must be attached to the peak_table using the attach_ref_spectra function. These reference spectra are then used to construct a distance matrix based on spectral similarity (pearson correlation) between peaks. Hierarchical clustering with bootstrap resampling is performed on the resulting correlation matrix to classify peaks by spectral similarity, as implemented in pvclust. Finally, bootstrap values can be used to select clusters that exceed a certain confidence threshold as defined by alpha.

Clusters can be filtered by the minimum and maximum size of the cluster using the min_size and max_size arguments respectively. Users should be aware that the clustering algorithm will often return nested clusters. Thus, an individual peak could appear in more than one cluster. If max_only is TRUE, only the largest cluster in a nested tree of clusters meeting the specified confidence threshold will be returned.

It is highly suggested to use more than 100 bootstraps if you run the clustering algorithm on real data even though we use nboot = 100 in the example to reduce runtime. The authors of pvclust suggest nboot = 10000.

Value

Returns clusters and/or pvclust object according to the value of the output argument.

  • If output = clusters, returns a list of S4 cluster objects.

  • If output = pvclust, returns a pvclust object.

  • If output = both, returns a nested list containing [[1]] the pvclust object, and [[2]] the list of S4 cluster objects.

The cluster objects consist of the following components:

  • peaks: a character vector containing the names of all peaks contained in the given cluster.

  • pval: a numeric vector of length 1 containing the bootstrap p-value (au) for the given cluster.

Author(s)

Ethan Bass

References

R. Suzuki & H. Shimodaira. 2006. Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, 22(12):1540-1542. doi:10.1093/bioinformatics/btl117.

Examples

data(pk_tab)
data(Sa_warp)
pk_tab <- attach_ref_spectra(pk_tab, Sa_warp, ref = "max.int")
cl <- cluster_spectra(pk_tab, nboot = 100, max.only = FALSE, 
save = FALSE, alpha = 0.03)

Combine peaks

Description

Utility function to combine duplicate peaks in peak table, (i.e., peaks that were integrated at more than one wavelength or component) based on retention time and/or spectral similarity. Specify tolerance (tol) for retention time matching and minimum spectral correlation (min.cor) for a match.

Usage

combine_peaks(
  peak_table,
  tol = 0.01,
  min.cor = 0.9,
  choose = "max",
  verbose = getOption("verbose")
)

Arguments

peak_table

Peak table from get_peaktable.

tol

Tolerance for matching retention times (maximum retention time difference). Defaults to .01.

min.cor

Minimum spectral correlation to confirm a match. Defaults to 0.9.

choose

If max will retain peak with highest intensity. Otherwise, the first column in the data.frame will be retained.

verbose

Logical. Whether to print status to the console.

Value

A peak table derived from the original, but with columns corresponding to duplicate peaks combined according to the specified criteria.

Author(s)

Ethan Bass

See Also

get_peaks

Other utility functions: get_lambdas(), get_times(), merge_peaks(), reshape_chroms(), reshape_peaktable()

Examples

data(pk_tab)
data(Sa_warp)
pk_tab <- attach_ref_spectra(pk_tab)
combine_peaks(pk_tab, tol = 0.02, min.cor = 0.9)

Correct peak positions according to a PTW warping model

Description

Corrects retention time differences in peak_list using parametric time warping as implemented in the ptw package.

Usage

correct_peaks(peak_list, mod_list, chrom_list, match_names = TRUE)

Arguments

peak_list

A 'peak_list' object created by get_peaks, containing a nested list of peak tables where the first level is the sample, and the second level is the spectral wavelength. Every wavelength is described by a matrix where each row corresponds to a feature, and the columns contain information on that feature (e.g., retention time, peak width (FWHM), height, area, etc.)

mod_list

A list of ptw models.

chrom_list

List of chromatograms from which the ptw models are derived.

match_names

Logical. Whether to actively match the names of the peak_list to the list of models (mod_list). Defaults to TRUE.

Details

Once an appropriate warping model has been established, corrected retention times can be predicted for each peak. These are stored in a separate column in the list of peak tables.

Value

The input list of peak tables is returned with extra columns containing the corrected retention times.

Note

This function is adapted from correctPeaks function in the alsace package by Ron Wehrens.

Author(s)

Ron Wehrens, Ethan Bass

See Also

correct_rt


Correct retention time

Description

Aligns chromatograms using one of two algorithms, according to the value of alg: either parametric time warping, as implemented in ptw, or variable penalty dynamic time warping, as implemented in VPdtw. The init.coef and n.traces arguments apply only to ptw warping, while penalty and maxshift apply only to vpdtw warping.

Usage

correct_rt(
  chrom_list,
  lambdas,
  models = NULL,
  reference = "best",
  alg = c("ptw", "vpdtw"),
  what = c("corrected.values", "models"),
  init.coef = c(0, 1, 0),
  n.traces = NULL,
  n.zeros = 0,
  scale = FALSE,
  trwdth = 200,
  plot_it = FALSE,
  penalty = 5,
  maxshift = 50,
  verbose = getOption("verbose"),
  show_progress = NULL,
  cl = 2,
  ...
)

Arguments

chrom_list

List of chromatograms in matrix format.

lambdas

A character or numeric vector specifying the wavelengths to use for alignment. Only one wavelength should be specified for VPdtw. For one-dimensional chromatograms, this argument can be ignored.

models

List of models to warp by. The models provided here (if any) must match the algorithm selected in alg.

reference

Index of the sample that is to be considered the reference sample.

alg

algorithm to use: parametric time warping (ptw) or variable penalty dynamic time warping (vpdtw).

what

What to return: either the 'corrected.values' (useful for visual inspection) or the warping 'models' (for further programmatic use).

init.coef

Starting values for the optimization.

n.traces

Number of traces to use.

n.zeros

Number of zeros to add.

scale

Logical. If true, scale chromatograms before warping.

trwdth

width of the triangle in the WCC criterion.

plot_it

Logical. Whether to plot alignment.

penalty

The divisor used to calculate the penalty for VPdtw. The warping penalty is calculated by dividing the dilation by this number. Thus, a higher number will produce a lower penalty and be more permissive, while a lower number will produce a higher penalty and allow less warping. Defaults to 5.

maxshift

Integer. Maximum allowable shift for VPdtw. Defaults to 50.

verbose

Whether to print verbose output.

show_progress

Logical. Whether to show progress bar. Defaults to TRUE if pbapply is installed. Currently works only for ptw alignments.

cl

Argument to pblapply or mclapply. Either an integer specifying the number of clusters to use for parallel processing or a cluster object created by makeCluster. Defaults to 2. On Windows integer values will be ignored.

...

Optional arguments for the ptw function. The only argument that cannot be changed is warp.type: this is always equal to "global".

Value

A list of warping models or a list of warped absorbance profiles, according to the value of the what argument.

Note

Adapted from correctRT function in the alsace package by Ron Wehrens.

Author(s)

Ethan Bass

References

  • Clifford, D., Stone, G., Montoliu, I., Rezzi, S., Martin, F. P., Guy, P., Bruce, S., & Kochhar, S. 2009. Alignment using variable penalty dynamic time warping. Analytical chemistry, 81(3):1000-1007. doi:10.1021/ac802041e.

  • Clifford, D., & Stone, G. 2012. Variable Penalty Dynamic Time Warping Code for Aligning Mass Spectrometry Chromatograms in R. Journal of Statistical Software, 47(8):1-17. doi:10.18637/jss.v047.i08.

  • Eilers, P.H.C. 2004. Parametric Time Warping. Anal. Chem., 76:404-411. doi:10.1021/ac034800e.

  • Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. Fast parametric time warping of peak lists. Bioinformatics, 31:3063-3065. doi:10.1093/bioinformatics/btv299.

  • Wehrens, R., Carvalho, E., Fraser, P.D. 2015. Metabolite profiling in LC–DAD using multivariate curve resolution: the alsace package for R. Metabolomics, 11:143-154. doi:10.1007/s11306-014-0683-5.

See Also

ptw, correct_peaks, VPdtw

Examples

data(Sa_pr)
warp <- correct_rt(chrom_list = Sa_pr, lambdas=210)

Filter peak lists

Description

Utility function to remove peaks from a peak list (e.g., because their intensity is too low). Currently one can filter on peak height, peak area, standard deviation, and/or retention time.

Usage

filter_peaks(peak_list, min_height, min_area, min_sd, max_sd, min_rt, max_rt)

Arguments

peak_list

A peak_list object, consisting of a nested list of peak tables, where the first level is the sample, and the second level is the spectral component. Every component is described by a matrix where every row is one peak, and the columns contain information on retention time, full width at half maximum (FWHM), peak width, height, and area.

min_height

Minimum peak height.

min_area

Minimum peak area.

min_sd

Minimal standard deviation.

max_sd

Maximum standard deviation.

min_rt

Minimum retention time.

max_rt

Maximum retention time.

Value

A peak list similar to the input, with all rows removed from that do not satisfy the specified criteria.

Author(s)

Ron Wehrens, Ethan Bass

See Also

get_peaks, filter_peaktable


Filter peak table

Description

Utility function to remove peaks from peak table, e.g., because their intensity is too low. Currently one can filter on mean, median, or maximum ("max") peak intensity or retention time.

Usage

filter_peaktable(
  peak_table,
  rts,
  min_rt,
  max_rt,
  min_value,
  lambda,
  what = c("median", "mean", "max"),
  tol = 0
)

Arguments

peak_table

A peak_table object from get_peaktable.

rts

Vector of retention times to include in the peak table.

min_rt

Minimum retention time to include in the peak table.

max_rt

Maximum retention time to include in the peak table.

min_value

Minimal cutoff for summarized peak intensity.

lambda

Component(s) to include in peak table (e.g. wavelengths if you are using HPLC-DAD/UV).

what

Whether to summarize intensities using mean, median, or max. Defaults to median.

tol

Tolerance for matching of retention times to rts.

Value

A peak table similar to the input, with all columns removed from the peak table that do not satisfy the specified criteria.

Author(s)

Ethan Bass

See Also

get_peaktable, filter_peaks

Examples

data(pk_tab)
pk_tab <- filter_peaktable(pk_tab, min_rt = 10, max_rt = 16)

Get lambdas

Description

Get wavelengths from a list of chromatograms or a peak_table object.

Usage

get_lambdas(x)

Arguments

x

A list of chromatograms or peak_table object.

Value

A numeric vector of wavelengths.

See Also

Other utility functions: combine_peaks(), get_times(), merge_peaks(), reshape_chroms(), reshape_peaktable()


Get peak list.

Description

Finds and fits peaks and extracts peak parameters from a list of chromatograms at the specified wavelengths.

Usage

get_peaks(
  chrom_list,
  lambdas,
  fit = c("egh", "gaussian", "raw"),
  sd_max = 50,
  max_iter = 100,
  time_unit = c("min", "s", "ms"),
  estimate_purity = FALSE,
  noise_threshold = 0.001,
  show_progress = NULL,
  cl = 2,
  collapse = FALSE,
  time.units = NULL,
  sd.max = NULL,
  max.iter = NULL,
  ...
)

Arguments

chrom_list

A list of profile matrices, each of the same dimensions (timepoints × wavelengths).

lambdas

A character or numeric vector specifying the wavelengths to find peaks at. For one-dimensional chromatograms, this argument can be ignored.

fit

What type of fit to use. Current options are exponential-gaussian hybrid (egh), gaussian or raw. The raw setting performs trapezoidal integration directly on the raw data without fitting a peak shape.

sd_max

Maximum width (standard deviation) for peaks. Defaults to 50.

max_iter

Maximum number of iterations for non-linear least squares in fit_peaks.

time_unit

Units of sd, FWHM, area, and tau (if applicable). Options are minutes ("min"), seconds ("s"), or milliseconds ("ms").

estimate_purity

Logical. Whether to estimate purity or not. Defaults to FALSE. (If TRUE, this will slow down the function significantly).

noise_threshold

Noise threshold. Argument to get_purity.

show_progress

Logical. Whether to show progress bar. Defaults to TRUE if pbapply is installed.

cl

Argument to pblapply or mclapply. Either an integer specifying the number of clusters to use for parallel processing or a cluster object created by makeCluster. Defaults to 2. On Windows integer values will be ignored.

collapse

Logical. Whether to collapse multiple peak lists per sample into a single list when multiple wavelengths (lambdas) are provided.

time.units

The time.units argument is deprecated. Please use time_unit instead.

sd.max

The sd.max argument is deprecated. Please use sd_max instead.

max.iter

The max.iter argument is deprecated. Please use max_iter instead.

...

Additional arguments to find_peaks. Arguments provided to find_peaks can be used to fine-tune the peak-finding algorithm. Most importantly, the smooth_window should be increased if features are being split into multiple bins. Other arguments that can be used here include smooth_type, slope_thresh, and amp_thresh.

Details

Peaks are located by finding zero-crossings in the smoothed first derivative of the specified chromatographic traces (function find_peaks). At the given positions, an exponential-gaussian hybrid (or regular gaussian) function is fit to the signal using fit_peaks according to the value of fit. Finally, the area is calculated using trapezoidal approximation.

Additional arguments can be provided to find_peaks to fine-tune the peak-finding algorithm. For example, the smooth_window can be increased to prevent peaks from being split into multiple features. Overly aggressive smoothing may cause small peaks to be overlooked.

The standard deviation (sd), full-width at half maximum (FWHM), tau tau, and area are returned in units determined by time_unit. By default, the units are in minutes. To compare directly with 'ChemStation' integration results, the time units should be changed to seconds.

Value

The result is an S3 object of class peak_list, containing a nested list of data.frames containing information about the peaks fitted for each chromatogram at each of wavelengths specified by the lamdas argument. Each row in these data.frames is a peak and the columns contain information about various peak parameters:

  • rt: The retention time of the peak maximum.

  • start: The retention time where the peak is estimated to begin.

  • end: The retention time where the peak is estimated to end.

  • sd: The standard deviation of the fitted peak shape.

  • tau The value of parameter τ\tau. This parameter determines peak asymmetry for peaks fit with an exponential-gaussian hybrid function. (This column will only appear if fit = egh.

  • FWHM: The full-width at half maximum.

  • height: The height of the peak.

  • area: The area of the peak as determined by trapezoidal approximation.

  • r.squared The coefficient of determination (R2R^2) of the fitted model to the raw data. (Note: this value is calculated by fitting a linear model of the fitted peak values to the raw data. This approach is statistically questionable, since the models are fit using non-linear least squares. Nevertheless, it can still be useful as a rough metric for "goodness-of-fit").

  • purity The peak purity as estimated by get_purity.

Note

The bones of this function are adapted from the getAllPeaks function authored by Ron Wehrens (though the underlying algorithms for peak identification and peak-fitting are not the same).

Author(s)

Ethan Bass

References

  • Lan, K. & Jorgenson, J. W. 2001. A hybrid of exponential and gaussian functions as a simple model of asymmetric chromatographic peaks. Journal of Chromatography A 915:1-13. doi:10.1016/S0021-9673(01)00594-5.

  • Naish, P. J. & Hartwell, S. 1988. Exponentially Modified Gaussian functions - A good model for chromatographic peaks in isocratic HPLC? Chromatographia, 26: 285-296. doi:10.1007/BF02268168.

  • O'Haver, Tom. Pragmatic Introduction to Signal Processing: Applications in scientific measurement. https://terpconnect.umd.edu/~toh/spectrum/ (Accessed January, 2022).

  • Wehrens, R., Carvalho, E., Fraser, P.D. 2015. Metabolite profiling in LC–DAD using multivariate curve resolution: the alsace package for R. Metabolomics 11:143-154. doi:10.1007/s11306-014-0683-5.

See Also

find_peaks, fit_peaks

Examples

data(Sa_pr)
pks <- get_peaks(Sa_pr, lambdas = c('210'), sd_max = 50, fit = "egh")

Converts peak list into an ordered peak table.

Description

The function performs a complete linkage clustering of retention times across all samples, and cuts at a height given by the user (which can be understood as the maximal inter-cluster retention time difference) in the simple case based on retention times. Clustering can also incorporate information about spectral similarity using a distance function adapted from Broeckling et al., 2014:

e(1cij)22σr2e(1(titj)2)2σt2e^{-\frac{(1-c_{ij})^2}{2\sigma_r^2}} \cdot e^{-\frac{(1-(t_i-t_j)^2)}{2\sigma_t^2}}

If two peaks from the same sample are assigned to the same cluster, a warning message is printed to the console. These warnings can usually be ignored, but one could also consider reducing the hmax variable. However, this may lead to splitting of peaks across multiple clusters. Another option is to filter the peaks by intensity to remove small features.

Usage

get_peaktable(
  peak_list,
  chrom_list,
  response = c("area", "height"),
  use.cor = NULL,
  hmax = 0.2,
  plot_it = FALSE,
  ask = plot_it,
  clust = c("rt", "sp.rt"),
  sigma.t = NULL,
  sigma.r = 0.5,
  deepSplit = FALSE,
  verbose = FALSE,
  out = c("data.frame", "matrix")
)

Arguments

peak_list

A peak_list object created by get_peaks, containing a nested list of peak tables: the first level is the sample, and the second level is the spectral wavelength. Every component is described by a data.frame with a row for each peak and columns containing information on various peak parameters.

chrom_list

A list of chromatographic matrices.

response

Indicates whether peak area or peak height is to be used as intensity measure. Defaults to area setting.

use.cor

Logical. Indicates whether to use corrected retention times (rt.cor column) or raw retention times (rt column). Unless otherwise specified, the rt.cor column will be used by default if it exists in the provided peak_list.

hmax

Height at which the complete linkage dendrogram will be cut. Can be interpreted as the maximal intercluster retention time difference.

plot_it

Logical. If TRUE, for every component a strip plot will be shown indicating the clustering.

ask

Logical. Ask before showing new plot? Defaults to TRUE.

clust

Specify whether to perform hierarchical clustering based on spectral similarity and retention time (sp.rt) or retention time alone (rt). Defaults to rt. The sp.rt option is experimental and should be used with caution.

sigma.t

Width of gaussian in retention time distance function. Controls weight given to retention time if sp.rt is selected.

sigma.r

Width of gaussian in spectral similarity function. Controls weight given to spectral correlation if sp.rt is selected.

deepSplit

Logical. Controls sensitivity to cluster splitting. If TRUE, function will return more smaller clusters. See documentation for cutreeDynamic for additional information.

verbose

Logical. Whether to print warning when combining peaks into single time window. Defaults to FALSE.

out

Specify data.frame or matrix as output. Defaults to data.frame.

Value

The function returns an S3 peak_table object, containing the following elements:

  • tab: The peak table itself – a data.frame of intensities in a sample x peak configuration.

  • pk_meta: A data.frame containing peak meta-data (e.g., the spectral component, peak number, and average retention time).

  • sample_meta: A data.frame of sample meta-data. Must be added using attach_metadata.

  • ref_spectra: A data.frame of reference spectra (in a wavelength × peak configuration). Must be added using attach_ref_spectra.

  • args: A vector of arguments given to get_peaktable to generate the peak table.

Note

This function is adapted from getPeakTable function in the alsace package by Ron Wehrens.

Author(s)

Ethan Bass

References

  • Broeckling, C. D., Afsar F.A., Neumann S., Ben-Hur A., and Prenni J.E. 2014. RAMClust: A Novel Feature Clustering Method Enables Spectral-Matching-Based Annotation for Metabolomics Data. Anal. Chem. 86:6812-6817. doi:10.1021/ac501530d.

  • Wehrens, R., Carvalho, E., Fraser, P.D. 2015. Metabolite profiling in LC–DAD using multivariate curve resolution: the alsace package for R. Metabolomics 11:143-154. doi:10.1007/s11306-014-0683-5.

See Also

attach_ref_spectra attach_metadata

Examples

data(Sa_pr)
pks <- get_peaks(Sa_pr, lambdas = c('210'))
get_peaktable(pks, response = "area")

Get retention times

Description

Get retention times from a list of chromatograms or a peak_table object.

Usage

get_times(x, idx = 1)

Arguments

x

A list of chromatograms or peak_table object.

idx

Index of chromatogram from which to extract times.

Value

Numeric vector of retention times from the chromatogram specified by idx.

See Also

Other utility functions: combine_peaks(), get_lambdas(), merge_peaks(), reshape_chroms(), reshape_peaktable()


Merge split peaks

Description

Utility function to combine split peaks into a single column of the peak table.

Usage

merge_peaks(peak_table, peaks, method = c("max", "sum"))

Arguments

peak_table

Peak table from get_peaktable.

peaks

A vector specifying the names or indices of peaks to be merged.

method

Method to merge peaks. Either max to select the largest peak from each sample or sum to sum the peaks together.

Details

Merges the specified peaks in peak table, by selecting the largest value from each column if method is "max". If method is "sum", merges peak by summing their values.

Value

A peak table similar to the input peak table, but where the specified columns are combined.

Author(s)

Ethan Bass

See Also

Other utility functions: combine_peaks(), get_lambdas(), get_times(), reshape_chroms(), reshape_peaktable()

Examples

data(pk_tab)
pk_tab <- merge_peaks(peak_table = pk_tab, peaks=c("V10","V11"))

Make mirror plot from peak table.

Description

Plots chromatograms as a mirror plot.

Usage

mirror_plot(
  x,
  chrom_list,
  lambdas = NULL,
  var,
  subset = NULL,
  print_legend = TRUE,
  legend_txt = NULL,
  legend_pos = "topright",
  legend_size = 1,
  mirror = TRUE,
  xlim = NULL,
  ylim = NULL,
  ...
)

Arguments

x

The peak table (output from get_peaktable function).

chrom_list

A list of chromatograms in matrix format (timepoints x wavelengths). If no argument is provided here, the function will try to find the chrom_list object used to create the peak_table.

lambdas

The wavelength you wish to plot the traces at.

var

Variable to index chromatograms.

subset

Character vector specifying levels to use (if more than 2 levels are present in var).

print_legend

Logical. Whether to print legend. Defaults to TRUE.

legend_txt

Character vector containing labels for legend.

legend_pos

Legend position.

legend_size

Legend size (cex argument). Default is 1.

mirror

Logical. Whether to plot as mirror or stacked plots. Defaults to TRUE.

xlim

Numerical vector specifying limits for x axis.

ylim

Numerical vector specifying limits for y axis.

...

Additional arguments to matplot function.

Details

Can be used to confirm the identity of a peak or check that a particular column in the peak table represents a single compound. Can also be used to create simple box-plots to examine the distribution of a peak with respect to variables defined in sample metadata.

Value

No return value, called for side effects.

Side effects

If mirror_plot is TRUE, plots a mirror plot comparing two treatments defined by var and subset (if more than two factors are present in var). Otherwise, if mirror_plot is FALSE, the treatments are plotted in two separate panes.

Author(s)

Ethan Bass

See Also

Other visualization functions: boxplot.peak_table(), plot.peak_list(), plot.peak_table(), plot_all_spectra(), plot_chroms(), plot_chroms_heatmap(), plot_spectrum(), scan_chrom()

Examples

data(Sa_warp)
data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")
mirror_plot(pk_tab,lambdas = c("210","260"), var = "trt", mirror = TRUE, 
  col = c("green","blue"))

Normalize peak table or chromatograms

Description

Normalizes peak table or list of chromatograms by specified column in sample metadata. Metadata must first be attached to the peak_table using attach_metadata.

Usage

normalize_data(
  peak_table,
  column,
  chrom_list,
  what = c("peak_table", "chrom_list"),
  by = c("meta", "peak")
)

Arguments

peak_table

A peak_table object.

column

The name of the column containing the weights.

chrom_list

List of chromatograms for normalization. The samples must be in same order as the peak_table. If no argument is provided here, the function will try to find the chrom_list object used to create the provided peak_table.

what

'peak_table' or list of chromatograms ('chrom_list').

by

Whether to normalize by a column in sample metadata (meta) or by a column in the peak table itself (peak).

Value

A peak_table object where the peaks are normalized by the mass of each sample.

Author(s)

Ethan Bass

See Also

get_peaktable attach_metadata

Examples

data(pk_tab)
path <- system.file("extdata", "Sa_metadata.csv", package = "chromatographR")
meta <- read.csv(path)
pk_tab <- attach_metadata(peak_table = pk_tab, metadata = meta, column="vial")
norm <- normalize_data(pk_tab, "mass", what = "peak_table")

Peak Table Object

Description

S3 objects of class peak_table store chromatographic peak data along with associated metadata and reference spectra.

Components

A peak_table object contains the following elements:

  • tab: The peak table itself – a data.frame of intensities in a sample x peak configuration.

  • pk_meta: A data.frame containing peak meta-data (e.g., the spectral component, peak number, and average retention time).

  • sample_meta: A data.frame of sample meta-data. Must be added using attach_metadata.

  • ref_spectra: A data.frame of reference spectra (in a wavelength x peak configuration). Must be added using attach_ref_spectra.

  • args: A vector of arguments given to get_peaktable to generate the peak table.

Methods

The following methods are available:

See Also

get_peaktable() for creating peak_table objects

Examples

# See get_peaktable() for creation examples
# To access peak_table components:
data(pk_tab)
pk_tab$tab          # peak table
pk_tab$pk_meta      # peak metadata  
pk_tab$sample_meta  # sample metadata

Goldenrod peak table

Description

A peak table generated from the exemplary goldenrod root extracts stored in Sa.

Usage

data(pk_tab)

Format

A peak_table object.

See Also

Other data objects: Sa, Sa_pr, Sa_warp


Plot all spectra for chosen peak.

Description

Plot multiple for a given peak in peak table. Wrapper for plot_spectrum.

Usage

plot_all_spectra(
  peak,
  peak_table,
  chrom_list,
  idx = "all",
  chrs = NULL,
  engine = c("base", "ggplot2", "plotly"),
  plot_spectrum = TRUE,
  export_spectrum = TRUE,
  scale_spectrum = TRUE,
  overlapping = TRUE,
  verbose = FALSE,
  what = c("peak", "rt", "idx"),
  ...
)

Arguments

peak

The name of a peak to plot (in character format).

peak_table

The peak table (output from get_peaktable function).

chrom_list

A list of chromatograms in matrix format (timepoints x components). If no argument is provided here, the function will try to find the chrom_list object used to create the provided peak_table.

idx

Vector of chromatograms to plot.

chrs

Deprecated. Please use idx instead.

engine

Which plotting engine to use: base, ggplot2, or plotly.

plot_spectrum

Logical. If TRUE, plots the spectrum of the chosen peak.

export_spectrum

Logical. If TRUE, exports spectrum to console. Defaults to FALSE.

scale_spectrum

Logical. If TRUE, scales spectrum to unit height.

overlapping

Logical. If TRUE, plot spectra in single plot.

verbose

Logical. If TRUE, prints verbose output to console.

what

What to look for. Either peak to extract spectral information for a certain peak, rt to scan by retention time, or idx to scan by numeric index. Defaults to "peak" mode.

...

Additional arguments to plot_spectrum.

Value

If export_spectrum is TRUE, returns the spectra as a data.frame with wavelengths as rows and one column for each sample in the chrom_list encoding the absorbance (or normalized absorbance, if scale_spectrum is TRUE) at each wavelength. Otherwise, there is no return value.

Side effects

If plot_spectrum is TRUE, plots the spectra for the specified chromatogram (idx) of the given peak. The spectrum is a single row from the chromatographic matrix.

Author(s)

Ethan Bass

See Also

Other visualization functions: boxplot.peak_table(), mirror_plot(), plot.peak_list(), plot.peak_table(), plot_chroms(), plot_chroms_heatmap(), plot_spectrum(), scan_chrom()

Examples

data(Sa_warp)
pks <- get_peaks(Sa_warp, lambda = "220")
pk_tab <- get_peaktable(pks)
plot_all_spectra(peak = "V13", peak_table = pk_tab, overlapping = TRUE)

Plot traces from list of chromatograms.

Description

Plots the specified traces from a list of chromatograms.

Usage

plot_chroms(
  x,
  lambdas,
  idx,
  time_resolution = 0.01,
  time_unit = NULL,
  xlim = NULL,
  ylim = NULL,
  xlab = "",
  ylab = "Absorbance",
  engine = c("base", "ggplot", "plotly"),
  linewidth = 1,
  show_legend = FALSE,
  legend_position = "topright",
  title = "",
  ...
)

Arguments

x

A list of chromatograms in matrix format (timepoints x wavelengths).

lambdas

A character or numeric vector specifying the wavelengths to plot. For one-dimensional chromatograms, this argument can be ignored.

idx

A vector representing the names or numerical indices of the chromatograms to plot.

time_resolution

Time resolution for plot in minutes. Defaults to 0.01. Thinning the time axis dramatically improved speed when plotting many chromatograms.

time_unit

Time units of the provided chromatograms. Units will be detected automatically if possible from chromatogram metadata. If time_unit attribute is not present, the time units will default to to min.

xlim

Range of x axis.

ylim

Range of y axis.

xlab

X label.

ylab

Y label. Defaults to "Absorbance".

engine

Plotting engine. Either base (matplot), plotly, or ggplot.

linewidth

Line width.

show_legend

Logical. Whether to display legend or not. Defaults to FALSE.

legend_position

Position of legend.

title

Title for plot.

...

Additional arguments to plotting function specified by engine.

Value

No return value, called for side effects.

Side effects

Plots the traces of the specified chromatograms idx at the specified wavelengths lambdas. Plots can be produced using base graphics, ggplot2, or plotly, according to the value of engine.

Author(s)

Ethan Bass

See Also

Other visualization functions: boxplot.peak_table(), mirror_plot(), plot.peak_list(), plot.peak_table(), plot_all_spectra(), plot_chroms_heatmap(), plot_spectrum(), scan_chrom()

Examples

data(Sa_warp)
plot_chroms(Sa_warp, lambdas = 210)

Plot chromatograms as heatmap

Description

Plots the specified traces from a list of chromatograms as a heatmap.

Usage

plot_chroms_heatmap(
  chrom_list,
  idx = NULL,
  lambdas,
  engine = c("base", "ggplot", "plotly"),
  show_legend = TRUE,
  xlim = NULL,
  legend_position = "topright",
  title = "",
  show_ylabs = FALSE
)

Arguments

chrom_list

List of chromatograms to plot

idx

A vector representing the names or numerical indices of the chromatograms to plot.

lambdas

A character or numeric vector specifying the wavelengths to plot.

engine

Plotting engine. Either base (matplot), plotly, or ggplot.

show_legend

Logical. Whether to display legend or not. Defaults to TRUE.

xlim

Range of x axis.

legend_position

Position of legend.

title

Title for plot.

show_ylabs

Logical. Whether to show y labels. Defaults to FALSE.

Details

Adapted from plot.VPdtw.

Value

No return value, called for side effects.

Side effects

Plots the traces of the specified chromatograms idx at the specified wavelengths lambdas as a heatmap. Plots can be produced using base graphics engine, ggplot2, or plotly, according to the value of engine.

Author(s)

Ethan Bass

See Also

Other visualization functions: boxplot.peak_table(), mirror_plot(), plot.peak_list(), plot.peak_table(), plot_all_spectra(), plot_chroms(), plot_spectrum(), scan_chrom()

Examples

data(Sa_warp)
plot_chroms_heatmap(Sa_warp, lambdas = 210)

Plot spectrum from peak table

Description

Plots the trace and/or spectrum for a given peak or retention time in a peak_table object or a list of chromatograms.

Usage

plot_spectrum(
  loc = NULL,
  peak_table,
  chrom_list,
  idx = "max",
  lambda = "max",
  plot_spectrum = TRUE,
  plot_trace = TRUE,
  spectrum_labels = TRUE,
  scale_spectrum = FALSE,
  export_spectrum = FALSE,
  verbose = TRUE,
  what = c("peak", "rt", "idx", "click"),
  engine = c("base", "plotly", "ggplot2"),
  chr = NULL,
  ...
)

Arguments

loc

The name of the peak or retention time for which you wish to extract spectral data.

peak_table

The peak table (output from get_peaktable).

chrom_list

A list of chromatograms in matrix format (timepoints x wavelengths). If no argument is provided here, the function will try to find the chrom_list object used to create the provided peak_table.

idx

Numerical index of chromatogram you wish to plot, or "max" to automatically plot the chromatogram with the largest signal at the given peak or retention time.

lambda

The wavelength you wish to plot the trace at if plot_trace == TRUE and/or the wavelength to be used for the determination of signal abundance.

plot_spectrum

Logical. If TRUE, plots the spectrum of the chosen peak. Defaults to TRUE.

plot_trace

Logical. If TRUE, plots the trace of the chosen peak at lambda. Defaults to TRUE.

spectrum_labels

Logical. If TRUE, plots labels on maxima in spectral plot. Defaults to TRUE.

scale_spectrum

Logical. If TRUE, scales spectrum to unit height. Defaults to FALSE.

export_spectrum

Logical. If TRUE, exports spectrum to console. Defaults to FALSE.

verbose

Logical. If TRUE, prints verbose output to console. Defaults to TRUE.

what

What to look for. Either peak to extract spectral information for a certain peak, rt to scan by retention time, idx to scan by numeric index, or click to manually select retention time by clicking on the chromatogram. Defaults to "peak" mode.

engine

Which plotting engine to use: base, ggplot2, or plotly.

chr

Deprecated. Please use idx instead.

...

Additional arguments.

Details

Can be used to confirm the identity of a peak or check that a particular column in the peak table represents a single compound. Retention times can also be selected by clicking on the plotted trace if what == 'click'. Plots can be produced using either base R graphics, ggplot2, or plotly, according to the value of the engine argument.

Value

If export_spectrum is TRUE, returns the spectrum as a data.frame with wavelengths as rows and a single column encoding the absorbance (or normalized absorbance, if scale_spectrum is TRUE) at each wavelength. If export_spectrum is FALSE, the output depends on the plotting engine. If engine == "plotly", returns a plotly object containing the specified plots. Otherwise, if engine == "base", there is no return value.

Side effects

  • If plot_trace is TRUE, plots the chromatographic trace of the specified chromatogram (idx), at the specified wavelength (lambda) with a dotted red line to indicate the retention time given by loc. The trace is a single column from the chromatographic matrix.

  • If plot_spectrum is TRUE, plots the spectrum for the specified chromatogram at the specified retention time. The spectrum is a single row from the chromatographic matrix.

Author(s)

Ethan Bass

See Also

Other visualization functions: boxplot.peak_table(), mirror_plot(), plot.peak_list(), plot.peak_table(), plot_all_spectra(), plot_chroms(), plot_chroms_heatmap(), scan_chrom()

Examples

data(Sa)
pks <- get_peaks(Sa, lambda = "220.00000")
pk_tab <- get_peaktable(pks)
oldpar <- par(no.readonly = TRUE)
par(mfrow = c(2, 1))
plot_spectrum(loc = "V10", peak_table = pk_tab, what = "peak")
par(oldpar)

Plot fitted peak shapes.

Description

Visually assess integration accuracy by plotting fitted peaks over trace.

Usage

## S3 method for class 'peak_list'
plot(
  x,
  ...,
  chrom_list,
  idx = 1,
  lambda = NULL,
  points = FALSE,
  ticks = FALSE,
  a = 0.5,
  color = NULL,
  cex.points = 0.5,
  numbers = FALSE,
  cex.font = 0.5,
  y.offset = 25,
  plot_purity = FALSE,
  res,
  index = NULL
)

Arguments

x

A peak_list object. Output from the get_peaks function.

...

Additional arguments to main plot function.

chrom_list

List of chromatograms (retention time x wavelength matrices)

idx

Index or name of chromatogram to be plotted.

lambda

Wavelength for plotting.

points

Logical. If TRUE, plot peak maxima. Defaults to FALSE.

ticks

Logical. If TRUE, mark beginning and end of each peak. Defaults to FALSE.

a

Alpha parameter controlling the transparency of fitted shapes.

color

The color of the fitted shapes.

cex.points

Size of points. Defaults to 0.5

numbers

Whether to number peaks. Defaults to FALSE.

cex.font

Font size if peaks are numbered. Defaults to 0.5.

y.offset

Y offset for peak numbers. Defaults to 25.

plot_purity

Whether to add visualization of peak purity.

res

time resolution for peak fitting

index

This argument is deprecated. Please use idx instead.

Value

No return value, called for side effects.

Side effects

Plots a chromatographic trace from the specified chromatogram (chr) at the specified wavelength (lambda) with fitted peak shapes from the provided peak_list drawn underneath the curve.

Author(s)

Ethan Bass

See Also

get_peaks

Other visualization functions: boxplot.peak_table(), mirror_plot(), plot.peak_table(), plot_all_spectra(), plot_chroms(), plot_chroms_heatmap(), plot_spectrum(), scan_chrom()

Examples

data(Sa_warp)
pks <- get_peaks(chrom_list = Sa_warp[1], lambdas = 210)
plot(pks, points = TRUE, ticks = TRUE)

Plot spectrum from peak table

Description

Plots the trace and/or spectrum for a given peak in peak table.

Usage

## S3 method for class 'peak_table'
plot(
  x,
  loc,
  chrom_list,
  what = "peak",
  idx = "max",
  lambda = "max",
  plot_spectrum = TRUE,
  plot_trace = TRUE,
  box_plot = FALSE,
  vars = NULL,
  spectrum_labels = TRUE,
  scale_spectrum = FALSE,
  export_spectrum = FALSE,
  verbose = TRUE,
  engine = c("base", "plotly", "ggplot"),
  chr = NULL,
  ...
)

Arguments

x

The peak table (output from get_peaktable function).

loc

A vector specifying the peak(s) or retention time(s) that you wish to plot.

chrom_list

A list of chromatograms in matrix format (timepoints x wavelengths). If no argument is provided here, the function will try to find the chrom_list object used to create the peak_table.

what

What to look for. Either peak to extract spectral information for a certain peak, rt to scan by retention time, or click to manually select retention time by clicking on the chromatogram. Defaults to peak.

idx

Numerical index of chromatogram you wish to plot; "max" to plot the chromatogram with the largest signal; or "all" to plot spectra for all chromatograms.

lambda

The wavelength you wish to plot the trace at (if plot_chrom is TRUE and/or the wavelength to be used for the determination of signal abundance.

plot_spectrum

Logical. If TRUE, plots the spectrum of the chosen peak. Defaults to TRUE.

plot_trace

Logical. If TRUE, plots the trace of the chosen peak at lambda. Defaults to TRUE.

box_plot

Logical. If TRUE, plots box plot using categories defined by vars.

vars

Independent variables for boxplot. Righthand side of formula.

spectrum_labels

Logical. If TRUE, plots labels on maxima in spectral plot. Defaults to TRUE.

scale_spectrum

Logical. If TRUE, scales spectrum to unit height. Defaults to FALSE.

export_spectrum

Logical. If TRUE, exports spectrum to console. Defaults to FALSE.

verbose

Logical. If TRUE, prints verbose output to console. Defaults to TRUE.

engine

Which plotting engine to use: either base or plotly.

chr

Deprecated. Please use idx instead.

...

Additional arguments to boxplot.

Details

Can be used to confirm the identity of a peak or check that a particular column in the peak table represents a single compound. Can also be used to create simple box-plots to examine the distribution of a peak with respect to variables defined in sample metadata.

Value

If export_spectrum is TRUE, returns the spectrum as a data.frame with wavelengths as rows and columns encoding the absorbance (or normalized absorbance, if scale_spectrum is TRUE) for the specified sample(s). Otherwise, there is no return value.

Side effects

If plot_trace is TRUE, plots the chromatographic trace of the specified chromatogram (idx), at the specified wavelength (lambda) with a dotted red line to indicate the retention time given by loc. The trace is a single column from the chromatographic matrix.

If plot_spectrum is TRUE, plots the spectrum for the specified chromatogram at the specified retention time. The spectrum is a single row from the chromatographic matrix.

If box_plot is TRUE, produces a boxplot from the specified peak with groups provided by vars.

Author(s)

Ethan Bass

See Also

Other visualization functions: boxplot.peak_table(), mirror_plot(), plot.peak_list(), plot_all_spectra(), plot_chroms(), plot_chroms_heatmap(), plot_spectrum(), scan_chrom()


Plot PTW alignments

Description

Plots ptw alignments.

Usage

## S3 method for class 'ptw_list'
plot(
  x,
  what = c("traces", "heatmap"),
  engine = c("base", "ggplot", "plotly"),
  lambdas,
  show_legend = TRUE,
  ...
)

Arguments

x

A ptw_list object created by correct_rt.

what

What type of plot to return. Either traces or heatmap.

engine

What plotting engine to use. Either base, ggplot or plotly

lambdas

Which lambdas to plot.

show_legend

Logical. Whether to include sample legend.

...

Additional arguments (placeholder).

Value

No return value, called for side effects.

Side effects

Plots PTW alignments at the specified wavelength (lambda) either as individual traces or as a heatmap, according to the value of what. The plot can be produced using either base R graphics, ggplot2, or plotly, according to the value of engine.

Author(s)

Ethan Bass

Examples

data(Sa_pr)
warp <- correct_rt(chrom_list = Sa_pr, what = "models", lambdas = 210)
plot(warp)

Preprocess time/wavelength data

Description

Standard pre-processing of response matrices, consisting of a time axis and a spectral axis (e.g. HPLC-DAD/UV data). For smooth data, like UV-VIS data, the size of the matrix can be reduced by interpolation. By default, the data are baseline-corrected in the time direction (baseline.corr) and smoothed in the spectral dimension using cubic smoothing splines (smooth.spline).

Usage

preprocess(
  X,
  dim1,
  dim2,
  remove.time.baseline = TRUE,
  spec.smooth = TRUE,
  maxI = NULL,
  interpolate_rows = TRUE,
  interpolate_cols = TRUE,
  cl = 2,
  show_progress = NULL,
  outlier_cutoff = 5/60,
  ...
)

Arguments

X

A numerical data matrix, or list of data matrices. Missing values are not allowed. If rownames or colnames attributes are used, they should be numerical and signify time points and wavelengths, respectively.

dim1

A new, usually shorter, set of time points (numerical). The range of these should not exceed the range of the original time points.

dim2

A new, usually shorter, set of wavelengths (numerical). The range of these should not exceed the range of the original wavelengths.

remove.time.baseline

Logical, indicating whether baseline correction should be done in the time direction, according to baseline.corr. Default is TRUE.

spec.smooth

Logical, indicating whether smoothing should be done in the spectral direction, according to smooth.spline. Default is TRUE.

maxI

if given, the maximum intensity in the matrix is set to this value.

interpolate_rows

Logical. Whether to interpolate along the time axis (dim1). Defaults to TRUE.

interpolate_cols

Logical. Whether to interpolate along the spectral axis (dim2). Defaults to TRUE.

cl

Argument to pblapply or mclapply. Either an integer specifying the number of clusters to use for parallel processing or a cluster object created by makeCluster. Defaults to 2. On Windows integer values will be ignored.

show_progress

Logical. Whether to show progress bar. Defaults to TRUE if pbapply is installed.

outlier_cutoff

Threshold for excluding chromatograms that end prematurely. Chromatograms ending more than this value (in seconds) before the median end time are excluded. Default is 5 seconds. Only applies when dim1 is not specified.

...

Further optional arguments to baseline.corr.

Value

The function returns the preprocessed data matrix (or list of matrices), with row names and column names indicating the time points and wavelengths, respectively.

Note

Adapted from the preprocess function in the alsace package by Ron Wehrens.

Author(s)

Ethan Bass

References

  • Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. Fast parametric time warping of peak lists. Bioinformatics 31:3063-3065. doi:10.1093/bioinformatics/btv299.

  • Wehrens, R., Carvalho, E., Fraser, P.D. 2015. Metabolite profiling in LC–DAD using multivariate curve resolution: the alsace package for R. Metabolomics 11:1:143-154. doi:10.1007/s11306-014-0683-5.

Examples

data(Sa)
new.ts <- seq(10,18.66,by=.01) # choose time-points
new.lambdas <- seq(200, 318, by = 2) # choose wavelengths
Sa_pr <- preprocess(Sa[[1]], dim1 = new.ts, dim2 = new.lambdas)

Reshape chromatograms

Description

Reshapes a list of chromatograms from wide to long format.

Usage

reshape_chroms(
  x,
  idx,
  time_resolution = NULL,
  sample_var = "sample",
  lambdas = NULL,
  rts = NULL,
  transfer_metadata = FALSE
)

Arguments

x

A list of chromatographic matrices in wide format.

idx

Indices of chromatograms to convert.

time_resolution

Time resolution for plot. This argument can be used to thin the time axis while reshaping. By default the time resoution is not altered.

sample_var

String with name of new column containing sample IDs.

lambdas

Vector specifying wavelength(s) to include.

rts

Vector specifying retention times to include.

transfer_metadata

Logical. Whether to transfer metadata attributes or not. Defaults to FALSE.

Value

A list of chromatographic matrices in long format.

Author(s)

Ethan Bass

See Also

Other utility functions: combine_peaks(), get_lambdas(), get_times(), merge_peaks(), reshape_peaktable()


Reshape peaktable

Description

Reshapes peak table from wide to long format

Usage

reshape_peaktable(x, peaks, metadata, fixed_levels = TRUE)

Arguments

x

A peak_table object.

peaks

A character vector specifying the peaks to include. If the character vector is named, the names of the vector elements will be used in place of the original peak names.

metadata

A character vector specifying the metadata fields to include.

fixed_levels

Logical. Whether to fix factor levels of features in the order provided. Defaults to TRUE.

Value

A data.frame containing the information for the specified peaks in long format.

Author(s)

Ethan Bass

See Also

Other utility functions: combine_peaks(), get_lambdas(), get_times(), merge_peaks(), reshape_chroms()


Raw goldenrod root chromatograms

Description

A list of four HPLC-DAD data matrices of Solidago altissima roots extracted in 90% methanol. Retention times are stored in rows and wavelengths are stored in columns. Data were collected on a Agilent 1100 HPLC.

Usage

data(Sa)

Format

A list of four matrices (1301 times x 60 wavelengths).

See Also

Other data objects: Sa_pr, Sa_warp, pk_tab


Preprocessed goldenrod root chromatograms

Description

A list of four pre-processed HPLC-DAD chromatograms derived from the raw data stored in Sa. Retention times are stored in rows and wavelengths are stored in columns. The time axis is compressed to save space and processing time so the data are a little choppy.

Usage

data(Sa_pr)

Format

A list of four pre-processed matrices (434 retention times x 60 wavelengths).

See Also

Other data objects: Sa, Sa_warp, pk_tab


Warped goldenrod root chromatograms.

Description

A list of four pre-processed and warped goldenrod root chromatograms derived from the raw data stored in Sa.

Usage

data(Sa_warp)

Format

A list of four pre-processed and warped matrices (434 times x 60 wavelengths).

See Also

Other data objects: Sa, Sa_pr, pk_tab


Plot spectra by clicking on the chromatogram.

Description

Plot spectra by clicking on the chromatogram.

Usage

scan_chrom(
  chrom_list,
  idx,
  lambda,
  plot_spectrum = TRUE,
  peak_table = NULL,
  scale_spectrum = FALSE,
  spectrum_labels = TRUE,
  export_spectrum = FALSE,
  chr = NULL,
  ...
)

Arguments

chrom_list

A list of chromatograms in matrix format (timepoints x wavelengths). If no argument is provided here, the function will try to find the chrom_list object used to create the provided peak_table.

idx

Numerical index of chromatogram you wish to plot.

lambda

The wavelength to plot the trace at.

plot_spectrum

Logical. Whether to plot the spectrum or not.

peak_table

The peak table (output from get_peaktable function).

scale_spectrum

Logical. If TRUE, scales spectrum to unit height. Defaults to FALSE.

spectrum_labels

Logical. If TRUE, plots labels on maxima in spectral plot. Defaults to TRUE.

export_spectrum

Logical. If TRUE, exports spectrum to console. Defaults to FALSE.

chr

Deprecated. Please use idx instead.

...

Additional arguments.

Value

If export_spectrum is TRUE, returns the spectrum as a data.frame with wavelengths as rows and a single column encoding the absorbance (or normalized absorbance, if scale_spectrum is TRUE) at each wavelength. Otherwise, there is no return value.

Side effects

Plots a chromatographic trace from the specified chromatogram (idx), at the specified wavelength (lambda) with a dotted red line to indicate the user-selected retention time. The trace is a single column from the chromatographic matrix.

If plot_spectrum is TRUE, plots the spectrum for the specified chromatogram at the user-specified retention time. The spectrum is a single row from the chromatographic matrix.

Author(s)

Ethan Bass

See Also

Other visualization functions: boxplot.peak_table(), mirror_plot(), plot.peak_list(), plot.peak_table(), plot_all_spectra(), plot_chroms(), plot_chroms_heatmap(), plot_spectrum()

Examples

data(Sa_pr)
scan_chrom(Sa_pr, lambda = "210", idx = 2, export_spectrum = TRUE)

Export peak table

Description

Exports peak table in csv or xlsx format according to the value of format.

Usage

write_peaktable(
  peak_table,
  path,
  filename = "peak_table",
  format = c("csv", "xlsx"),
  what = c("tab", "pk_meta", "sample_meta", "ref_spectra", "args")
)

Arguments

peak_table

Peak table object from get_peaktable.

path

Path to write file.

filename

File name. Defaults to "peak_table".

format

File format to export. Either csv or xlsx.

what

Which elements of the peak_table to export.

Value

No return value. The function is called for its side effects.

Side effects

Exports peak_table object as .csv or .xlsx file according to the value of format.

Examples

data(pk_tab)
path_out = tempdir()
write_peaktable(pk_tab, path = path_out, what = c("tab"))