Hardklör

Documentation Summary

Description: Find isotope distributions and compute monoisotopic mass and charge state.
Usage: hardklor <MS1 file> <output file> [parameters]
hardklor -conf <config file>
Input:
  • <MS1 file> - File containing spectra data. File format will be referenced in the future.
  • <config file> - Configuration file containing user-defined parameters and batch instructions.
Output: The output file contains a list of peptide/protein isotope distributions, their monoisotopic masses, charge states, and other helpful information. The output is in text format.

Parameter Summary

Because of the number of configurable parameters, it is recommended that users learn to set up and use configuration files. Detailed explanations of configuration files can be found on this page.

  • -a <algorithm>
  • Chooses the algorithm for analyzing combinations of multiple peptide/protein isotope distributions. There are five algorithms to choose from:
    Basic - Computes all combinatorial possibilities and returns the combination with the highest score.
    FewestPeptides - Computes increasing depths of combinations until the score threshold is exceeded. The smallest combination exceeding the threshold is returned, preventing "over-fitting" of the data.
    FastFewestPeptides - Same as the FewestPeptides algorithm, but trades memory usage for speed. Use this method if there is sufficient memory on the system.
    FewestPeptidesChoice - Same as the FewestPeptides algorithm, but adds a heuristic to evaluate if further combinatorial analysis would produce a better score. This method can dramatically improve speed, but may not be as accurate.
    FastFewestPeptidesChoice - Same as the FewestPeptidesChoice algorithm, but trades memory usage for speed. Use this method if there is sufficient memory on the system.
    The default setting is Basic.
  • -cdm <char>
  • Chooses the charge state determination method. There are five methods to choose from:
    B - Basic method, assume all charge states are possible.
    F - Fast Fourier Transform.
    P - Patterson algorithm.
    Q - QuickCharge method, uses inverse peak distances.
    S - Senko method, or combined Fast Fourier Transform and Patterson algorithm.
    The default setting is B.
  • -chMin <int>
  • Sets the minimum charge state to look for when analyzing a spectrum. The default value is 1.
  • -chMax <int>
  • Sets the maximum charge state to look for when analyzing a spectrum. The default value is 3.
  • -corr <float>
  • Sets the correlation threshold to accept a predicted isotope distribution. Valid values are any decimal value between 0.0 and 1.0, inclusive. The default value is 0.90.
  • -d <int>
  • Sets the depth of combinatorial analysis. This is the maximum number of protein/peptide distributions that can be combined to estimate the observed data at any given spectrum segment. The default value is 3.
  • -hdat <Hardklör DAT file>
  • Gives the full path and file name of the Hardklör data file (typically Hardklor.dat) to be used in the analysis. This flag must be set globally at the top of a config file if the data file to be used is not in the working directory.
  • -i
  • Sets peak detection to intersection mode. Spectra are analyzed for peaks in overlapping segments. When intersection mode is set, peaks are only accepted if they appear in two overlapping segments. Intersection is turned on by default.
  • -m <modification>
  • Includes alternative averagine models in the analysis that incorporate additional atoms and/or isotopic enrichments. Modifications are represented as text strings. Inclusion of additional atoms in the model is done using by entering an atomic formula,such as: PO2 or Cl. Inclusion of isotopic enrichment to the model is done by specifying the percent enrichment (as a decimal) followed by the atom being enriched and an index of the isotope. For example, 0.75H1 specifies 75% enrichment of the first heavy isotope of hydrogen. In other words, 75% deuterium enrichment. Two or more modifications can be combined into the same model, and separated by spaces: B2 0.5B1 This parameter can also be used redundantly to include multiple alternative averagine models in a single analysis.
  • -mdat <Mercury DAT file>
  • Gives the full path and file name of the Mercury data file (typically ISOTOPE.DAT) to be used in the analysis. This flag must be set globally at the top of a config file if the data file to be used is not in the working directory.
  • -mF <Filter Code>
  • Sets a filter for mzXML files. If you want to analyze only the MS2 scans in your mzXML file, specify -mF MS2. Valid values are MS1, MS2, MS3.
  • -nb
  • Specifies "no base" averagine. Only modified averagine models will be used in the analysis.
  • -p <int>
  • Sets the maximum number of peptides or proteins that are estimated from the peaks found in a spectrum segment. The default value is 10.
  • -res <double> <MS>
  • Sets the resolution of the observed spectra at m/z 400. The user must specify a resolution followed by a mass spectrometer code. Valid codes are FTICR, OrbiTrap, TOF, and QIT. The default settings are 100000 FTICR.
  • -s <int>
  • Applies polynomial Savitsky-Golay smoothing of the mass spectra prior to analysis. The integer supplied with the flag sets the width of the smoothing window. A larger width makes smoother peaks, but has more alteration of peak intensity. By default there is no smoothing.
  • -sc <int> <int>
  • Performs analysis on a specific spectrum or set of spectra in the input file. The user specifies the specta by scan number. The user may specify a single spectrum, ex: -s 523, or a range of spectra, ex: -s 300 500.
  • -sl <int>
  • Sets the sensitivity level. There are four levels, 0 (low), 1 (moderate), 2 (high), and 3 (max). Increasing the sensitivity may increase computation time. The default value is 1.
  • -sn <float>
  • Sets the signal-over-noise threshold. Any integer or decimal value greater than or equal to 0.0 is valid. The default value is 3.0.
  • -snWin <float>
  • Sets the signal-over-noise window length (in m/z). Because noise may be non-uniform across a spectra, this value adjusts the segment size considered when calculating a signal-over-noise ratio. The default value is 50.0.
  • -u
  • Sets peak detection to union mode. Spectra are analyzed for peaks in overlapping segments. When union mode is set, peaks are accepted regardless of whether they appear in one segment or two overlapping segments. Union is turned off by default.
  • -w <double> <double>
  • Narrows analysis to only a small window in each segment (in m/z). The user must specify the starting and ending m/z values between which the analysis will be performed. By default the whole spectrum is analyzed.
  • -win <float>
  • Sets the maximum width of any set of peaks in a spectrum when computing the results (in m/z). Thus, if the value was 5.0, then sets of peaks greater than 5 m/z are divided into smaller sets prior to analysis. The default value is 5.0.


    Hardklör is Copyright ©2007 University of Washington. All rights reserved. Written by Michael R. Hoopmann, Michael J. MacCoss, in the Department of Genome Sciences at the University of Washington.