Table of Contents
On this Page
Customization: how to adjust hermie parameters
Hermie is quite flexible and can be adjusted to meet your needs. This section covers how to change the inputs to each step of the pipeline including the organism being searched, the fasta file used by the database search, the parameter files used, and the DTASelect criteria. All parameters can be given on the command line. Sometimes this leads to very long, cumbersome commands that you might not want to type more than once. For this reason, all parameters are also controlled by what is called the search mode. Creating and using custom search modes is described below.
Custom Search Modes and Organisms
A search mode is a text file containing a list of the desired options and their arguments. All of the available options can be found with the command
$ hermie -help
-Smn
7). See the documentation
for more details on the function of each option.
Syntax. When creating a custom mode file, you might want to refer to one of the default modes which are found at /net/maccoss/vol2/software/pipeline/modes/. The basic syntax is to write an option name followed by its argument, if any. Options may be separated by any white-space (space, tab, newline). Comments may be added to the file by beginning the line with a pound sign (#).
Multiple options. Most of the options should be
specified only once. If they are included twice, the value of the
last occurrence will be used. If they are specified in both the mode
file and on the command line, the command-line value will be
used. There is an exception. Six options may be given multiple times.
These are --dta-sequest, --dta-library, --cz, --lib,
--perc-option, --crux-option. With these, you may specify a set
of options to be passed to programs run within hermie. Yes, this is slightly confusing; you are
using a hermie option to specify, say, a DTASelect option. The general form is
--program option=argument
program indicates which program this option is for,
option is the option as it would be entered on the
command line, and argument is the option's argument, if
there is one. For instance, for hermie to run $ DTASelect -Smn 7 -o -a falseyou would include the following in the mode file
--dta-sequest -Smn=7 --dta-sequest -o= --dta-sequest -a=false
Custom organisms. Many model organisms are already established in the defaults. To get a list of available organisms run
$ hermie -list-organisms
other and include the -fasta, -decoy (for
percolator), and -library options in the search mode.
These options can only be used with the organism other,
otherwise they will be overridden by the default sources.
Using your custom mode. Once you have written your custom mode file, make sure it is located in a directory in your $MODEPATH (see Understanding $MODEPATH). To use your new mode named, for example, platypus.mode use the command
$ hermie other platypus.mode my.ms2
return to top
Custom sequest.params
The default sequest.params file used for the SEQUEST search will meet most people's needs. However, if you want to include modifications, change the number of matches reported, or make other adjustments to the SEQUEST search, you will need to provide a new sequest.params file and make sure hermie knows where to look for it. Follow these steps.- Write the sequest.params file with the options you want. If you are bold enough to do this, I'll assume you know what to change. The file can have any name (SEQUEST requires that it is called "sequest.params", but hermie will make a copy of it and correct the name, if necessary.)
- Move the file to a convenient location. "Convenient" can mean whatever you like. If you expect to do this sort of SEQUEST search only once, you might want to put the file in the directory where you will start the hermie run. Or if you plan on doing all your searches this way, you might want to put it in your home directory or some other appropriate location (e.g. ~/othermodes).
- Add the option
-seq-paramsand the name of your file to your hermie command or to your custom mode file. - Check your configuration. Move to the directory where you plan
on running hermie and check the setup. The
command might look like this
Now look in pipeline/sequest/ and read the sequest.params file to make sure that looks right.
$ hermie -check -seq-params ~/param_files/phospho.params yeast standard
-seq-params, the first
file named sequest.params in your $MODEPATH will be usedDynamic Modifications
For SEQUEST, dynamic modifications are specified in the sequest.params file (see Custom sequest.params). They are given as a mass shift and a list of residues that could be modified. With percolator version 2, modifications must also be specified by their Unimod number. Hermie requires that you provide both of these values.
-
Define your modifciations in a sequest.params file. To search for phosphorylation,
the line might look like this
diff_search_options = 79.966 STY 0.0 X 0.0 X 0.0 X 0.0 X 0.0 X
-
Find the Unimod number for the modification. Look at the
summary table to find your mod. The table includes the ID number, name,
mass shift (monoisotopic and average masses) and a list of residues that might be
affected. Try searching for the name or for the mass shift value. Phospho is ID
21.
NOTE: If you forget the
modsoption, percolator will exit with an error. Rerun hermie, skipping all steps before percolator, and add the-modsoption then. -
Include with your hermie command the
-modsoption followed by the Unimod ID. For our example$ hermie -mod 21 -seq-params myseq.params yeast standard-perc -
If you specify more than one modifciation, they must be given in the same order
in the sequest.params as they are with the
-modsparam. For example, this line in a sequest.paramsdiff_search_options = 79.966 STY 15.99 M 188.032956 K 0.0 X 0.0 X 0.0 X
would be accompanied by this option-mods 21,35,42
Understanding $MODEPATH
The environment variable $MODEPATH is a list of directories where hermie will look for the search mode and the sequest.params file. It can be changed for every run you do, or you can set it to always be the same. To find out your current value of $MODEPATH, do
$ echo $MODEPATH
.:~/modsearches:~/modes:/net/maccoss/vol2/software/pipeline/modes/
.
For whatever value you choose, remember that hermie will look in those locations in the order
they are listed and stop once it finds a file with the right name.
Syntax for setting the value. To set any environment variable in a bash shell, use the command
$ export VARNAME=newvalue
VARNAME or
newvalue and the equal sign (=). In our case
VARNAME is MODEPATH. The
newvalue is a path or a list of paths separated by colons
such as .:~/my/new/path:/another. You can add new paths
to the current value by including $MODEPATH in the list
of paths. For instance, to add a new path to the end of the list, use
the command
$ export MODEPATH=$MODEPATH:~/new/path
$ export MODEPATH=~/new/path:$MODEPATH
Set the value temporarily. You can set the value
for $MODEPATH (or any environment variable) temporarily for the shell
you are currently working in by using the export command
described above. Once you close the shell, the value disappears.
Set the value for all runs. You may set the value
of $MODEPATH for every bash shell you open by putting the
export command in the file ~/.bashrc. To do this, simply open the file in a
text editor (vim, Emacs, etc.), add the
command anywhere in the file (although you might want to put it near
the definitions of other environment variables), save the changes, and
close the file. You will have to open a new shell to see the changes.
Running SEQUEST on other clusters
When you log on to proteome and run hermie from there, the processor queue available is specific to the MacCoss lab (nodes named m001-m008). Genome Sciences also hosts a cluster that is available to everyone in the department. You may run hermie from there with a few minor modifications. Follow these steps.
- Log on to sage.gs.washington.edu. This is the master node for the cluster.
- DO NOT run hermie from sage. Instead, start an interactive session
on one of the cluter nodes with the
qlogincommand. - Your environment will be different than on proteome. Set
$PATHto include the location of hermie and all of its componenet parts.$ export PATH=$PATH:/net/maccoss/vol2/software/bin64:/net/maccoss/vol2/software/bin - Run the hermie command as usual, using
nohupto ensure that the it runs after you exit your interactive session. - (Optional) If you have permissions to run on the quartz queue
(owned by the UW Proteome Resource), you can add the option
--queue quartz.qto run on those processors. Log on to tephra.gs to access that queue. - Once everything is running as expected, log out.
return to top