Description: Creates a library of spectra with known peptide identifications. Typically, these identifications are done with a database search such as SEQUEST or Mascot, sometimes followed by an evaluation step such as percolator or Peptide Prophet. BlibBuild accepts files from a variety of database search programs. File formats are identified by file extension, which are given in the table below. In many cases, the peptide identification (peptide sequence, charge state and optional score) are in a separate file from the spectrum information. Unless noted, it is assumed that both files will be in the same directory.
Database search Peptide ID file extension Spectrum file extension Notes SEQUEST / Percolator .perc.xml (.sqt) .cms2, .ms2 Percolator v1.17 does not include sequence modification information therefore the .sqt file from the SEQUEST search must be present in the same directory. Peptide Prophet .pep.xml, .pep.XML, .pepXML .mzXML The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. Spectrum Mill .pep.xml, .pep.XML, .pepXML .mzXML, .mzML The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. OMSSA .pep.xml, .pep.XML, .pepXML .mzXML, .mzML The names of the .mzXML files are given in the .pep.xml file and may be in the parent or grandparent directory. X! Tandem .xtan.xml No separate spectrum file. Mascot .dat No separate spectrum file. Windows only. Protein Pilot .group.xml No separate spectrum file. ID Picker (Myrimatch) .idpXML .mzXML, .mzML The name(s) of the spectrum file(s) are given in the .idpXML file Scaffold .mzid .MGF MSe final_fragment.csv There need not be a . before 'final_fragment'. generic .ssl This generic format is provided for peptide identifications made by other means. See the file formats page for a description. Usage:
BlibBuild [options] <peptide id file>[+] <library name>
Input:
<peptide id file>
– A file containing peptide spectrum matches to be included in the library. The associated spectrum files should be in the same directory as the peptide id file but should not be given on the command line. See the above table for recognized formats. Multiple files may be listed together.<library name>
– The name of the library being created. An existing library may be overwriten or added to.Output: A spectrum library in in sqlite3 format.
Options:
-o
Overwrite existing library. Default append.-s
Result file names from stdin. (e.g. ls *pep.xml | BlibBuild -s new.blib)-q
<max score> Maximum FDR for accepting results from Percolator (.sqt or .perc.xml) files. Default 0.01.-p
<min score> Minimum probability for accepting results from PeptideProphet (.pep.xml) files. Default 0.95.-e
<max score> Maximum expectation value for accepting results from Mascot (.dat) files. Default 0.05-t
<max score> Maximum expectation value for accepting results from X! Tandem (.xtan.xml) files. Default 0.1-f
<min score> Minimum Scaffold Peptide Probability for accepting results from .mzid files. Default 0.95-w
<min score> Minimum Waters MSe peptide score for accepting results from final_fragment.csv files. Default 6.-E
<max score> Maximum expectation value for accepting results from OMSAA .pep.xml files. Default 0.00001-v
<level> Level of output to stderr (silent, error, status, warn). Default status.-L
Write status and warning messages to log file.-m
<size> SQLite memory cache size in Megs. Default 250M.-l
<level> ZLib compression level (0-?). Default 3.-i
<library_id> LSID library ID. Default uses file name.-a
<authority> LSID authority. Default proteome.gs.washington.edu.