Submission Parameters
The Submission Window
This is the submission window; we will explain in detail each part of the submission process. Note that here you can set the job title without waiting to do it in the job list. You also can set the actual parameters as the default parameters at any time.
Define the Search Space
Database(s) : Select one or several databases to use for the search.
Taxonomy : Click on "Edit" to change the taxonomy (read more)
More filters : You can futher restrict the search by clicking on "more filters..." (read more)
Enzyme : Specify the enzyme you used to generate your peptides (see the list and cleavage rules here).
Modification(s) : Click on "Edit..." to specify the chemical modifications (read more).
Annotated PTM : Allows you to generate theoretical peptides using the Swiss-Prot information on PTMs.
Missed cleavage : Select the number of missed cleavages allowed.
Resolution : Specify the isotopic resolution of the experimental masses. The theoretical masses of the peptides will be calculated accordingly.
Ion mode : Specify the charge state of the peptides:
- Protonated molecular ions, [M+H]+.
- Deprotonated molecular ions, [M-H]-.
- Molecular mass data, [M].
The calibration of the spectrometer is assimilated to an affine function (a line). You have to define thresholds for this calibration: shift and slope max (see the principles here). You can set Shift or Slope to zero. If you use both, the link between shift and slope is the logical OR.
Shift max : Defines the maximum difference that you allow between a theoretical peptide and an experimental peak in Daltons (absolute value).
Slope max : Defines the maximum difference that you allow between a theoretical peptide and an experimental peak in ppm (relative value "parts per million").
Internal error : The program will find the best set of aligned hits amongst all possible hits (an experimental peak matching a theoretical peptide). You must define the tolerance of the alignment, which is the error (in ppm) allowed to each hit to deviate from the alignment.
Number of hits : Hit = an experimental peak matching a theoretical peptide. If a protein contains fewer hits than this threshold, it will automatically be discarded.
Statistics
To know if an identification is relevant or not, you can choose to compare results against random scores. Random sequences are used in order to compare the best random score with the score of the proteins found. The number of random sequences generated are the same as the number of proteins "in range" in the database. The significancy of a result, based on the best random score, will differ with the size of the searching database. It makes sense that a given score can be significant while searching in a very small database and can be insignificant searching in a larger one.
Output Display
Define the maximum number of proteins to be displayed.
Select Your Peak Lists
Click the "Add..." button to add your peak lists. (See Peak Lists.) Then you can edit for each peak list the name, pI and Mw if known. Default name is the filename from which it comes. If the file contains several peak lists, the location index is added at the end of the name.
Edit Taxonomy
You can select a predefined taxon or define a taxId combination using the NCBI taxonomy (see Newt). Each predefined taxon or taxId represent a sub tree in the taxonomy: for example 40674 represents all the species inside the Mammalia hierarchy.
Protein / More Filters
In addition to the restriction on the databases and the taxonomy, you can restrict the search by defining the Mw and pI range, specific AC(s) or Swiss-Prot keywords.
Peptide / Enzyme, List and Cleavage Rules
| Enzyme or Reagent |
Cleaves Where? |
Exceptions |
| Trypsin |
C-terminal side of K or R |
if P is C-term to K or R |
| Trypsin (C-term to K/R, even before P) |
C-terminal side of K or R |
|
| Trypsin (higher specificity) |
C-terminal side of K or R |
if P is C-term to K or R; after K in CKY, DKD, CKH, CKD, KKR; after R in RRH, RRR, CRK, DRD, RRF, KRR |
| Lys C |
C-terminal side of K |
|
| CNBr |
C-terminal side of M |
|
| Arg C |
C-terminal side of R |
if P is C-term to R |
| Asp N |
N-terminal side of D |
|
| Asp N + N-terminal Glu |
N-terminal side of D or E |
|
| Glu C (bicarbonate) |
C-terminal side of E |
if P is C-term to E, or if E is C-term to E |
| Glu C (phosphate) |
C-terminal side of D or E |
if P is C-term to D or E, or if E is C-term to D or E |
| Chymotrypsin (C-term to F/Y/W/M/L, not before P, not after Y if P is C-term to Y) |
C-terminal side of F, L, M, W, Y |
if P is C-term to F, L, M, W, Y, if P is N-term to Y |
| Chymotrypsin (C-term to F/Y/W/, not before P, not after Y if P is C-term to Y) |
C-terminal side of F, Y, W |
if P is C-term to F, Y, W, if P is N-term to Y |
| Trypsin/Chymotrypsin (C-term to K/R/F/Y/W, not before P, not after Y if P is C-term to Y) |
C-terminal side of K, R, F, Y, W |
if P is C-term to K, R, F, Y, W, if P is N-term to Y |
| Pepsin (pH 1.3) |
C-terminal side of F, L |
|
| Pepsin (pH > 2) |
C-terminal side of F, L, W, Y, A, E, Q |
|
| Proteinase K |
C-terminal side of A, C, G, M, F, S, Y, W |
|
Peptide / Edit Modifications
Click the "Add..." button to add modifications from a predefined list or modify them by hand.
Name : The complete name of the modification.
Label : Define a label to identify the modification which will be displayed in the output result.
Locus : Define the locus where the modification should appear. Use the one letter amino acid code and the special character "$" for the C- or N-terminal positions of the peptide.
Formula : Define the chemical formula to be added to the defined locus. Follow each atom by its number; the default number is one and a negative value means "remove". Example: CH3O-1H-1 would replace an OH by a CH3. Note! You can also use charges using parenthesis, for example: SO4(2-) or H(+) or C4H8Cl.
Mode :
- FIXED : One expects that all the loci on the peptide should be modified. For example, you have chemically treated your sample.
- VARIABLE : Some of the loci on the peptide could be modified. For example, an artefactual reaction like oxidation.
Tolerance : Threshold to limit the combination.
If the mode FIXED is selected:
- All the locus will be modified in a first theoretical peptide.
- All except one locus will be modified in a second theoretical peptide.
- And so on, until the threshold is reached.
If the mode VARIABLE is selected:
- No locus will be modified in a first theoretical peptide.
- One locus will be modified in a second theoretical peptide.
- And so on, until the threshold is reached.
Scoring : For each modification, the peptide score will be multiplied by this factor for each unexpected locus:
- For VARIABLE modifications, unexpected locus are the modified locus.
- For FIXED modifications, unexpected locus are the unmodified locus.
Peak Lists
It is highly recommended to submit several peak lists in the same job in order to speed up the request and to keep spectra together from the same origin.
Select Several Files at a Time
Several files can be opened at a time. The file extension defines the format of each file. You can open sereral extension at a time.
Supported Formats
pkm
pkm format, produced by the Voyager software of Perseptive Biosystems or the GRAMS software.
Example:
OP=0
Center X Peak Y Left X Right X Time X Mass Difference Name
STD.Misc Height Left Y Right Y %Height,Width,%Area,%Quan,H/A
833.319 2189 833.260 833.378 0.016 0 0
C 0.? 0 762 762
854.843 5078 854.769 854.917 0.001 0 0
C 0.? 0 3453 3453
863.419 5108 863.064 863.775 0.001 0 0
C 0.? 0 3567 3567
872.402 12519 872.347 872.456 0.002 0 0
C 0.? 0 11417 11417
874.395 6730 874.331 874.460 0.002 0 0
C 0.? 0 3559 3559
887.786 5903 887.540 888.031 0.003 0 0
C 0.? 0 4131 4131
898.475 3329 898.416 898.534 0.006 0 0
C 0.? 0 1377 1377
904.366 7432 904.199 904.533 0.001 0 0
C 0.? 0 5596 5596
955.300 2598 955.229 955.371 0.011 0 0
C 0.? 0 1089 1089
All lines before the line ending with "H/A" are ignored. After that, every other line is interpreted, the first column as the mass value, the second column as the peak intensity.
dta
dta format, produced by Sequest.
Example:
this line is a comment line
899.546 1498
910.471 9718
920.45 2858
966.572 3228
1066.544 3342
1081.669 0
1130.681 0
1158.593 0
1166.623 0
1179.601 1204
1192.566 1660
1209.583 930
1213.544 1162
The first line is a comment line. For each subsequent line, the first column is interpreted as the mass value, the second column as the peak intensity. If the intensity is null, the line is ignored.
pkt
pkt format, produced by the Data Explorer software of Applied Biosystems.
Example:
81 1480.70557 1480.53 1481.13 1 17676 100.00 78777.88 215.64 15651.54 0.00
64 1439.72058 1439.55 1440.14 1 15787 89.32 69269.72 190.02 14596.77 168439.80
80 1479.70349 1479.45 1480.16 1 15097 85.41 71367.88 183.53 14046.04 187889.00
6 927.40973 927.27 927.64 1 14592 82.55 60508.18 177.22 12293.43 111441.10
66 1440.72168 1440.53 1441.05 1 14270 80.73 63693.18 172.73 14845.78 0.00
119 1639.84241 1639.58 1640.30 1 13500 76.38 70965.65 163.87 14066.58 191023.41
120 1640.83862 1640.66 1641.33 1 13055 73.86 64899.15 158.87 14484.64 0.00
7 928.41229 928.27 928.66 1 9375 53.04 37568.11 112.70 12488.59 0.00
155 1881.81982 1881.43 1882.32 1 8170 46.22 38734.46 98.14 17109.73 0.00
153 1880.81995 1880.61 1881.11 1 8145 46.08 33871.89 95.28 15818.97 118650.67
83 1481.70532 1481.55 1481.92 1 8032 45.44 33069.30 97.74 15581.68 0.00
122 1641.84314 1641.69 1642.38 1 7889 44.63 34656.24 94.40 15879.55 0.00
68 1441.73120 1441.54 1442.17 1 5531 31.29 28062.78 65.77 13122.96 0.00
40 1305.63635 1305.43 1306.08 1 4977 28.16 22964.59 60.26 14204.21 39906.37
8 929.42041 929.21 929.79 1 4791 27.10 22765.49 58.35 12668.57 0.00
156 1882.83765 1882.62 1883.15 1 4535 25.66 26779.96 54.26 11810.26 0.00
199 2248.82861 2248.42 2249.23 1 4304 24.35 20417.10 52.13 16448.57 0.00
48 1419.60510 1419.35 1420.05 1 3773 21.35 20049.58 45.92 12723.64 38017.72
235 2613.02393 2612.76 2613.53 1 3476 19.66 21082.94 41.36 16568.03 0.00
The second column of each line is the mass value, the sixth column is the peak intensity.
txt
Mass and intensity format.
Example:
899.546 1498
910.471 9718
920.45 2858
966.572 3228
1066.544 3342
1081.669 1144
1130.681 1232
1158.593 1424
1166.623 910
1179.601 1204
1192.566 1660
1209.583 930
1213.544 1162
1262.672 739
1277.699 1259
1314.772 810
The first column is the mass, the second the intensity.
mzData
This format can contain several spectra; Aldente allows you to select wich one you want to keep inside the MS data.
(see mzData official webpage)
Resubmit
Select one job in the job list, then right click on it to open the job sub menu or click the resubmit button in the toolbar.
Resubmit an Existing Job
Resubmit will open the Submission window with the same parameters and the specified spectra (all / identified / not identified).