Aldente : Submission

Use the links below to jump directly to a section:

Submission Parameters
The Submission Window


This is the submission window; we will explain in detail each part of the submission process. Note that here you can set the job title without waiting to do it in the job list. You also can set the actual parameters as the default parameters at any time.

Define the Search Space


Database(s) : Select one or several databases to use for the search.

Taxonomy : Click on "Edit" to change the taxonomy (read more)

More filters : You can futher restrict the search by clicking on "more filters..." (read more)

Enzyme : Specify the enzyme you used to generate your peptides (see the list and cleavage rules here).

Modification(s) : Click on "Edit..." to specify the chemical modifications (read more).

Annotated PTM : Allows you to generate theoretical peptides using the Swiss-Prot information on PTMs.

Missed cleavage : Select the number of missed cleavages allowed.

Resolution : Specify the isotopic resolution of the experimental masses. The theoretical masses of the peptides will be calculated accordingly.

Ion mode : Specify the charge state of the peptides:

  1. Protonated molecular ions, [M+H]+.
  2. Deprotonated molecular ions, [M-H]-.
  3. Molecular mass data, [M].

The calibration of the spectrometer is assimilated to an affine function (a line). You have to define thresholds for this calibration: shift and slope max (see the principles here). You can set Shift or Slope to zero. If you use both, the link between shift and slope is the logical OR.

Shift max : Defines the maximum difference that you allow between a theoretical peptide and an experimental peak in Daltons (absolute value).

Slope max : Defines the maximum difference that you allow between a theoretical peptide and an experimental peak in ppm (relative value "parts per million").

Internal error : The program will find the best set of aligned hits amongst all possible hits (an experimental peak matching a theoretical peptide). You must define the tolerance of the alignment, which is the error (in ppm) allowed to each hit to deviate from the alignment.

Number of hits : Hit = an experimental peak matching a theoretical peptide. If a protein contains fewer hits than this threshold, it will automatically be discarded.

Statistics


To know if an identification is relevant or not, you can choose to compare results against random scores. Random sequences are used in order to compare the best random score with the score of the proteins found. The number of random sequences generated are the same as the number of proteins "in range" in the database. The significancy of a result, based on the best random score, will differ with the size of the searching database. It makes sense that a given score can be significant while searching in a very small database and can be insignificant searching in a larger one.

Output Display


Define the maximum number of proteins to be displayed.

Select Your Peak Lists


Click the "Add..." button to add your peak lists. (See Peak Lists.) Then you can edit for each peak list the name, pI and Mw if known. Default name is the filename from which it comes. If the file contains several peak lists, the location index is added at the end of the name.

Edit Taxonomy


You can select a predefined taxon or define a taxId combination using the NCBI taxonomy (see Newt). Each predefined taxon or taxId represent a sub tree in the taxonomy: for example 40674 represents all the species inside the Mammalia hierarchy.

Protein / More Filters


In addition to the restriction on the databases and the taxonomy, you can restrict the search by defining the Mw and pI range, specific AC(s) or Swiss-Prot keywords.


Peptide / Enzyme, List and Cleavage Rules

Enzyme or Reagent Cleaves Where? Exceptions
Trypsin C-terminal side of K or R if P is C-term to K or R
Trypsin (C-term to K/R, even before P) C-terminal side of K or R  
Trypsin (higher specificity) C-terminal side of K or R if P is C-term to K or R; after K in CKY, DKD, CKH, CKD, KKR; after R in RRH, RRR, CRK, DRD, RRF, KRR
Lys C C-terminal side of K  
CNBr C-terminal side of M  
Arg C C-terminal side of R if P is C-term to R
Asp N N-terminal side of D  
Asp N + N-terminal Glu N-terminal side of D or E  
Glu C (bicarbonate) C-terminal side of E if P is C-term to E, or if E is C-term to E
Glu C (phosphate) C-terminal side of D or E if P is C-term to D or E, or if E is C-term to D or E
Chymotrypsin (C-term to F/Y/W/M/L, not before P, not after Y if P is C-term to Y) C-terminal side of F, L, M, W, Y if P is C-term to F, L, M, W, Y, if P is N-term to Y
Chymotrypsin (C-term to F/Y/W/, not before P, not after Y if P is C-term to Y) C-terminal side of F, Y, W if P is C-term to F, Y, W, if P is N-term to Y
Trypsin/Chymotrypsin (C-term to K/R/F/Y/W, not before P, not after Y if P is C-term to Y) C-terminal side of K, R, F, Y, W if P is C-term to K, R, F, Y, W, if P is N-term to Y
Pepsin (pH 1.3) C-terminal side of F, L  
Pepsin (pH > 2) C-terminal side of F, L, W, Y, A, E, Q  
Proteinase K C-terminal side of A, C, G, M, F, S, Y, W  


Peptide / Edit Modifications


Click the "Add..." button to add modifications from a predefined list or modify them by hand.


Name : The complete name of the modification.

Label : Define a label to identify the modification which will be displayed in the output result.

Locus : Define the locus where the modification should appear. Use the one letter amino acid code and the special character "$" for the C- or N-terminal positions of the peptide.

Formula : Define the chemical formula to be added to the defined locus. Follow each atom by its number; the default number is one and a negative value means "remove". Example: CH3O-1H-1 would replace an OH by a CH3. Note! You can also use charges using parenthesis, for example: SO4(2-) or H(+) or C4H8Cl.

Mode :

  • FIXED : One expects that all the loci on the peptide should be modified. For example, you have chemically treated your sample.
  • VARIABLE : Some of the loci on the peptide could be modified. For example, an artefactual reaction like oxidation.

Tolerance : Threshold to limit the combination.

If the mode FIXED is selected:

  1. All the locus will be modified in a first theoretical peptide.
  2. All except one locus will be modified in a second theoretical peptide.
  3. And so on, until the threshold is reached.

If the mode VARIABLE is selected:

  1. No locus will be modified in a first theoretical peptide.
  2. One locus will be modified in a second theoretical peptide.
  3. And so on, until the threshold is reached.

Scoring : For each modification, the peptide score will be multiplied by this factor for each unexpected locus:

  1. For VARIABLE modifications, unexpected locus are the modified locus.
  2. For FIXED modifications, unexpected locus are the unmodified locus.

Peak Lists
It is highly recommended to submit several peak lists in the same job in order to speed up the request and to keep spectra together from the same origin.

Select Several Files at a Time


Several files can be opened at a time. The file extension defines the format of each file. You can open sereral extension at a time.

Supported Formats


pkm
pkm format, produced by the Voyager software of Perseptive Biosystems or the GRAMS software.

Example:


OP=0
Center X   Peak Y   Left X   Right X   Time X   Mass Difference  Name
STD.Misc   Height   Left Y   Right Y   %Height,Width,%Area,%Quan,H/A
833.319 2189  833.260  833.378  0.016  0  0
C 0.?  0  762  762
854.843 5078  854.769  854.917  0.001  0  0
C 0.?  0  3453  3453
863.419 5108  863.064  863.775  0.001  0  0
C 0.?  0  3567  3567
872.402 12519  872.347  872.456  0.002  0  0
C 0.?  0  11417  11417
874.395 6730  874.331  874.460  0.002  0  0
C 0.?  0  3559  3559
887.786 5903  887.540  888.031  0.003  0  0
C 0.?  0  4131  4131
898.475 3329  898.416  898.534  0.006  0  0
C 0.?  0  1377  1377
904.366 7432  904.199  904.533  0.001  0  0
C 0.?  0  5596  5596
955.300 2598  955.229  955.371  0.011  0  0
C 0.?  0  1089  1089

All lines before the line ending with "H/A" are ignored. After that, every other line is interpreted, the first column as the mass value, the second column as the peak intensity.


dta
dta format, produced by Sequest.

Example:

this line is a comment line
899.546 1498
910.471 9718
920.45 2858
966.572 3228
1066.544 3342
1081.669 0
1130.681 0
1158.593 0
1166.623 0
1179.601 1204
1192.566 1660
1209.583 930
1213.544 1162

The first line is a comment line. For each subsequent line, the first column is interpreted as the mass value, the second column as the peak intensity. If the intensity is null, the line is ignored.

pkt
pkt format, produced by the Data Explorer software of Applied Biosystems.

Example:

81   1480.70557  1480.53  1481.13   1  17676  100.00  78777.88  215.64  15651.54  0.00         
64   1439.72058  1439.55  1440.14   1  15787  89.32   69269.72  190.02  14596.77  168439.80    
80   1479.70349  1479.45  1480.16   1  15097  85.41   71367.88  183.53  14046.04  187889.00    
6    927.40973   927.27   927.64    1  14592  82.55   60508.18  177.22  12293.43  111441.10    
66   1440.72168  1440.53  1441.05   1  14270  80.73   63693.18  172.73  14845.78  0.00         
119  1639.84241  1639.58  1640.30   1  13500  76.38   70965.65  163.87  14066.58  191023.41    
120  1640.83862  1640.66  1641.33   1  13055  73.86   64899.15  158.87  14484.64  0.00         
7    928.41229   928.27   928.66    1  9375   53.04   37568.11  112.70  12488.59  0.00         
155  1881.81982  1881.43  1882.32   1  8170   46.22   38734.46  98.14   17109.73  0.00         
153  1880.81995  1880.61  1881.11   1  8145   46.08   33871.89  95.28   15818.97  118650.67    
83   1481.70532  1481.55  1481.92   1  8032   45.44   33069.30  97.74   15581.68  0.00         
122  1641.84314  1641.69  1642.38   1  7889   44.63   34656.24  94.40   15879.55  0.00         
68   1441.73120  1441.54  1442.17   1  5531   31.29   28062.78  65.77   13122.96  0.00         
40   1305.63635  1305.43  1306.08   1  4977   28.16   22964.59  60.26   14204.21  39906.37     
8    929.42041   929.21   929.79    1  4791   27.10   22765.49  58.35   12668.57  0.00         
156  1882.83765  1882.62  1883.15   1  4535   25.66   26779.96  54.26   11810.26  0.00         
199  2248.82861  2248.42  2249.23   1  4304   24.35   20417.10  52.13   16448.57  0.00         
48   1419.60510  1419.35  1420.05   1  3773   21.35   20049.58  45.92   12723.64  38017.72     
235  2613.02393  2612.76  2613.53   1  3476   19.66   21082.94  41.36   16568.03  0.00

The second column of each line is the mass value, the sixth column is the peak intensity.

txt
Mass and intensity format.

Example:

899.546 1498
910.471 9718
920.45 2858
966.572 3228
1066.544 3342
1081.669 1144
1130.681 1232
1158.593 1424
1166.623 910
1179.601 1204
1192.566 1660
1209.583 930
1213.544 1162
1262.672 739
1277.699 1259
1314.772 810

The first column is the mass, the second the intensity.

mzData
This format can contain several spectra; Aldente allows you to select wich one you want to keep inside the MS data.
(see mzData official webpage)

Resubmit
Select one job in the job list, then right click on it to open the job sub menu or click the resubmit button in the toolbar.

Resubmit an Existing Job


Resubmit will open the Submission window with the same parameters and the specified spectra (all / identified / not identified).

Aldente_logo Aldente