MassMatrix Search Form Help 2017-03-01T20:04:40+00:00

1. Data files:

This field specifies the MS/MS data files that you want to search.

1) Click “Browse…”

2) Select a MS/MS data file to search. The data file should be stored locally on your own computer.

Multiple data files may be selected and searched at once by repeating the above steps.

Currently, mzXML, mzML and MGF file formats are supported in MassMatrix. File formats can be mixed for a single search.

3. Protein database to use:
This field specifies the protein database that you want to use.
Only one database can be selected.
You can upload your own databases by going to “Massmatrix Settings”->”Protein databases”.

4. Decoy database:
This field specifies the type of decoy database that you want to use.
If you select “Reversed”, a protein database of the reversed protein sequences of the target database that you choose in “Database” field will be appended to the target database during searching.
If you select “Randomized”, a protein database of the randomly reshuffled protein sequences of the target database that you choose in “Database” field will be appended to the target database during searching.
If you select “None”, no decoy database will be used.
Decoy databases can be used to evaluate false positive discovery rate.

5. Enzyme to use:
This field specifies the enzymes used for digestion during sample preparation.
Multiple enzymes can be selected if you use a combination of enzymes during sample preparation.
In addition to the built-in enzymes, you can create you own enzymes by going to “Massmatrix Settings”->”Enzymes”.
Note: The special enzyme of “Nonspecific/Non-restricted” specifies non-restricted cleavage to use.

6. Missed Cleavages:
This field specifies the maximum number of missed cleavages allowed during proteolytic digestion.
Please specifies a number of 1 or 2 for your search to allow incomplete digestion that may occur.
If you are confident that your digestion goes to completeness, a number of 0 can also be chosen to get optimal results.
A large number specified in this field will increase the search space and search time exponentially and cause high false discovery rate.
Therefore, a large number is not recommended unless you think it is necessary.

7. Variable modifications:
This field specifies the modifications that may or may not modify the occurrences of certain amino acid residues.
Variable modifications add complexity as there are a great number of permutations of variably modified peptides for each sequence.
It will increase the search space. Therefore, please only choose necessary modifications for a large protein database.
Note:
You can create your own modifications by going to “Massmatrix Settings”->”Modifications”.

8. Fixed modifications:
This field specifies the modifications that modify all occurrences of certain amino acid residues.
Fixed modifications do not add complexity to the search. However, peptides with those amino acid residues that are incompletely modified will not be searched. Therefore, please choose fixed modification wisely and be sure that the modification can modify all occurrences of the specified amino acid residues.
Note:
You can create your own modifiactions by going to “Massmatrix Settings”->”Modifications”.

9. Mass spectrometer:
This field is only available in the basic search form. It specifies the mass spectrometer that you use.
All other advanced parameters will be set automatically according to your selected mass spectrometer and experiment type.

10. Experiment:
This field is only available in the basic search form. It specifies the type of your experiment: protein ID, protein characterization or disulfide search.
All other advanced parameters will be set automatically in accordance with your selected mass spectrometer and experiment type.

11. Precursor ion tolerance:
This field specifies the error tolerance for precursor peptide ion m/z values. The unit can be Da or ppm
The error tolerance should be specified according to the mass spectrometer that you use.
Typical settings for some common mass spectrometers are as follows.
LTQ-Orbitrap: 5-20 ppm
LTQ-FT ICR: 5-20 ppm
LTQ: 1.5-3.0 Da
LCQ: 1.5-3.0 Da

12. Product ion tolerance:
This field specifies the error tolerance for fragmented product ion m/z values. The unit is fixed to be Da.
The error tolerance should be specified according to the mass spectrometer that you use.
Typical settings for some common mass spectrometers are as follows.
LTQ-Orbitrap: 0.5-0.8 Da for normal mode, 0.01-0.02 Da for Orbitrap-Orbitrap mode
LTQ-FT ICR: 0.5-0.8 Da
LTQ: 0.5-0.8 Da
LCQ: 0.5-0.8 Da

13. Max # PTM per peptide:
This field specifies the maximum number of variable modifications allowed for each peptide sequence.
Due to the fact that variable modifications can dramatically increases the search space, search speed can be extremely slow and false positives can be severe. In order to limit the search space to get optimal results, a limited number of variable modifications should be allowed for each peptide sequence. However, if you are confident that you may have some peptides with a large number of variable modifications, please choose a proper big number. However, please be aware that the search many take a long time for a very large database with many variable modifications.

14. Mass type:
This field specifies the type of mass for precursor and product ions used during searching.
The monoisotopic or average mass for an ion can be specified.
It is recommended that monoisotopic mass is used for all types of searches and all types of mass spectrometers. Choosing “average” for a high mass accuracy mass spectrometer will cause erroneous results.

15. Min/Max peptide length:
These two fields specify the length of peptides to be searched.
A minimum length < 6 may cause many false positive peptide matches with small length.
A minimum length > 8 may cause the loss of true peptide matches with length < 8.
Therefore, it is recommended that a number between 6 and 8 be used.
The maximum length of peptides should be limited when several variable modifications are selected in order to make the search speed reasonably fast. This is due to the fact that long peptides tend to have more permutations of modification sites than short peptides. However a too small maximum length could cause the loss of long peptides. Typical settings of max peptide length for some common mass spectrometers are as follows.
LTQ-Orbitrap: 40-60
LTQ-FT ICR: 40-60
LTQ: 30-50 Da
LCQ: 30-40 Da

16. Min pp, pptag scores of peptides for output:
The quality of a peptide match is mainly evaluated by three statistical scores: pp, pp2, pptag.
These fields specify the score thresholds for those three scores. The min pp score is the threshold for pp and pp2 scores. The min pptag is the threshold for pptag score. A too low threshold setting will cause many peptide matches with small scores and of low quality in your final results. A too high threshold setting may cause the loss of peptide matches of good quality.
For normal protein identification, a setting of 4.0-6.0 can be used for min pp score and a setting of 1.0-2.3 can be used for min pptag.
A low setting of those two thresholds can be used when you want all possible peptide matches output in your results. For example, when you perform a search of peptides and proteins with intact disulfide bonds or cross links against a limited protein database, a threshold as low as 0.1 for min pp score and 0.01 for min pptagmay be specified to allow all possible peptide matches with cross links in your final results. This may be necessary when pp, pp2, pptag scores for big peptides with disulfide bonds or cross links are very low due to the fact the MS/MS spectra of those peptides have many product ions and there are many different peptides having similar MS/MS spectra.

17. Max # match/spec:
This field specifies the maximum number of candidate peptide matches for each spectrum output in the result.
Under some circumstances, a spectrum may have multiple candidate peptide matches with close statistical scores. MassMatrix will output up to “max # match/spec” number of those matches with top scores. A setting bigger than 1 will allow you to evaluate the other competing peptide matches besides the one with the best scores.

18. Max # comb/match:
This field specifies the maximum number of combination of different modification sites for a peptide match with modifications output in the result.
Under many circumstances, peptides with the same sequence and the same set of modifications, but different specific modification sites will have very close statistical scores. MassMatrix will output up to “max # com/match” number of them. A setting bigger that 1 is necessary under most cases when modification sites need to be determined.

19. Fragmentation method:
This field specifies the fragmentation method used during mass spectrometry to MS/MS spectra. CID, ETD, ECT are supported.
Note:
Performance of MassMatrix on ECD data has not been tested.

20. C13 isotope ions:
This field specifies whether or not non-monoisotopic peptide ions be searched.
For high mass accuracy machines, peptide ions with C13 isotopes (non-monoisotopic ions) may undergo fragmentation to create MS/MS spectra. Therefore, it is necessary to choose “yes” to get optimal results. A setting of “Auto” is always recommended, by which MassMatrix will determine the best option for you.

21. Cross link:
This field specifies the intact cross links you want to search.
You can create your own cross links to search by going to “Massmatrix Settings”->”Cross Links”.
Note: 
In order to search peptides with disulfides or cross links, you also have to choose a proper search mode in the field of “Cross link mode”. By default, the search mode of disulfides or cross links is “disabled”, which means MassMatrix will not try to search any peptides with disulfides or cross links. Please refer to Cross link mode for more details.

22. Cross link mode:
This field specifies the search mode for peptides with disulfide or cross links.
“Disabled”:
No search of peptides with disulfide or cross links will be performed.
“Exploratory”: 
In the exploratory search mode, all possible cross link site residues in the protein sequences are considered to be variable cross link sites, i.e. all site residues may or may not form cross links. During searching, MassMatrix will generate all possible combinations of cross links by assuming that any two site residues are capable of forming a cross link. Consider a protein with n cysteine residues. During exploratory search mode of disulfide bonds, MassMatrix will generate n(n-1)/2 possible combinations of single disulfide bond for the protein (n= number of cysteine residues).
“Confirmatory”: 
In the confirmatory search mode, only the cross links specified in the protein database will be considered and searched against experimental data. Cross links are specified in the sequence by uploading your custom database. In the special .fasta protein databases or .bas MassMatrix databases, cross links are coded as “($i)” where X is the site residue (e.g. C for disulfide bonds), i is the index number of the specified cross link. Each cross link has two related cross link site residues with the same label of “($i)”
For example, in the confirmatory search of disulfide bonds against a protein database containing the following sequence

>Ribonuclease A from bovine pancreas

KETAAAKFER      10

QHMDSSTSAA      20

SSSNYC($1)NQMM  30

KSRNLTKDRC($2)  40

KPVNTFVHES      50

LADVQAVC($3)SQ  60

KNVAC($4)KNGQT  70

NC($4)YQSYSTMS  80

ITDC($1)RETGSS  90

KYPNC($2)AYKTT  100

QANKHIIVAC($3)  110

EGNPYVPVHF      120

DASV            130

only four native disulfide bond in the protein labeld by “($1)”, “($2)”, “($3)”, and “($4)” will be searched.

“Semi-exploratory”: 

In the semi-exploratory mode, an exploratory search will be performed. However, the search of cross links will be limited to those site residues with a label of “($)” or ($i)” where i can be any number.

For example, in the semi-exploratory search of disulfide bonds against a protein database containing the following sequence

>Ribonuclease A from bovine pancreas

KETAAAKFER      10

QHMDSSTSAA      20

SSSNYC($1)NQMM  30

KSRNLTKDRC($2)  40

KPVNTFVHES      50

LADVQAVCSQ      60

KNVACKNGQT      70

NCYQSYSTMS      80

ITDC($1)RETGSS  90

KYPNC($2)AYKTT  100

QANKHIIVAC      110

EGNPYVPVHF      120

DASV            130

only disulfide bonds between the four Cys with a label of “($1)” or “($2)”, i.e. 4*(4-1)/2 = 6 disulfide bonds, will be considered.

23. Cross link sites cleavability:
This field specifies whether the cross link sites are cleavable by the specified enzyme(s) or not. The default setting is “Non applicable”, which means that the cross link sites are not among the cleavage sites of the specified enzyme(s). If the cross link sites are among the cleavage sites of the specified enzyme(s), you will have to specify this field. For example, the cross link sites are lysine rediues and the specified enzyme is trypsin. If you choose “Non-cleavable by enzyme”, the lysine residues that are cross linked with another lysine will not be cleaved by enzyme during searching. If you choose “Cleavable by enzyme”, the lysine residues that are cross linked with another lysine will also be cleaved by enzyme like normal lysine residues during searching.

24. Max # cross links/peptide:
This field specifies the maximum number of cross links allowed for each peptide. Only 1 and 2 can be choosen. If 1 is chosen, peptides with up to 1 cross links will be searched. If 2 is selected, peptides with up to 2 cross links will be searched.

25. How to search inter-protein cross-links:
In order to search inter-protein cross-links between different proteins or inter-chain cross-links for a protein with multiple chains (such as Insulin), all the protein sequences and sequences for different chains have to be included as one protein in the .FASTA or .BAS database. Different proteins and chains have to be on different rows and start with “~“.
For example, in order to search “K-K” cross-links between two proteins, Cytochrome C and Lysosome, a .FASTA protein database has to be constructed as follows:

>Cytochrome C and Lysosome

MGDVEKGKKIFVQKCAQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGFTYTDANKNKGITWKEETLMEYLE

NPKKYIPGTKMIFAGIKKKTEREDLIAYLKKATNE

~MRSLLILVLCFLPLAALGKVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAKFESNFNTQATNRNTDGSTD

YGILQINSRWWCNDGRTPGSRNLCNIPCSALLSSDITASVNCAKKIVSDGNGMNAWVAWRNRCKGTDVQA

WIRGCRL

In this way, MassMatrix will generate all peptides from both proteins with and without cross-links and also those due to inter-protein cross-links between Cytochrome C and Lysosome.

Another example is Insulin containing two chains linked by disulfide bonds. In order to search inter-chain disulfide bonds for the protein, you have to construct a .FASTA or .BAS database as follows:

>Insulin

GIVEQC($1)C($2)ASVC($1)SLYQLENYC($3)N

~FVNQHLC($2)GSHLVEALYLVC($3)GERGFFYTPKA

Confirmatory disulfide search for Insuline can also be performed, since all native disulfide bonds are specified in the above database.

26. Comment:
This field allows you to give a title to your search so that you may recognize your search afterwards.

27. Expert Options:
This field is only used to enable un-published functions in MassMatrix. Un-published functions in MassMatrix are either not validated or confidential. So you may always leave it blank.

28. Search Profile:
A search profile allows you to store a set of search parameters and reuse them in future searches. You may save the parameters in the current search form as a profile or manage your profiles by going to “Massmatrix Settings”->”Search Profiles”. The default profile to be loaded in a search form is always the search parameters used in your most recent search.

For more information or support, please fill out the Support Question Form.