Help
Spectra Prediction
The Spectra Prediction utility predicts the spectra for a given input molecule.
Input:
InChI/SMILES: The molecule must be represented in either InChI format or SMILES format. InChI strings need to start with "InChI=" and are not expected to have any charge - an additional H+ will be added. InChI strings need to contain AT LEAST the main layer with its chemical formula and atom connections sublayers for proper computation.
Examples: | CN1CCC[C@H]1c2cccnc2 |
InChI=1S/C10H14N2/c1-12-7-3-5-10(12)9-4-2-6-11-8-9/h2,4,6,8,10H,3,5,7H2,1H3 |
Spectra Type: The type of spectra, either ESI (Electrospray Ionization) or EI (Electron Ionization/Impact).
Ion Mode: Indicates whether the precursor ion has a positive or negative adduct.
Adduct: Indicates the specific adduct used.
Output:
Spectra are computed for low (10V), medium (20V) and high (40V) collision energy levels and are represented by a list of 'mass intensity' pairs, each corresponding to a peak in the spectra.
energy0 | low energy level | |
---|---|---|
132.0813243 | 2.334652703 | |
134.0969744 | 3.34923502 | mass intensity |
136.1126244 | 4.048843483 | |
146.0969744 | 2.149324997 | |
147.0922234 | 1.635296179 | |
163.1235235 | 86.48264762 | |
energy1 | medium energy level | |
84.08132432 | 6.006301708 | |
132.0813243 | 8.196052008 | |
134.0969744 | 9.194699278 | |
136.1126244 | 10.23621718 | |
146.0969744 | 6.107338911 | |
147.0922234 | 4.933320787 | |
163.1235235 | 55.32607013 | |
energy2 | high energy level | |
30.03437413 | 2.742873396 | |
41.03912516 | 6.65846756 | |
44.05002419 | 4.235872856 | |
46.06567426 | 1.836649163 | |
51.0234751 | 5.705668897 | |
53.03912516 | 3.026849879 | |
55.05477522 | 6.651243286 | |
56.05002419 | 1.699714643 | |
57.07042529 | 3.407180068 | |
65.03912516 | 2.771825374 | |
68.05002419 | 1.700204766 | |
80.05002419 | 6.265672776 | |
82.06567426 | 3.807914718 | |
84.08132432 | 8.114929747 | |
92.05002419 | 4.008757209 | |
94.06567426 | 6.926908744 | |
96.08132432 | 1.504021177 | |
104.0500242 | 2.293258876 | |
105.0704253 | 1.810507476 | |
106.0656743 | 2.507700064 | |
108.0813243 | 2.524395945 | |
118.0656743 | 2.464943213 | |
120.0813243 | 3.650643451 | |
121.0765733 | 3.021784682 | |
132.0813243 | 2.057084015 | |
134.0969744 | 1.892947239 | |
136.1126244 | 2.008486652 | |
137.1078734 | 1.694987566 | |
147.0922234 | 3.008506562 |
Peak Assignment
The Peak Assignment utility annotates the peaks in a provided set of spectra given a known molecule. The complete list of feasible fragments is computed, then the most likely fragments for each spectrum peak are determined using a pre-trained model.
Input:
InChI/SMILES: The molecule must be represented in either InChI format or SMILES format. InChI strings need to start with "InChI=" and are not expected to have any charge - an additional H+ will be added. InChI strings need to contain AT LEAST the main layer with its chemical formula and atom connections sublayers for proper computation.
Examples: | Oc1ccc(CC(NC(=O)C(N)CO)C(=O)NC(CC(O)=O)C(O)=O)cc1 |
InChI=1S/C16H21N3O8/c17-10(7-20)14(24)18-11(5-8-1-3-9(21)4-2-8)15(25)19-12(16(26)27)6-13(22)23/h1-4,10-12,20-21H,5-7,17H2,(H,18,24)(H,19,25)(H,22,23)(H,26,27) |
Spectra: The spectra should be represented as a list of peaks with the format 'mass intensity' on each line. For ESI spectra, 'low','medium', and 'high' or 'energy0', 'energy1', and 'energy2' header lines should begin spectra of different energy levels (in that order) and multiple energy levels are optional (only one is required). EI spectra only need to have one energy level. Spectra may also be in .msp file format, in which case energy levels for ESI spectra should be specified in the "Comment: " field (EI spectra do not need a specified energy level). A corresponding spectra ID must be selected for .msp spectra. .msp files must have an "ID" and "Num peaks" attributes for each spectra.
Example peak list format: | low | |
87.054687 | 7.567280 | |
105.069174 | 1.791050 | |
136.07616 | 13.081500 | |
160.076289 | 2.225420 | |
178.084616 | 5.319120 | |
223.106608 | 100.000000 | |
251.10173 | 40.722900 | |
297.107567 | 3.945980 | |
384.140384 | 11.216900 | |
medium | ||
60.044545 | 2.476820 | |
87.056965 | 9.632580 | |
119.046086 | 2.367850 | |
135.066335 | 1.865000 | |
136.077192 | 46.373600 | |
160.074417 | 6.652730 | |
178.08705 | 20.078100 | |
223.109344 | 100.000000 | |
251.108668 | 3.127750 | |
297.113687 | 1.892360 | |
high | ||
42.033909 | 3.047230 | |
60.043746 | 26.520300 | |
70.027268 | 3.162400 | |
87.056272 | 18.342000 | |
91.054494 | 23.516200 | |
119.04828 | 15.711000 | |
121.063402 | 7.273900 | |
133.06551 | 5.039960 | |
135.066238 | 3.626030 | |
136.074907 | 100.000000 | |
160.074409 | 26.458000 | |
178.085454 | 12.211700 |
Example .msp format: | |
Name: Diazirine | |
NISTNO: 305841 | |
ID: ID_3 | |
Num peaks: 12 | |
Comment: energy0 | |
12 | 108.00 |
13 | 228.99 |
14 | 999.00 |
15 | 21.98 |
26 | 17.98 |
27 | 58.05 |
28 | 178.04 |
29 | 22.98 |
40 | 17.98 |
41 | 108.00 |
42 | 431.01 |
43 | 7.99 |
Name: Methane, diazo- | |
NISTNO: 57 | |
ID: ID_4 | |
Num peaks: 12 | |
Comment: energy1 | |
12 | 110.10 |
13 | 220.30 |
14 | 999.00 |
15 | 25.18 |
26 | 12.59 |
27 | 58.25 |
28 | 179.34 |
29 | 20.48 |
40 | 21.98 |
41 | 110.10 |
42 | 424.82 |
43 | 10.99 |
Spectra Type: The type of spectra, either ESI (Electrospray Ionization) or EI (Electron Ionization/Impact).
Ion Mode: Indicates whether the precursor ion has a positive or negative adduct.
Mass Tolerance: The mass tolerance to use when matching peaks within the dot product comparison. The default value is 10.0 ppm.
Output:
Results contain the original spectra appended with the ids of any fragments with a corresponding mass, listed in order from most likely to least likely. A list of fragments with their masses and SMILES is also provided, along with a list of transitions between pairs of fragments and their corresponding neutral losses. Fragment numbers are shown in red.
energy0 | low energy level | ||
---|---|---|---|
87.054687 | 4.071272337 | 16 15 | |
105.069174 | 0.9636028163 | 4 | |
136.07616 | 7.037977857 | 17 | |
160.076289 | 1.197298221 | 14 | |
178.084616 | 2.861739768 | ||
223.106608 | 53.80100032 | 7 21 24 | mass intensity corresponding_fragment(s) |
251.10173 | 21.90932756 | 8 19 18 20 | |
297.107567 | 2.122976713 | 5 | |
384.140384 | 6.034804405 | 0 | |
energy1 | medium energy level | ||
60.044545 | 1.273646775 | 22 | |
87.056965 | 4.953329049 | 16 15 | |
119.046086 | 1.217611501 | 25 | |
135.066335 | 0.9590326451 | ||
136.077192 | 23.84653956 | 17 | |
160.074417 | 3.421010857 | 14 | |
178.08705 | 10.32469349 | ||
223.109344 | 51.42266194 | 7 21 24 | |
251.108668 | 1.608372309 | 8 19 18 20 | |
297.113687 | 0.9731018854 | 5 | |
energy2 | high energy level | ||
42.033909 | 1.244230912 | 26 | |
60.043746 | 10.82864669 | 22 | |
70.027268 | 1.291256596 | ||
87.056272 | 7.489320919 | 16 15 | |
91.054494 | 9.60202642 | ||
119.04828 | 6.415043123 | 25 | |
121.063402 | 2.97004533 | 27 | |
133.06551 | 2.057893243 | ||
135.066238 | 1.480563861 | ||
136.074907 | 40.8315392 | 17 | |
160.074409 | 10.80320864 | 14 | |
178.085454 | 4.986225072 |
0 | 384.1406897 | NC(CO)C(=O)NC(CC1=CC=C(O)C=C1)C(=O)NC(CC(=O)O)C(=O) | |
1 | 278.0988249 | N=C(C=O)C(=O)[NH+]=CC(O)NC(CC(O)O)C(O)O | fragment_number fragment_mass fragment_SMILES |
2 | 276.0831748 | N=C(C=O)C(=O)[NH+]=C=C(O)NC(CC(O)O)C(O)O | |
3 | 274.0675247 | N=C(C=O)C(=O)[NH+]=C=C(O)N=C(CC(O)O)C(O)O | |
4 | 105.0664025 | NC(CO)C(=[NH2+])O | |
5 | 297.1086613 | [NH3+]C(=C=C1C=CC(=O)CC1)C(=O)N=C(CC(O)O)C(O)O | |
6 | 367.1141406 | O=C(NC(CC(O)O)C(O)O)C(=C=C1C=CC(=O)CC1)[NH+]=C(O)C#CO | |
7 | 223.1082673 | NC(CO)C(O)[NH+]=C=C=C1C=CC(=O)CC1 | |
8 | 251.103182 | N=C(CO)C(O)=[NH+]C(=C=C1C=CC(=O)CC1)CO | |
9 | 253.118832 | NC(CO)C(O)=[NH+]C(=C=C1C=CC(=O)CC1)CO | |
10 | 266.114081 | N=C(CO)C(O)=[NH+]C(=C=C1C=CC(=O)CC1)C(N)O | |
11 | 268.1297311 | NC(O)C(=C=C1C=CC(=O)CC1)[NH+]=C(O)C(N)CO | |
12 | 270.1453811 | NC(O)C(=C=C1C=CC(=O)CC1)[NH2+]C(O)C(N)CO | |
13 | 354.130125 | C#CC(=C=C([NH+]=C(O)C(=N)C=O)C(=O)NC(CC(O)O)C(O)O)CC | |
14 | 160.0722162 | N=C(CO)C(=O)[NH+]=CC(N)O | |
15 | 87.05583784 | [NH+]#CC(N)CO | |
16 | 87.05583784 | CC(=N)C(=[NH2+])O | |
17 | 136.0762389 | [NH3+]C=C=C1C=CC(=O)CC1 | |
18 | 251.103182 | CC(=NC(=O)C([NH3+])=C=C1C=CC(=O)CC1)C(O)O | |
19 | 251.103182 | [NH3+]C(=C=C1C=CC(=O)CC1)C(=O)N=CCC(O)O | |
20 | 251.103182 | NC(O)C(=C=C1C=CC(=O)CC1)[NH+]=C(O)C=CO | |
21 | 223.1082673 | C#CC(=C=C(CO)[NH+]=C(O)C(=N)CO)CC | |
22 | 60.04493881 | [NH2+]=CCO | |
23 | 354.130125 | N#CC(O)=[NH+]C(=C=C1C=CC(=O)CC1)C(O)NC(CC(O)O)C(O)O | |
24 | 223.1082673 | NCC(O)=[NH+]C(=C=C1C=CC(=O)CC1)CO | |
25 | 119.0496898 | C=C=C1C=CC(=[OH+])C=C1 | |
26 | 42.03437413 | CC#[NH+] | |
27 | 121.0653399 | C=C=C1C=CC(=[OH+])CC1 |
0 | 2 | C=C1C=CC(=O)CC1 | |
0 | 3 | CC1C=CC(=O)CC1 | fragment_number fragment_number transition_between_fragments |
0 | 4 | O=C1C=CC(=C=C=C(O)N=C(C=C(O)O)C(O)O)CC1 | |
0 | 5 | NC(=C=O)CO | |
0 | 6 | N | |
0 | 7 | O=C=NC(=CC(O)O)C(O)O | |
0 | 8 | N=C(C=C(O)O)C(O)O | |
0 | 9 | N=C(C=C(O)O)C(=O)O | |
0 | 10 | OC(O)C#CC(O)O | |
0 | 11 | O=C(O)C#CC(O)O | |
0 | 12 | O=C(O)C#CC(=O)O | |
0 | 13 | C=O | |
1 | 14 | OC(O)C#CC(O)O | |
2 | 14 | O=C(O)C#CC(O)O | |
3 | 14 | O=C(O)C#CC(=O)O | |
4 | 15 | O | |
4 | 16 | O | |
5 | 17 | O=C=NC(=CC(O)O)C(O)O | |
5 | 18 | O=CO | |
5 | 19 | O=CO | |
6 | 5 | O=C=C=CO | |
6 | 20 | O=C(O)C#CC(O)O | |
7 | 4 | C=C=C1C=CC(=O)C=C1 | |
7 | 17 | NC(=C=O)CO | |
8 | 4 | O=C=C=C=C1C=CC(=O)CC1 | |
9 | 7 | C=O | |
10 | 14 | C=C1C=CC(=O)C=C1 | |
10 | 7 | N=C=O | |
11 | 14 | C=C1C=CC(=O)CC1 | |
11 | 4 | NC(O)=C=C=C1C=CC(=O)CC1 | |
11 | 20 | N | |
11 | 7 | NC=O | |
11 | 8 | N | |
12 | 14 | CC1C=CC(=O)CC1 | |
13 | 21 | N=C(C=C(O)O)C(=O)O | |
0 | 22 | O=C=NC(=C=C1C=CC(=O)CC1)C(=O)NC(CC(O)O)C(O)O | |
0 | 23 | C=O | |
2 | 22 | O=CN=C=C(O)N=C(C=C(O)O)C(O)O | |
4 | 22 | N=CO | |
23 | 5 | NC=C=O | |
23 | 24 | N=C(C=C(O)O)C(=O)O | |
7 | 25 | N=C(O)C(N)CO | |
7 | 22 | O=C1C=CC(=C=C=NCO)CC1 | |
8 | 22 | O=C=NC(=C=C1C=CC(=O)CC1)CO | |
9 | 21 | C=O | |
11 | 22 | NC(O)C(=C=C1C=CC(=O)CC1)N=CO | |
13 | 22 | C#CC(=C=C(N=C=O)C(=O)N=C(CC(O)O)C(O)O)CC | |
1 | 22 | O=CN=C=C(O)N=C(CC(O)O)C(O)O | |
3 | 22 | O=CN=C=C(O)N=C(C=C(O)O)C(=O)O | |
22 | 26 | O | |
7 | 27 | N=C(O)C(=N)CO | |
9 | 22 | O=C1C=CC(=C=C(CO)N=CO)CC1 | |
10 | 22 | NC(O)C(=C=C1C=CC(=O)CC1)N=C=O | |
12 | 22 | NC(O)C(=C=C1C=CC(=O)CC1)NCO |
Compound Identification
The Compound Identification utility determines the compounds that most closely match to a given spectra. The spectra for each candidate compound are predicted using a pre-trained model and compared to the input spectra. The candidate compounds may be provided in a list from the user, or can be extracted from a database.
Input:
Spectra: The spectra should be represented as a list of peaks with the format 'mass intensity' on each line. For ESI spectra, 'low','medium', and 'high' or 'energy0', 'energy1', and 'energy2' header lines should begin spectra of different energy levels (in that order) and multiple energy levels are optional (only one is required). EI spectra only need to have one energy level. Spectra may also be in .msp file format, in which case energy levels for ESI spectra should be specified in the "Comment: " field (EI spectra do not need a specified energy level). A corresponding spectra ID must be selected for .msp spectra. .msp files must have an "ID" and "Num peaks" attributes for each spectra.
Example: | low | |
87.054687 | 7.567280 | |
105.069174 | 1.791050 | |
136.07616 | 13.081500 | |
160.076289 | 2.225420 | |
178.084616 | 5.319120 | |
223.106608 | 100.000000 | |
251.10173 | 40.722900 | |
297.107567 | 3.945980 | |
384.140384 | 11.216900 | |
medium | ||
60.044545 | 2.476820 | |
87.056965 | 9.632580 | |
119.046086 | 2.367850 | |
135.066335 | 1.865000 | |
136.077192 | 46.373600 | |
160.074417 | 6.652730 | |
178.08705 | 20.078100 | |
223.109344 | 100.000000 | |
251.108668 | 3.127750 | |
297.113687 | 1.892360 | |
high | ||
42.033909 | 3.047230 | |
60.043746 | 26.520300 | |
70.027268 | 3.162400 | |
87.056272 | 18.342000 | |
91.054494 | 23.516200 | |
119.04828 | 15.711000 | |
121.063402 | 7.273900 | |
133.06551 | 5.039960 | |
135.066238 | 3.626030 | |
136.074907 | 100.000000 | |
160.074409 | 26.458000 | |
178.085454 | 12.211700 |
Example .msp format: | |
Name: Diazirine | |
NISTNO: 305841 | |
ID: ID_3 | |
Num peaks: 12 | |
Comment: energy0 | |
12 | 108.00 |
13 | 228.99 |
14 | 999.00 |
15 | 21.98 |
26 | 17.98 |
27 | 58.05 |
28 | 178.04 |
29 | 22.98 |
40 | 17.98 |
41 | 108.00 |
42 | 431.01 |
43 | 7.99 |
Name: Methane, diazo- | |
NISTNO: 57 | |
ID: ID_4 | |
Num peaks: 12 | |
Comment: energy1 | |
12 | 110.10 |
13 | 220.30 |
14 | 999.00 |
15 | 25.18 |
26 | 12.59 |
27 | 58.25 |
28 | 179.34 |
29 | 20.48 |
40 | 21.98 |
41 | 110.10 |
42 | 424.82 |
43 | 10.99 |
Search Candidates: The candidates should be represented as a list of compounds in the format 'ID SMILES_or_InChI' on each line. The list can have a maximum of 100 compounds. The compounds must be represented in proper InChI format or SMILES format. InChI strings need to start with "InChI=" and are not expected to have any charge - an additional H+ will be added. InChI strings need to contain AT LEAST the main layer with it's chemical formula and atom connections sublayers for proper computation.
Example: | 7156455 | CC(C)N1C(=O)C2C(CCN2S(C)(=O)=O)N(Cc2cccc(F)c2)C1=O | |
485776 | CSCC(=O)NCC1CN(c2ccc(N3CCOCC3)c(F)c2)C(=O)O1 | ||
485687 | CC(=O)NNCC1CN(c2ccc(C3CCS(=O)CC3)c(F)c2)C(=O)O1 | ||
45556239 | O=C(NC1CC1)N1CCC2(CC1)OCCN2S(=O)(=O)c1ccc(F)cc1 | ||
19459759 | Cc1cc(C(F)(F)Cl)n2nc(C(=O)NC3CC4CCC(C3)N4C)cc2n1 | ||
59444507 | Cc1cc(CN(CC(=O)O)CC(=O)O)nc(CN(CC(=O)O)CC(=O)O)c1 | ||
58984199 | C=CC(=O)OCCn1c(=O)n(CCOC)c(=O)n(CCOC(=O)C=C)c1=O | ||
58753253 | NC(CO)C(=O)NC(CC(=O)O)C(=O)NC(Cc1ccc(O)cc1)C(=O)O | ||
54199399 | NC(CN(CC(=O)O)CC(=O)O)(c1ccccc1)N(CC(=O)O)CC(=O)O | ||
45644415 | CNC(=O)NC(=O)COC(=O)C1C(C(=O)OC)=C(C)NC(C)=C1C(=O)OC | ||
44585322 | COc1cc(C(=O)NCC(=O)NCC(=O)NCC(=O)O)cc(OC)c1OC | ||
36010709 | COc1cc(C(=O)NCC(=O)OC(C)C(=O)NC(N)=O)cc(OC)c1OC | ||
21494927 | Nc1ccccc1C(C(=O)O)N(CCN(CC(=O)O)CC(=O)O)CC(=O)O | ||
21273011 | NC(C(=O)O)(c1ccccc1)N(CCN(CC(=O)O)CC(=O)O)CC(=O)O | ||
20147059 | Nc1ccc(C(C(=O)O)N(CCN(CC(=O)O)CC(=O)O)CC(=O)O)cc1 | ||
18232127 | NC(Cc1ccc(O)cc1)C(=O)NC(CO)C(=O)NC(CC(=O)O)C(=O)O | ||
18231916 | NC(Cc1ccc(O)cc1)C(=O)NC(CC(=O)O)C(=O)NC(CO)C(=O)O | ||
18224136 | NC(CO)C(=O)NC(Cc1ccc(O)cc1)C(=O)NC(CC(=O)O)C(=O)O | ||
18219720 | NC(CC(=O)O)C(=O)NC(Cc1ccc(O)cc1)C(=O)NC(CO)C(=O)O |
Database: Instead of providing a candidate list, one can be generated from a selected database. Additional input options for generating a compound list from a database are:
Parent Ion Mass: The parent ion mass of the compound used in the mass spectrometry.
Adduct Type: The adduct type used in the mass spectrometry.
Candidate Mass Tolerance: The mass tolerance to use when identifying candidate compounds in the database. The default value is 100.0 ppm.
Candidate Limit: The maximum number of candidates to return. The maximum and default value is 100.
Spectra Type: The type of spectra, either ESI (Electrospray Ionization) or EI (Electron Ionization/Impact).
Ion Mode: Indicates whether the precursor ion has a positive or negative adduct.
Number of Results: The number of results to return, with the default value being 10. If left blank, all results wil be returned.
Mass Tolerance: The mass tolerance to use when matching peaks within the dot product comparison. The default value is 10.0 ppm.
Scoring Function: The type of scoring function to use when comparing spectra. The options are Jaccard and DotProduct.
Output:
The top candidates are ranked according to how closely they match and returned in a list.
Score | ID | SMILES | ||
---|---|---|---|---|
1 | 0.7829224 | 18224136 | NC(CO)C(=O)NC(Cc1ccc(O)cc1)C(=O)NC(CC(=O)O)C(=O)O | |
2 | 0.71888482 | 18232127 | NC(Cc1ccc(O)cc1)C(=O)NC(CO)C(=O)NC(CC(=O)O)C(=O)O | rank score fragment_id fragment_SMILES |
3 | 0.58806501 | 18231916 | NC(Cc1ccc(O)cc1)C(=O)NC(CC(=O)O)C(=O)NC(CO)C(=O)O | |
4 | 0.58717254 | 58753253 | NC(CO)C(=O)NC(CC(=O)O)C(=O)NC(Cc1ccc(O)cc1)C(=O)O | |
5 | 0.57845528 | 18219720 | NC(CC(=O)O)C(=O)NC(Cc1ccc(O)cc1)C(=O)NC(CO)C(=O)O | |
6 | 0.30917874 | 21273011 | NC(C(=O)O)(c1ccccc1)N(CCN(CC(=O)O)CC(=O)O)CC(=O)O | |
7 | 0.27142857 | 20147059 | Nc1ccc(C(C(=O)O)N(CCN(CC(=O)O)CC(=O)O)CC(=O)O)cc1 | |
8 | 0.26190476 | 21494927 | Nc1ccccc1C(C(=O)O)N(CCN(CC(=O)O)CC(=O)O)CC(=O)O | |
9 | 0.23333333 | 54199399 | NC(CN(CC(=O)O)CC(=O)O)(c1ccccc1)N(CC(=O)O)CC(=O)O | |
10 | 0.20098039 | 44585322 | COc1cc(C(=O)NCC(=O)NCC(=O)NCC(=O)O)cc(OC)c1OC |
If a list of search candidates is submitted, the predicted spectra for these candidates will be found in a separate file, which will be in .msp format.