| c1(nn(c(c1Br)C)CC(Nc1ncccc1C1CCC[NH+]1C)=O)[NH](O)=O | ==> | #1:
c1(nn(c(c1Br)C)CC(Nc1ncccc1[C@H]1CCC[N@H+]1C)=O)[NH](O)=O #2: c1(nn(c(c1Br)C)CC(Nc1ncccc1[C@H]1CCC[N@@H+]1C)=O)[NH](O)=O #3: c1(nn(c(c1Br)C)CC(Nc1ncccc1[C@@H]1CCC[N@H+]1C)=O)[NH](O)=O #4: c1(nn(c(c1Br)C)CC(Nc1ncccc1[C@@H]1CCC[N@@H+]1C)=O)[NH](O)=O |
1: ![]() |
2: ![]() |
3: ![]() |
4: ![]() |
1. History
2. Features
3. Limitations
4. Usage
5. Examples, sample tests
6. Concepts
7. Validation
8. Availability (news since 2009 January)
9. Citations
History:
- 2005: early discussions at RPBS about developping a free 1D to 3D tool (M. Miteva, P. Tuffery, B. Villoutreix).
- 2005: early developments by D. Gomes. Proof of concept, graph expansion of compounds from smiles, geometry assembly prototype.
- 2006: analysis of ambiguous compounds, energy assessment, towards multi conformation (T. Bohme Leite). Intensive checks of isomer detection (chirality, ZE conformations, axial/equatorial). Check of 3D assembly (mono conformation). Hydrogen positioning.
- late 2006: energy assessment of conformations (based on an implementation of Merck Molecular Force Field (MMFF)).
- February 2007: small bug fixes (Force Field parameters).
- Solve isomer ambiguities occurring in compounds expressed
using the 1D SMILES[1] or 2D SDF [2] formats, used by most academic
or commercial compound collections. Frog will process the input data, identify chiral centers and produce a list of unambiguous smiles, each smiles corersponding to one unambiguous isomer. Frog will also consider axial / equatorial conformations for cycles when relevant.
- Generate from SMILES or SDF 3D coordinates for the compounds. It is possible to ask for multi conformations per isomer. Multi conformations are often of gret help in the process of in silico compound screening.
1: Weininger, D. SMILES, a Chemical Language and
Information System. 1. Introduction to Methodology and Encoding
Rules. J. Chem. Inf. Comput. Sci. 1988, 28,
31-36
Limitations:
- Frog will accept to build compounds involving atom types commonly
accepted as possible for medical drugs (i.e. mostly C, N, O, P, S, H).
Ions present in salts can however be removed before processing by the facility
accessible from the form.
- Frog builder is based on a library of cycles (over 10000 cycles presently). Compounds involving cycles not present in the library cannot be constructed by Frog.
- Frog v1.01 multiconformation generation still has a crude sampling algorithm. For compounds having large number of degrees of freedom the combinatorial search is biased to keep reasonable computational time. Work is on progress on that point. (Rem: alternative stochastic approaches also do not search exhasutively the full conformational space)
- Frog energy scoring is based on MMFF [3]. Although validation has
been performed, some particular molecular arrangements can fall out of
our present implementation.
3: Halgren T. A.; Merck Molecular Force Field: I. Basis, Form, Scope, Parameterization, and Performance of MMFF94 (490-519), II. MMFF94 van des Waals and Electrostatic Parameters for Intermolecular Interactions (520-552), III. Molecular Geometries and Vibrational Frequencies for MMFF94 (553-586), IV. Conformational Energies and Geometries for MMFF94 (587-615), V. Extension of MMFF94 Using Experimental Data, Additional Computational Data, and Empirical Rules (616-641). J. Comp. Chem., 1996, Vol.17, Nos. 5 & 6
Usage:
- Input: smiles or sdf formats are
accepted for
the compounds. In
order not to overload the server, requests are limited to 1000
compounds. It is possible to bot upload a file and paste data. However,
the two body of data MUST be on the same format (i.e. both are smiles
or both are sdf).
- Processing: the user can choose among "Unambiguate", "Single" and "Multi".
- "Unambiguate" will only produce a list of unambiguous smiles (i.e. one per isomer identified).
- "Quick3D" will only produce one 3D conformation, for one isomer
only. It is intended to provide a means to have a quick glance at the
compound(s).
- "Single" will generate one conformation per unambiguous isomer of each compound.
- "Multi" will generate several conformations per unambiguous isomer of each compound.
- #confs: Maximal number of conformations returned per isomer. The actual number depends on acceptance of the conformations due to the EMax threshold.
- E Max: Maximal Energy difference to lowest energy conformer to flush 3D conformations. Conformations with scores more than the value will be discarderd.
- #mc Steps: monte Carlo steps. For each conformation, some monte carlo steps are preformed as an attempt to quickly improve conformation and/or remove clashes.
- Output: 3D files can be returned using the
PDB, SDF or mol2
formats. 1D information is returned using the SMILES format.
Unambiguate results are always smiles. Identifiers of the returned
conformations are on the form: inputIdentifier_#isomer_#conformation.
- The returned log will give information about the treatment of
each compound: smiles, axial-equatorial conformer string if relevant (A
for axial, E for equatorial, the string matches the smiles' heavy
atoms). Errors can occur (see limitations).
Using "Quick3D" or "Single", the first conformation not presenting strong steric clashes will be returned.This is intended to provide rapidly a correct 3D geometry of the compound
for one or all its isomers.
Only using "Multi", conformations of low energy are returned. Be aware that this is under current optimisation on two directions: (i) relevance of the lowest energy
conformations (ii) computational speed. At present too large compounds still require important computational time. This is under investigation.
Note on formats:
Smiles files should be on the form of one smiles per line, such as:
O=C(CCCCCCCCCC[C@H]1[C@@H]2[C@@H](c3c(C1)cc(cc3)O)CC[C@@]1([C@@H](CC[C@@H]21)O)C)[N@@](C)CCCC compound_Identifier
O=C(CCCCCCCCCC[C@H]1[C@@H]2[C@@H](c3c(C1)cc(cc3)O)CC[C@@]1([C@@H](CC[C@@H]21)O)C)[N@](C)CCCC another_compound_identifier
O=C1[C@@H]2[C@H](N[C@H](N1)N)[N@](CCC(CO)CO)[CH]N2 TKinh5_penciclovir_1KI3pdb
The sdf format should be on the form:
Examples:
Paste smiles:
CC(=C(C)C(O)C)F
Select Unambiguate.
Resulting smiles are:
CC(=C(/C)[C@H](O)C)/F
CC(=C(/C)[C@H](O)C)\F
CC(=C(/C)[C@@H](O)C)/F
A more complex sample test (19 smiles) can be accessed here.
The unambiguation results (38 smiles) can be accessed here.
The results of the quick3D generation (19 compounds) can be accessed here. (mol2 format)
The results of the single generation (33 compounds) can be accessed here. (mol2 format) (the difference between 38 and 33 stands for axial/equatorial conformations possible for some cycles, and the fact that Frog randomly selected a maximum of 8 isomers upon 16 for 1 compound).
The results of the multi generation (10 conformations at max per isomer) can be accessed here. (mol2 format, 275 conformations) (note: for some compounds, less than 10 conformations of low energy were identified).
Some 3D conformations generated using Frog (multiconformations), for
which experimental data is available here can
be accessed here
Random test upon 992 compounds from Specs, Chembridge and Ambinter, using as values of energetic treshold of 100.0,
number of Monte Carlo steps of 100, number of conformations of 10.
The input smiles are here.
The unambiguated smiles (1238) are here.
The mol2 output (12668 conformers) is here.
The log file is here.
Compounds not processed (might not be ADME/Tox compliant) here.
Note: the number of 12668 (i.e. more than 1238 x 10) is due to componds for which axial equatorial conformers have been considered. See for instance compound Chembridge-6439335, 2 smiles to describe the isomers, but 4 conformers considered due to axial/equatorial conformations.
- Graph approach to compound 3D generation. Nodes types are cycles,
linkers, appendices. (see image below)
- Cycles correspond to simple cycles or complex cycles made of several simple cycles connected together (sharing atoms).
- Linkers are compound fragments that interconnect cycles.
- Appendices are fragments that are bound to cycles, not linking several cycles.
- Cycle conformations are not addressed by Frog. Cycle
conformations are taken from a library of cycles extracted from
collections of publicly available collection of 3D compounds. Such
strategy has already been described [4]. Frog revisits it.
- Flexibility of compounds results from dihedrals of the linkers and the appendices. Covalent geometry flexibility is ignored.
- Cycles multi conformation not presently considered, although it is conceivable from the different conformations extracted, stored in the library.
- Multi conformations are obtained by sampling the flexible dihedral angles of the compounds, sorted according to their MMFF energy. This has several limitations. The two major are: (i) the non relevance of MMFF to reproduce the relative orientation of cyles in some cases. (ii) For large compounds, the combinatorial to explore is huge. Frog will presently truncate it. Both points are under investigation for improvement.
4: Sadowski, J.; Gasteiger, J. From Atoms and
Bonds to Three-Dimensional Atomic Coordinates: Automatic Model
Builders. Chem. Rev. 1993, 93, 2567-2581
![]() |
==> |
![]() |
Validation tests:
- Chirality detection on the
Asinex library: Selection of 84.812
compounds for which some chirality information was present. Removed
chirality information for the compounds, ask for unambiguation. Check
if original chirality information regenerated. Success for 84792 (over
99%). After manual analysis, the 20 remaining compounds are false
negatives!
- Isomery 3D assembly: random selection of compounds over Asinex, Ambinter, Specs et ChemBridge. 3D generation. Visual inspection of all the isomers: OK.
- Atom types assignment: performed using the MMFF94 validation suite (235 compounds).
- Some 3D conformations generated using Frog, for which
experimental data is available here can
be accessed here.
- Some results obtained for the astex diverse test set are summarized here
Availability:
- January 2009: Frog v1.0 is freely available under the terms of the GNU GPL license. You can access the Frog source code here. The authors appreciate if you send an email, so as to identify Frog users, and send news about Frog evolution.
Using Frog, please cite:
- Frog: a FRee Online druG 3D conformation generator. Leite TB, Gomes D, Miteva MA, Chomilier J, Villoutreix BO, Tufféry P. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W568-72. Epub 2007 May 7.





