Alexandre G. de Brevern's papers
de Brevern A.G., Etchebest C. & Hazout, S. (2000),
Bayesian probabilistic approach for prediction backbone structures in terms of protein blocks,
Proteins : Structure, Functions and Genetics, 41(3):271-287.
- Using an unsupervised cluster analyser, we have
identified a local structural alphabet composed of 16 folding patterns of five
("protein blocks"). The dependence that exists between successive
blocks is explicitly taken into account. A Bayesian approach based on the relation
protein block-amino acid propensity is used
for prediction and leads to a success rate close to 35 %. Sharing
sequence windows associated with certain blocks into "sequence families" improves
the prediction accuracy by 6 %. This prédiction accuracy exceeds 75 %
when keeping the first four predicted protein blocks at each site of
de Brevern A.G. & Hazout S. (2000),
Hybrid Protein Model (HPM): a method to compact protein 3D-structures information and
IEEE - Computer Society , S1:49-54.
- The transformation of protein 1D-sequence to protein
3D-structure is one of the main difficulties of the structural biology. A
structural alphabet had been previously defined from dihedral angles describing
the protein backbone as structural information by using an unsupervised
The 16 Protein Blocks (PBs), basis element of the structural alphabet,
allows a correct 3D structure approximation. Local prediction had been estimated
by a Bayesian approach and shown that sequence information induces strongly the
local fold, but stays coarse (prediction rate of 40.7 % with one PB, 75.8 % with
the four most probable PBs).
The Hybrid Protein Model presented in this study learns both sequence and structure of the proteins.
The analysis made along the hybrid protein has permitted to appreciate more
precisely the spatial location of some types of amino acid residues in the secondary structures
and their flanking regions. This study leads to a fuzzy model
of dependence between sequence and structure.
de Brevern A.G. & Hazout S. (2001),
Compacting local protein folds with a Hybrid Protein,
Theoretical Chemistry Accounts, 106(1/2):36-47.
- The "Hybrid Protein Model" (HPM) is a fuzzy model for compacting local
protein structures. It learns a non-redundant database encoded in a
previously defined structural alphabet composed of 16 protein blocks (PBs).
The hybrid protein is composed of a series of distributions of the
probability of observing the PBs. The training is an iterative
unsupervised process that for every fold to be learnt consists of looking
for the most similar pattern present in the hybrid protein and modifying
it slightly. Finally each position of the hybrid protein corresponds to a set
of similar local structures. Superimposing those local structures yields an
average root mean square of 3.14 Å. The significant amino acid
characteristics related to the local structures are determined. The use of
this model is illustrated by finding the most similar folds between two cytochromes P450.
Camproux A.C., de Brevern A.G., Hazout S. & Tuffery P. (2001),
Exploring the use of a structural alphabet for a structural prediction of protein loops,
Theoretical Chemistry Accounts, 106(1/2):28-35.
- The prediction of loop conformations is one of the
challenging problems of homology modeling, due to the large sequence variability
associated with these parts of protein structures. In the present study, we introduce a
search procedure that evolves in a structural alphabet space deduced from a hidden Markov
model to simplify the structural information. It uses a
Bayesian criterion to predict, from the amino acid sequence of a loop region,
its corresponding word in the structural alphabet space. Results show, that our
approach ranks 30 % of the target words with the best score, 50 % within the 5
best scores. Interestingly, our approach is also suited to accept or not the
prediction performed. This allows to rank 57 % of the target words with the best
score, 67 % within the 5 best scores, accepting 16 % of learned words and
rejecting 93 % of unknown words.
de Brevern A.G. (2001),
Nouvelles stratégies d'analyses et de prédiction des structures tridimensionnelles des protéines,
Doctorat de l'Université PARIS 7 - Spécialité : Analyses de Génomes et Modélisation Moléculaire, 208 p.
soutenue le 6 février 2001 avec mention Très Honorable avec Félicitations
- Résumé :Caractériser la structure tridimensionnelle des protéines avec les structures secondaires classiques est assez pauvre structurellement.
Nous avons donc développé une nouvelle méthodologie pour concevoir des séries de petits prototypes moyens nommés Blocs Protéiques (BPs)
qui permettent une bonne approximation des structures protéiques. L'analyse de la spécificité des blocs protéiques a montré leur stabilité et
leur spécificité sur le plan structural.
Le choix final du nombre de BPs est associé a une prédiction locale correcte.
Cette prédiction se base avec une méthode bayésienne qui permet de comprendre l'importance des acides aminés de manière simple.
Pour améliorer cette prédiction, nous nous sommes bases sur deux concepts : (i) 1 repliement local -> n séquences et
(ii) 1 séquence -> n repliements. Le premier concept signifie que plusieurs types de séquences peuvent être associes a la même structure
et le second qu'une séquence peut-être associée a plusieurs type de repliements. Ces deux aspects sont développés en se basant
sur la recherche d'un indice de fiabilité lie a la prédiction locale, pour trouver des zones de fortes probabilités.
Certains mots, i.e. successions de blocs protéiques apparaissent plus fréquemment que d'autres. Nous avons donc
défini au mieux quelle est l'architecture de ces successions, les liens existants entre ces différents mots.
Du fait de cette redondance qui peut apparaître dans la structure protéique, une méthode de compactage qui permet
d'associer des structures structurellement proches sur le plan local a été mise au point. Cette approche appelée "protéine hybride"
de conception simple permet de catégoriser en classes "structurellement dépendantes" l'ensemble des structures de la base de données protéiques.
Cette approche, en plus du compactage, peut être utilisée dans une optique différente, celle de la recherche d'homologie structurale
et de la caractérisation des dépendances entre structures et séquences.
paper :The secondary structures approximate badly the 3D protein structures.
A new method have been developped to create a more complex and precise structural alphabet
(called Protein Blocks) which could be used in a local prediction method. The analysis of the specificity
and stability of this alphabet has been performed.
This alphabet is composed of 16 prototypes of 5 Calpha length. The local prediction of PBs from the sequence gives correct results.
The prediction is based on Bayesian statistics which is efficient to understand the meaning and influences of every amino acids.
To improve this prediction, we have used two concepts : (i) 1 local fold is associated with a set of sequences and (ii)
1 sequence could give different folds. The first point is associated with the splitting of occurrence matrices associated with the most common PBs.
The second point is based upon a confidence index and allow the location of well-predicted residues. Some succession of Protein Blocks are over-represented.
So, we have define a network describing most of the protein topology.
Finally, we propose a method called Hybrid Protein Model which allow the compaction of succession of Protein Blocks in a fuzzy manner and
create structurally dependant cluster. This approach has been extended to the research of structural homology.
de Brevern A.G., Camproux A.C., Hazout S., Etchebest C., and Tuffery P. (2001),
Protein structural alphabets: beyond the secondary structure description,
Recent Adv. In Prot. Eng., 1:319-331, Sangadai SG ed. Research signpost, Trivandrum, India.
- The considerable increase of the protein structural database allows to cross the line from
the classical secondary structure description of proteins. While still confronted with numerous
problems, defining structural alphabets is an emerging concept in the field of protein structure
analysis. It is an attempt to objectively classify the whole set of conformations occurring in
protein structures described by small overlapping fragments.
It is expected to lead to a better understanding of protein architecture and to open new
opportunities for protein structure prediction.
de Brevern A.G., Valadié, H., Hazout H. & Etchebest C. (2002),
Extension of a local backbone description using a structural alphabet.
A new approach to the sequence-structure relationship.,
Protein Science, 11(12):2871-2886.
- Protein Blocks (PBs) comprise a structural alphabet of 16 protein fragments, each 5 C alpha long.
They make it possible to approximate and correctly predict local protein 3-dimensional (3-D) structures (de Brevern et al., 2000).
We have selected the 72 most frequent sequences of 5 PBs, which we call Structural Words (SWs).
Analysis of 4 different protein databanks shows that SWs cover 92% of the amino acids in them and provide a good structural approximation for residues,
that is, sequences, 9 C alpha long. We present most of them in a simple network that describes 90% of the overall residues and, interestingly, includes more than 80%
of the amino acids present in coils. Analysis of the network shows the specificity and quality of the 3D descriptions as well as a new type
of relation between local folds and amino acid distribution. The results show that the 3D structure of these protein databanks can be
easily described by a combination of subgraphs included in the network. Finally, a Bayesian probabilistic approach improved the prediction rate by 4%.
de Brevern A.G. & Hazout S. (2003),
A "Hybrid Protein Model" for defining optimally a repertory of contiguous 3D protein structure fragments,
Motivation : Our aim is to define automatically
a repertory of contiguous 3D protein structure fragments.
protein structures in order to exploit the defined domains.
We present the improvements of a methodology, the "Hybrid Protein Model"
(de Brevern and Hazout, 2001).
The hybrid protein aims in learning a non-redundant database encoded in a previously
alphabet composed of 16 Protein Blocks (PBs) (de Brevern et al., 2000).
The hybrid protein is composed of probability series of observing the PBs.
It consists in learning every local fold by looking for the most similar
pattern present in the hybrid protein and
modifying it slightly.
Finally each position corresponds to a set of similar local structures.
Results : In this paper, we present the strategy for defining optimally
the hybrid protein.
The strategy lies upon the "baby training" which consists in introducing large
structure fragments and progressively reducing their sizes,
and, the deletion of the redundancy in the hybrid protein.
Assessing of these two improvements is carried out with a description of the repertory.
Benros C., de Brevern A.G. & Hazout H. (2003),
Hybrid Protein Model (HPM) : A Method For Building A Library Of Overlapping Local Structural Prototypes. Sensitivity Study And Improvements Of The Training.,
IEEE NNSP, 1:53-72.
- Predicting protein structure from amino acid sequence is one of the main challenges of Genomics. Various computational methods have been developed during the last decade to reach this goal. However, the problem of structure prediction remains difficult. Before facing this complex problem, our goal is to focus on the accurate analysis of protein structures at a local level. In our study, we present an approach called "Hybrid Protein Model" (HPM) which uses a training procedure similar to the one of the Self-Organizing Maps. It allows the compression of a non-redundant protein structure databank into a library of overlapping 3D structural fragments. The "Hybrid Protein Model" carries out a multiple alignment of structural fragments. We present in this study an improvement of this strategy by introducing gaps in the local structures, and a sensitivity study of the training according to the control parameters. The library obtained is composed of a finite number of structural classes, each class including fragments sharing similar local structures. These classes are representative of the structural motifs found in the protein structures from the databank. Thus, this library constitutes an efficient tool for determining structural similarities between proteins and especially for predicting the local protein structure from the amino acid sequence.
If you want more information about those works mailto: email@example.com.
De BREVERN Alexandre
Equipe de Bioinformatique Génomique & Moléculaire du professeur Serge Hazout (EBGM)
Unité INSERM E0346
Université Paris VII, case 7113
2, place Jussieu
75251 Paris Cedex 05
pour envoyer un mail
Subject: Protein Blocks.