3D protein structures are classically described by the succession of their secondary structures : the periodic a-helix and b-sheet, and, the coils (everything else). However, this approach lets 50% of the structures not described.
A structural alphabet (de Brevern et al, 2001 [ abstract ])is a set (or library) of small prototypes which approximate every part of the protein structures. They compose a limited number of recurrent structural elements of proteins. The associations between these structural "letters" are governed by logic rules and form the words of protein structures. The applications are numerous and range from simplifying the protein backbone conformation with a correct accuracy to more ambitious prediction approaches.
Figure 1 . Representation of the 16 Protein Blocks.
The most frequent successions of 5 PBs length called Structural Words (SWs) have been examined. The selection defines 72 SWs that exhibit a good structural approximation for 9 Calpha length. Combination of most of the SWs in a protein network includes more than 90% of protein residues of non-redundant protein structural databank. Interestingly, more than 80% of the coils are included in the network. The structural stability of the protein network is examined for every part and shows locally only one type of folds. Amino acid composition is analyzed and new type of relationship between local folds and amino acid distribution is shown. The results show that the 3D structure of the protein databank may be easily described through a combination of sub-graphs included in the network(de Brevern et al., 2002 [abstract]).
This structural alphabet was used in a compaction of a structural databank with a new clustering approach called Hybrid Protein Model (de Brevern & Hazout, 2001 abstract]). It differs from a classical clustering because the clusters are not independent, they are overlapping and so create continuity. The methodology has been improved (de Brevern & Hazout, 2003 [abstract] and Benros et al. [abstract].
It has shown its potentiality in the analysis and description of the relationship between structure and sequence in globular proteins (de Brevern & Hazout, 2000 abstract). An other approach is developped in our laboratory using Hidden Markov Model (Camproux et al., 2001 [ abstract]).
The first screen of LocPred is a classical window to write the protein sequence. In the following sections, we look at the different options and analysis.
The help files are sub-divised into 5 distinct sections.