OR

AG OR/ML - Dr. Josef Hochreiter

ML


Recurrent Neural Networks in Bioinformatics


For bioinformatics and medical applications it is important to classify sequences, e.g. healthy vs. ill probands. However, sequence classification inherently requires the handling of long-term dependencies, such as important information which have to be memorized for classification at the end of the sequence. I will present some applications of sequence classification in the field of bioinformatics.

First, I will show how recurrent nets can be used to predict the 3D structure of proteins given as amino acid sequences, e.g. for determining the secondary structure, contact maps, or turns. Then I will present an novel approach to functional protein classification. This approach also identifies new protein motifs, i.e. indicative, highly conserved patterns in the amino acid sequence. From 15 randomly chosen Prosite protein classes we rediscovered for 8 classes the Prosite pattern without using biochemical knowledge (such as the Prosite designers did). For the remaining 7 classes we detected new patterns which were tested on the SwissProt database. On average the new pattern outperformed the corresponding Prosite pattern in terms of misclassification. In contrast to the construction of the existing functional protein databases (Prosite, Pfam, Prints), recurrent networks are able to generate protein databases automatically.


back - Mathematics - OR - LNM - Theoretical Computer Science - Computer Science - University of Osnabrück.

B.Hammer