![]() In the eukaryotic cell system, just after the transcription process, a process called “splicing” takes place. That means the DNAs are processed into messenger RNAs, which pass through the nuclear membrane and reach the cytosol, where they are translated to proteins. The central dogma plays an indispensable role in unfolding the instructions by the process of transcription of the DNA into RNA and then RNA to protein. The whole genome sequence of an organism is actually a “blueprint”, that says that an organism's genome carries a set of instructions that recite its biological characteristics. On the other hand, characterization of predicted genes, assigning them a biological function, and identifying their metabolic role or describing structural features are the criteria of functional annotation. The process to identify the exact gene structure is described by the prediction phase, restricting the boundaries of exon and intron and the localization of genes on the genome. Genome annotation has two distinct phases: gene prediction and functional annotation. In reality, for the improvement of genome annotation, a number of bioinformatics tools and software have been developed that consider multiple and heterogeneous evidence sources. In this research area, bioinformatics plays a major role. In the last many years an increasing number of sequencing projects and the accessibility of entirely sequenced genomes create difficulty in finding gene sequences in an expeditious and decisive manner. Recent research has shown that the machine learning approach serves to be a boon for different types of prediction, especially in the field of bioinformatics. This model is compatible with huge sequential data such as the complete genome. This designed model achieved a maximum accuracy of 95.5%. The model has been improved by increasing the number of epochs while training. ![]() This bidirectional LSTM-RNN model uses the intron features that start with splice site donor (GT) and end with splice site acceptor (AG) in order of its length constraints. During the splicing mechanism of the primary mRNA transcript, the introns, the non-coding region of the gene are spliced out and the exons, the coding region of the gene are joined. To solve this problem, here, in this paper, we represent a bidirectional Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) based deep learning model that has been developed to identify and predict the splice-sites for the prediction of exons from eukaryotic DNA sequences. In eukaryotes, Splice-site identification and prediction is though not a straightforward job because of numerous false positives. Machine learning methods played a major role in improving the accuracy of predictions and classification of DNA (Deoxyribonucleic Acid) and protein sequences.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |