sequence classification algorithms can be used to determine whether
-
a DNA sequence is part of a gene-coding or a non-coding region,
-
identify the introns or exons of a eukaryotic gene sequence,
-
predict the secondary structure of a protein, or
-
assign a protein sequence to a specific protein family
-
promoter https://archive.ics.uci.edu/ml/datasets/Molecular+Biology+%28Promoter+Gene+Sequences%29
-
secondary-structure https://archive.ics.uci.edu/ml/datasets/Molecular+Biology+%28Protein+Secondary+Structure%29
-
splice https://archive.ics.uci.edu/ml/datasets/Molecular+Biology+%28Splice-junction+Gene+Sequences%29
-
https://archive.ics.uci.edu/ml/machine-learning-databases/ecoli-mld/