Using AnABlast for intergenic sORF prediction in the Caenorhabditis elegans genome.

TitleUsing AnABlast for intergenic sORF prediction in the Caenorhabditis elegans genome.
Publication TypeJournal Article
Year of Publication2020
AuthorsCasimiro-Soriguer, CS, Rigual, MM, Brokate-Llanos, AM, Muñoz, MJ, Garzón, A, Pérez-Pulido, AJ, Jimenez, J
JournalBioinformatics
Volume36
Issue19
Pagination4827-4832
Date Published2020 12 08
ISSN1367-4811
KeywordsAnimals; Caenorhabditis elegans; Computational Biology; Genome; Open Reading Frames; Software
Abstract

MOTIVATION: Short bioactive peptides encoded by small open reading frames (sORFs) play important roles in eukaryotes. Bioinformatics prediction of ORFs is an early step in a genome sequence analysis, but sORFs encoding short peptides, often using non-AUG initiation codons, are not easily discriminated from false ORFs occurring by chance.RESULTS: AnABlast is a computational tool designed to highlight putative protein-coding regions in genomic DNA sequences. This protein-coding finder is independent of ORF length and reading frame shifts, thus making of AnABlast a potentially useful tool to predict sORFs. Using this algorithm, here, we report the identification of 82 putative new intergenic sORFs in the Caenorhabditis elegans genome. Sequence similarity, motif presence, expression data and RNA interference experiments support that the underlined sORFs likely encode functional peptides, encouraging the use of AnABlast as a new approach for the accurate prediction of intergenic sORFs in annotated eukaryotic genomes.AVAILABILITY AND IMPLEMENTATION: AnABlast is freely available at http://www.bioinfocabd.upo.es/ab/. The C.elegans genome browser with AnABlast results, annotated genes and all data used in this study is available at http://www.bioinfocabd.upo.es/celegans.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

DOI10.1093/bioinformatics/btaa608
Alternate JournalBioinformatics
PubMed ID32614398
PubMed Central IDPMC7723330