TY - JOUR T1 - Babelomics 5.0: functional interpretation for new generations of genomic data. JF - Nucleic acids research Y1 - 2015 A1 - Alonso, Roberto A1 - Salavert, Francisco A1 - Garcia-Garcia, Francisco A1 - Carbonell-Caballero, José A1 - Bleda, Marta A1 - García-Alonso, Luz A1 - Sanchis-Juan, Alba A1 - Perez-Gil, Daniel A1 - Marin-Garcia, Pablo A1 - Sánchez, Rubén A1 - Cubuk, Cankut A1 - Hidalgo, Marta R A1 - Amadoz, Alicia A1 - Hernansaiz-Ballesteros, Rosa D A1 - Alemán, Alejandro A1 - Tárraga, Joaquín A1 - Montaner, David A1 - Medina, Ignacio A1 - Dopazo, Joaquin KW - babelomics KW - data integration KW - gene set analysis KW - interactome KW - network analysis KW - NGS KW - RNA-seq KW - Systems biology KW - transcriptomics AB - Babelomics has been running for more than one decade offering a user-friendly interface for the functional analysis of gene expression and genomic data. Here we present its fifth release, which includes support for Next Generation Sequencing data including gene expression (RNA-seq), exome or genome resequencing. Babelomics has simplified its interface, being now more intuitive. Improved visualization options, such as a genome viewer as well as an interactive network viewer, have been implemented. New technical enhancements at both, client and server sides, makes the user experience faster and more dynamic. Babelomics offers user-friendly access to a full range of methods that cover: (i) primary data analysis, (ii) a variety of tests for different experimental designs and (iii) different enrichment and network analysis algorithms for the interpretation of the results of such tests in the proper functional context. In addition to the public server, local copies of Babelomics can be downloaded and installed. Babelomics is freely available at: http://www.babelomics.org. VL - 43 UR - http://nar.oxfordjournals.org/content/43/W1/W117 ER - TY - JOUR T1 - Exome sequencing reveals a high genetic heterogeneity on familial Hirschsprung disease. JF - Scientific reports Y1 - 2015 A1 - Luzón-Toro, Berta A1 - Gui, Hongsheng A1 - Ruiz-Ferrer, Macarena A1 - Sze-Man Tang, Clara A1 - Fernández, Raquel M A1 - Sham, Pak-Chung A1 - Torroglosa, Ana A1 - Kwong-Hang Tam, Paul A1 - Espino-Paisán, Laura A1 - Cherny, Stacey S A1 - Bleda, Marta A1 - Enguix-Riego, María Del Valle A1 - Joaquín Dopazo A1 - Antiňolo, Guillermo A1 - Garcia-Barceló, Maria-Mercè A1 - Borrego, Salud KW - babelomics KW - Hirschprung KW - NGS KW - prioritization AB - Hirschsprung disease (HSCR; OMIM 142623) is a developmental disorder characterized by aganglionosis along variable lengths of the distal gastrointestinal tract, which results in intestinal obstruction. Interactions among known HSCR genes and/or unknown disease susceptibility loci lead to variable severity of phenotype. Neither linkage nor genome-wide association studies have efficiently contributed to completely dissect the genetic pathways underlying this complex genetic disorder. We have performed whole exome sequencing of 16 HSCR patients from 8 unrelated families with SOLID platform. Variants shared by affected relatives were validated by Sanger sequencing. We searched for genes recurrently mutated across families. Only variations in the FAT3 gene were significantly enriched in five families. Within-family analysis identified compound heterozygotes for AHNAK and several genes (N = 23) with heterozygous variants that co-segregated with the phenotype. Network and pathway analyses facilitated the discovery of polygenic inheritance involving FAT3, HSCR known genes and their gene partners. Altogether, our approach has facilitated the detection of more than one damaging variant in biologically plausible genes that could jointly contribute to the phenotype. Our data may contribute to the understanding of the complex interactions that occur during enteric nervous system development and the etiopathology of familial HSCR. VL - 5 UR - http://www.nature.com/articles/srep16473 ER - TY - JOUR T1 - Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling. JF - Nucleic Acids Research Y1 - 2010 A1 - Medina, Ignacio A1 - Carbonell, José A1 - Pulido, Luis A1 - Madeira, Sara C A1 - Goetz, Stefan A1 - Ana Conesa A1 - Tárraga, Joaquín A1 - Pascual-Montano, Alberto A1 - Nogales-Cadenas, Ruben A1 - Santoyo, Javier A1 - García, Francisco A1 - Marbà, Martina A1 - Montaner, David A1 - Joaquín Dopazo KW - babelomics KW - gene expression KW - genotyping KW - gepas KW - GSA KW - GWAS AB -

Babelomics is a response to the growing necessity of integrating and analyzing different types of genomic data in an environment that allows an easy functional interpretation of the results. Babelomics includes a complete suite of methods for the analysis of gene expression data that include normalization (covering most commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors, clustering; large-scale genotyping assays (case controls and TDTs, and allows population stratification analysis and correction). All these genomic data analysis facilities are integrated and connected to multiple options for the functional interpretation of the experiments. Different methods of functional enrichment or gene set enrichment can be used to understand the functional basis of the experiment analyzed. Many sources of biological information, which include functional (GO, KEGG, Biocarta, Reactome, etc.), regulatory (Transfac, Jaspar, ORegAnno, miRNAs, etc.), text-mining or protein-protein interaction modules can be used for this purpose. Finally a tool for the de novo functional annotation of sequences has been included in the system. This provides support for the functional analysis of non-model species. Mirrors of Babelomics or command line execution of their individual components are now possible. Babelomics is available at http://www.babelomics.org.

VL - 38 UR - http://nar.oxfordjournals.org/content/38/suppl_2/W210.full ER - TY - JOUR T1 - Formulating and testing hypotheses in functional genomics JF - Artif Intell Med Y1 - 2009 A1 - Dopazo, J. KW - babelomics KW - gene set analysis AB -

OBJECTIVE: The ultimate goal of any genome-scale experiment is to provide a functional interpretation of the results, relating the available genomic information to the hypotheses that originated the experiment. METHODS AND RESULTS: Initially, this interpretation has been made on a pre-selection of relevant genes, based on the experimental values, followed by the study of the enrichment in some functional properties. Nevertheless, functional enrichment methods, demonstrated to have a flaw: the first step of gene selection was too stringent given that the cooperation among genes was ignored. The assumption that modules of genes related by relevant biological properties (functionality, co-regulation, chromosomal location, etc.) are the real actors of the cell biology lead to the development of new procedures, inspired in systems biology criteria, generically known as gene-set methods. These methods have been successfully used to analyze transcriptomic and large-scale genotyping experiments as well as to test other different genome-scale hypothesis in other fields such as phylogenomics.

VL - 45 UR - http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18789659 N1 -

Dopazo, Joaquin Research Support, Non-U.S. Gov’t Netherlands Artificial intelligence in medicine Artif Intell Med. 2009 Feb-Mar;45(2-3):97-107. Epub 2008 Sep 11.

ER - TY - JOUR T1 - Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies JF - Nucl. Acids Res. Y1 - 2009 A1 - Medina, Ignacio A1 - Montaner, David A1 - Bonifaci, Núria A1 - Pujana, Miguel Angel A1 - Carbonell, José A1 - Tárraga, Joaquín A1 - Fatima Al-Shahrour A1 - Dopazo, Joaquin KW - babelomics KW - gene set KW - GESBAP KW - pathway-based analysis KW - SNP AB -

Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/

VL - 37 UR - http://nar.oxfordjournals.org/cgi/content/abstract/37/suppl_2/W340 ER - TY - JOUR T1 - Babelomics: advanced functional profiling of transcriptomics, proteomics and genomics experiments JF - Nucleic Acids Res Y1 - 2008 A1 - Fatima Al-Shahrour A1 - Carbonell, J. A1 - Minguez, P. A1 - Goetz, S. A1 - A. Conesa A1 - Tarraga, J. A1 - Medina, Ignacio A1 - Alloza, E. A1 - Montaner, D. A1 - Dopazo, J. KW - babelomics KW - funtional profiling AB -

We present a new version of Babelomics, a complete suite of web tools for the functional profiling of genome scale experiments, with new and improved methods as well as more types of functional definitions. Babelomics includes different flavours of conventional functional enrichment methods as well as more advanced gene set analysis methods that makes it a unique tool among the similar resources available. In addition to the well-known functional definitions (GO, KEGG), Babelomics includes new ones such as Biocarta pathways or text mining-derived functional terms. Regulatory modules implemented include transcriptional control (Transfac, CisRed) and other levels of regulation such as miRNA-mediated interference. Moreover, Babelomics allows for sub-selection of terms in order to test more focused hypothesis. Also gene annotation correspondence tables can be imported, which allows testing with user-defined functional modules. Finally, a tool for the ’de novo’ functional annotation of sequences has been included in the system. This allows using yet unannotated organisms in the program. Babelomics has been extensively re-engineered and now it includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. Babelomics is available at http://www.babelomics.org.

VL - 36 UR - http://nar.oxfordjournals.org/content/36/suppl_2/W341.long N1 -

Al-Shahrour, Fatima Carbonell, Jose Minguez, Pablo Goetz, Stefan Conesa, Ana Tarraga, Joaquin Medina, Ignacio Alloza, Eva Montaner, David Dopazo, Joaquin Research Support, Non-U.S. Gov’t England Nucleic acids research Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W341-6. Epub 2008 May 31.

ER - TY - JOUR T1 - FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments JF - Nucleic Acids Res Y1 - 2007 A1 - Fatima Al-Shahrour A1 - Minguez, P. A1 - Tarraga, J. A1 - Medina, Ignacio A1 - Alloza, E. A1 - Montaner, D. A1 - Dopazo, J. KW - babelomics KW - functional enrichment analysys AB -

The ultimate goal of any genome-scale experiment is to provide a functional interpretation of the data, relating the available information with the hypotheses that originated the experiment. Thus, functional profiling methods have become essential in diverse scenarios such as microarray experiments, proteomics, etc. We present the FatiGO+, a web-based tool for the functional profiling of genome-scale experiments, specially oriented to the interpretation of microarray experiments. In addition to different functional annotations (gene ontology, KEGG pathways, Interpro motifs, Swissprot keywords and text-mining based bioentities related to diseases and chemical compounds) FatiGO+ includes, as a novelty, regulatory and structural information. The regulatory information used includes predictions of targets for distinct regulatory elements (obtained from the Transfac and CisRed databases). Additionally FatiGO+ uses predictions of target motifs of miRNA to infer which of these can be activated or deactivated in the sample of genes studied. Finally, properties of gene products related to their relative location and connections in the interactome have also been used. Also, enrichment of any of these functional terms can be directly analysed on chromosomal coordinates. FatiGO+ can be found at: http://www.fatigoplus.org and within the Babelomics environment http://www.babelomics.org.

VL - 35 UR - http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17478504 N1 -

Al-Shahrour, Fatima Minguez, Pablo Tarraga, Joaquin Medina, Ignacio Alloza, Eva Montaner, David Dopazo, Joaquin Research Support, Non-U.S. Gov’t England Nucleic acids research Nucleic Acids Res. 2007 Jul;35(Web Server issue):W91-6. Epub 2007 May 3.

ER - TY - JOUR T1 - From genes to functional classes in the study of biological systems JF - BMC Bioinformatics Y1 - 2007 A1 - Fatima Al-Shahrour A1 - Arbiza, L. A1 - H. Dopazo A1 - Huerta-Cepas, J. A1 - Minguez, P. A1 - Montaner, D. A1 - Dopazo, J. KW - Algorithms Chromosome Mapping/*methods Computer Simulation Gene Expression Profiling/methods *Models KW - babelomics KW - Biological Multigene Family/*physiology Signal Transduction/*physiology *Software Systems Biology/*methods *User-Computer Interface AB -

BACKGROUND: With the popularization of high-throughput techniques, the need for procedures that help in the biological interpretation of results has increased enormously. Recently, new procedures inspired in systems biology criteria have started to be developed. RESULTS: Here we present FatiScan, a web-based program which implements a threshold-independent test for the functional interpretation of large-scale experiments that does not depend on the pre-selection of genes based on the multiple application of independent tests to each gene. The test implemented aims to directly test the behaviour of blocks of functionally related genes, instead of focusing on single genes. In addition, the test does not depend on the type of the data used for obtaining significance values, and consequently different types of biologically informative terms (gene ontology, pathways, functional motifs, transcription factor binding sites or regulatory sites from CisRed) can be applied to different classes of genome-scale studies. We exemplify its application in microarray gene expression, evolution and interactomics. CONCLUSION: Methods for gene set enrichment which, in addition, are independent from the original data and experimental design constitute a promising alternative for the functional profiling of genome-scale experiments. A web server that performs the test described and other similar ones can be found at: http://www.babelomics.org.

VL - 8 UR - http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17407596 N1 -

Al-Shahrour, Fatima Arbiza, Leonardo Dopazo, Hernan Huerta-Cepas, Jaime Minguez, Pablo Montaner, David Dopazo, Joaquin Research Support, Non-U.S. Gov’t England BMC bioinformatics BMC Bioinformatics. 2007 Apr 3;8:114.

ER - TY - JOUR T1 - Functional profiling and gene expression analysis of chromosomal copy number alterations JF - Bioinformation Y1 - 2007 A1 - L. Conde A1 - Montaner, D. A1 - Burguet-Castell, J. A1 - Tarraga, J. A1 - Fatima Al-Shahrour A1 - Dopazo, J. KW - babelomics AB -

Contrarily to the traditional view in which only one or a few key genes were supposed to be the causative factors of diseases, we discuss the importance of considering groups of functionally related genes in the study of pathologies characterised by chromosomal copy number alterations. Recent observations have reported the existence of regions in higher eukaryotic chromosomes (including humans) containing genes of related function that show a high degree of coregulation. Copy number alterations will consequently affect to clusters of functionally related genes, which will be the final causative agents of the diseased phenotype, in many cases. Therefore, we propose that the functional profiling of the regions affected by copy number alterations must be an important aspect to take into account in the understanding of this type of pathologies. To illustrate this, we present an integrated study of DNA copy number variations, gene expression along with the functional profiling of chromosomal regions in a case of multiple myeloma.

VL - 1 UR - http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17597935 N1 -

Conde, Lucia Montaner, David Burguet-Castell, Jordi Tarraga, Joaquin Al-Shahrour, Fatima Dopazo, Joaquin Singapore Bioinformation Bioinformation. 2007 Apr 10;1(10):432-5.

ER - TY - JOUR T1 - Functional profiling of microarray experiments using text-mining derived bioentities JF - Bioinformatics Y1 - 2007 A1 - Minguez, P. A1 - Fatima Al-Shahrour A1 - Montaner, D. A1 - Dopazo, J. KW - Artificial Intelligence *Databases KW - babelomics KW - Protein Gene Expression Profiling/*methods Information Storage and Retrieval/*methods *Natural Language Processing Proteins/*classification/*metabolism Research/*methods Systems Integration AB -

MOTIVATION: The increasing use of microarray technologies brought about a parallel demand in methods for the functional interpretation of the results. Beyond the conventional functional annotations for genes, such as gene ontology, pathways, etc. other sources of information are still to be exploited. Text-mining methods allow extracting informative terms (bioentities) with different functional, chemical, clinical, etc. meanings, that can be associated to genes. We show how to use these associations within an appropriate statistical framework and how to apply them through easy-to-use, web-based environments to the functional interpretation of microarray experiments. Functional enrichment and gene set enrichment tests using bioentities are presented.

VL - 23 UR - http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17855415 N1 -

Minguez, Pablo Al-Shahrour, Fatima Montaner, David Dopazo, Joaquin Research Support, Non-U.S. Gov’t England Bioinformatics (Oxford, England) Bioinformatics. 2007 Nov 15;23(22):3098-9. Epub 2007 Sep 13.

ER - TY - CHAP T1 - Microarray Technology in Agricultural Research T2 - Microarray Technology Through Applications Y1 - 2007 A1 - A. Conesa A1 - J. Forment A1 - J. Gadea A1 - van Dijk, J. KW - babelomics JF - Microarray Technology Through Applications PB - F. Falciani. Publisher: Taylor and Francis Group ER - TY - JOUR T1 - Prophet, a web-based tool for class prediction using microarray data JF - Bioinformatics Y1 - 2007 A1 - Medina, Ignacio A1 - Montaner, D. A1 - Tarraga, J. A1 - Dopazo, J. KW - babelomics KW - gepas KW - predictors AB -

Sample classification and class prediction is the aim of many gene expression studies. We present a web-based application, Prophet, which builds prediction rules and allows using them for further sample classification. Prophet automatically chooses the best classifier, along with the optimal selection of genes, using a strategy that renders unbiased cross-validated errors. Prophet is linked to different microarray data analysis modules, and includes a unique feature: the possibility of performing the functional interpretation of the molecular signature found. Availability: Prophet can be found at the URL http://prophet.bioinfo.cipf.es/ or within the GEPAS package at http://www.gepas.org/ Supplementary information: http://gepas.bioinfo.cipf.es/tutorial/prophet.html.

VL - 23 UR - http://bioinformatics.oxfordjournals.org/cgi/content/full/23/3/390?view=long&pmid=17138587 N1 -

Medina, Ignacio Montaner, David Tarraga, Joaquin Dopazo, Joaquin Research Support, Non-U.S. Gov’t England Bioinformatics (Oxford, England) Bioinformatics. 2007 Feb 1;23(3):390-1. Epub 2006 Nov 30.

ER - TY - JOUR T1 - BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments JF - Nucleic Acids Res Y1 - 2006 A1 - Fatima Al-Shahrour A1 - Minguez, P. A1 - Tarraga, J. A1 - Montaner, D. A1 - Alloza, E. A1 - Vaquerizas, J. M. A1 - L. Conde A1 - Blaschke, C. A1 - Vera, J. A1 - Dopazo, J. KW - babelomics KW - functional profiling AB -

We present a new version of Babelomics, a complete suite of web tools for functional analysis of genome-scale experiments, with new and improved tools. New functionally relevant terms have been included such as CisRed motifs or bioentities obtained by text-mining procedures. An improved indexing has considerably speeded up several of the modules. An improved version of the FatiScan method for studying the coordinate behaviour of groups of functionally related genes is presented, along with a similar tool, the Gene Set Enrichment Analysis. Babelomics is now more oriented to test systems biology inspired hypotheses. Babelomics can be found at http://www.babelomics.org.

VL - 34 UR - http://nar.oxfordjournals.org/content/34/suppl_2/W472.long N1 -

Al-Shahrour, Fatima Minguez, Pablo Tarraga, Joaquin Montaner, David Alloza, Eva Vaquerizas, Juan M Conde, Lucia Blaschke, Christian Vera, Javier Dopazo, Joaquin Research Support, Non-U.S. Gov’t England Nucleic acids research Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W472-6.

ER - TY - JOUR T1 - Blast2GO goes grid: developing a grid-enabled prototype for functional genomics analysis JF - Stud Health Technol Inform Y1 - 2006 A1 - Aparicio, G. A1 - Gotz, S. A1 - A. Conesa A1 - Segrelles, D. A1 - Blanquer, I. A1 - Garcia, J. M. A1 - Hernandez, V. A1 - Robles, M. A1 - Talon, M. KW - babelomics AB -

The vast amount in complexity of data generated in Genomic Research implies that new dedicated and powerful computational tools need to be developed to meet their analysis requirements. Blast2GO (B2G) is a bioinformatics tool for Gene Ontology-based DNA or protein sequence annotation and function-based data mining. The application has been developed with the aim of affering an easy-to-use tool for functional genomics research. Typical B2G users are middle size genomics labs carrying out sequencing, ETS and microarray projects, handling datasets up to several thousand sequences. In the current version of B2G. The power and analytical potential of both annotation and function data-mining is somehow restricted to the computational power behind each particular installation. In order to be able to offer the possibility of an enhanced computational capacity within this bioinformatics application, a Grid component is being developed. A prototype has been conceived for the particular problem of speeding up the Blast searches to obtain fast results for large datasets. Many efforts have been done in the literature concerning the speeding up of Blast searches, but few of them deal with the use of large heterogeneous production Grid Infrastructures. These are the infrastructures that could reach the largest number of resources and the best load balancing for data access. The Grid Service under development will analyse requests based on the number of sequences, splitting them accordingly to the available resources. Lower-level computation will be performed through MPIBLAST. The software architecture is based on the WSRF standard.

VL - 120 UR - http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16823138 N1 -

Aparicio, G Gotz, S Conesa, A Segrelles, D Blanquer, I Garcia, J M Hernandez, V Robles, M Talon, M Netherlands Studies in health technology and informatics Stud Health Technol Inform. 2006;120:194-204.

ER - TY - JOUR T1 - Functional interpretation of microarray experiments JF - OMICS Y1 - 2006 A1 - Dopazo, J. KW - babelomics KW - Diabetes Mellitus KW - microarray data analysis AB -

Over the past few years, due to the popularisation of high-throughput methodologies such as DNA microarrays, the possibility of obtaining experimental data has increased significantly. Nevertheless, the interpretation of the results, which involves translating these data into useful biological knowledge, still remains a challenge. The methods and strategies used for this interpretation are in continuous evolution and new proposals are constantly arising. Initially, a two-step approach was used in which genes of interest were initially selected, based on thresholds that consider only experimental values, and then in a second, independent step the enrichment of these genes in biologically relevant terms, was analysed. For different reasons, these methods are relatively poor in terms of performance and a new generation of procedures, which draw inspiration from systems biology criteria, are currently under development. Such procedures, aim to directly test the behaviour of blocks of functionally related genes, instead of focusing on single genes.

VL - 10 UR - http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=17069516 N1 -

Dopazo, Joaquin Research Support, Non-U.S. Gov’t United States Omics : a journal of integrative biology OMICS. 2006 Fall;10(3):398-410.

ER - TY - JOUR T1 - A function-centric approach to the biological interpretation of microarray time-series JF - Genome Inform Y1 - 2006 A1 - Minguez, P. A1 - Fatima Al-Shahrour A1 - Dopazo, J. KW - babelomics AB -

The interpretation of microarray experiments is commonly addressed by means a two-step approach in which the relevant genes are firstly selected uniquely on the basis of their experimental values (ignoring their coordinate behaviors) and in a second step their functional properties are studied to hypothesize about the biological roles they are fulfilling in the cell. Recently, different methods (e.g. GSEA or FatiScan) have been proposed to study the coordinate behavior of blocks of functionally-related genes. These methods study the distribution of functional information across lists of genes ranked according their different experimental values in a static situation, such as the comparison between two classes (e.g. healthy controls versus diseased cases). Nevertheless there is no an equivalent way of studying a dynamic situation from a functional point of view. We present a method for the functional analysis of microarrays series in which the experiments display autocorrelation between successive points (e.g. time series, dose-response experiments, etc.) The method allows to recover the dynamics of the molecular roles fulfilled by the genes along the series which provides a novel approach to functional interpretation of such experiments. The method finds blocks of functionally-related genes which are significantly and coordinately over-expressed at different points of the series. This method draws inspiration from systems biology given that the analysis does not focus on individual properties of genes but on collective behaving blocks of functionally-related genes. The FatiScan algorithm used in the method proposed is available at: http://fatiscan.bioinfo.cipf.es, or within the Babelomics suite: http://www.babelomics.org. Additional material is available at: http://bioinfo.cipf.es/data/plasmodium.

VL - 17 N1 -

Minguez, Pablo Al-Shahrour, Fatima Dopazo, Joaquin Research Support, Non-U.S. Gov’t Japan Genome informatics. International Conference on Genome Informatics Genome Inform. 2006;17(2):57-66.

ER - TY - JOUR T1 - Ontology-driven approaches to analyzing data in functional genomics JF - Methods Mol Biol Y1 - 2006 A1 - F. Azuaje A1 - Fatima Al-Shahrour A1 - Dopazo, J. KW - babelomics KW - Cluster Analysis KW - Cluster Analysis Computational Biology/*methods *Data Interpretation KW - Computational Biology KW - Statistical Gene Expression Profiling KW - Statistical Gene Expression Profiling *Genomics Humans AB -

Ontologies are fundamental knowledge representations that provide not only standards for annotating and indexing biological information, but also the basis for implementing functional classification and interpretation models. This chapter discusses the application of gene ontology (GO) for predictive tasks in functional genomics. It focuses on the problem of analyzing functional patterns associated with gene products. This chapter is divided into two main parts. The first part overviews GO and its applications for the development of functional classification models. The second part presents two methods for the characterization of genomic information using GO. It discusses methods for measuring functional similarity of gene products, and a tool for supporting gene expression clustering analysis and validation.

VL - 316 UR - http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16671401 N1 -

Azuaje, Francisco Al-Shahrour, Fatima Dopazo, Joaquin Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov’t Review United States Methods in molecular biology (Clifton, N.J.) Methods Mol Biol. 2006;316:67-86.

ER - TY - JOUR T1 - BABELOMICS: a suite of web tools for functional annotation and analysis of groups of genes in high-throughput experiments JF - Nucleic Acids Res Y1 - 2005 A1 - Fatima Al-Shahrour A1 - Minguez, P. A1 - Vaquerizas, J. M. A1 - L. Conde A1 - Dopazo, J. KW - babelomics KW - functional profiling AB -

We present Babelomics, a complete suite of web tools for the functional analysis of groups of genes in high-throughput experiments, which includes the use of information on Gene Ontology terms, interpro motifs, KEGG pathways, Swiss-Prot keywords, analysis of predicted transcription factor binding sites, chromosomal positions and presence in tissues with determined histological characteristics, through five integrated modules: FatiGO (fast assignment and transference of information), FatiWise, transcription factor association test, GenomeGO and tissues mining tool, respectively. Additionally, another module, FatiScan, provides a new procedure that integrates biological information in combination with experimental results in order to find groups of genes with modest but coordinate significant differential behaviour. FatiScan is highly sensitive and is capable of finding significant asymmetries in the distribution of genes of common function across a list of ordered genes even if these asymmetries were not extreme. The strong multiple-testing nature of the contrasts made by the tools is taken into account. All the tools are integrated in the gene expression analysis package GEPAS. Babelomics is the natural evolution of our tool FatiGO (which analysed almost 22,000 experiments during the last year) to include more sources on information and new modes of using it. Babelomics can be found at http://www.babelomics.org.

VL - 33 UR - http://nar.oxfordjournals.org/content/33/suppl_2/W460.long N1 -

Al-Shahrour, Fatima Minguez, Pablo Vaquerizas, Juan M Conde, Lucia Dopazo, Joaquin Research Support, Non-U.S. Gov’t England Nucleic acids research Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W460-4.

ER - TY - JOUR T1 - Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research JF - Bioinformatics Y1 - 2005 A1 - A. Conesa A1 - Gotz, S. A1 - Garcia-Gomez, J. M. A1 - Terol, J. A1 - Talon, M. A1 - Robles, M. KW - babelomics AB -

SUMMARY: We present here Blast2GO (B2G), a research tool designed with the main purpose of enabling Gene Ontology (GO) based data mining on sequence data for which no GO annotation is yet available. B2G joints in one application GO annotation based on similarity searches with statistical analysis and highlighted visualization on directed acyclic graphs. This tool offers a suitable platform for functional genomics research in non-model species. B2G is an intuitive and interactive desktop application that allows monitoring and comprehension of the whole annotation and analysis process. AVAILABILITY: Blast2GO is freely available via Java Web Start at http://www.blast2go.de. SUPPLEMENTARY MATERIAL: http://www.blast2go.de -> Evaluation.

VL - 21 UR - http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16081474 N1 -

Conesa, Ana Gotz, Stefan Garcia-Gomez, Juan Miguel Terol, Javier Talon, Manuel Robles, Montserrat Research Support, Non-U.S. Gov’t England Bioinformatics (Oxford, England) Bioinformatics. 2005 Sep 15;21(18):3674-6. Epub 2005 Aug 4.

ER - TY - CHAP T1 - Data analysis and visualisation in genomics and proteomics Y1 - 2005 A1 - F. Azuaje A1 - Dopazo, J. KW - babelomics PB - Wiley, F. Azuaje and J. Dopazo ER - TY - JOUR T1 - Discovering molecular functions significantly related to phenotypes by combining gene expression data and biological information JF - Bioinformatics Y1 - 2005 A1 - Fatima Al-Shahrour A1 - Diaz-Uriarte, R. A1 - Dopazo, J. KW - babelomics KW - Biological Neoplasm Proteins/genetics/*metabolism Phenotype Software Structure-Activity Relationship Systems Integration Tumor Markers KW - Biological/genetics/*metabolism KW - Breast Neoplasms/genetics/*metabolism Computer Simulation *Database Management Systems *Databases KW - Protein Documentation/methods Gene Expression Profiling/*methods Humans *Models AB -

MOTIVATION: The analysis of genome-scale data from different high throughput techniques can be used to obtain lists of genes ordered according to their different behaviours under distinct experimental conditions corresponding to different phenotypes (e.g. differential gene expression between diseased samples and controls, different response to a drug, etc.). The order in which the genes appear in the list is a consequence of the biological roles that the genes play within the cell, which account, at molecular scale, for the macroscopic differences observed between the phenotypes studied. Typically, two steps are followed for understanding the biological processes that differentiate phenotypes at molecular level: first, genes with significant differential expression are selected on the basis of their experimental values and subsequently, the functional properties of these genes are analysed. Instead, we present a simple procedure which combines experimental measurements with available biological information in a way that genes are simultaneously tested in groups related by common functional properties. The method proposed constitutes a very sensitive tool for selecting genes with significant differential behaviour in the experimental conditions tested. RESULTS: We propose the use of a method to scan ordered lists of genes. The method allows the understanding of the biological processes operating at molecular level behind the macroscopic experiment from which the list was generated. This procedure can be useful in situations where it is not possible to obtain statistically significant differences based on the experimental measurements (e.g. low prevalence diseases, etc.). Two examples demonstrate its application in two microarray experiments and the type of information that can be extracted.

VL - 21 UR - http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15840702 N1 -

Al-Shahrour, Fatima Diaz-Uriarte, Ramon Dopazo, Joaquin Evaluation Studies Research Support, Non-U.S. Gov’t England Bioinformatics (Oxford, England) Bioinformatics. 2005 Jul 1;21(13):2988-93. Epub 2005 Apr 19.

ER - TY - JOUR T1 - FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes JF - Bioinformatics Y1 - 2004 A1 - Fatima Al-Shahrour A1 - Diaz-Uriarte, R. A1 - Dopazo, J. KW - *Algorithms Artificial Intelligence Databases KW - babelomics KW - DNA/*methods *Software KW - Genetic Gene Expression Profiling/*methods *Hypermedia Information Storage and Retrieval/*methods *Internet *Phylogeny Sequence Alignment/methods Sequence Analysis AB -

We present a simple but powerful procedure to extract Gene Ontology (GO) terms that are significantly over- or under-represented in sets of genes within the context of a genome-scale experiment (DNA microarray, proteomics, etc.). Said procedure has been implemented as a web application, FatiGO, allowing for easy and interactive querying. FatiGO, which takes the multiple-testing nature of statistical contrast into account, currently includes GO associations for diverse organisms (human, mouse, fly, worm and yeast) and the TrEMBL/Swissprot GOAnnotations@EBI correspondences from the European Bioinformatics Institute.

VL - 20 UR - http://bioinformatics.oxfordjournals.org/content/20/4/578.abstract N1 -

Al-Shahrour, Fatima Diaz-Uriarte, Ramon Dopazo, Joaquin England Bioinformatics (Oxford, England) Bioinformatics. 2004 Mar 1;20(4):578-80. Epub 2004 Jan 22.

ER - TY - CHAP T1 - Using Gene Ontology on genome-scale studies to find significant associations of biologically relevant terms to group of genes T2 - Neural Networks for Signal Processing XIII Y1 - 2003 A1 - Fatima Al-Shahrour A1 - Herrero, J. A1 - A. Mateos A1 - J. Santoyo A1 - Díaz-Uriarte, R A1 - Dopazo, J. KW - babelomics JF - Neural Networks for Signal Processing XIII PB - IEEE Press CY - New York, USA ER -