%0 Journal Article %J BMC bioinformatics %D 2017 %T A new parallel pipeline for DNA methylation analysis of long reads datasets. %A Olanda, Ricardo %A Pérez, Mariano %A Orduña, Juan M %A Tárraga, Joaquín %A Joaquín Dopazo %K Methyl-Seq %K NGS %X BACKGROUND: DNA methylation is an important mechanism of epigenetic regulation in development and disease. New generation sequencers allow genome-wide measurements of the methylation status by reading short stretches of the DNA sequence (Methyl-seq). Several software tools for methylation analysis have been proposed over recent years. However, the current trend is that the new sequencers and the ones expected for an upcoming future yield sequences of increasing length, making these software tools inefficient and obsolete. RESULTS: In this paper, we propose a new software based on a strategy for methylation analysis of Methyl-seq sequencing data that requires much shorter execution times while yielding a better level of sensitivity, particularly for datasets composed of long reads. This strategy can be exported to other methylation, DNA and RNA analysis tools. CONCLUSIONS: The developed software tool achieves execution times one order of magnitude shorter than the existing tools, while yielding equal sensitivity for short reads and even better sensitivity for long reads. %B BMC bioinformatics %V 18 %P 161 %8 2017 Mar 09 %G eng %U http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-017-1574-3 %R 10.1186/s12859-017-1574-3 %0 Journal Article %J Molecular biology and evolution %D 2016 %T 267 Spanish exomes reveal population-specific differences in disease-related genetic variation. %A Joaquín Dopazo %A Amadoz, Alicia %A Bleda, Marta %A García-Alonso, Luz %A Alemán, Alejandro %A Garcia-Garcia, Francisco %A Rodriguez, Juan A %A Daub, Josephine T %A Muntané, Gerard %A Antonio Rueda %A Vela-Boza, Alicia %A López-Domingo, Francisco J %A Florido, Javier P %A Arce, Pablo %A Ruiz-Ferrer, Macarena %A Méndez-Vidal, Cristina %A Arnold, Todd E %A Spleiss, Olivia %A Alvarez-Tejado, Miguel %A Navarro, Arcadi %A Bhattacharya, Shomi S %A Borrego, Salud %A Santoyo-López, Javier %A Antiňolo, Guillermo %K disease %K NGS %K polymorphisms %K Population genomics %K prioritization %K SNP %X Recent results from large-scale genomic projects suggest that allele frequencies, which are highly relevant for medical purposes, differ considerably across different populations. The need for a detailed catalogue of local variability motivated the whole exome sequencing of 267 unrelated individuals, representative of the healthy Spanish population. Like in other studies, a considerable number of rare variants were found (almost one third of the described variants). There were also relevant differences in allelic frequencies in polymorphic variants, including about 10,000 polymorphisms private to the Spanish population. The allelic frequencies of variants conferring susceptibility to complex diseases (including cancer, schizophrenia, Alzheimer disease, type 2 diabetes and other pathologies) were overall similar to those of other populations. However, the trend is the opposite for variants linked to Mendelian and rare diseases (including several retinal degenerative dystrophies and cardiomyopathies) that show marked frequency differences between populations. Interestingly, a correspondence between differences in allelic frequencies and disease prevalence was found, highlighting the relevance of frequency differences in disease risk. These differences are also observed in variants that disrupt known drug binding sites, suggesting an important role for local variability in population-specific drug resistances or adverse effects. We have made the Spanish population variant server web page that contains population frequency information for the complete list of 170,888 variant positions we found publicly available (http://spv.babelomics.org/), We show that it if fundamental to determine population-specific variant frequencies in order to distinguish real disease associations from population-specific polymorphisms. %B Molecular biology and evolution %8 2016 Jan 13 %G eng %U https://mbe.oxfordjournals.org/content/early/2016/02/17/molbev.msw005.full %R 10.1093/molbev/msw005 %0 Journal Article %J Nucleic acids research %D 2016 %T Actionable pathways: interactive discovery of therapeutic targets using signaling pathway models. %A Salavert, Francisco %A Hidago, Marta R %A Amadoz, Alicia %A Cubuk, Cankut %A Medina, Ignacio %A Crespo, Daniel %A Carbonell-Caballero, José %A Joaquín Dopazo %K actionable genes %K Disease mechanism %K drug action mechanism %K Drug discovery %K pathway analysis %K personalized medicine %K signalling %K therapeutic targets %X The discovery of actionable targets is crucial for targeted therapies and is also a constituent part of the drug discovery process. The success of an intervention over a target depends critically on its contribution, within the complex network of gene interactions, to the cellular processes responsible for disease progression or therapeutic response. Here we present PathAct, a web server that predicts the effect that interventions over genes (inhibitions or activations that simulate knock-outs, drug treatments or over-expressions) can have over signal transmission within signaling pathways and, ultimately, over the cell functionalities triggered by them. PathAct implements an advanced graphical interface that provides a unique interactive working environment in which the suitability of potentially actionable genes, that could eventually become drug targets for personalized or individualized therapies, can be easily tested. The PathAct tool can be found at: http://pathact.babelomics.org. %B Nucleic acids research %8 2016 May 2 %G eng %U http://nar.oxfordjournals.org/content/early/2016/05/02/nar.gkw369.full %R 10.1093/nar/gkw369 %0 Journal Article %J The Journal of molecular diagnostics : JMD %D 2016 %T Assessment of Targeted Next-Generation Sequencing as a Tool for the Diagnosis of Charcot-Marie-Tooth Disease and Hereditary Motor Neuropathy. %A Lupo, Vincenzo %A Garcia-Garcia, Francisco %A Sancho, Paula %A Tello, Cristina %A García-Romero, Mar %A Villarreal, Liliana %A Alberti, Antonia %A Sivera, Rafael %A Joaquín Dopazo %A Pascual-Pascual, Samuel I %A Márquez-Infante, Celedonio %A Casasnovas, Carlos %A Sevilla, Teresa %A Espinós, Carmen %K Charcot-Marie-Tooth %K CMT %K Diagnostic %K NGS %K Panels %K rare diseases %K Targeted resequencing %X Charcot-Marie-Tooth disease is characterized by broad genetic heterogeneity with >50 known disease-associated genes. Mutations in some of these genes can cause a pure motor form of hereditary motor neuropathy, the genetics of which are poorly characterized. We designed a panel comprising 56 genes associated with Charcot-Marie-Tooth disease/hereditary motor neuropathy. We validated this diagnostic tool by first testing 11 patients with pathological mutations. A cohort of 33 affected subjects was selected for this study. The DNAJB2 c.352+1G>A mutation was detected in two cases; novel changes and/or variants with low frequency (<1%) were found in 12 cases. There were no candidate variants in 18 cases, and amplification failed for one sample. The DNAJB2 c.352+1G>A mutation was also detected in three additional families. On haplotype analysis, all of the patients from these five families shared the same haplotype; therefore, the DNAJB2 c.352+1G>A mutation may be a founder event. Our gene panel allowed us to perform a very rapid and cost-effective screening of genes involved in Charcot-Marie-Tooth disease/hereditary motor neuropathy. Our diagnostic strategy was robust in terms of both coverage and read depth for all of the genes and patient samples. These findings demonstrate the difficulty in achieving a definitive molecular diagnosis because of the complexity of interpreting new variants and the genetic heterogeneity that is associated with these neuropathies. %B The Journal of molecular diagnostics : JMD %8 2016 Jan 2 %G eng %U http://www.sciencedirect.com/science/article/pii/S1525157815002615 %R 10.1016/j.jmoldx.2015.10.005 %0 Journal Article %J Scientific reports %D 2016 %T The pan-cancer pathological regulatory landscape. %A Falco, Matias M %A Bleda, Marta %A Carbonell-Caballero, José %A Joaquín Dopazo %X Dysregulation of the normal gene expression program is the cause of a broad range of diseases, including cancer. Detecting the specific perturbed regulators that have an effect on the generation and the development of the disease is crucial for understanding the disease mechanism and for taking decisions on efficient preventive and curative therapies. Moreover, detecting such perturbations at the patient level is even more important from the perspective of personalized medicine. We applied the Transcription Factor Target Enrichment Analysis, a method that detects the activity of transcription factors based on the quantification of the collective transcriptional activation of their targets, to a large collection of 5607 cancer samples covering eleven cancer types. We produced for the first time a comprehensive catalogue of altered transcription factor activities in cancer, a considerable number of them significantly associated to patient’s survival. Moreover, we described several interesting TFs whose activity do not change substantially in the cancer with respect to the normal tissue but ultimately play an important role in patient prognostic determination, which suggest they might be promising therapeutic targets. An additional advantage of this method is that it allows obtaining personalized TF activity estimations for individual patients. %B Scientific reports %V 6 %P 39709 %8 2016 Dec 21 %G eng %U http://www.nature.com/articles/srep39709 %R 10.1038/srep39709 %0 Journal Article %J Nucleic acids research %D 2015 %T Assessing the impact of mutations found in next generation sequencing data over human signaling pathways. %A Hernansaiz-Ballesteros, Rosa D %A Salavert, Francisco %A Sebastián-Leon, Patricia %A Alemán, Alejandro %A Medina, Ignacio %A Joaquín Dopazo %K NGS %K pathways %K signalling %K Systems biology %X Modern sequencing technologies produce increasingly detailed data on genomic variation. However, conventional methods for relating either individual variants or mutated genes to phenotypes present known limitations given the complex, multigenic nature of many diseases or traits. Here we present PATHiVar, a web-based tool that integrates genomic variation data with gene expression tissue information. PATHiVar constitutes a new generation of genomic data analysis methods that allow studying variants found in next generation sequencing experiment in the context of signaling pathways. Simple Boolean models of pathways provide detailed descriptions of the impact of mutations in cell functionality so as, recurrences in functionality failures can easily be related to diseases, even if they are produced by mutations in different genes. Patterns of changes in signal transmission circuits, often unpredictable from individual genes mutated, correspond to patterns of affected functionalities that can be related to complex traits such as disease progression, drug response, etc. PATHiVar is available at: http://pathivar.babelomics.org. %B Nucleic acids research %V 43 %P W270-W275 %8 2015 Apr 16 %G eng %U http://nar.oxfordjournals.org/content/43/W1/W270 %R 10.1093/nar/gkv349 %0 Journal Article %J BMC cancer %D 2015 %T BRCA1 Alternative splicing landscape in breast tissue samples. %A Romero, Atocha %A Garcia-Garcia, Francisco %A López-Perolio, Irene %A Ruiz de Garibay, Gorka %A García-Sáenz, José A %A Garre, Pilar %A Ayllón, Patricia %A Benito, Esperanza %A Joaquín Dopazo %A Díaz-Rubio, Eduardo %A Caldés, Trinidad %A de la Hoya, Miguel %X BACKGROUND: BRCA1 is a key protein in cell network, involved in DNA repair pathways and cell cycle. Recently, the ENIGMA consortium has reported a high number of alternative splicing (AS) events at this locus in blood-derived samples. However, BRCA1 splicing pattern in breast tissue samples is unknown. Here, we provide an accurate description of BRCA1 splicing events distribution in breast tissue samples. METHODS: BRCA1 splicing events were scanned in 70 breast tumor samples, 4 breast samples from healthy individuals and in 72 blood-derived samples by capillary electrophoresis (capillary EP). Molecular subtype was identified in all tumor samples. Splicing events were considered predominant if their relative expression level was at least the 10% of the full-length reference signal. RESULTS: 54 BRCA1 AS events were identified, 27 of them were annotated as predominant in at least one sample. Δ5q, Δ13, Δ9, Δ5 and ▼1aA were significantly more frequently annotated as predominant in breast tumor samples than in blood-derived samples. Predominant splicing events were, on average, more frequent in tumor samples than in normal breast tissue samples (P = 0.010). Similarly, likely inactivating splicing events (PTC-NMDs, Non-Coding, Δ5 and Δ18) were more frequently annotated as predominant in tumor than in normal breast samples (P = 0.020), whereas there were no significant differences for other splicing events (No-Fs) frequency distribution between tumor and normal breast samples (P = 0.689). CONCLUSIONS: Our results complement recent findings by the ENIGMA consortium, demonstrating that BRCA1 AS, despite its tremendous complexity, is similar in breast and blood samples, with no evidences for tissue specific AS events. Further on, we conclude that somatic inactivation of BRCA1 through spliciogenic mutations is, at best, a rare mechanism in breast carcinogenesis, albeit our data detects an excess of likely inactivating AS events in breast tumor samples. %B BMC cancer %V 15 %P 219 %8 2015 %G eng %U http://www.biomedcentral.com/1471-2407/15/219 %R 10.1186/s12885-015-1145-9 %0 Journal Article %J Nature methods %D 2015 %T Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection. %A Ewing, Adam D %A Houlahan, Kathleen E %A Hu, Yin %A Ellrott, Kyle %A Caloian, Cristian %A Yamaguchi, Takafumi N %A Bare, J Christopher %A P’ng, Christine %A Waggott, Daryl %A Sabelnykova, Veronica Y %A Kellen, Michael R %A Norman, Thea C %A Haussler, David %A Friend, Stephen H %A Stolovitzky, Gustavo %A Margolin, Adam A %A Stuart, Joshua M %A Boutros, Paul C %E ICGC-TCGA DREAM Somatic Mutation Calling Challenge participants %E Liu Xi %E Ninad Dewal %E Yu Fan %E Wenyi Wang %E David Wheeler %E Andreas Wilm %E Grace Hui Ting %E Chenhao Li %E Denis Bertrand %E Niranjan Nagarajan %E Qing-Rong Chen %E Chih-Hao Hsu %E Ying Hu %E Chunhua Yan %E Warren Kibbe %E Daoud Meerzaman %E Kristian Cibulskis %E Mara Rosenberg %E Louis Bergelson %E Adam Kiezun %E Amie Radenbaugh %E Anne-Sophie Sertier %E Anthony Ferrari %E Laurie Tonton %E Kunal Bhutani %E Nancy F Hansen %E Difei Wang %E Lei Song %E Zhongwu Lai %E Liao, Yang %E Shi, Wei %E Carbonell-Caballero, José %E Joaquín Dopazo %E Cheryl C K Lau %E Justin Guinney %K cancer %K NGS %K variant calling %X The detection of somatic mutations from cancer genome sequences is key to understanding the genetic basis of disease progression, patient survival and response to therapy. Benchmarking is needed for tool assessment and improvement but is complicated by a lack of gold standards, by extensive resource requirements and by difficulties in sharing personal genomic information. To resolve these issues, we launched the ICGC-TCGA DREAM Somatic Mutation Calling Challenge, a crowdsourced benchmark of somatic mutation detection algorithms. Here we report the BAMSurgeon tool for simulating cancer genomes and the results of 248 analyses of three in silico tumors created with it. Different algorithms exhibit characteristic error profiles, and, intriguingly, false positives show a trinucleotide profile very similar to one found in human tumors. Although the three simulated tumors differ in sequence contamination (deviation from normal cell sequence) and in subclonality, an ensemble of pipelines outperforms the best individual pipeline in all cases. BAMSurgeon is available at https://github.com/adamewing/bamsurgeon/. %B Nature methods %8 2015 May 18 %G eng %U http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3407.html %R 10.1038/nmeth.3407 %0 Journal Article %J Hearing research %D 2015 %T Comparative gene expression study of the vestibular organ of the Igf1 deficient mouse using whole-transcript arrays. %A Rodríguez-de la Rosa, Lourdes %A Sánchez-Calderón, Hortensia %A Contreras, Julio %A Murillo-Cuesta, Silvia %A Falagan, Sandra %A Avendaño, Carlos %A Joaquín Dopazo %A Varela-Nieto, Isabel %A Milo, Marta %X The auditory and vestibular organs form the inner ear and have a common developmental origin. Insulin like growth factor 1 (IGF-1) has a central role in the development of the cochlea and maintenance of hearing. Its deficiency causes sensorineural hearing loss in man and mice. During chicken early development, IGF-1 modulates neurogenesis of the cochleovestibular ganglion but no further studies have been conducted to explore the potential role of IGF-1 in the vestibular system. In this study we have compared the whole transcriptome of the vestibular organ from wild type and Igf1(-/-) mice at different developmental and postnatal times. RNA was prepared from E18.5, P15 and P90 vestibular organs of Igf1(-/-) and Igf1(+/+) mice and the transcriptome analysed in triplicates using Affymetrix® Mouse Gene 1.1 ST Array Plates. These plates are whole-transcript arrays that include probes to measure both messenger (mRNA) and long intergenic non-coding RNA transcripts (lincRNA), with a coverage of over 28 thousand coding transcripts and over 7 thousands non-coding transcripts. Given the complexity of the data we used two different methods VSN-RMA and mmBGX to analyse and compare the data. This is to better evaluate the number of false positives and to quantify uncertainty of low signals. We identified a number of differentially expressed genes that we described using functional analysis and validated using RT-qPCR. The morphology of the vestibular organ did not show differences between genotypes and no evident alterations were observed in the vestibular sensory areas of the null mice. However, well-defined cellular alterations were found in the vestibular neurons with respect their number and size. Although these mice did not show a dramatic vestibular phenotype, we conducted a functional analysis on differentially expressed genes between genotypes and across time. This was with the aim to identify new pathways that are involved in the development of the vestibular organ as well as pathways that maybe affected by the lack of IGF-1 and be associated to the morphological changes of the vestibular neurons that we observed in the Igf1(-/-) mice. %B Hearing research %8 2015 Sep 1 %G eng %U http://www.sciencedirect.com/science/article/pii/S0378595515001835 %R 10.1016/j.heares.2015.08.016 %0 Journal Article %J Scientific reports %D 2015 %T Exome sequencing reveals a high genetic heterogeneity on familial Hirschsprung disease. %A Luzón-Toro, Berta %A Gui, Hongsheng %A Ruiz-Ferrer, Macarena %A Sze-Man Tang, Clara %A Fernández, Raquel M %A Sham, Pak-Chung %A Torroglosa, Ana %A Kwong-Hang Tam, Paul %A Espino-Paisán, Laura %A Cherny, Stacey S %A Bleda, Marta %A Enguix-Riego, María Del Valle %A Joaquín Dopazo %A Antiňolo, Guillermo %A Garcia-Barceló, Maria-Mercè %A Borrego, Salud %K babelomics %K Hirschprung %K NGS %K prioritization %X Hirschsprung disease (HSCR; OMIM 142623) is a developmental disorder characterized by aganglionosis along variable lengths of the distal gastrointestinal tract, which results in intestinal obstruction. Interactions among known HSCR genes and/or unknown disease susceptibility loci lead to variable severity of phenotype. Neither linkage nor genome-wide association studies have efficiently contributed to completely dissect the genetic pathways underlying this complex genetic disorder. We have performed whole exome sequencing of 16 HSCR patients from 8 unrelated families with SOLID platform. Variants shared by affected relatives were validated by Sanger sequencing. We searched for genes recurrently mutated across families. Only variations in the FAT3 gene were significantly enriched in five families. Within-family analysis identified compound heterozygotes for AHNAK and several genes (N = 23) with heterozygous variants that co-segregated with the phenotype. Network and pathway analyses facilitated the discovery of polygenic inheritance involving FAT3, HSCR known genes and their gene partners. Altogether, our approach has facilitated the detection of more than one damaging variant in biologically plausible genes that could jointly contribute to the phenotype. Our data may contribute to the understanding of the complex interactions that occur during enteric nervous system development and the etiopathology of familial HSCR. %B Scientific reports %V 5 %P 16473 %8 2015 %G eng %U http://www.nature.com/articles/srep16473 %R 10.1038/srep16473 %0 Journal Article %J BMC genomics %D 2015 %T Involvement of a citrus meiotic recombination TTC-repeat motif in the formation of gross deletions generated by ionizing radiation and MULE activation. %A Terol, Javier %A Ibañez, Victoria %A Carbonell, José %A Alonso, Roberto %A Estornell, Leandro H %A Licciardello, Concetta %A Gut, Ivo G %A Joaquín Dopazo %A Talon, Manuel %X BACKGROUND: Transposable-element mediated chromosomal rearrangements require the involvement of two transposons and two double-strand breaks (DSB) located in close proximity. In radiobiology, DSB proximity is also a major factor contributing to rearrangements. However, the whole issue of DSB proximity remains virtually unexplored. RESULTS: Based on DNA sequencing analysis we show that the genomes of 2 derived mutations, Arrufatina (sport) and Nero (irradiation), share a similar 2 Mb deletion of chromosome 3. A 7 kb Mutator-like element found in Clemenules was present in Arrufatina in inverted orientation flanking the 5’ end of the deletion. The Arrufatina Mule displayed "dissimilar" 9-bp target site duplications separated by 2 Mb. Fine-scale single nucleotide variant analyses of the deleted fragments identified a TTC-repeat sequence motif located in the center of the deletion responsible of a meiotic crossover detected in the citrus reference genome. CONCLUSIONS: Taken together, this information is compatible with the proposal that in both mutants, the TTC-repeat motif formed a triplex DNA structure generating a loop that brought in close proximity the originally distinct reactive ends. In Arrufatina, the loop brought the Mule ends nearby the 2 distinct insertion target sites and the inverted insertion of the transposable element between these target sites provoked the release of the in-between fragment. This proposal requires the involvement of a unique transposon and sheds light on the unresolved question of how two distinct sites become located in close proximity. These observations confer a crucial role to the TTC-repeats in fundamental plant processes as meiotic recombination and chromosomal rearrangements. %B BMC genomics %V 16 %P 69 %8 2015 Feb 13 %G eng %U http://www.biomedcentral.com/1471-2164/16/69 %R 10.1186/s12864-015-1280-3 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2015 %T A Parallel and Sensitive Software Tool for Methylation Analysis on Multicore Platforms. %A Tárraga, Joaquín %A Pérez, Mariano %A Orduña, Juan M %A Duato, José %A Medina, Ignacio %A Joaquín Dopazo %K BS-seq %K HPC %K methylation %K NGS %X MOTIVATION: DNA methylation analysis suffers from very long processing time, since the advent of Next-Generation Sequencers (NGS) has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. Since it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. RESULTS: We present a new software tool, called HPG-Methyl, which efficiently maps bisulfite sequencing reads on DNA, analyzing DNA methylation. The strategy used by this software consists of leveraging the speed of the Burrows-Wheeler Transform to map a large number of DNA fragments (reads) rapidly, as well as the accuracy of the Smith-Waterman algorithm, which is exclusively employed to deal with the most ambiguous and shortest reads. Experimental results on platforms with Intel multicore processors show that HPGMethyl significantly outperforms in both execution time and sensitivity state-of-the-art software such as Bismark, BS-Seeker or BSMAP, particularly for long bisulfite reads. AVAILABILITY: Software in the form of C libraries and functions, together with instructions to compile and execute this software. Available by sftp to anonymous@clariano.uv.es (password "anonymous"). CONTACT: Juan.Orduna@uv.es. %B Bioinformatics (Oxford, England) %V 31 %P 3130-3138 %8 2015 Jun 10 %G eng %U http://bioinformatics.oxfordjournals.org/content/31/19/3130.long %R 10.1093/bioinformatics/btv357 %0 Journal Article %J Molecular immunology %D 2015 %T Therapeutic targets for olive pollen allergy defined by gene markers modulated by Ole e 1-derived peptides. %A Calzada, David %A Aguerri, Miriam %A Baos, Selene %A Montaner, David %A Mata, Manuel %A Joaquín Dopazo %A Quiralte, Joaquín %A Florido, Fernando %A Lahoz, Carlos %A Cárdaba, Blanca %X Two regions of Ole e 1, the major olive-pollen allergen, have been characterized as T-cell epitopes, one as immunodominant region (aa91-130) and the other, as mainly recognized by non-allergic subjects (aa10-31). This report tries to characterize the specific relevance of these epitopes in the allergic response to olive pollen by analyzing the secreted cytokines and the gene expression profiles induced after specific stimulation of peripheral blood mononuclear cells (PBMCs). PBMCs from olive pollen-allergic and non-allergic control subjects were stimulated with olive-pollen extract and Ole e 1 dodecapeptides containing relevant T-cell epitopes. Levels of cytokines were measured in cellular supernatants and gene expression was determined by microarrays, on the RNAs extracted from PBMCs. One hundred eighty-nine differential genes (fold change >2 or <-2, P<0.05) were validated by qRT-PCR in a large population. It was not possible to define a pattern of response according the overall cytokine results but interesting differences were observed, mainly in the regulatory cytokines. Principal component (PCA) gene-expression analysis defined clusters that correlated with the experimental conditions in the group of allergic subjects. Gene expression and functional analyses revealed differential genes and pathways among the experimental conditions. A set of 51 genes (many essential to T-cell tolerance and homeostasis) correlated with the response to aa10-31 of Ole e 1. In conclusion, two peptides derived from Ole e 1 could regulate the immune response in allergic patients, by gene-expression modification of several regulation-related genes. These results open new research ways to the regulation of allergy by Oleaceae family members. %B Molecular immunology %V 64 %P 252-61 %8 2015 Apr %G eng %U http://www.sciencedirect.com/science/article/pii/S0161589014003356 %R 10.1016/j.molimm.2014.12.002 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2014 %T Acceleration of short and long DNA read mapping without loss of accuracy using suffix array. %A Tárraga, Joaquín %A Arnau, Vicente %A Martinez, Hector %A Moreno, Raul %A Cazorla, Diego %A Salavert-Torres, José %A Blanquer-Espert, Ignacio %A Joaquín Dopazo %A Medina, Ignacio %K NGS %K short read mapping. HPC. suffix arrays %X HPG Aligner applies suffix arrays for DNA read mapping. This implementation produces a highly sensitive and extremely fast mapping of DNA reads that scales up almost linearly with read length. The approach presented here is faster (over 20x for long reads) and more sensitive (over 98% in a wide range of read lengths) than the current, state-of-the-art mappers. HPG Aligner is not only an optimal alternative for current sequencers but also the only solution available to cope with longer reads and growing throughputs produced by forthcoming sequencing technologies. %B Bioinformatics (Oxford, England) %V 30 %P 3396-3398 %8 2014 Aug 20 %G eng %U http://bioinformatics.oxfordjournals.org/content/early/2014/08/19/bioinformatics.btu553.long %R 10.1093/bioinformatics/btu553 %0 Journal Article %J PloS one %D 2014 %T Combined genetic and high-throughput strategies for molecular diagnosis of inherited retinal dystrophies. %A de Castro-Miró, Marta %A Pomares, Esther %A Lorés-Motta, Laura %A Tonda, Raul %A Joaquín Dopazo %A Marfany, Gemma %A Gonzàlez-Duarte, Roser %X Most diagnostic laboratories are confronted with the increasing demand for molecular diagnosis from patients and families and the ever-increasing genetic heterogeneity of visual disorders. Concerning Retinal Dystrophies (RD), almost 200 causative genes have been reported to date, and most families carry private mutations. We aimed to approach RD genetic diagnosis using all the available genetic information to prioritize candidates for mutational screening, and then restrict the number of cases to be analyzed by massive sequencing. We constructed and optimized a comprehensive cosegregation RD-chip based on SNP genotyping and haplotype analysis. The RD-chip allows to genotype 768 selected SNPs (closely linked to 100 RD causative genes) in a single cost-, time-effective step. Full diagnosis was attained in 17/36 Spanish pedigrees, yielding 12 new and 12 previously reported mutations in 9 RD genes. The most frequently mutated genes were USH2A and CRB1. Notably, RD3-up to now only associated to Leber Congenital Amaurosis- was identified as causative of Retinitis Pigmentosa. The main assets of the RD-chip are: i) the robustness of the genetic information that underscores the most probable candidates, ii) the invaluable clues in cases of shared haplotypes, which are indicative of a common founder effect, and iii) the detection of extended haplotypes over closely mapping genes, which substantiates cosegregation, although the assumptions in which the genetic analysis is based could exceptionally lead astray. The combination of the genetic approach with whole exome sequencing (WES) greatly increases the diagnosis efficiency, and revealed novel mutations in USH2A and GUCY2D. Overall, the RD-chip diagnosis efficiency ranges from 16% in dominant, to 80% in consanguineous recessive pedigrees, with an average of 47%, well within the upper range of massive sequencing approaches, highlighting the validity of this time- and cost-effective approach whilst high-throughput methodologies become amenable for routine diagnosis in medium sized labs. %B PloS one %V 9 %P e88410 %8 2014 %G eng %U http://dx.plos.org/10.1371/journal.pone.0088410 %R 10.1371/journal.pone.0088410 %0 Journal Article %J Molecular Genetics & Genomic Medicine %D 2014 %T Deciphering intrafamilial phenotypic variability by exome sequencing in a Bardet–Biedl family %A González-del Pozo, María %A Méndez-Vidal, Cristina %A Santoyo-López, Javier %A Vela-Boza, Alicia %A Nereida Bravo-Gil %A Antonio Rueda %A García-Alonso, Luz %A Vázquez-Marouschek, Carmen %A Joaquín Dopazo %A Borrego, Salud %A Antiňolo, Guillermo %X Bardet–Biedl syndrome (BBS) is a model ciliopathy characterized by a wide range of clinical variability. The heterogeneity of this condition is reflected in the number of underlying gene defects and the epistatic interactions between the proteins encoded. BBS is generally inherited in an autosomal recessive trait. However, in some families, mutations across different loci interact to modulate the expressivity of the phenotype. In order to investigate the magnitude of epistasis in one BBS family with remarkable intrafamilial phenotypic variability, we designed an exome sequencing–based approach using SOLID 5500xl platform. This strategy allowed the reliable detection of the primary causal mutations in our family consisting of two novel compound heterozygous mutations in McKusick–Kaufman syndrome (MKKS) gene (p.D90G and p.V396F). Additionally, exome sequencing enabled the detection of one novel heterozygous NPHP4 variant which is predicted to activate a cryptic acceptor splice site and is only present in the most severely affected patient. Here, we provide an exome sequencing analysis of a BBS family and show the potential utility of this tool, in combination with network analysis, to detect disease-causing mutations and second-site modifiers. Our data demonstrate how next-generation sequencing (NGS) can facilitate the dissection of epistatic phenomena, and shed light on the genetic basis of phenotypic variability. %B Molecular Genetics & Genomic Medicine %V 2 %P 124-133 %G eng %U http://onlinelibrary.wiley.com/doi/10.1002/mgg3.50/full %R 10.1002/mgg3.50 %0 Journal Article %J Human mutation %D 2014 %T A New Overgrowth Syndrome is Due to Mutations in RNF125. %A Tenorio, Jair %A Mansilla, Alicia %A Valencia, María %A Martínez-Glez, Víctor %A Romanelli, Valeria %A Arias, Pedro %A Castrejón, Nerea %A Poletta, Fernando %A Guillén-Navarro, Encarna %A Gordo, Gema %A Mansilla, Elena %A García-Santiago, Fé %A González-Casado, Isabel %A Vallespín, Elena %A Palomares, María %A Mori, María A %A Santos-Simarro, Fernando %A García-Miñaur, Sixto %A Fernández, Luis %A Mena, Rocío %A Benito-Sanz, Sara %A Del Pozo, Angela %A Silla, Juan Carlos %A Ibañez, Kristina %A López-Granados, Eduardo %A Martín-Trujillo, Alex %A Montaner, David %A Heath, Karen E %A Campos-Barros, Angel %A Joaquín Dopazo %A Nevado, Julián %A Monk, David %A Ruiz-Pérez, Víctor L %A Lapunzina, Pablo %K NGS %K prioritization %K Rare Disease %X Overgrowth syndromes (OGS) are a group of disorders in which all parameters of growth and physical development are above the mean for age and sex. We evaluated a series of 270 families from the Spanish Overgrowth Syndrome Registry with no known overgrowth syndrome. We identified one de novo deletion and three missense mutations in RNF125 in six patients from 4 families with overgrowth, macrocephaly, intellectual disability, mild hydrocephaly, hypoglycaemia and inflammatory diseases resembling Sjögren syndrome. RNF125 encodes an E3 ubiquitin ligase and is a novel gene of OGS. Our studies of the RNF125 pathway point to upregulation of RIG-I-IPS1-MDA5 and/or disruption of the PI3K-AKT and interferon signaling pathways as the putative final effectors. This article is protected by copyright. All rights reserved. %B Human mutation %V 35 %P 1436–1441 %8 2014 Sep 5 %G eng %U http://onlinelibrary.wiley.com/doi/10.1002/humu.22689/abstract %R 10.1002/humu.22689 %0 Journal Article %J Human mutation %D 2014 %T Two Novel Mutations in the BCKDK Gene (Branched-Chain Keto-Acid Dehydrogenase Kinase) are Responsible of a Neurobehavioral Deficit in two Pediatric Unrelated Patients. %A García-Cazorla, Angels %A Oyarzabal, Alfonso %A Fort, Joana %A Robles, Concepción %A Castejón, Esperanza %A Ruiz-Sala, Pedro %A Bodoy, Susanna %A Merinero, Begoña %A Lopez-Sala, Anna %A Joaquín Dopazo %A Nunes, Virginia %A Ugarte, Magdalena %A Artuch, Rafael %A Palacín, Manuel %A Rodríguez-Pombo, Pilar %X Inactivating mutations in the BCKDK gene, which codes for the kinase responsible for the negative regulation of the branched-chain keto-acid dehydrogenase complex (BCKD), have recently been associated with a form of autism in three families. In this work, two novel exonic BCKDK mutations, c.520C>G/p.R174G and c.1166T>C/p.L389P, were identified at the homozygous state in two unrelated children with persistently reduced body fluid levels of branched-chain amino acids (BCAAs), developmental delay, microcephaly and neurobehavioral abnormalities. Functional analysis of the mutations confirmed the missense character of the c.1166T>C change and showed a splicing defect r.[520c>g;521_543del]/p.R174Gfs1*, for c.520C>G due to the presence of a new donor splice site. Mutation p.L389P showed total loss of kinase activity. Moreover, patient-derived fibroblasts showed undetectable (p.R174Gfs1*) or barely detectable (p.L389P) levels of BCKDK protein and its phosphorylated substrate (phospho-E1α), resulting in increased BCKD activity and the very rapid BCAA catabolism manifested by the patients’ clinical phenotype. Based on these results, a protein-rich diet plus oral BCAA supplementation was implemented in the patient homozygous for p.R174Gfs1*. This treatment normalized plasma BCAA levels and improved growth, developmental and behavioral variables. Our results demonstrate that BCKDK mutations can result in neurobehavioral deficits in humans and support the rationale for dietary intervention. This article is protected by copyright. All rights reserved. %B Human mutation %V 35 %P 470-7 %8 2014 Jan 21 %G eng %U http://onlinelibrary.wiley.com/doi/10.1002/humu.22513/abstract %R 10.1002/humu.22513 %0 Journal Article %J BMC systems biology %D 2014 %T Understanding disease mechanisms with models of signaling pathway activities. %A Sebastián-Leon, Patricia %A Vidal, Enrique %A Minguez, Pablo %A Ana Conesa %A Sonia Tarazona %A Amadoz, Alicia %A Armero, Carmen %A Salavert, Francisco %A Vidal-Puig, Antonio %A Montaner, David %A Joaquín Dopazo %K Disease mechanism %K pathway %K signalling %K Systems biology %X BackgroundUnderstanding the aspects of the cell functionality that account for disease or drug action mechanisms is one of the main challenges in the analysis of genomic data and is on the basis of the future implementation of precision medicine.ResultsHere we propose a simple probabilistic model in which signaling pathways are separated into elementary sub-pathways or signal transmission circuits (which ultimately trigger cell functions) and then transforms gene expression measurements into probabilities of activation of such signal transmission circuits. Using this model, differential activation of such circuits between biological conditions can be estimated. Thus, circuit activation statuses can be interpreted as biomarkers that discriminate among the compared conditions. This type of mechanism-based biomarkers accounts for cell functional activities and can easily be associated to disease or drug action mechanisms. The accuracy of the proposed model is demonstrated with simulations and real datasets.ConclusionsThe proposed model provides detailed information that enables the interpretation disease mechanisms as a consequence of the complex combinations of altered gene expression values. Moreover, it offers a framework for suggesting possible ways of therapeutic intervention in a pathologically perturbed system. %B BMC systems biology %V 8 %P 121 %8 2014 Oct 25 %G eng %U http://www.biomedcentral.com/1752-0509/8/121/abstract %R 10.1186/s12918-014-0121-3 %0 Journal Article %J Nucleic acids research %D 2014 %T A web tool for the design and management of panels of genes for targeted enrichment and massive sequencing for clinical applications. %A Alemán, Alejandro %A Garcia-Garcia, Francisco %A Medina, Ignacio %A Joaquín Dopazo %K Diagnostic %K Targeted enrichment sequencing %K WES %X Disease targeted sequencing is gaining importance as a powerful and cost-effective application of high throughput sequencing technologies to the diagnosis. However, the lack of proper tools to process the data hinders its extensive adoption. Here we present TEAM, an intuitive and easy-to-use web tool that fills the gap between the predicted mutations and the final diagnostic in targeted enrichment sequencing analysis. The tool searches for known diagnostic mutations, corresponding to a disease panel, among the predicted patient’s variants. Diagnostic variants for the disease are taken from four databases of disease-related variants (HGMD-public, HUMSAVAR, ClinVar and COSMIC.) If no primary diagnostic variant is found, then a list of secondary findings that can help to establish a diagnostic is produced. TEAM also provides with an interface for the definition of and customization of panels, by means of which, genes and mutations can be added or discarded to adjust panel definitions. TEAM is freely available at: http://team.babelomics.org. %B Nucleic acids research %V 42 %P W83-W87 %8 2014 May 26 %G eng %U http://nar.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=24861626 %R 10.1093/nar/gku472 %0 Journal Article %J Nucleic acids research %D 2014 %T A web-based interactive framework to assist in the prioritization of disease candidate genes in whole-exome sequencing studies. %A Alemán, Alejandro %A Garcia-Garcia, Francisco %A Salavert, Francisco %A Medina, Ignacio %A Joaquín Dopazo %K NGS. prioritization %X Whole-exome sequencing has become a fundamental tool for the discovery of disease-related genes of familial diseases and the identification of somatic driver variants in cancer. However, finding the causal mutation among the enormous background of individual variability in a small number of samples is still a big challenge. Here we describe a web-based tool, BiERapp, which efficiently helps in the identification of causative variants in family and sporadic genetic diseases. The program reads lists of predicted variants (nucleotide substitutions and indels) in affected individuals or tumor samples and controls. In family studies, different modes of inheritance can easily be defined to filter out variants that do not segregate with the disease along the family. Moreover, BiERapp integrates additional information such as allelic frequencies in the general population and the most popular damaging scores to further narrow down the number of putative variants in successive filtering steps. BiERapp provides an interactive and user-friendly interface that implements the filtering strategy used in the context of a large-scale genomic project carried out by the Spanish Network for Research in Rare Diseases (CIBERER) in which more than 800 exomes have been analyzed. BiERapp is freely available at: http://bierapp.babelomics.org/ %B Nucleic acids research %V 42 %P W88-W93. %8 2014 May 6 %G eng %U http://nar.oxfordjournals.org/content/42/W1/W88 %R 10.1093/nar/gku407 %0 Journal Article %J Omics : a journal of integrative biology %D 2013 %T Assessing Differential Expression Measurements by Highly Parallel Pyrosequencing and DNA Microarrays: A Comparative Study. %A Ariño, Joaquín %A Casamayor, Antonio %A Pérez, Julián Perez %A Pedrola, Laia %A Alvarez-Tejado, Miguel %A Marbà, Martina %A Santoyo, Javier %A Joaquín Dopazo %X

Abstract To explore the feasibility of pyrosequencing for quantitative differential gene expression analysis we have performed a comparative study of the results of the sequencing experiments to those obtained by a conventional DNA microarray platform. A conclusion from our analysis is that, over a threshold of 35 normalized reads per gene, the measurements of gene expression display a good correlation with the references. The observed concordance between pyrosequencing and DNA microarray platforms beyond the threshold was of 0.8, measured as a Pearson’s correlation coefficient. In differential gene expression the initial aim is the quantification the differences among transcripts when comparing experimental conditions. Thus, even in a scenario of low coverage the concordance in the measurements is quite acceptable. On the other hand, the comparatively longer read size obtained by pyrosequencing allows detecting unconventional splicing forms.

%B Omics : a journal of integrative biology %8 2011 Sep 15 %G eng %U http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3545353/ %R 10.1089/omi.2011.0065 %0 Journal Article %J Oncotarget %D 2013 %T Capturing the biological impact of CDKN2A and MC1R genes as an early predisposing event in melanoma and non melanoma skin cancer. %A Puig-Butille, Joan Anton %A Escamez, Maria José %A Garcia-Garcia, Francisco %A Tell-Marti, Gemma %A Fabra, Angels %A Martínez-Santamaría, Lucía %A Badenas, Celia %A Aguilera, Paula %A Pevida, Marta %A Joaquín Dopazo %A Del Rio, Marcela %A Puig, Susana %X Germline mutations in CDKN2A and/or red hair color variants in MC1R genes are associated with an increased susceptibility to develop cutaneous melanoma or non melanoma skin cancer. We studied the impact of the CDKN2A germinal mutation p.G101W and MC1R variants on gene expression and transcription profiles associated with skin cancer. To this end we set-up primary skin cell co-cultures from siblings of melanoma prone-families that were later analyzed using the expression array approach. As a result, we found that 1535 transcripts were deregulated in CDKN2A mutated cells, with over-expression of immunity-related genes (HLA-DPB1, CLEC2B, IFI44, IFI44L, IFI27, IFIT1, IFIT2, SP110 and IFNK) and down-regulation of genes playing a role in the Notch signaling pathway. 3570 transcripts were deregulated in MC1R variant carriers. In particular, genes related to oxidative stress and DNA damage pathways were up-regulated as well as genes associated with neurodegenerative diseases such as Parkinson’s, Alzheimer and Huntington. Finally, we observed that the expression signatures indentified in phenotypically normal cells carrying CDKN2A mutations or MC1R variants are maintained in skin cancer tumors (melanoma and squamous cell carcinoma). These results indicate that transcriptome deregulation represents an early event critical for skin cancer development. %B Oncotarget %8 2013 Dec 16 %G eng %U http://www.impactjournals.com/oncotarget/index.php?journal=oncotarget&page=article&op=view&path%5B%5D=1444&path%5B%5D=1824 %0 Journal Article %J Nucleic acids research %D 2013 %T Genome Maps, a new generation genome browser. %A Medina, Ignacio %A Salavert, Francisco %A Sánchez, Rubén %A De Maria, Alejandro %A Alonso, Roberto %A Escobar, Pablo %A Bleda, Marta %A Joaquín Dopazo %K BAM %K genome viewer %K HTML5 %K javascript %K Next Generation Sequencing %K NGS %K SVG %K VCF %X Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org. %B Nucleic acids research %V 41 %P W41-W46 %8 2013 Jun 8 %G eng %U http://nar.oxfordjournals.org/content/41/W1/W41 %R 10.1093/nar/gkt530 %0 Journal Article %J Carcinogenesis %D 2013 %T Grape antioxidant dietary fiber (GADF) inhibits intestinal polyposis in ApcMin/+ mice: relation to cell cycle and immune response. %A Sánchez-Tena, Susana %A Lizarraga, Daneida %A Miranda, Anibal %A Vinardell, Maria Pilar %A Garcia-Garcia, Francisco %A Joaquín Dopazo %A Torres, Josep Lluís %A Saura-Calixto, Fulgencio %A Capellà, Gabriel %A Cascante, Marta %X Epidemiological and experimental studies suggest that fiber and phenolic compounds might have a protective effect on the development of colon cancer in humans. Accordingly, we assessed the chemopreventive efficacy and associated mechanisms of action of a lyophilized red grape pomace containing proanthocyanidin-rich dietary fiber (Grape Antioxidant Dietary Fiber, GADF) on spontaneous intestinal tumorigenesis in the Apc(Min/+) mouse model. Mice were fed a standard diet (control group) or a 1% (w/w) GADF-supplemented diet (GADF group) for 6 weeks. GADF supplementation greatly reduced intestinal tumorigenesis, significantly decreasing the total number of polyps by 76%. Moreover, size distribution analysis showed a considerable reduction in all polyp size categories [diameter <1 mm (65%), 1-2 mm (67%) and >2 mm (87%)]. In terms of polyp formation in the proximal, middle and distal portions of the small intestine a decrease of 76%, 81% and 73% was observed respectively. Putative molecular mechanisms underlying the inhibition of intestinal tumorigenesis were investigated by comparison of microarray expression profiles of GADF-treated and non-treated mice. We observed that the effects of GADF are mainly associated with the induction of a G1 cell cycle arrest and the downregulation of genes related to the immune response and inflammation. Our findings show for the first time the efficacy and associated mechanisms of action of GADF against intestinal tumorigenesis in Apc(Min/+) mice, suggesting its potential for the prevention of colorectal cancer. %B Carcinogenesis %8 2013 Apr 24 %G eng %U http://carcin.oxfordjournals.org/content/early/2013/04/23/carcin.bgt140.abstract %R 10.1093/carcin/bgt140 %0 Journal Article %J PloS one %D 2013 %T Maslinic Acid-Enriched Diet Decreases Intestinal Tumorigenesis in Apc(Min/+) Mice through Transcriptomic and Metabolomic Reprogramming. %A Sánchez-Tena, Susana %A Reyes-Zurita, Fernando J %A Díaz-Moralli, Santiago %A Vinardell, Maria Pilar %A Reed, Michelle %A Garcia-Garcia, Francisco %A Joaquín Dopazo %A Lupiáñez, José A %A Günther, Ulrich %A Cascante, Marta %X Chemoprevention is a pragmatic approach to reduce the risk of colorectal cancer, one of the leading causes of cancer-related death in western countries. In this regard, maslinic acid (MA), a pentacyclic triterpene extracted from wax-like coatings of olives, is known to inhibit proliferation and induce apoptosis in colon cancer cell lines without affecting normal intestinal cells. The present study evaluated the chemopreventive efficacy and associated mechanisms of maslinic acid treatment on spontaneous intestinal tumorigenesis in Apc(Min/+) mice. Twenty-two mice were randomized into 2 groups: control group and MA group, fed with a maslinic acid-supplemented diet for six weeks. MA treatment reduced total intestinal polyp formation by 45% (P<0.01). Putative molecular mechanisms associated with suppressing intestinal polyposis in Apc(Min/+) mice were investigated by comparing microarray expression profiles of MA-treated and control mice and by analyzing the serum metabolic profile using NMR techniques. The different expression phenotype induced by MA suggested that it exerts its chemopreventive action mainly by inhibiting cell-survival signaling and inflammation. These changes eventually induce G1-phase cell cycle arrest and apoptosis. Moreover, the metabolic changes induced by MA treatment were associated with a protective profile against intestinal tumorigenesis. These results show the efficacy and underlying mechanisms of MA against intestinal tumor development in the Apc(Min/+) mice model, suggesting its chemopreventive potential against colorectal cancer. %B PloS one %V 8 %P e59392 %8 2013 %G eng %R 10.1371/journal.pone.0059392 %0 Journal Article %J Clinica chimica acta; international journal of clinical chemistry %D 2013 %T Novel genes detected by transcriptional profiling from whole-blood cells in patients with early onset of acute coronary syndrome: Transcriptional profiling of acute coronary syndrome. %A Silbiger, Vivian N %A Luchessi, André D %A Hirata, Rosário D C %A Lima-Neto, Lídio G %A Cavichioli, Débora %A Carracedo, Ángel %A Brión, Maria %A Joaquín Dopazo %A Garcia-Garcia, Francisco %A Dos Santos, Elizabete S %A Ramos, Rui F %A Sampaio, Marcelo F %A Armaganijan, Dikran %A Sousa, Amanda G M R %A Hirata, Mario H %X {BACKGROUND: Genome-wide expression analysis using microarrays has been used as a research strategy to discovery new biomarkers and candidate genes for a number of diseases. We aim to find new biomarkers for the prediction of acute coronary syndrome (ACS) with a differentially expressed mRNA profiling approach using whole genomic expression analysis in a peripheral blood cell model from patients with early ACS. METHODS AND RESULTS: This study was carried out in two phases. On phase 1 a restricted clinical criteria (ACS-Ph1 %B Clinica chimica acta; international journal of clinical chemistry %8 2013 Mar 24 %G eng %R 10.1016/j.cca.2013.03.011 %0 Journal Article %J Orphanet journal of rare diseases %D 2013 %T Pathways systematically associated to Hirschsprung’s disease. %A Fernández, Raquel M %A Bleda, Marta %A Luzón-Toro, Berta %A García-Alonso, Luz %A Arnold, Stacey %A Sribudiani, Yunia %A Besmond, Claude %A Lantieri, Francesca %A Doan, Betty %A Ceccherini, Isabella %A Lyonnet, Stanislas %A Hofstra, Robert Mw %A Chakravarti, Aravinda %A Antiňolo, Guillermo %A Joaquín Dopazo %A Borrego, Salud %K GWAS %K Hirschprung %K network analysis %K Pathway Based Analysis %X Despite it has been reported that several loci are involved in Hirschsprung’s disease, the molecular basis of the disease remains yet essentially unknown. The study of collective properties of modules of functionally-related genes provides an efficient and sensitive statistical framework that can overcome sample size limitations in the study of rare diseases. Here, we present the extension of a previous study of a Spanish series of HSCR trios to an international cohort of 162 HSCR trios to validate the generality of the underlying functional basis of the Hirschsprung’s disease mechanisms previously found. The Pathway-Based Analysis (PBA) confirms a strong association of gene ontology (GO) modules related to signal transduction and its regulation, enteric nervous system (ENS) formation and other processes related to the disease. In addition, network analysis recovers sub-networks significantly associated to the disease, which contain genes related to the same functionalities, thus providing an independent validation of these findings. The functional profiles of association obtained for patients populations from different countries were compared to each other. While gene associations were different at each series, the main functional associations were identical in all the five populations. These observations would also explain the reported low reproducibility of associations of individual disease genes across populations. %B Orphanet journal of rare diseases %V 8 %P 187 %8 2013 Dec 2 %G eng %U http://www.ojrd.com/content/8/1/187/abstract %R 10.1186/1750-1172-8-187 %0 Journal Article %J PloS one %D 2012 %T Development, Characterization and Experimental Validation of a Cultivated Sunflower (Helianthus annuus L.) Gene Expression Oligonucleotide Microarray. %A Fernandez, Paula %A Soria, Marcelo %A Blesa, David %A Dirienzo, Julio %A Moschen, Sebastián %A Rivarola, Máximo %A Clavijo, Bernardo Jose %A Gonzalez, Sergio %A Peluffo, Lucila %A Príncipi, Dario %A Dosio, Guillermo %A Aguirrezabal, Luis %A Garcia-Garcia, Francisco %A Ana Conesa %A Hopp, Esteban %A Joaquín Dopazo %A Heinz, Ruth Amelia %A Paniego, Norma %X Oligonucleotide-based microarrays with accurate gene coverage represent a key strategy for transcriptional studies in orphan species such as sunflower, H. annuus L., which lacks full genome sequences. The goal of this study was the development and functional annotation of a comprehensive sunflower unigene collection and the design and validation of a custom sunflower oligonucleotide-based microarray. A large scale EST (>130,000 ESTs) curation, assembly and sequence annotation was performed using Blast2GO (www.blast2go.de). The EST assembly comprises 41,013 putative transcripts (12,924 contigs and 28,089 singletons). The resulting Sunflower Unigen Resource (SUR version 1.0) was used to design an oligonucleotide-based Agilent microarray for cultivated sunflower. This microarray includes a total of 42,326 features: 1,417 Agilent controls, 74 control probes for sunflower replicated 10 times (740 controls) and 40,169 different non-control probes. Microarray performance was validated using a model experiment examining the induction of senescence by water deficit. Pre-processing and differential expression analysis of Agilent microarrays was performed using the Bioconductor limma package. The analyses based on p-values calculated by eBayes (p<0.01) allowed the detection of 558 differentially expressed genes between water stress and control conditions; from these, ten genes were further validated by qPCR. Over-represented ontologies were identified using FatiScan in the Babelomics suite. This work generated a curated and trustable sunflower unigene collection, and a custom, validated sunflower oligonucleotide-based microarray using Agilent technology. Both the curated unigene collection and the validated oligonucleotide microarray provide key resources for sunflower genome analysis, transcriptional studies, and molecular breeding for crop improvement. %B PloS one %V 7 %P e45899 %8 2012 %G eng %U http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0045899 %R 10.1371/journal.pone.0045899 %0 Journal Article %J Orphanet journal of rare diseases %D 2012 %T Four new loci associations discovered by pathway-based and network analyses of the genome-wide variability profile of Hirschsprung’s disease. %A Fernández, Raquel Ma %A Bleda, Marta %A Núñez-Torres, Rocío %A Medina, Ignacio %A Luzón-Toro, Berta %A García-Alonso, Luz %A Torroglosa, Ana %A Marbà, Martina %A Enguix-Riego, Ma Valle %A Montaner, David %A Antiňolo, Guillermo %A Joaquín Dopazo %A Borrego, Salud %X ABSTRACT: Finding gene associations in rare diseases is frequently hampered by the reduced numbers of patients accessible. Conventional gene-based association tests rely on the availability of large cohorts, which constitutes a serious limitation for its application in this scenario. To overcome this problem we have used here a combined strategy in which a pathway-based analysis (PBA) has been initially conducted to prioritize candidate genes in a Spanish cohort of 53 trios of short-segment Hirschsprung’s disease. Candidate genes have been further validated in an independent population of 106 trios. The study revealed a strong association of 11 gene ontology (GO) modules related to signal transduction and its regulation, enteric nervous system (ENS) formation and other HSCR-related processes. Among the preselected candidates, a total of 4 loci, RASGEF1A, IQGAP2, DLC1 and CHRNA7, related to signal transduction and migration processes, were found to be significantly associated to HSCR. Network analysis also confirms their involvement in the network of already known disease genes. This approach, based on the study of functionally-related gene sets, requires of lower sample sizes and opens new opportunities for the study of rare diseases. %B Orphanet journal of rare diseases %V 7 %P 103 %8 2012 Dec 28 %G eng %U http://www.ojrd.com/content/7/1/103/abstract %R 10.1186/1750-1172-7-103 %0 Journal Article %J Bioinformatics (Oxford, England) %D 2012 %T Qualimap: evaluating next-generation sequencing alignment data. %A García-Alcalde, Fernando %A Okonechnikov, Konstantin %A Carbonell, José %A Cruz, Luis M %A Götz, Stefan %A Sonia Tarazona %A Joaquín Dopazo %A Meyer, Thomas F %A Ana Conesa %K NGS %X MOTIVATION: The sequence alignment/map (SAM) and the binary alignment/map (BAM) formats have become the standard method of representation of nucleotide sequence alignments for next-generation sequencing data. SAM/BAM files usually contain information from tens to hundreds of millions of reads. Often, the sequencing technology, protocol and/or the selected mapping algorithm introduce some unwanted biases in these data. The systematic detection of such biases is a non-trivial task that is crucial to drive appropriate downstream analyses. RESULTS: We have developed Qualimap, a Java application that supports user-friendly quality control of mapping data, by considering sequence features and their genomic properties. Qualimap takes sequence alignment data and provides graphical and statistical analyses for the evaluation of data. Such quality-control data are vital for highlighting problems in the sequencing and/or mapping processes, which must be addressed prior to further analyses. AVAILABILITY: Qualimap is freely available from http://www.qualimap.org. CONTACT: aconesa@cipf.es SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. %B Bioinformatics (Oxford, England) %V 28 %P 2678-9 %8 2012 Oct 15 %G eng %U http://bioinformatics.oxfordjournals.org/content/28/20/2678.long %R 10.1093/bioinformatics/bts503 %0 Journal Article %J International journal of data mining and bioinformatics %D 2012 %T Select your SNPs (SYSNPs): a web tool for automatic and massive selection of SNPs. %A Lorente-Galdos, Belén %A Medina, Ignacio %A Morcillo-Suarez, Carlos %A Heredia, Txema %A Carreño-Torres, Angel %A Sangrós, Ricardo %A Alegre, Josep %A Pita, Guillermo %A Vellalta, Gemma %A Malats, Nuria %A Pisano, David G %A Joaquín Dopazo %A Navarro, Arcadi %X Association studies are the choice approach in the discovery of the genomic basis of complex traits. To carry out such analysis, researchers frequently need to (1) select optimally informative sets of Single Nucleotide Polymorphisms (SNPs) in candidate regions and (2) annotate the results of associations found by means of genome-wide SNP arrays. These are complex tasks, since many criteria have to be considered, including the SNPs’ functional properties, technological information and haplotype frequencies in given populations. SYSNPs implements algorithms that allow for efficient and simultaneous consideration of all the relevant criteria to obtain sets of SNPs that properly cover arbitrarily large lists of genes or genomic regions. Complementarily, SYSNPs allows for comprehensive functional annotation of SNPs linked to any given marker SNP. SYSNPs dramatically reduces the effort needed for SNP selection from days of searching various databases to a few minutes using a simple browser. %B International journal of data mining and bioinformatics %V 6 %P 324-34 %8 2012 %G eng %U http://inderscience.metapress.com/content/f76740x8071u513n/ %0 Journal Article %J Bioinformatics (Oxford, England) %D 2011 %T B2G-FAR, a species centered GO annotation repository. %A Götz, Stefan %A Arnold, Roland %A Sebastián-Leon, Patricia %A Martín-Rodríguez, Samuel %A Tischler, Patrick %A Jehl, Marc-André %A Joaquín Dopazo %A Rattei, Thomas %A Ana Conesa %X

MOTIVATION: Functional genomics research has expanded enormously in the last decade thanks to the cost-reduction in high-throughput technologies and the development of computational tools that generate, standardize and share information on gene and protein function such as the Gene Ontology (GO). Nevertheless many biologists, especially working with non-model organisms, still suffer from non-existing or low coverage functional annotation, or simply struggle retrieving, summarizing and querying these data. RESULTS: The Blast2GO Functional Annotation Repository (B2G-FAR) is a bioinformatics resource envisaged to provide functional information for otherwise uncharacterized sequence-data and offers data-mining tools to analyze a larger repertoire of species than currently available. This new annotation resource has been created by applying the Blast2GO functional annotation engine in a strongly high-throughput manner to the entire space of public available sequences. The resulting repository contains GO term predictions for over 13.2 million non-redundant protein sequences based on BLAST search alignments from the SIMAP database. We generated GO annotation for approximately 150.000 different taxa making available the 2000 species with the highest coverage through B2G-FAR. A second section within B2G-FAR holds functional annotations for 17 non-model organism Affymetrix GeneChips. Conclusions: B2G-FAR provides easy access to exhaustive functional annotation for 2000 species offering a good balance between quality and quantity, thereby supporting functional genomics research especially in the case of non-model organisms. AVAILABILITY: The annotation resource is available at http://b2gfar.bioinfo.cipf.es. CONTACT: aconesa@cipf.es, sgoetz@cipf.es.

%B Bioinformatics (Oxford, England) %V 27 %P 919-924 %8 2011 Feb 18 %G eng %0 Journal Article %J Plant physiology %D 2011 %T Early transcriptional defence responses in Arabidopsis cell suspension culture under high light conditions. %A González-Pérez, Sergio %A Gutiérrez, Jorge %A Garcia-Garcia, Francisco %A Osuna, Daniel %A Joaquín Dopazo %A Lorenzo, Oscar %A Revuelta, José L %A Arellano, Juan B %X

The early transcriptional defence responses and ROS production in Arabidopsis cell suspension culture (ACSC), containing functional chloroplasts, were examined at high light (HL). The transcriptional analysis revealed that most of the ROS markers identified among the 449 transcripts with significant differential expression were transcripts specifically up-regulated by singlet oxygen (1O2). On the contrary, minimal correlation was established with transcripts specifically up-regulated by superoxide radical (O2•) or hydrogen peroxide (H2O2). The transcriptional analysis was supported by fluorescence microscopy experiments. The incubation of ACSC with the 1O2 sensor green reagent and 2’,7’-dichlorofluorescein diacetate showed that the 30-min-HL-treated cultures emitted fluorescence that corresponded with the production of 1O2, but not of H2O2. Furthermore, the in vivo photodamage of the D1 protein of photosystem II (PSII) indicated that the photogeneration of 1O2 took place within the PSII reaction centre. Functional enrichment analyses identified transcripts that are key components of the ROS signalling transduction pathway in plants as well as others encoding transcription factors that regulate both ROS scavenging and water deficit stress. A meta-analysis examining the transcriptional profiles of mutants and hormone treatments in Arabidopsis showed a high correlation between ACSC at HL and the flu mutant family of Arabidopsis, a producer of 1O2 in plastids. Intriguingly, a high correlation was also observed with aba1 and max4, two mutants with defects in the biosynthesis pathways of two key (apo)carotenoid-derived plant hormones (i.e. ABA and strigolactones, respectively). ACSC has proven to be a valuable system for studying early transcriptional responses to HL stress.

%B Plant physiology %V 156 %P 1439-56 %8 2011 Apr 29 %G eng %U http://www.plantphysiol.org/content/early/2011/04/29/pp.111.177766.short?keytype=ref&ijkey=ph5B6J2khjnqwzN %0 Journal Article %J Bioinformatics (Oxford, England) %D 2011 %T Paintomics: a web based tool for the joint visualization of transcriptomics and metabolomics data. %A García-Alcalde, Fernando %A García-López, Federico %A Joaquín Dopazo %A Ana Conesa %X

The development of the omics technologies such as transcriptomics, proteomics and metabolomics has made possible the realization of systems biology studies where biological systems are interrogated at different levels of biochemical activity (gene expression, protein activity and/or metabolite concentration). An effective approach to the analysis of these complex datasets is the joined visualization of the disparate biomolecular data on the framework of known biological pathways.

%B Bioinformatics (Oxford, England) %V 27 %P 137-9 %8 2011 Jan 1 %G eng %0 Journal Article %J BMC genomics %D 2011 %T Profiling the venom gland transcriptomes of Costa Rican snakes by 454 pyrosequencing. %A Durban, Jordi %A Juárez, Paula %A Angulo, Yamileth %A Lomonte, Bruno %A Flores-Diaz, Marietta %A Alape-Girón, Alberto %A Sasa, Mahmood %A Sanz, Libia %A Gutiérrez, José M %A Joaquín Dopazo %A Ana Conesa %A Calvete, Juan J %X

A long term research goal of venomics, of applied importance for improving current antivenom therapy, but also for drug discovery, is to understand the pharmacological potential of venoms. Individually or combined, proteomic and transcriptomic studies have demonstrated their feasibility to explore in depth the molecular diversity of venoms. In the absence of genome sequence, transcriptomes represent also valuable searchable databases for proteomic projects.

%B BMC genomics %V 12 %P 259 %8 2011 %G eng %0 Journal Article %J Nucleic Acids Research %D 2010 %T Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling. %A Medina, Ignacio %A Carbonell, José %A Pulido, Luis %A Madeira, Sara C %A Goetz, Stefan %A Ana Conesa %A Tárraga, Joaquín %A Pascual-Montano, Alberto %A Nogales-Cadenas, Ruben %A Santoyo, Javier %A García, Francisco %A Marbà, Martina %A Montaner, David %A Joaquín Dopazo %K babelomics %K gene expression %K genotyping %K gepas %K GSA %K GWAS %X

Babelomics is a response to the growing necessity of integrating and analyzing different types of genomic data in an environment that allows an easy functional interpretation of the results. Babelomics includes a complete suite of methods for the analysis of gene expression data that include normalization (covering most commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors, clustering; large-scale genotyping assays (case controls and TDTs, and allows population stratification analysis and correction). All these genomic data analysis facilities are integrated and connected to multiple options for the functional interpretation of the experiments. Different methods of functional enrichment or gene set enrichment can be used to understand the functional basis of the experiment analyzed. Many sources of biological information, which include functional (GO, KEGG, Biocarta, Reactome, etc.), regulatory (Transfac, Jaspar, ORegAnno, miRNAs, etc.), text-mining or protein-protein interaction modules can be used for this purpose. Finally a tool for the de novo functional annotation of sequences has been included in the system. This provides support for the functional analysis of non-model species. Mirrors of Babelomics or command line execution of their individual components are now possible. Babelomics is available at http://www.babelomics.org.

%B Nucleic Acids Research %V 38 %P W210-W213. Featured in NAR %8 2010 May 16 %G eng %U http://nar.oxfordjournals.org/content/38/suppl_2/W210.full %& Featured in NAR %0 Journal Article %J The ISME journal %D 2010 %T Fine-scale evolution: genomic, phenotypic and ecological differentiation in two coexisting Salinibacter ruber strains. %A Peña, Arantxa %A Teeling, Hanno %A Huerta-Cepas, Jaime %A Santos, Fernando %A Yarza, Pablo %A Brito-Echeverría, Jocelyn %A Lucio, Marianna %A Schmitt-Kopplin, Philippe %A Meseguer, Inmaculada %A Schenowitz, Chantal %A Dossat, Carole %A Barbe, Valerie %A Joaquín Dopazo %A Rosselló-Mora, Ramon %A Schüler, Margarete %A Glöckner, Frank Oliver %A Amann, Rudolf %A Gabaldón, Toni %A Antón, Josefa %X

Genomic and metagenomic data indicate a high degree of genomic variation within microbial populations, although the ecological and evolutive meaning of this microdiversity remains unknown. Microevolution analyses, including genomic and experimental approaches, are so far very scarce for non-pathogenic bacteria. In this study, we compare the genomes, metabolomes and selected ecological traits of the strains M8 and M31 of the hyperhalophilic bacterium Salinibacter ruber that contain ribosomal RNA (rRNA) gene and intergenic regions that are identical in sequence and were simultaneously isolated from a Mediterranean solar saltern. Comparative analyses indicate that S. ruber genomes present a mosaic structure with conserved and hypervariable regions (HVRs). The HVRs or genomic islands, are enriched in transposases, genes related to surface properties, strain-specific genes and highly divergent orthologous. However, the many indels outside the HVRs indicate that genome plasticity extends beyond them. Overall, 10% of the genes encoded in the M8 genome are absent from M31 and could stem from recent acquisitions. S. ruber genomes also harbor 34 genes located outside HVRs that are transcribed during standard growth and probably derive from lateral gene transfers with Archaea preceding the M8/M31 divergence. Metabolomic analyses, phage susceptibility and competition experiments indicate that these genomic differences cannot be considered neutral from an ecological perspective. The results point to the avoidance of competition by micro-niche adaptation and response to viral predation as putative major forces that drive microevolution within these Salinibacter strains. In addition, this work highlights the extent of bacterial functional diversity and environmental adaptation, beyond the resolution of the 16S rRNA and internal transcribed spacers regions.The ISME Journal advance online publication, 18 February 2010; doi:10.1038/ismej.2010.6.

%B The ISME journal %8 2010 Feb 18 %G eng %0 Book Section %B Methods in molecular biology (Clifton, N.J.) %D 2010 %T Functional profiling methods in cancer. %A Joaquín Dopazo %E Grützmann, Robert %E Pilarsky, Christian %X

The introduction of new high-throughput methodologies such as DNA microarrays constitutes a major breakthrough in cancer research. The unprecedented amount of data produced by such technologies has opened new avenues for interrogating living systems although, at the same time, it has demanded of the development of new data analytical methods as well as new strategies for testing hypotheses. A history of early successful applications in cancer boosted the use of microarrays and fostered further applications in other fields. Keeping the pace with these technologies, bioinformatics offers new solutions for data analysis and, what is more important, permits the formulation of a new class of hypotheses inspired in systems biology, more oriented to pathways or, in general, to modules of functionally related genes. Although these analytical methodologies are new, some options are already available and are discussed in this chapter.

%B Methods in molecular biology (Clifton, N.J.) %V 576 %P 363-74 %8 2010 %G eng