%0 Journal Article %J Hum Mol Genet %D 2022 %T Novel genes and sex differences in COVID-19 severity. %A Cruz, Raquel %A Almeida, Silvia Diz-de %A Heredia, Miguel López %A Quintela, Inés %A Ceballos, Francisco C %A Pita, Guillermo %A Lorenzo-Salazar, José M %A González-Montelongo, Rafaela %A Gago-Domínguez, Manuela %A Porras, Marta Sevilla %A Castaño, Jair Antonio Tenorio %A Nevado, Julián %A Aguado, Jose María %A Aguilar, Carlos %A Aguilera-Albesa, Sergio %A Almadana, Virginia %A Almoguera, Berta %A Alvarez, Nuria %A Andreu-Bernabeu, Álvaro %A Arana-Arri, Eunate %A Arango, Celso %A Arranz, María J %A Artiga, Maria-Jesus %A Baptista-Rosas, Raúl C %A Barreda-Sánchez, María %A Belhassen-Garcia, Moncef %A Bezerra, Joao F %A Bezerra, Marcos A C %A Boix-Palop, Lucía %A Brión, Maria %A Brugada, Ramón %A Bustos, Matilde %A Calderón, Enrique J %A Carbonell, Cristina %A Castano, Luis %A Castelao, Jose E %A Conde-Vicente, Rosa %A Cordero-Lorenzana, M Lourdes %A Cortes-Sanchez, Jose L %A Corton, Marta %A Darnaude, M Teresa %A De Martino-Rodríguez, Alba %A Campo-Pérez, Victor %A Bustamante, Aranzazu Diaz %A Domínguez-Garrido, Elena %A Luchessi, André D %A Eirós, Rocío %A Sanabria, Gladys Mercedes Estigarribia %A Fariñas, María Carmen %A Fernández-Robelo, Uxía %A Fernández-Rodríguez, Amanda %A Fernández-Villa, Tania %A Gil-Fournier, Belén %A Gómez-Arrue, Javier %A Álvarez, Beatriz González %A Quirós, Fernan Gonzalez Bernaldo %A González-Peñas, Javier %A Gutiérrez-Bautista, Juan F %A Herrero, María José %A Herrero-Gonzalez, Antonio %A Jimenez-Sousa, María A %A Lattig, María Claudia %A Borja, Anabel Liger %A Lopez-Rodriguez, Rosario %A Mancebo, Esther %A Martín-López, Caridad %A Martín, Vicente %A Martinez-Nieto, Oscar %A Martinez-Lopez, Iciar %A Martinez-Resendez, Michel F %A Martinez-Perez, Ángel %A Mazzeu, Juliana A %A Macías, Eleuterio Merayo %A Minguez, Pablo %A Cuerda, Victor Moreno %A Silbiger, Vivian N %A Oliveira, Silviene F %A Ortega-Paino, Eva %A Parellada, Mara %A Paz-Artal, Estela %A Santos, Ney P C %A Pérez-Matute, Patricia %A Perez, Patricia %A Pérez-Tomás, M Elena %A Perucho, Teresa %A Pinsach-Abuin, Mel Lina %A Pompa-Mera, Ericka N %A Porras-Hurtado, Gloria L %A Pujol, Aurora %A León, Soraya Ramiro %A Resino, Salvador %A Fernandes, Marianne R %A Rodríguez-Ruiz, Emilio %A Rodriguez-Artalejo, Fernando %A Rodriguez-Garcia, José A %A Ruiz-Cabello, Francisco %A Ruiz-Hornillos, Javier %A Ryan, Pablo %A Soria, José Manuel %A Souto, Juan Carlos %A Tamayo, Eduardo %A Tamayo-Velasco, Alvaro %A Taracido-Fernandez, Juan Carlos %A Teper, Alejandro %A Torres-Tobar, Lilian %A Urioste, Miguel %A Valencia-Ramos, Juan %A Yáñez, Zuleima %A Zarate, Ruth %A Nakanishi, Tomoko %A Pigazzini, Sara %A Degenhardt, Frauke %A Butler-Laporte, Guillaume %A Maya-Miles, Douglas %A Bujanda, Luis %A Bouysran, Youssef %A Palom, Adriana %A Ellinghaus, David %A Martínez-Bueno, Manuel %A Rolker, Selina %A Amitrano, Sara %A Roade, Luisa %A Fava, Francesca %A Spinner, Christoph D %A Prati, Daniele %A Bernardo, David %A García, Federico %A Darcis, Gilles %A Fernández-Cadenas, Israel %A Holter, Jan Cato %A Banales, Jesus M %A Frithiof, Robert %A Duga, Stefano %A Asselta, Rosanna %A Pereira, Alexandre C %A Romero-Gómez, Manuel %A Nafría-Jiménez, Beatriz %A Hov, Johannes R %A Migeotte, Isabelle %A Renieri, Alessandra %A Planas, Anna M %A Ludwig, Kerstin U %A Buti, Maria %A Rahmouni, Souad %A Alarcón-Riquelme, Marta E %A Schulte, Eva C %A Franke, Andre %A Karlsen, Tom H %A Valenti, Luca %A Zeberg, Hugo %A Richards, Brent %A Ganna, Andrea %A Boada, Mercè %A Rojas, Itziar %A Ruiz, Agustín %A Sánchez, Pascual %A Real, Luis Miguel %A Guillén-Navarro, Encarna %A Ayuso, Carmen %A González-Neira, Anna %A Riancho, José A %A Rojas-Martinez, Augusto %A Flores, Carlos %A Lapunzina, Pablo %A Carracedo, Ángel %X

Here we describe the results of a genome-wide study conducted in 11 939 COVID-19 positive cases with an extensive clinical information that were recruited from 34 hospitals across Spain (SCOURGE consortium). In sex-disaggregated genome-wide association studies for COVID-19 hospitalization, genome-wide significance (p < 5x10-8) was crossed for variants in 3p21.31 and 21q22.11 loci only among males (p = 1.3x10-22 and p = 8.1x10-12, respectively), and for variants in 9q21.32 near TLE1 only among females (p = 4.4x10-8). In a second phase, results were combined with an independent Spanish cohort (1598 COVID-19 cases and 1068 population controls), revealing in the overall analysis two novel risk loci in 9p13.3 and 19q13.12, with fine-mapping prioritized variants functionally associated with AQP3 (p = 2.7x10-8) and ARHGAP33 (p = 1.3x10-8), respectively. The meta-analysis of both phases with four European studies stratified by sex from the Host Genetics Initiative confirmed the association of the 3p21.31 and 21q22.11 loci predominantly in males and replicated a recently reported variant in 11p13 (ELF5, p = 4.1x10-8). Six of the COVID-19 HGI discovered loci were replicated and an HGI-based genetic risk score predicted the severity strata in SCOURGE. We also found more SNP-heritability and larger heritability differences by age (<60 or ≥ 60 years) among males than among females. Parallel genome-wide screening of inbreeding depression in SCOURGE also showed an effect of homozygosity in COVID-19 hospitalization and severity and this effect was stronger among older males. In summary, new candidate genes for COVID-19 severity and evidence supporting genetic disparities among sexes are provided.

%B Hum Mol Genet %8 2022 Jun 16 %G eng %R 10.1093/hmg/ddac132 %0 Journal Article %J Nature %D 2020 %T Transparency and reproducibility in artificial intelligence. %A Haibe-Kains, Benjamin %A Adam, George Alexandru %A Hosny, Ahmed %A Khodakarami, Farnoosh %A Waldron, Levi %A Wang, Bo %A McIntosh, Chris %A Goldenberg, Anna %A Kundaje, Anshul %A Greene, Casey S %A Broderick, Tamara %A Hoffman, Michael M %A Leek, Jeffrey T %A Korthauer, Keegan %A Huber, Wolfgang %A Brazma, Alvis %A Pineau, Joelle %A Tibshirani, Robert %A Hastie, Trevor %A Ioannidis, John P A %A Quackenbush, John %A Aerts, Hugo J W L %K Algorithms %K Artificial Intelligence %K Reproducibility of Results %B Nature %V 586 %P E14-E16 %8 2020 10 %G eng %N 7829 %1 https://www.ncbi.nlm.nih.gov/pubmed/33057217?dopt=Abstract %R 10.1038/s41586-020-2766-y %0 Journal Article %J N Engl J Med %D 2017 %T GGPS1 Mutation and Atypical Femoral Fractures with Bisphosphonates. %A Roca-Ayats, Neus %A Balcells, Susana %A Garcia-Giralt, Natàlia %A Falcó-Mascaró, Maite %A Martínez-Gil, Núria %A Abril, Josep F %A Urreizti, Roser %A Dopazo, Joaquin %A Quesada-Gómez, José M %A Nogués, Xavier %A Mellibovsky, Leonardo %A Prieto-Alhambra, Daniel %A Dunford, James E %A Javaid, Muhammad K %A Russell, R Graham %A Grinberg, Daniel %A Díez-Pérez, Adolfo %K Aged %K Amino Acid Sequence %K Bone Density Conservation Agents %K Dimethylallyltranstransferase %K Diphosphonates %K Exome %K Farnesyltranstransferase %K Female %K Femoral Fractures %K Geranyltranstransferase %K Humans %K Middle Aged %K mutation %B N Engl J Med %V 376 %P 1794-1795 %8 2017 May 04 %G eng %U http://www.nejm.org/doi/full/10.1056/NEJMc1612804 %N 18 %1 https://www.ncbi.nlm.nih.gov/pubmed/28467865?dopt=Abstract %R 10.1056/NEJMc1612804 %0 Journal Article %J Hum Mutat %D 2017 %T Mutations in TRAPPC11 are associated with a congenital disorder of glycosylation. %A Matalonga, Leslie %A Bravo, Miren %A Serra-Peinado, Carla %A García-Pelegrí, Elisabeth %A Ugarteburu, Olatz %A Vidal, Silvia %A Llambrich, Maria %A Quintana, Ester %A Fuster-Jorge, Pedro %A Gonzalez-Bravo, Maria Nieves %A Beltran, Sergi %A Dopazo, Joaquin %A Garcia-Garcia, Francisco %A Foulquier, François %A Matthijs, Gert %A Mills, Philippa %A Ribes, Antonia %A Egea, Gustavo %A Briones, Paz %A Tort, Frederic %A Girós, Marisa %K Abnormalities, Multiple %K Alleles %K Amino Acid Substitution %K Brain %K Congenital Disorders of Glycosylation %K Genotype %K Humans %K Magnetic Resonance Imaging %K Male %K mutation %K Phenotype %K Vesicular Transport Proteins %K Whole Genome Sequencing %X

Congenital disorders of glycosylation (CDG) are a heterogeneous and rapidly growing group of diseases caused by abnormal glycosylation of proteins and/or lipids. Mutations in genes involved in the homeostasis of the endoplasmic reticulum (ER), the Golgi apparatus (GA), and the vesicular trafficking from the ER to the ER-Golgi intermediate compartment (ERGIC) have been found to be associated with CDG. Here, we report a patient with defects in both N- and O-glycosylation combined with a delayed vesicular transport in the GA due to mutations in TRAPPC11, a subunit of the TRAPPIII complex. TRAPPIII is implicated in the anterograde transport from the ER to the ERGIC as well as in the vesicle export from the GA. This report expands the spectrum of genetic alterations associated with CDG, providing new insights for the diagnosis and the understanding of the physiopathological mechanisms underlying glycosylation disorders.

%B Hum Mutat %V 38 %P 148-151 %8 2017 Feb %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/27862579?dopt=Abstract %R 10.1002/humu.23145 %0 Journal Article %J DNA Res %D 2016 %T Highly sensitive and ultrafast read mapping for RNA-seq analysis. %A Medina, I %A Tárraga, J %A Martínez, H %A Barrachina, S %A Castillo, M I %A Paschall, J %A Salavert-Torres, J %A Blanquer-Espert, I %A Hernández-García, V %A Quintana-Ortí, E S %A Dopazo, J %K Genomics %K High-Throughput Nucleotide Sequencing %K Humans %K Sensitivity and Specificity %K Sequence Analysis, RNA %K Transcriptome %X

As sequencing technologies progress, the amount of data produced grows exponentially, shifting the bottleneck of discovery towards the data analysis phase. In particular, currently available mapping solutions for RNA-seq leave room for improvement in terms of sensitivity and performance, hindering an efficient analysis of transcriptomes by massive sequencing. Here, we present an innovative approach that combines re-engineering, optimization and parallelization. This solution results in a significant increase of mapping sensitivity over a wide range of read lengths and substantial shorter runtimes when compared with current RNA-seq mapping methods available.

%B DNA Res %V 23 %P 93-100 %8 2016 Apr %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/26740642?dopt=Abstract %R 10.1093/dnares/dsv039 %0 Journal Article %J IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM %D 2015 %T Concurrent and Accurate Short Read Mapping on Multicore Processors. %A Martinez, Hector %A Tárraga, Joaquín %A Medina, Ignacio %A Barrachina, Sergio %A Castillo, Maribel %A Dopazo, Joaquin %A Quintana-Orti, Enrique S %K HPC %K NGS %K short real mapping %X We introduce a parallel aligner with a work-flow organization for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, [Formula: see text] ([Formula: see text] is an open-source application. The software is available at http://www.opencb.org, exploits a suffix array to rapidly map a large fraction of the RNA fragments (reads), as well as leverages the accuracy of the Smith-Waterman algorithm to deal with conflictive reads. The aligner is enhanced with a careful strategy to detect splice junctions based on an adaptive division of RNA reads into small segments (or seeds), which are then mapped onto a number of candidate alignment locations, providing crucial information for the successful alignment of the complete reads. The experimental results on a platform with Intel multicore technology report the parallel performance of [Formula: see text], on RNA reads of 100-400 nucleotides, which excels in execution time/sensitivity to state-of-the-art aligners such as TopHat 2+Bowtie 2, MapSplice, and STAR. %B IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM %V 12 %P 995-1007 %8 2015 Sep-Oct %G eng %U http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=7010005 %R 10.1109/TCBB.2015.2392077 %0 Journal Article %J Molecular immunology %D 2015 %T Therapeutic targets for olive pollen allergy defined by gene markers modulated by Ole e 1-derived peptides. %A Calzada, David %A Aguerri, Miriam %A Baos, Selene %A Montaner, David %A Mata, Manuel %A Joaquín Dopazo %A Quiralte, Joaquín %A Florido, Fernando %A Lahoz, Carlos %A Cárdaba, Blanca %X Two regions of Ole e 1, the major olive-pollen allergen, have been characterized as T-cell epitopes, one as immunodominant region (aa91-130) and the other, as mainly recognized by non-allergic subjects (aa10-31). This report tries to characterize the specific relevance of these epitopes in the allergic response to olive pollen by analyzing the secreted cytokines and the gene expression profiles induced after specific stimulation of peripheral blood mononuclear cells (PBMCs). PBMCs from olive pollen-allergic and non-allergic control subjects were stimulated with olive-pollen extract and Ole e 1 dodecapeptides containing relevant T-cell epitopes. Levels of cytokines were measured in cellular supernatants and gene expression was determined by microarrays, on the RNAs extracted from PBMCs. One hundred eighty-nine differential genes (fold change >2 or <-2, P<0.05) were validated by qRT-PCR in a large population. It was not possible to define a pattern of response according the overall cytokine results but interesting differences were observed, mainly in the regulatory cytokines. Principal component (PCA) gene-expression analysis defined clusters that correlated with the experimental conditions in the group of allergic subjects. Gene expression and functional analyses revealed differential genes and pathways among the experimental conditions. A set of 51 genes (many essential to T-cell tolerance and homeostasis) correlated with the response to aa10-31 of Ole e 1. In conclusion, two peptides derived from Ole e 1 could regulate the immune response in allergic patients, by gene-expression modification of several regulation-related genes. These results open new research ways to the regulation of allergy by Oleaceae family members. %B Molecular immunology %V 64 %P 252-61 %8 2015 Apr %G eng %U http://www.sciencedirect.com/science/article/pii/S0161589014003356 %R 10.1016/j.molimm.2014.12.002 %0 Journal Article %J J Biol Regul Homeost Agents %D 2013 %T Differential gene-expression analysis defines a molecular pattern related to olive pollen allergy. %A Aguerri, M %A Calzada, D %A Montaner, D %A Mata, M %A Florido, F %A Quiralte, J %A Dopazo, J %A Lahoz, C %A Cardaba, B %K Adult %K Female %K Gene Expression Profiling %K Humans %K Male %K Middle Aged %K Olea %K Principal Component Analysis %K Rhinitis, Allergic, Seasonal %X

Analysis of gene-expression profiles by microarrays is useful for characterization of candidate genes, key regulatory networks, and to define phenotypes or molecular signatures which improve the diagnosis and/or classification of the allergic processes. We have used this approach in the study of olive pollen response in order to find differential molecular markers among responders and non-responders to this allergenic source. Five clinical groups, non-allergic, asymptomatic, allergic but not to olive pollen, untreated-olive-pollen allergic patients and olive-pollen allergic patients (under specific-immunotherapy), were assessed during and outside pollen seasons. Whole-genome gene expression analysis was performed in RNAs extracted from PBMCs. After assessment of data quality and principal components analysis (PCA), differential gene-expression, by multiple testing and, functional analyses by KEGG, for pathways and Gene-Ontology for biological processes were performed. Relevance was defined by fold change and corrected P values (less than 0.05). The most differential genes were validated by qRT-PCR in a larger set of individuals. Interestingly, gene-expression profiling obtained by PCA clearly showed five clusters of samples that correlated with the five clinical groups. Furthermore, differential gene expression and functional analyses revealed differential genes and pathways in the five clinical groups. The 93 most significant genes found were validated, and one set of 35 genes was able to discriminate profiles of olive pollen response. Our results, in addition to providing new information on allergic response, define a possible molecular signature for olive pollen allergy which could be useful for the diagnosis and treatment of this and other sensitizations.

%B J Biol Regul Homeost Agents %V 27 %P 337-50 %8 2013 Apr-Jun %G eng %N 2 %1 https://www.ncbi.nlm.nih.gov/pubmed/23830385?dopt=Abstract %0 Journal Article %J PLoS pathogens %D 2011 %T Discovery of an ebolavirus-like filovirus in europe. %A Negredo, Ana %A Palacios, Gustavo %A Vázquez-Morón, Sonia %A González, Félix %A Dopazo, Hernán %A Molero, Francisca %A Juste, Javier %A Quetglas, Juan %A Savji, Nazir %A de la Cruz Martínez, Maria %A Herrera, Jesus Enrique %A Pizarro, Manuel %A Hutchison, Stephen K %A Echevarría, Juan E %A Lipkin, W Ian %A Tenorio, Antonio %X

Filoviruses, amongst the most lethal of primate pathogens, have only been reported as natural infections in sub-Saharan Africa and the Philippines. Infections of bats with the ebolaviruses and marburgviruses do not appear to be associated with disease. Here we report identification in dead insectivorous bats of a genetically distinct filovirus, provisionally named Lloviu virus, after the site of detection, Cueva del Lloviu, in Spain.

%B PLoS pathogens %V 7 %P e1002304 %8 2011 Oct %G eng %0 Journal Article %J The Journal of clinical endocrinology and metabolism %D 2011 %T Modeling human endometrial decidualization from the interaction between proteome and secretome. %A Garrido-Gomez, Tamara %A Dominguez, Francisco %A Lopez, Juan Antonio %A Camafeita, Emilio %A Quiñonero, Alicia %A Martinez-Conejero, Jose Antonio %A Pellicer, Antonio %A Ana Conesa %A Simon, Carlos %X

Decidualization of the human endometrium, which involves morphological and biochemical modifications of the endometrial stromal cells (ESCs), is a prerequisite for adequate trophoblast invasion and placenta formation.

%B The Journal of clinical endocrinology and metabolism %V 96 %P 706-16 %8 2011 Mar %G eng %0 Journal Article %J Nature biotechnology %D 2010 %T The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. %A Shi, Leming %A Campbell, Gregory %A Jones, Wendell D %A Campagne, Fabien %A Wen, Zhining %A Walker, Stephen J %A Su, Zhenqiang %A Chu, Tzu-Ming %A Goodsaid, Federico M %A Pusztai, Lajos %A Shaughnessy, John D %A Oberthuer, André %A Thomas, Russell S %A Paules, Richard S %A Fielden, Mark %A Barlogie, Bart %A Chen, Weijie %A Du, Pan %A Fischer, Matthias %A Furlanello, Cesare %A Gallas, Brandon D %A Ge, Xijin %A Megherbi, Dalila B %A Symmans, W Fraser %A Wang, May D %A Zhang, John %A Bitter, Hans %A Brors, Benedikt %A Bushel, Pierre R %A Bylesjo, Max %A Chen, Minjun %A Cheng, Jie %A Cheng, Jing %A Chou, Jeff %A Davison, Timothy S %A Delorenzi, Mauro %A Deng, Youping %A Devanarayan, Viswanath %A Dix, David J %A Dopazo, Joaquin %A Dorff, Kevin C %A Elloumi, Fathi %A Fan, Jianqing %A Fan, Shicai %A Fan, Xiaohui %A Fang, Hong %A Gonzaludo, Nina %A Hess, Kenneth R %A Hong, Huixiao %A Huan, Jun %A Irizarry, Rafael A %A Judson, Richard %A Juraeva, Dilafruz %A Lababidi, Samir %A Lambert, Christophe G %A Li, Li %A Li, Yanen %A Li, Zhen %A Lin, Simon M %A Liu, Guozhen %A Lobenhofer, Edward K %A Luo, Jun %A Luo, Wen %A McCall, Matthew N %A Nikolsky, Yuri %A Pennello, Gene A %A Perkins, Roger G %A Philip, Reena %A Popovici, Vlad %A Price, Nathan D %A Qian, Feng %A Scherer, Andreas %A Shi, Tieliu %A Shi, Weiwei %A Sung, Jaeyun %A Thierry-Mieg, Danielle %A Thierry-Mieg, Jean %A Thodima, Venkata %A Trygg, Johan %A Vishnuvajjala, Lakshmi %A Wang, Sue Jane %A Wu, Jianping %A Wu, Yichao %A Xie, Qian %A Yousef, Waleed A %A Zhang, Liang %A Zhang, Xuegong %A Zhong, Sheng %A Zhou, Yiming %A Zhu, Sheng %A Arasappan, Dhivya %A Bao, Wenjun %A Lucas, Anne Bergstrom %A Berthold, Frank %A Brennan, Richard J %A Buness, Andreas %A Catalano, Jennifer G %A Chang, Chang %A Chen, Rong %A Cheng, Yiyu %A Cui, Jian %A Czika, Wendy %A Demichelis, Francesca %A Deng, Xutao %A Dosymbekov, Damir %A Eils, Roland %A Feng, Yang %A Fostel, Jennifer %A Fulmer-Smentek, Stephanie %A Fuscoe, James C %A Gatto, Laurent %A Ge, Weigong %A Goldstein, Darlene R %A Guo, Li %A Halbert, Donald N %A Han, Jing %A Harris, Stephen C %A Hatzis, Christos %A Herman, Damir %A Huang, Jianping %A Jensen, Roderick V %A Jiang, Rui %A Johnson, Charles D %A Jurman, Giuseppe %A Kahlert, Yvonne %A Khuder, Sadik A %A Kohl, Matthias %A Li, Jianying %A Li, Li %A Li, Menglong %A Li, Quan-Zhen %A Li, Shao %A Li, Zhiguang %A Liu, Jie %A Liu, Ying %A Liu, Zhichao %A Meng, Lu %A Madera, Manuel %A Martinez-Murillo, Francisco %A Medina, Ignacio %A Meehan, Joseph %A Miclaus, Kelci %A Moffitt, Richard A %A Montaner, David %A Mukherjee, Piali %A Mulligan, George J %A Neville, Padraic %A Nikolskaya, Tatiana %A Ning, Baitang %A Page, Grier P %A Parker, Joel %A Parry, R Mitchell %A Peng, Xuejun %A Peterson, Ron L %A Phan, John H %A Quanz, Brian %A Ren, Yi %A Riccadonna, Samantha %A Roter, Alan H %A Samuelson, Frank W %A Schumacher, Martin M %A Shambaugh, Joseph D %A Shi, Qiang %A Shippy, Richard %A Si, Shengzhu %A Smalter, Aaron %A Sotiriou, Christos %A Soukup, Mat %A Staedtler, Frank %A Steiner, Guido %A Stokes, Todd H %A Sun, Qinglan %A Tan, Pei-Yi %A Tang, Rong %A Tezak, Zivana %A Thorn, Brett %A Tsyganova, Marina %A Turpaz, Yaron %A Vega, Silvia C %A Visintainer, Roberto %A von Frese, Juergen %A Wang, Charles %A Wang, Eric %A Wang, Junwei %A Wang, Wei %A Westermann, Frank %A Willey, James C %A Woods, Matthew %A Wu, Shujian %A Xiao, Nianqing %A Xu, Joshua %A Xu, Lei %A Yang, Lun %A Zeng, Xiao %A Zhang, Jialu %A Zhang, Li %A Zhang, Min %A Zhao, Chen %A Puri, Raj K %A Scherf, Uwe %A Tong, Weida %A Wolfinger, Russell D %X

Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.

%B Nature biotechnology %V 28 %P 827-38 %8 2010 Aug %G eng %U http://www.nature.com/nbt/journal/v28/n8/full/nbt.1665.html %0 Journal Article %J BMC Bioinformatics %D 2008 %T Prediction of enzyme function by combining sequence similarity and protein interactions %A Espadaler, J. %A Eswar, N. %A Querol, E. %A Aviles, F. X. %A Sali, A. %A M. A. Marti-Renom %A Oliva, B. %K Amino Acid *Software Structure-Activity Relationship Substrate Specificity/genetics %K Amino Acid Sequence/physiology Databases %K Automated Predictive Value of Tests Protein Interaction Mapping Proteins/analysis/metabolism Sequence Alignment Sequence Analysis %K Protein *Sequence Homology %K Protein Enzymes/analysis/*metabolism Fuzzy Logic Pattern Recognition %X BACKGROUND: A number of studies have used protein interaction data alone for protein function prediction. Here, we introduce a computational approach for annotation of enzymes, based on the observation that similar protein sequences are more likely to perform the same function if they share similar interacting partners. RESULTS: The method has been tested against the PSI-BLAST program using a set of 3,890 protein sequences from which interaction data was available. For protein sequences that align with at least 40% sequence identity to a known enzyme, the specificity of our method in predicting the first three EC digits increased from 80% to 90% at 80% coverage when compared to PSI-BLAST. CONCLUSION: Our method can also be used in proteins for which homologous sequences with known interacting partners can be detected. Thus, our method could increase 10% the specificity of genome-wide enzyme predictions based on sequence matching by PSI-BLAST alone. %B BMC Bioinformatics %V 9 %P 249 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=18505562 %0 Journal Article %J Proc Natl Acad Sci U S A %D 2005 %T Detecting remotely related proteins by their interactions and sequence similarity %A Espadaler, J. %A Aragues, R. %A Eswar, N. %A M. A. Marti-Renom %A Querol, E. %A Aviles, F. X. %A Sali, A. %A Oliva, B. %K Amino Acid %K Computational Biology Databases %K Molecular Protein Conformation Protein Folding Proteins/*genetics/*metabolism Proteomics/*methods *Sequence Homology %K Protein *Evolution %X The function of an uncharacterized protein is usually inferred either from its homology to, or its interactions with, characterized proteins. Here, we use both sequence similarity and protein interactions to identify relationships between remotely related protein sequences. We rely on the fact that homologous sequences share similar interactions, and, therefore, the set of interacting partners of the partners of a given protein is enriched by its homologs. The approach was bench-marked by assigning the fold and functional family to test sequences of known structure. Specifically, we relied on 1,434 proteins with known folds, as defined in the Structural Classification of Proteins (SCOP) database, and with known interacting partners, as defined in the Database of Interacting Proteins (DIP). For this subset, the specificity of fold assignment was increased from 54% for position-specific iterative BLAST to 75% for our approach, with a concomitant increase in sensitivity for a few percentage points. Similarly, the specificity of family assignment at the e-value threshold of 10(-8) was increased from 70% to 87%. The proposed method would be a useful tool for large-scale automated discovery of remote relationships between protein sequences, given its unique reliance on sequence similarity and protein-protein interactions. %B Proc Natl Acad Sci U S A %V 102 %P 7151-6 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15883372 %0 Journal Article %J J Comput Aided Mol Des %D 2001 %T Classification of protein disulphide-bridge topologies %A Mas, J. M. %A Aloy, P. %A M. A. Marti-Renom %A Oliva, B. %A de Llorens, R. %A Aviles, F. X. %A Querol, E. %K Algorithms Computer Simulation Databases as Topic Disulfides/*chemistry Models %K Molecular Protein Structure %K Secondary Protein Structure %K Tertiary Proteins/*chemistry/*classification Software %X The preferential occurrence of certain disulphide-bridge topologies in proteins has prompted us to design a method and a program, KNOT-MATCH, for their classification. The program has been applied to a database of proteins with less than 65% homology and more than two disulphide bridges. We have investigated whether there are topological preferences that can be used to group proteins and if these can be applied to gain insight into the structural or functional relationships among them. The classification has been performed by Density Search and Hierarchical Clustering Techniques, yielding thirteen main protein classes from the superimposition and clustering process. It is noteworthy that besides the disulphide bridges, regular secondary structures and loops frequently become correctly aligned. Although the lack of significant sequence similarity among some clustered proteins precludes the easy establishment of evolutionary relationships, the program permits us to find out important structural or functional residues upon the superimposition of two protein structures apparently unrelated. The derived classification can be very useful for finding relationships among proteins which would escape detection by current sequence or topology-based analytical algorithms. %B J Comput Aided Mol Des %V 15 %P 477-87 %G eng %U http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11394740