科學(xué)家利用機(jī)器學(xué)習(xí)將哺乳動物的增強(qiáng)子遺傳變異與復(fù)雜的表型聯(lián)系起來
2023.05.11美國卡內(nèi)基梅隆大學(xué)Andreas R. Pfenning等研究人員合作利用機(jī)器學(xué)習(xí)將哺乳動物的增強(qiáng)子遺傳變異與復(fù)雜的表型聯(lián)系起來。這一研究成果發(fā)表在2023年4月28日出版的國際學(xué)術(shù)期刊《科學(xué)》上。
研究人員開發(fā)了Tissue-Aware Conservation Inference Toolkit(TACIT),利用在特定組織上訓(xùn)練的機(jī)器學(xué)習(xí)模型的預(yù)測,將候選增強(qiáng)子與物種的表型聯(lián)系起來。應(yīng)用TACIT將運(yùn)動皮層和小清蛋白陽性的神經(jīng)元增強(qiáng)子與神經(jīng)系統(tǒng)表型聯(lián)系起來,研究人員發(fā)現(xiàn)了幾十個增強(qiáng)子與表型的聯(lián)系,包括與大腦大小相關(guān)的增強(qiáng)子,它們與小頭畸形或大頭畸形的基因相互影響。TACIT提供了一個基礎(chǔ),可用于識別與任何具有對齊基因組的大類群物種中任何趨同演化表型的演化相關(guān)增強(qiáng)子。
據(jù)了解,物種之間的蛋白質(zhì)編碼差異往往不能解釋表型的多樣性,這表明調(diào)節(jié)基因表達(dá)的基因組元件,如增強(qiáng)子的參與。識別增強(qiáng)子和表型之間的聯(lián)系是具有挑戰(zhàn)性的,因?yàn)樵鰪?qiáng)子的活動可以是組織依賴性的,而且盡管序列保守性低,但功能保守。
附:英文原文
Title: Relating enhancer genetic variation across mammals to complex phenotypes using machine learning
Author: Irene M. Kaplow, Alyssa J. Lawler, Daniel E. Schffer, Chaitanya Srinivasan, Heather H. Sestili, Morgan E. Wirthlin, BaDoi N. Phan, Kavya Prasad, Ashley R. Brown, Xiaomeng Zhang, Kathleen Foley, Diane P. Genereux, Zoonomia Consortium**, Elinor K. Karlsson, Kerstin Lindblad-Toh, Wynn K. Meyer, Andreas R. Pfenning, Gregory Andrews, Joel C. Armstrong, Matteo Bianchi, Bruce W. Birren, Kevin R. Bredemeyer, Ana M. Breit, Matthew J. Christmas, Hiram Clawson, Joana Damas, Federica Di Palma, Mark Diekhans, Michael X. Dong, Eduardo Eizirik, Kaili Fan, Cornelia Fanter, Nicole M. Foley, Karin Forsberg-Nilsson, Carlos J. Garcia, John Gatesy, Steven Gazal, Diane P. Genereux, Linda Goodman, Jenna Grimshaw, Michaela K. Halsey, Andrew J. Harris, Glenn Hickey, Michael Hiller, Allyson G. Hindle, Robert M. Hubley, Graham M. Hughes, Jeremy Johnson, David Juan, Irene M. Kaplow, Elinor K. Karlsson, Kathleen C. Keough, Bogdan Kirilenko, Klaus-Peter Koepfli, Jennifer M. Korstian, Amanda Kowalczyk, Sergey V. Kozyrev, Alyssa J. Lawler, Colleen Lawless, Thomas Lehmann, Danielle L. Levesque, Harris A. Lewin, Xue Li, Abigail Lind, Kerstin Lindblad-Toh, Ava Mackay-Smith, Voichita D. Marinescu, Tomas Marques-Bonet, Victor C. Mason, Jennifer R. S. Meadows, Wynn K. Meyer, Jill E. Moore, Lucas R. Moreira, Diana D. Moreno-Santillan, Kathleen M. Morrill, Gerard Muntané, William J. Murphy, Arcadi Navarro, Martin Nweeia, Sylvia Ortmann, Austin Osmanski, Benedict Paten, Nicole S. Paulat, Andreas R. Pfenning, BaDoi N. Phan, Katherine S. Pollard, Henry E. Pratt, David A. Ray, Steven K. Reilly, Jeb R. Rosen, Irina Ruf, Louise Ryan, Oliver A. Ryder, Pardis C. Sabeti, Daniel E. Schffer, Aitor Serres, Beth Shapiro, Arian F. A. Smit, Mark Springer, Chaitanya Srinivasan, Cynthia Steiner, Jessica M. Storer, Kevin A. M. Sullivan, Patrick F. Sullivan, Elisabeth Sundstrm, Megan A. Supple, Ross Swofford, Joy-El Talbot, Emma Teeling, Jason Turner-Maier, Alejandro Valenzuela, Franziska Wagner, Ola Wallerman, Chao Wang, Juehan Wang, Zhiping Weng, Aryn P. Wilder, Morgan E. Wirthlin, James R. Xue, Xiaomeng Zhang
Issue&Volume: 2023-04-28
Abstract: Protein-coding differences between species often fail to explain phenotypic persity, suggesting the involvement of genomic elements that regulate gene expression such as enhancers. Identifying associations between enhancers and phenotypes is challenging because enhancer activity can be tissue-dependent and functionally conserved despite low sequence conservation. We developed the Tissue-Aware Conservation Inference Toolkit (TACIT) to associate candidate enhancers with species’ phenotypes using predictions from machine learning models trained on specific tissues. Applying TACIT to associate motor cortex and parvalbumin-positive interneuron enhancers with neurological phenotypes revealed dozens of enhancer–phenotype associations, including brain size–associated enhancers that interact with genes implicated in microcephaly or macrocephaly. TACIT provides a foundation for identifying enhancers associated with the evolution of any convergently evolved phenotype in any large group of species with aligned genomes.
DOI: abm7993