Creatively integrating statistical human genetics and bioinformatics to elucidate  disease mechanisms and facilitate new therapeutics

Creatively integrating statistical human genetics and bioinformatics to elucidate disease mechanisms and facilitate new therapeutics

School of Public Health (Shenzhen), Sun Yat-sen University

The majority of human diseases are influenced by genomic variations. Therefore, elucidating the relationship between genomic sequences and biological traits as well as diseases, along with the interplay between genes and the environment, constitutes a powerful approach to understanding disease mechanisms and discovering new strategies for disease prevention and treatment. The advancements in domestic and international biobanks and sequencing technologies have provided unprecedented opportunities for identifying disease-causing genes and reconstructing biological networks. However, achieving the aforementioned objectives, whether through bioinformatics methods or mechanistic research approaches, still requires new breakthroughs.

My group focuses on addressing the critical scientific question of how genomic sequences impact the occurrence and development of diseases and conducted research in the three major areas:

1. Systems Epidemiological Research Based on Medical Big Data

Integrating real-world medical big data, including electronic health records, genetic data, molecular testing, medical imaging, and multi-omics experimental data, we employ and develop bioinformatics, statistical, and machine learning algorithms to systematically analyze the genetic architecture and biological networks underlying complex diseases and traits. We also apply genetic-based causal inference algorithms to infer genetic-phenotypic interplay and predict new therapeutics. Our disease models include adult cerebrovascular disease, children developmental disorders, and COVID-19 infection.

2. Genetics and evolutionary study of human traits and diseases

Utilizing large-scale modern and ancient human population genomic datasets from international and national collaborative institutions, we employ computational algorithms such as genome selection, ancient DNA infiltration, polygenic adaptation, and phenotype-genome association analysis to explore the evolutionary spatiotemporal patterns of disease-related genes and their interplay with historical environments.

3. Bioinformatics algorithm development

Our focus lies in addressing intricate structural variations within human genomes and devising an optimal research framework to alleviate the substantial financial constraints associated with large-scale genome sequencing investigations. This is achieved through the development of algorithms aimed at detecting and genotyping variants from genome sequencing data of medium to shallow depth.