CHUHSING KATE HSIAO (蕭朱杏)

     ORCID Follow

(Kate)

Research Outline      2014-12-18 12:15:52

The primary focus of my research is the development of statistical methodology from the Bayesian perspective, particularly for problems encountered in the field of biology and medicine. My recent projects include genetic association studies with genetic markers like omics data and the development of bioinformatics tools for analysis of large scale data sets.

We have developed new methodology for inference on gene clustering, gene-disease association, and gene-gene (GG) and gene-environment (GE) interaction. For examples, we considered a Bayesian mixture model for genome-wide association studies (GWAS), where we estimated first the proportion of associated markers with a Bayesian model and then selected markers based on Bayes factors (Wei et al. 2010). The free code (Bmix) to be used in R is online for free download. We developed a gene selection and classification procedure based on Bayesian mixture of generalized singular g-priors (Chien and Hsiao, 2013). We also conducted family-based association studies with haplotypes. To deal with the complexity in family data structure, haplotype phase determination, and the large number of parameters, we adopted an evolutionary concept to cluster haplotypes, developed a test based on likelihood ratio test (Lee et al. 2011), and constructed a coding matrix to incorporate various sources of uncertainty (Huang et al. 2011). The corresponding codes LRT-C and BRUCM are freely downloadable as well. For GG and GE interaction, we utilized a Bayesian spatial multi-marker genetic random-effects model and Markov chain Monte Carlo method to detect GG interaction and a Bayesian generalized linear mixed-effects model to detect contextual GE interaction existing between the individual level of genetic risks and the group level of area environmental factors (Wang et al. 2013; Wang et al. 2013). For continuous type data like expression levels, we developed a regularized least squares support vector regression model for gene selection (Chen et al. 2009), and a free software for analysis (RLS). Currently we are working on marker-set clustering and association test, and plan to provide a flexible and efficient computational tool for analysis.

My other inter-disciplinary projects include a probabilistic surveillance of ILI syndrome with a spatio-temporal Bayesian hierarchical model, national survey of myopia among school children, association between the quality of endodontic treatment and systemic diseases, biomarker identification for female non-smoking lung cancer patients, risks of air pollutant and temperature on coronary heart diseases, schizophrenia genetic studies, and maximum number of life births per donor in artificial insemination. The collaboration research has been a pleasure experience working with experts from different disciplines.