Statistical Bioinformatics Group

Combined expertise in bioinformatics and interdisciplinary research

Prof. Jean Yang

Professor in Statistics, NHMRC CDF Fellow. School of Mathematics and Statistics, University of Sydney

Dr. John Ormerod

Senior Lecturer. School of Mathematics and Statistics, Uiversity of Sydney

Dr. Pengyi Yang

Lecturer and DECRA Fellow. School of Mathematics and Statistics, University of Sydney
Group Member

Dr. Emi Tanaka

Lecturer and Research Fellow. School of Mathematics and Statistics, University of Sydney

A/Prof. Samuel Mueller

Associate Professor. School of Mathematics and Statistics, University of Sydney

Dr. Dario Strbenac

Research Associate. School of Mathematics and Statistics, University of Sydney

Shila Ghazanfar

PhD Student and Postgraduate Teaching Fellow. University of Sydney

Kevin Wang

PhD Student and Postgraduate Teaching Fellow. University of Sydney

Sarah Romanes

PhD Student and Postgraduate Teaching Fellow. University of Sydney

Weichang Yu

PhD Student. University of Sydney

Other Members

Mathematics and Statistics
  • Lamiae Azizi, Lecturer
  • Mark Greenaway, PhD Student
  • Andy Wang, PhD Student
  • Yingxin Lin, Honours
  • Ellis Patrick, Postdoctoral Research Fellow
  • Judith and David Coffey Life Lab
  • Alistair Senior, Postdoctoral Research Fellow
  • Fabian Held, Postdoctoral Research Fellow
  • Fatemeh Vafaee, Postdoctoral Research Fellow
  • Lake-Ee Quek, Postdoctoral Research Fellow
  • Edward Hancock, Postdoctoral Research Fellow
  • Metabolic Cybernetics Lab
  • Rima Chaudhuri, Postdoctoral Research Fellow
  • Vinita Deshpande, PhD Student
  • Tom Geddes, PhD Student
  • Research Areas

    Methodology Development
  • Approximate Bayesian inference
  • Complex data modelling and selection
  • Analysis of multi-platform omics data
  • Statistical machine learning and predictive models
  • Network-based inference
  • Interactive visualisation via Shiny
  • Application to omics
  • RNA-Seq data analysis
  • ChIP-Seq data analysis
  • miRNA profiling data analysis
  • Mass spectrometry (MS)-based data analysis
  • Post-translational modification analysis
  • Exome-Seq and Cage-Seq data analysis
  • Nanostring data analysis
  • Microbiome data analysis
  • fMRI data analysis
  • Flow cytometry data analysis
  • etc.
  • Current projects on offer

  • PhD project: Machine learning application for trans-omics
  • Description: A PhD position is available in developing and applying computational and machine learning models for multilayered trans‐omics data sets generated by state‐of‐the‐art mass spectrometer (MS) and next generation sequencer (NGS). Our research project, funded by the Australian Research Council (ARC), aims to develop novel machine learning algorithms to analyse and integrate large‐scale MS‐based omics data with ultra‐fast NGS‐based omics data generated from complex biological systems. Characterising the signaling cascades, transcriptional networks, and translational protein networks and their cross‐talks are critical for comprehensive understanding of complex biological systems. Our large‐scale multilayered trans‐omics data generated from state‐of‐the‐ art technological platforms provides a unique opportunity to uncover the novel biology and molecular mechanisms that are critical for treating complex diseases and personalised medicine. In this project, we aim to develop novel machine learning methods that are capable of extracting key patterns from each omic layer and integrate such information across multiple omic layers.
    Contact: Dr. Pengyi Yang to discuss and/or apply.

  • Honours project: Classification and statistical networks
  • Description: Classical approaches in classification are primarily based on single features that exhibit effect size difference between classes. In omics data, this is equivalent to finding differential expression of genes or proteins between different treatment classes. Recently, network-based approaches utilising interaction information between genes have emerged and our recent work (Barter et al., 2014) further reveals that simple network based methods are able to classify alternate subsets of patients compared to gene-based approaches. This suggests that next-generation methods of gene expression signature modelling may benefit from harnessing data from external networks. This project will further explore the strength and weaknesses of utilizing statistical network as a feature in classification. The project will also extend Barter et al, 2014 by examining the effect of robust networks obtain from external databases or complementary datasets and evaluate its effect in classification (prognostic) setting.
    Contact: Prof. Jean Yang to discuss and/or apply.

  • Honours project: Bayesian approaches to Differential Distribution
  • Description: The distribution of genes is potentially informative when trying to distinguish between health samples and diseased samples. Traditionally this has been performed via a hypothesis testing approach which tests for differences in the mean gene expression levels between healthy samples and diseased samples, which is called differential expression. In this project we will perform analogous Bayesian test for differences across the whole distribution of gene expression levels between two states. A multiple testing approach will be developed to take into account false discoveries. This work will be motivated by real gene expression data from melanoma patients where it is hoped that this new approach will be able to uncover new biomarkers for the disease.
    Contact: Dr. John Ormerod to discuss and/or apply.

  • Honours project: Methods towards personalize medicine
  • Description: Over the past decade, new and more powerful genomic tools have been applied to the study of complex disease such as cancer and generated a myriad of complex data. However, our general ability to analyse this data lags far behind our ability to produce it. This project is to develop statistical method that deliver better prediction of response to drug therapy. In particular, this project investigate whether it is possible to establish the patient or sample specific network based (matrix) by integrating public repository and gene expression data.
    Contact: Prof. Jean Yang to discuss and/or apply.

  • Honours project: Dimension reduction in resting state fMRI data
  • Anatomical, functional and effective networks within the brain are currently being elucidated at fine temporal and spatial resolution using magnetic resonance imaging, via both functional MRI (fMRI). The concepts behind local region clustering such as superpixels are becoming increasingly popular for use in computer vision applications, data visualization and dimensional reduction strategies. This project involves exploring ideas and models for segmenting fMRI imaging data by borrowing information across multiple samples. Specific applications of this information sharing may be to improve the identification of interesting biologically features or improve sample classification in large p small n datasets.
    Contact: Prof. Jean Yang or Dr. John Ormerod to discuss and/or apply.

  • Honours project: Embryonic stem cell (ESC)-specific pathway identification and annotation using multilayered omics data and statistical learning
  • While all cells from a given organism have the same DNA sequence that codes for the same genes, different cell types of that organism only have a subset of genes “turned on”. Genes are commonly annotated into pathways for summarising their collective effect in the biological systems. One of the main drawbacks in current pathway annotation is that they are NOT cell type-specific. We propose to identify and curate cell type-specific pathways for embryonic stem cells (ESCs) using our multi-layered omics data. The key assumption is that genes within a pathway should have correlated expression profile changes when perturbed.
    We have collected ESC differentiation data profiled in a time-course on both proteome and transcriptome levels. Following the above assumption, we aim to (1) identify pathways that are regulated specifically in ESC differentiation; and (2) curate ESC-specific pathways using statistical learning. This project will expose honours student to the development and application of cutting-edge statistical learning methods to the state-of-the-art bio-molecular applications. It sits at the heart of interdisciplinary research.
    Contact: Dr. Pengyi Yang to discuss and/or apply.

  • Honours project: Is integrative-omics the new currency for solutions to complex diseases? Exploring correlations in high-dimensional data
  • In the surge of large volumes of high-throughput biological data being generated, more researchers are looking to integrate data of different types to inform hypotheses. For example, in complex metabolic diseases such as T2D and obesity, it is crucial to interrogate multiple data types to gain a comprehensive picture of the system defects and may eventually lead to identification of T2D or obesity markers. In this project, we aim to apply multivariate statistical approaches to integrate the data and build better predictors from multiple data sources. In this project, we will explore ways of weighting the relatively sparse proteomics data with information borrowed from the transcriptomics data. This involves, exploring or developing methods to correlate multiple high dimensional datasets to identify common and differentiating patterns.
    Contact: Prof. Jean Yang to discuss and/or apply.



  • Prof. David James, CPC, USyd
  • Dr. Sean Humphrey, CPC, USyd
  • Dr. Ben Parker, CPC, USyd
  • Dr. James Burchfield, CPC, USyd
  • Dr. Daniel Fazakerley, CPC, USyd
  • Dr. Guang Yang, CPC, USyd
  • Prof. Graham Mann, Westmead
  • Dr. Sarah-Jane Schramm, Westmead
  • Dr. Serigne Lo, Westmead
  • Dr. Raja Jothi, NIH, USA
  • Dr. Andrew Oldfield, NIH, USA
  • Dr. Sethikumar Cinghu, NIH, USA
  • Dr. Amanda Conway, NIH, USA
  • Dr. Justin Kosak, NIH, USA
  • Dr. Joshua Ho, VCCRI
  • Prof. Zeguang Han, SJTU, China
  • Dr. Yi Shi, SJTU, China
  • Dr. Xianbin Su, SJTU, China
  • Dr. Xin Zou, SJTU, China
  • Contact

    Level 8, Carslaw Building
    Judith and David Coffey Life Lab, Charles Perkins Centre

    School of Mathematics & Statistics
    Faculty of Science
    The University of Sydney
    NSW, 2006