Model-Based Classification with High-Dimensional Data Geoff McLachlan Department of Mathematics and Institute for Molecular Bioscience University of Queensland Finite mixture models are being increasingly used to model the distributions of a wide variety of random phenomena. In this talk, we consider the use of mixture distributions to provide a model-based approach to classification with high-dimensional data. The need to be able to classify such data arises in many applications ranging from biology to image processing. For the unsupervised classification problem (cluster analysis), we consider the use of mixtures of factor analyzers. This approach enables a normal mixture model to be fitted to data which have high dimension relative to the number of data points to be clustered. The number of free parameters is controlled through the dimension of the latent factor space. By working in this reduced space, it allows an interpolation in model complexities from isotropic to full covariance structures without any restrictions. We also consider the case of supervised classification (discriminant analysis). The methodology is illustrated by applications in the context of cancer diagnosis and treatment. One problem concerns classifiying a relatively small number of tumour tissue samples containing the expression data on very many (possibly thousands) of genes from microarray experiments.