# Summer Research Projects Mathematics 2020

## MATH1

Alexander Fish

Title: The polynomial method in additive combinatorics

Description: The project with focus on the new polynomial method in (additive) combinatorics which allowed to resolve long standing problems such as Kakeya problem over finite fields, cap set problem and Erdos distance problem. We will study the method as it was introduced in the celebrated paper by Dvir, and will go over its applications such as cap set problem and (if time will permit) Erdos distance problem. The method provides a new way of bounding from above the cardinality of a set A inside a vector space over a finite field which does not contain a certain algebraic (geometric) structure. In the original paper by Dvir, the set A does not contain a whole line in any direction. The aim of the project is to study the method, and try to attack problems of a similar flavour. For instance, it is unknown whether the set A in F_p^2 which does not contain an equilateral triangle has to be O(p^{\alpha}) for \alpha < 2.

Time: Flexible for me. I can do either 18 Jan - 19 Feb or start even earlier.

Max number of students: preferably 2 students, but can be also 1 student.

Prerequisites: Linear Algebra, Some sort of abstract algebra (a weak requirement), this project is suitable for students with a minimal background (after year 1)

## MATH2

Alexander Fish

Title: Glasner property for (semi)-group actions

Description: Assume that a group G acts on a compact metric space X. We say that the action has Glasner property if for any infinite set A in X, and for any eps > 0 there exists g in G such that gA is eps-dense in X. Examples of actions which have Glasner property include the action of N on R/Z by multiplication, the action of SL_n(Z) on (R/Z)^n, and others. It follows from compactness argument that if an action has Glasner property then for every eps > 0, there exists k(eps) such that every set A in X with at least k(eps) elements is eps-dense. In the case of the action of N on R/Z, Alon and Peres proved that k(eps) < eps^{-2-delta} for any delta > 0. It is also known that the lower bound has to be at least quadratic. In the project we will try to provide a quantitative version of the statement that the action of SL_n(Z) on (R/Z)^n has Glasner property.

Time: Flexible for me. I can do either 18 Jan - 19 Feb or start even earlier.

Max number of students: preferably 2 students, but can be also 1 student.

Prerequisites: Linear Algebra, Analysis, Metric spaces (preferable, but not essential). This is suitable for students after the second year of studies.

## MATH3

Stephan Tillmann

Title: Circle Packings on Surfaces

Abstract: Circle packings are arrangements of circles in the plane, the sphere, or on a more general surface that have tangencies in a prescribed pattern. This project will focus on circle packings on negatively curved surfaces and their associated symmetric infinite circle packings in the plane. The aim is to understand how these packings vary as the geometry of a surface transitions from the negatively curved hyperbolic geometry to more general real projective geometries.

Availability: Anytime.

Max number of students: 6

Prerequisites: Essential: MATH2922 (or MATH2022), Desirable: MATH2921 (or MATH2021).

## MATH4

Jonathan Spreer

Title: Graph encoded manifolds

Description: It is quite easy to visualise orientable surfaces such as the sphere or the torus (the surface of a donut) embedded in three-dimensional space. Surfaces are two-dimensional manifolds. It is a challenging task to visualise manifolds of higher dimensions. One general - and perhaps surprising - way of achieving this is by representing a manifold by a graph with coloured edges. Such graph encoded manifolds, or gems, can always be drawn on a sheet of paper while containing all the information about the surface or manifold. While some of this information is very hard (or impossible) to access, some information can be read off the graph quite easily and other bits and pieces can be recovered by simple combinatorial rules. This project is about using these simple combinatorial rules to deduce interesting facts about manifolds, to construct large families of such gems satisfying some given properties (which is interesting for all kinds of reasons), to design a method to randomly generate such gems in certain settings (which is important for even more kinds of reasons), or to do more theoretical work.

Availability: 6 weeks in 2021.

Max number of students: 5

## MATH5

Wanchuang Zhu and Sally Cripps

Title: Sequential inference for large Bayesian networks.

Description: Bayesian network is a powerful framework to model hierarchical structure of multiple variables whose independencies are represented by the underlying directed acyclic graph (DAG). The classic inference of a Bayesian network includes structure learning and parameter learning, among which the structure learning poses challenges to researchers. The challenge originates from the discontinuity and huge size of the parameter space. Many methods have been proposed to alleviate the challenge, including structure MCMC, order MCMC and partition MCMC. However, performance of these approaches are not satisfactory in the following perspectives: (a) the approaches are not robust with respect to the starting point of DAG structure; (b) the approaches are incapable to explore full parameter space especially for large Bayesian networks. The goal of this project is: (a) to measure and evaluate robustness of existing methods; (b) to reduce the size of parameter space by using sequential inference techniques.

Available times:

Max number of students: 1

Prerequisites: 1. Background from Statistics, Mathematics or computer science 2. Knowledge of Bayesian inference 3. Experience with R

## MATH6

Richard Scalzo (DARE)

Title: Emulator Models for Accelerating Inverse Problem Solutions

Description: Inverse problems in geology and geophysics -- inferring the history and structure of a region of the Earth's crust from observations -- occur in a range of applications including mining, groundwater, and natural hazard assessment. These problems are by their nature ill-posed and uncertain, and adequately exploring and characterizing the space of possibilities consistent with the data involves repeated evaluations of numerical models for different geophysical sensors and their derivatives with respect to parameters. These models can be computationally intensive and/or black boxes, making it difficult to scale their use to high-dimensional problems. This project will investigate emulators, or proxies for the numerical likelihood, that can be learned to create adaptive, scalable methods for quantifying uncertainty in high-dimensional inverse problems. Possible emulator methods might include Gaussian processes, Bayesian neural networks, and/or probabilistic graphical models.

Max number of students: 2

Prerequisites: Programming in Python including standard scientific computing libraries (numpy, scipy, matplotlib) Familiarity with mathematical foundations for machine learning (multivariable calculus and linear algebra) Familiarity with contemporary machine learning frameworks such as scikit-learn, Google Jax, PyTorch or Keras desirable.

## MATH7

Simon Luo and Lamiae Azizi

Title: Improving Anomaly Detection with Transfer Learning

Description: Anomaly detection is a challenging problem in machine learning because there are very few training examples of the anomaly to fit a model. Often times there could be related datasets which can be used to improve the learning process. However, currently there are no good ways of combining the models we have trained on different datasets. We are interested in developing new transfer learning techniques to combine anomaly detection models that have been trained on different datasets. The research project involves a literature review of the techniques currently used and implementing the state-of-the-art techniques in python. For more details please contact s.luo@sydney.edu.au.

Max number of students: 2

Prerequisites: The candidate should have experience implementing statistical machine learning models in python (not just using packages)

## MATH8

Daniel Hauer

Title: An eigenvalue problem for the 1-Laplacian

Description: The 1-Laplace operator is a nonlinear differential operator which is often employed in image processing for smoothening images. But in order to be able to take advantage of this operator, one needs first reveal some/all of his properties. Therefore, we want to study in this summer research project the spectrum of this operator. You will learn about the theory of functions of bounded variation and its differentiability properties, approximation schemes which lead to the eigenvalue problem for the 1-Laplace operator equipped with homogeneous Dirichlet boundary conditions and if time permits, we try to derive an isoperimetric inequality providing lower bounds of the first and second eigenvalue of the 1-Laplace operator.

## MATH9

Nalini Joshi

Title: Cellular Automata

Summary: Cellular automata are mathematical models based on very simple rules, which have an ability to reproduce very complicated phenomena. (If you have played the "Game of Life" on a computer, then you have already seen automata with complicated behaviours.) This project is concerned with the mathematical analysis of their solutions. In particular, we will consider a family of cellular automata called parity filter rules, for which initial data are given on an infinite set.

Available times: 14 December 2020 -- 30 January 2021

## MATH10

Ellis Patrick

Title: Data-intensive science to understand the molecular aetiology of disease.

Description: Biotechnological advances have made it possible to monitor the expression levels of thousands of genes and proteins simultaneously promising exciting, ground-breaking discoveries in complex diseases. This project will focus on the application and/or development of statistical and machine learning methodology to analyse a high-dimensional biomedical experiment. Our lab works on projects spanning multiple diseases including melanoma, ovarian cancer, acute myeloid leukemia, Alzheimer's disease, multiple sclerosis and HIV. We also work with various high-throughput technologies including single-cell RNA-Seq, SWATH-MS, flow cytometry, CyTOF, CODEX imaging and imaging mass cytometry.

Availability: I am flexible with dates. But it sounds like 4 Jan – 19 Feb makes the most sense?

Max number of students: 4

Prerequisites: DATA2X02

## MATH11

Pengyi Yang

Title: Interactive visualisation of trans-omic data

Summary: Mass spectrometer (MS) and next generation sequencer (NGS) have become the methods of choices for high-throughput profiling of global proteome, phosphoproteome, transcriptome, and epigenome of cell systems. Data visualisation and summarisation is critical for making sense of these large-scale multilayered omic (i.e. trans-omics) datasets. We hypothesise that these techniques are essential step in understanding complex diseases and biological systems. The aim of this project is to develop an interactive data visualisation tool using R and Shiny application. You will be working with the state-of-the-art multi-omic datasets generated from various cellular systems with metabolic disease and development relevance. Methods you will learn include essential omic data analytics, R programming, and Shiny application development, which are highly valued skills in omic sciences and data sciences. Furthermore, this project will provide a unique opportunity for developing computational methods for discovery and comprehensive understanding of cell systems, their decision-making process, and their malfunction in disease states.

## MATH12

Pengyi Yang

Title: Clustering analysis for differential combinatorial binding of transcription factors in embryonic stem cells

Summary: Transcription factors (TFs), chromatin remodellers (CRs), and transcription co-factors (TCs) are key regulators in governing cell identities and cell-fate decisions. We have previously developed PAD (http://pad2.maths.usyd.edu.au/) for integrative clustering analysis of a large collection of ChIP-seq data from a compendium of more than 100 TFs, CRs and TCs generated from embryonic stem cells (ESCs). The aim of this project is to explore the differential combinatorial binding of TFs, CRs and TCs at different functional genomic regions so as to identify cooperation of TFs, CRs, and TCs at different genomic regions in controlling transcription of genes in ESCs. You will learn the basics of clustering and interactive data exploration in this project.

## MATH13

Garth Tarr

Title: Modelling consumer data from the red meat industry

Summary: The beef industry in Australia is worth $13 billion annually and the sheep meat industry is worth another$4 billion. A key question concerning the red meat industry is the ability to predict the eating quality of cuts of meat. Doing this well has major financial implications for the industry. This project would focus on the statistical issues associated with analysing consumer trial data to predicting meat eating quality. Examples of possible projects include: the analysis of consumer data which often contains many outliers; determining the relative importance of eating quality factors such as flavour, tenderness and juiciness; looking at the importance of “link product” as a common starter across consumers; and evaluating new objective grading techniques.

Times: Jan-Feb

Max number of students: 2

Prerequisites: DATA2002 or DATA2902

## MATH14

Jean Yang

Title: Methods towards precision medicine

Summary: Over the past decade, new and more powerful -omic tools have been applied to the study of complex disease such as cancer and generated a myriad of complex data. However, our general ability to analyse this data lags far behind our ability to produce it. This project is to develop computational methods that helps towards identify disease pathways and deliver better prediction of outcome. This project could also investigate whether it is possible to establish the patient or sample specific accuracy by integrating public repository of multi-omics data.

Times: Dec - Jan, Late Jan-Feb, Feb-Mar

Max number of students: 3

Prerequisites: DATA2002

## MATH15

Eduardo Altmann

Title: Time-series analysis of Twitter hashtags

Summary: Social media plays an increasingly important role in our society and economy. Data from social media can give insights on how the interest and opinion of users change over time. In this project we will look at how often individual hashtags were used in Twitter. We will then apply time-series analysis methods and simple mathematical models to describe the observations. We are particularly interested in Hashtags related to the Australian Bushfire season in 2019/2020 and how it impacted the overall discussion on climate change. Our goal is to describe how different hashtags interact with each other and contribute to some of the hashtags to show a rapid increase in interest across the population. Models from complex systems and mathematical ecology will be considered. Coding is required (preferably in Python).

Times: Jan-Feb

Max number of students: 2

## MATH16

Robert Marangell

Title: Adding spatio-temporal convection to classical Snowball Earth models

Summary: This project will look at adding spatial diffusion to the classical ice-albedo feedback models of Budyko, Sellers and Jormungand. In particular we will look at a model by North, and another by Wadiasih, which examine the effects of incorporating heat convection via diffusion across latitudes in the location and temperature (and stability) of polar ice caps.

Times: flexible

Max number of students: 1

Prerequisites: Background study in the Budyko and Sellers ice-albedo feedback models.

## MATH17

Holger Dullin and Peter Tuthill

Title: Beyond the Dzhanibekov effect

Description: The Dzhanibekov effect www.youtube.com/watch?v=1x5UiwEEvpQ beautifully illustrates the instability of the rotation of a rigid body about its middle principal axis. What is not part of that theorem is the observation the Russian cosmonaut Dzhanibekov made in space that the body flips orientation. This project is about the design of a gadget that would flip not by 180 degrees, but by possibly 120 or 90 or ... degrees. The design to be found is a rigid body with some rotors attached in a particular way. The project will involve deriving the equations of motion of this gadget and its simulation on the computer.

Times: flexible

Max number of students: 2

Prerequisites: Dynamical Systems, Lagrangian and Hamiltonian Dynamics, some experience in Matlab or Mathematica would be good.

## MATH18

Holger Dullin and Martijn de Sterke

Title: Lightsail Dynamics

Description: The Breakthrough Starshot project aims to build tiny lightsail driven spacecraft that can be accelerated to 20% of the speed of light. The description of the dynamics of such a lightsail can be approximated by the dynamics of a symmetric rigid body without a fixed point. The aim of this project is to investigate whether a rotation of the lightsail about its axis of symmetry can provide attitude stability.

Times: flexible

Max number of students: 3

Prerequisites: Dynamical Systems, Lagrangian and Hamiltonian Dynamics, some experience in Matlab or Mathematica would be good.