Uri Keich

Associate Professor
School of Mathematics and Statistics
University of Sydney, NSW 2006, Australia

Research Interests

Statistical methods for computational biology, with emphasis on false discovery analysis in multiple hypothesis testing. My current focus is on tandem mass spectrometry (MS/MS) proteomics, which introduced me to competition-based approaches to multiple testing. This topic has gained significant interest in the statistics and machine learning communities, particularly through the knockoff filter framework developed by Barber and Candès.

Other applications include genomics (sequence alignment, motif finding, DNA replication origins). I also work on developing computationally efficient algorithms for statistical significance estimation.

Leadership Roles

Faculty of Science Academic Lead for AI and Assessment (0.4 FTE)
University of Sydney
  • Working collaboratively with Associate Heads Education across eight schools, Honours coordinators, and faculty staff
  • Developing policies and guidelines for assessments of Honours projects in the age of generative AI
  • Devised a relational database approach for curriculum assessment mapping, enabling mapping of assessments from units to learning outcomes for majors, programs, and streams
  • This approach will be the basis for program-level assessment redesign in Science and was highly commended by Pro-Vice-Chancellor (Learning & Teaching) Prof. Adam Bridgeman
Deputy Head of School
School of Mathematics and Statistics, University of Sydney. Role concluded early upon appointment to the faculty-wide Academic Lead position.
  • Continued leading the school's engagement with generative AI through strategic presentations and organized meetings on assessment policy implications
  • Advised colleagues on adapting teaching approaches to the AI landscape
  • Drafted school strategic vision on AI integration into the teaching curriculum
Statistics Honours Coordinator
University of Sydney

Professional Experience

Associate Professor
School of Mathematics and Statistics, University of Sydney
Senior Lecturer
School of Mathematics and Statistics, University of Sydney
Assistant Professor
Department of Computer Science, Cornell University
Project Scientist
Department of Computer Science and Engineering, UC San Diego
Assistant Professor
Department of Mathematics, UC Riverside
Von Karman Instructor
Applied Mathematics Department, California Institute of Technology

Honors and Awards

2024 Faculty of Science Learning & Teaching Award for Teaching and Learning Excellence (Individual)
2024 Student-initiated Faculty of Science Teacher/Unit of Study Commendation: "Taught exceptionally challenging content in an intuitive and engaging manner. Learnt more under his instruction than I have in entire semesters in the past."
2024 Sydney University Postgraduate Representative Association Supervisor of the Year Award
2015 Best Paper Award, RECOMB 2015
2007–2009 NSF CAREER Award
1997 Wilhelm T. Magnus Memorial Prize for Significant Contributions to the Mathematical Sciences, Courant Institute, NYU
1994–1995 Alfred P. Sloan Doctoral Dissertation Fellowship
1990 Wolf Foundation Prize for M.Sc.

Publications

2025 Freestone J., Noble WS., Keich U. A semi-supervised framework for diverse multiple hypothesis testing scenarios. Submitted.
2025 Solivais AJ., Boekweg H., Smith LM., Noble WS., Shortreed MR., Payne SH., Keich U. Improved detection of differentially abundant proteins through FDR-control of peptide-identity-propagation. Journal of Proteome Research, 24(9), 4437–4449.
2025 Freestone J., Käll L., Noble WS., Keich U. How to train a post-processor for tandem mass spectrometry proteomics database search while maintaining control of the false discovery rate. Journal of Proteome Research, 24(5), 2266–2279.
2025 Wen B., Freestone J., Riffle M., MacCoss MJ., Noble WS., Keich U. Assessment of false discovery rate control in tandem mass spectrometry analysis using entrapment. Nature Methods, 22(7), 1454–1463.
2024 Lu Y., Noble WS., Keich U. A BLAST from the past: revisiting blastp's E-value. Bioinformatics, 40(12), btae729.
2024 Freestone J., Käll L., Noble WS., Keich U. Semi-supervised Learning While Controlling the FDR with an Application to Tandem Mass Spectrometry Analysis. LNCS (RECOMB 2024), 14758, 448–453.
2024 Freestone J., Noble WS., Keich U. Analysis of tandem mass spectrometry data with CONGA: Combining Open and Narrow searches with Group-wise Analysis. Journal of Proteome Research, 23(6), 1894–1906.
2024 Freestone J., Noble WS., Keich U. Re-investigating the correctness of decoy-based false discovery rate control in proteomics tandem mass spectrometry. Journal of Proteome Research, 23(6), 1907–1914.
2024 Lin A., See D., Fondrie WE., Keich U., Noble WS. Target-decoy false discovery rate estimation using Crema. Proteomics, 24(8).
2023 Ebadi A., Luo D., Freestone J., Noble WS., Keich U. Bounding the FDP in competition-based control of the FDR. arXiv, 2302.11837. [Preprint]
2023 Ebadi A., Freestone J., Noble WS., Keich U. Bridging the False Discovery Gap. Journal of Proteome Research, 22(7), 2172–2178.
2023 Rajchert A., Keich U. Controlling the False Discovery Rate via Competition: is the +1 needed? Statistics and Probability Letters, 197, 109819.
2023 Luo D., Ebadi A., Emery K., He Y., Noble WS., Keich U. Competition-based control of the false discovery proportion. Biometrics.
2023 Hasam S., Emery K., Noble WS., Keich U. A Pipeline for Peptide Detection Using Multiple Decoys. Methods in Molecular Biology, 2426:25–34. (Invited Chapter)
2022 Freestone J., Short T., Noble WS., Keich U. Group-walk: a rigorous approach to group-wise false discovery rate analysis by target-decoy competition. Bioinformatics, 38(Supplement 2):ii82–ii88.
2022 Lin A., Short T., Noble WS., Keich U. Improving peptide-level mass spectrometry analysis via double competition. Journal of Proteome Research, 21(10): 2412–2420.
2022 Heil LR., Fondrie WE., McGann CD., Federation AJ., Noble WS., MacCoss MJ., Keich U. Building Spectral Libraries from Narrow-Window Data-Independent Acquisition Mass Spectrometry Data. Journal of Proteome Research, 21(6): 1382–1391.
2021 Lin A., Plubell DL., Keich U., Noble WS. Accurately Assigning Peptides to Spectra When Only a Subset of Peptides Are Relevant. Journal of Proteome Research, 20(8): 4153–4164.
2021 Peres N., Lee AR., Keich U. Exactly Computing the Tail of the Poisson-Binomial Distribution. ACM Transactions on Mathematical Software, 47(4): 1–19.
2020 Emery K., Hasam S., Noble WS., Keich U. Multiple competition-based FDR control and its application to peptide detection. RECOMB 2020, LNCS 12074: 54–71.
2019 Emery K., Keich U. Controlling the FDR in variable selection via multiple knockoffs. arXiv, 1911.09442V2. [Preprint]
2019 Keich U., Tamura K., Noble WS. Averaging Strategy To Reduce Variability in Target-Decoy Estimates of False Discovery Rate. Journal of Proteome Research, 18(2): 585–593.
2018 Keich U., Noble WS. Controlling the FDR in imperfect matches to an incomplete database. Journal of the American Statistical Association, 113(523): 973–982.
2017 Noble WS., Keich U. Response to "Mass spectrometrists should search for all peptides, but assess only the ones they care about". Nature Methods, 14(7): 644.
2017 Keich U., Noble WS. Progressive calibration and averaging for tandem mass spectrometry statistical confidence estimation: Why settle for a single decoy? LNCS (RECOMB 2017), 10229: 99–116.
2017 Wilson H., Keich U. Accurate small tail probabilities of sums of iid lattice-valued random variables via FFT. Journal of Computational and Graphical Statistics, 26(1): 223–229.
2016 Wilson H., Keich U. Accurate pairwise convolutions of non-negative vectors via FFT. Computational Statistics & Data Analysis, 101: 300–315.
2016 Manescu D., Keich U. A symmetric length-aware enrichment test. Journal of Computational Biology, 23(6): 508–525.
2015 Keich U., Kertesz-Farkas A., Noble WS. Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics. Journal of Proteome Research, 14(8): 3148–3161.
2015 Manescu D., Keich U. A symmetric length-aware enrichment test. RECOMB 2015, LNBI 9029: 224–242. (Best Paper Award)
2015 Kertesz-Farkas A., Keich U., Noble WS. Tandem Mass Spectrum Identification via Cascaded Search. Journal of Proteome Research, 14(8): 3027–3038.
2015 Keich U., Noble WS. On the Importance of Well-Calibrated Scores for Identifying Shotgun Proteomics Spectra. Journal of Proteome Research, 14(2): 1147–1160.
2014 Tanaka E., Bailey TL., Keich U. Improving MEME via a two-tiered significance analysis. Bioinformatics, 30(14): 1965–1973.
2013 Liachko I., Youngblood RA., Keich U.*, Dunham MJ. High-resolution mapping, characterization, and optimization of autonomously replicating sequences in yeast. Genome Research, 23(4): 698–704. (*co-corresponding author)
2011 Liachko I., Tanaka E., Cox K., Chung SC., Yang L., Seher A., Hallas L., Cha E., Kang G., Pace H., Barrow J., Inada M., Tye BK., Keich U. Novel Features of ARS Selection in Budding Yeast Lachancea kluyveri. BMC Genomics, 12: 633.
2011 Tanaka E., Bailey TL., Grant CE., Noble WS., Keich U. Improved similarity scores for comparing motifs. Bioinformatics, 27(12): 1603–1609.
2011 Gupta N., Bandeira N., Keich U., Pevzner PA. Target-Decoy Approach and False Discovery Rate: When Things May Go Wrong. Journal of The American Society for Mass Spectrometry, 22(7): 1111–1120.
2011 Ng P., Keich U. Alignment Constrained Sampling. Journal of Computational Biology, 18(2).
2010 Bhaskar A., Keich U. Confidently estimating the number of DNA replication origins. Statistical Applications in Genetics and Molecular Biology, 9(1): Article 28.
2010 Liachko I., Bhaskar A., Li C., Chung SCC., Tye BK., Keich U. A Comprehensive Genome-Wide Map of Autonomously Replicating Sequences in a Naive Genome. PLoS Genetics, 6(5).
2009 Oliver HF., Orsi RH., Ponnala L., Keich U., Wang W., Sun Q., Cartinhour SW., Filiatrault MJ., Wiedmann M., Boor KJ. Deep RNA sequencing of L. monocytogenes reveals overlapping and extensive stationary phase and sigma B-dependent transcriptomes. BMC Genomics, 10: 641.
2009 Nagarajan N., Keich U. Reliability and efficiency of algorithms for computing the significance of the Mann-Whitney test. Computational Statistics, 24(4): 605–622.
2008 Ng P., Keich U. Factoring local sequence composition in motif significance analysis. Genome Informatics, 21: 15–26.
2008 Ng P., Keich U. GIMSAN: a Gibbs motif finder with significance analysis. Bioinformatics, 24(19): 2256–2257.
2008 Keich U., Gao H., Garretson JS., Bhaskar A., Liachko I., Donato J., Tye BK. Computational detection of significant variation in binding affinity across two sets of sequences with application to the analysis of replication origins in yeast. BMC Bioinformatics, 9: 372.
2008 Nagarajan N., Keich U. FAST: Fourier transform based Algorithms for Significance Testing of ungapped multiple alignments. Bioinformatics, 24(4): 577–578.
2007 Keich U., Ng P. A conservative parametric approach to motif significance analysis. Genome Informatics, 19: 61–72.
2007 Zhi D., Keich U., Pevzner P., Heber S., Tang H. Correcting base-assignment errors in repeat regions of shotgun assembly. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 4(1): 54–64.
2006 Nagarajan N., Ng P., Keich U. Refining motif finders with E-value calculations. RECOMB Satellite Workshop on Regulatory Genomics, 73–84.
2006 Ng P., Nagarajan N., Jones N., Keich U. Apples to apples: improving the performance of motif finders and their significance analysis in the Twilight Zone. Bioinformatics, 22(14): e393–e401.
2006 Keich U., Nagarajan N. A fast and numerically robust method for exact multinomial goodness-of-fit test. Journal of Computational and Graphical Statistics, 15(4): 779–802.
2005 Nagarajan N., Jones N., Keich U. Computing the p-value of the information content from an alignment of multiple sequences. Bioinformatics, 21(Suppl 1, ISMB): i311–i318.
2005 Buhler J., Keich U., Sun Y. Designing Seeds for Similarity Search in Genomic DNA. Journal of Computer and System Sciences, 70(3): 342–363.
2005 Keich U. sFFT: a faster accurate computation of the p-value of the entropy score. Journal of Computational Biology, 12(4): 416–430.
2004 Keich U., Nagarajan N. A faster reliable algorithm to estimate the p-value of the multinomial llr statistic. WABI 2004.
2004 Keich U., Li M., Ma B., Tromp J. On Spaced Seeds for Similarity Search. Discrete Applied Mathematics, 138(3): 253–263.
2003 Buhler J., Keich U., Sun Y. Designing Seeds for Similarity Search in Genomic DNA. RECOMB 2003.
2003 Eskin E., Keich U., Gelfand MS., Pevzner PA. Genome-Wide Analysis of Bacterial Promoter Regions. Pacific Symposium on Biocomputing.
2003 Keich U. Stationary Tangent – the Discrete and Non-smooth Cases. Journal of Time Series Analysis, 24(2): 173–192.
2002 Keich U., Pevzner PA. Finding motifs in the twilight zone. RECOMB 2002.
2002 Keich U., Pevzner PA. Subtle motifs: defining the limits of motif finding algorithms. Bioinformatics, 18(10): 1382–1390.
2002 Keich U., Pevzner PA. Finding motifs in the twilight zone. Bioinformatics, 18(10): 1374–1381.
2001 Cwikel M., Keich U. Optimal decompositions for the K-functional for a couple of Banach lattices. Arkiv för Matematik, 39(1): 27–64.
2000 Keich U. A Possible Definition of A Stationary Tangent. Stochastic Processes and Their Applications, 88(1): 1–36.
1999 Keich U. Krein's Strings, the Symmetric Moment Problem, and Extending a Real Positive Definite Function. Communications on Pure and Applied Mathematics, 52(10): 1315–1334.
1999 Keich U. On Lp Bounds for Kakeya Maximal Functions and the Minkowski Dimension in R2. Bulletin of the London Mathematical Society, 31: 213–221.
1999 Keich U. Absolute Continuity Between the Wiener and Stationary Gaussian Measures. Pacific Journal of Mathematics, 188(1): 95–108.
1999 Keich U. The Entropy Distance Between the Wiener and Stationary Gaussian Measures. Pacific Journal of Mathematics, 188(1): 109–128.
1996 Aharoni R., Keich U. A Generalization of the Ahlswede Daykin Inequality. Discrete Mathematics, 152: 1–12.

Software Contributions

Tool for conducting entrapment experiments including database generation and FDP estimation methods
FDR control for peptide-identity-propagation (implemented in FlashLFQ)
Flexible procedures for controlling FDR with side-information in p-value and competition settings
MS/MS database search post-processor addressing FDR control issues (to be implemented in Percolator v4.0)
Open-source Python tool implementing TDC-based FDR estimation methods
Post-processor combining narrow window and open modification MS/MS searches with rigorous FDR control
R package providing upper prediction bands on FDP in competition-based multiple hypothesis testing
R package for FDR control with intrinsic group structure
Tool assigning studentized-Gumbel based p-values to blastp search results
R package implementing step-down procedure for controlling false discovery proportion
R package for FDR control using multiple competition
R package for FDR control in imperfect matches to incomplete databases
Accurate convolutions and small tail probabilities via FFT (R and Python)
R package for Poisson-Binomial distribution computation
Alignment constrained sampling
Gibbs motif finder with significance analysis
Motif comparison tool (contributed to score function redesign)
Tool for motif scanning and binding affinity variation detection
Fourier transform based algorithms for significance testing of ungapped multiple alignments
Computing the p-value of the information content (entropy score) of a sequence motif
Computing the exact p-value of the llr statistic for multinomial goodness-of-fit test

Teaching

University of Sydney

2009–2025 STAT2911 – Probability and Statistical Models (Advanced)
2019, 2021–2022 MATH1905 – Statistical Thinking with Data (Advanced)
2017–2019 MSH2 – Probability (Honours level)
2018 MATH1005 – Statistical Thinking about Data
2009–2017 MSH8 – Statistical Methods in Bioinformatics
2010–11, 2013–15 STAT3914/3014 – Applied Statistics (Advanced)
2010, 2022, 2024 MATH1907/1933/1972 – Mathematics (Special Studies Program)

Cornell University (2003–2008)

CS 426 – Introduction to Bioinformatics; CS 726 – Computational Molecular Biology; CS 628 – Biological Sequence Analysis; CS 280 – Discrete Structures

Selected Invited Talks

Professional Service

Editorial

Reviewing

Journals: Nature Methods (2x in 2025), PNAS, Bioinformatics, Journal of Proteome Research, PLoS Computational Biology, Journal of Computational and Graphical Statistics, and others

Conferences: RECOMB, ISMB, ICML, SODA, and others

Program Committees

RECOMB Satellite Workshop on Regulatory Genomics (2004–6, 2008–10, 2012–14); Asia-Pacific Bioinformatics Conference (2007, 2009–10); IEEE BIBM (2019); GIW (2008–9)

Supervision

PhD Students

Jack Freestone
9 joint papers; now Lecturer at Macquarie University
Kristen Emery
3 joint papers; now in industry
Emi Tanaka
3 joint papers; now Deputy Director, Biological Data Science Institute, ANU
Patrick Ng
6 joint papers; now in finance
Niranjan Nagarajan
7 joint papers; now Associate Director, Genome Institute of Singapore

MSc & Honours Students

Arya Ebadi (MSc, 2 papers), Huon Wilson (MSc, 2 papers, now at Data61), Temana Short, Dong Luo, Yilun He, Kristen Emery, and others

Research Project Students

Supervised 25+ undergraduate research projects and lab rotations at University of Sydney and Cornell University; 11 joint papers with these students

Education

Ph.D. in Mathematics
Courant Institute, New York University
Thesis: Stationary Approximations to Non-Stationary Stochastic Processes
Advisor: Prof. H.P. McKean
M.Sc. in Mathematics
Technion – Israel Institute of Technology
B.Sc. in Computer Science & Mathematics
Hebrew University of Jerusalem, Summa Cum Laude