Daniel Roy

news

JUL 2024.
My joint work with Idan Attias, Gintare Karolina Dziugaite, Mahdi Haghifam, Roi Livni on "Information Complexity of Stochastic Convex Optimization" won a best paper award at ICML 2024 (given to 10 out of ~10,000 submissions).

APR 2024.
I've been promoted to Full Professor.

SEP 2023.
I've been named Research Director of the Vector Institute.

research

My research is focused on the foundational principles underlying prediction, inference, and decision making under uncertainty. This focus on foundations ties together multiple lines of work, which span machine learning, statistics, mathematical logic, applied probability, and computer science.

My group has made significant contributions to learning theory, statistical network analysis, decision theory, probabilistic programming, Bayesian nonparametric statistics. More recently, my group has come to focus on information theories of learning, online learning, and nonstandard analytic foundations for decision theory.

Want to hear more about our research? Watch a recent talk at ETH Zurich, summarizing some recent research on "removing assumptions from statistics".

joining my group

I am seeking students at all levels with strong quantitative backgrounds interested in foundational problems at the intersection of machine learning, statistics, and computer science. I am also seeking qualified postdoctoral researchers for two- and three- year positions. Click to read instructions before contacting me, or your email is likely to be deleted. I value people who are attentive to details.

Undergraduates at Toronto: I welcome these emails, but they should include a c.v., lists of relevant course work, and a transcript. The subject line should contain the word "consideration" to indicate these instructions have been read.

Students seeking short-term research opportunities: Other than in exceptional cases (listed in next paragraph), I do not take students for internships, summer research, or short term visits.

Prospective graduate students: I receive emails everyday from prospective graduate students. Other than in exceptional cases, students who are not already admitted to Toronto should NOT email me directly. Instead, they should first apply to and gain admission to the graduate program in Statistics or Computer Science. Do not email me about the admission process—google and figure it out yourself. In your research statement, you should mention my name and state clearly how your research interests align with my research program. If you are admitted, then I'm happy to discuss supervision. Exceptional cases include: you have publications already at NIPS, ICML, UAI, ICLR, COLT, ALT, similar conferences, or in a top statistics journal; you have competed at the highest levels of math or computing olympiads; you have done something truly exceptional relative to the resources available to you. In these cases, students are encouraged to email me before applying and include the phrase "special consideration" in their email's subject line to indicate they have read my webpage.

Postdocs should send me their c.v. and best papers as PDFs and should include the word "consideration" in their subject line to indicate they have read my website.

Hide instructions.

group members

Ziyi Liu, Ph.D. candidate, Stats
Ekansh Sharma, Ph.D. candidate, CS

group alumni

Mahdi Haghifam, Ph.D., ECE → Distinguished Postdoctoral Fellow, Northeastern University
Mufan Li, Ph.D., Stats → Postdoctoral Research Associate, Princeton University → Assistant Prof., Dept. of Statistics and Actuarial Sciences, University of Waterloo
David Schrittesser, Postdoc, Math →Full Prof., Institute for Advanced Study and Mathematics, Harbin Inst. Technology, China
Blair Bilodeau, Ph.D. candidate, Stats → Voleon Group
Yanbo Tang, Ph.D., Statistics → Chapman Fellow, Imperial College London → Lecturer, Dept. of Statistics, Imperial College London
Jeffrey Negrea, Ph.D., Statistics → Postdoctoral Fellow, Data Science Institute, University of Chicago → Assistant Prof., Dept. of Statistics and Actuarial Science, Waterloo
Yasaman Mahdaviyeh, Ms.C., CS → Ph.D., CS Columbia University
Ali Ramezani-Kebrya, Postdoctoral Fellow → Senior Postdoctoral Associate, EPFL → Associate Prof., Dept. of Informatics, University of Oslo, Norway
Jun Yang, Ph.D., Statistics → Florence Nightingale Fellow, Oxford University → Assistant Prof., Dept. Mathematical Sciences, University of Copenhagen
Zacharie Naulet, Postdoctoral Fellow → Assistant Prof., Université Paris-Sud
Alexander Edmonds, M.Sc., CS → Ph.D., CS, UofT
Haosui (Kevin) Duanmu, Ph.D., Statistics (Dissertation; Departmental Thesis Award) → NSERC Postdoctoral Fellow, Dept. Economics, UC Berkeley → Full Professor, Institute for Advanced Study in Mathematics, Harbin Institute of Technology
Victor Veitch, Ph.D., Statistics (Dissertation; Departmental Thesis Award, Stat. Soc. of Canada (SSC) Pierre Robillard Award) → Distinguished Postdoctoral Researcher at Columbia → Asst. Prof., University of Chicago, Statistics
Siddharth Ancha
M.Sc., CS → Ph.D., CS, CMU → Postdoc, MIT
Pablo García Moreno
Postdoc → Machine Learning Scientist, Amazon

teaching

STA3000 (Fall 2017)
Advanced Theory of Statistics, Part I

STA4516 (Fall 2017)
Nonstandard Analysis and Applications to Statistics and Probability

STAD68 (Fall 2017)
Advanced Machine Learning and Data Mining

STAD68 (Fall 2016)
Advanced Machine Learning and Data Mining

STA4516 (Fall 2015)
Topics in Probabilistic Programming

STAC63 (Fall 2015)
Probability Models

STAD68 (Winter 2015)
Advanced Machine Learning and Data Mining

STA4513 (Fall 2014)
Statistical models of networks, graphs, and other relational structures

probabilistic programming

Programs can be used to give compact representations of distributions: in order to represent a distribution, one simply gives a program that would generate an exact sample were the random number generator to produce realizations of independent and identically distributed random variables. This approach to representing distributions by probabilistic programs works not only for simple distributions on numbers like Poissons, Gaussians, etc., and combinations thereof, but also for more exotic distributions on, e.g., phrases in natural language, rendered 2D images of 3D scenes, and climate sensor measurements.

Probabilistic programming systems support statistical inference on models defined by probabilistic programs. By constraining some variables of a program (e.g., simulated sensor readings in some climate model) and studying the conditional distribution of other variables (e.g., the parameters of the climate model), we can identify plausible variable settings that agree with the constraints. Conditional inferences like this would allow us to, e.g., build predictive text systems for mobile phones, guess the 3D shape of an object from only a photograph, or study the underlying mechanisms driving observed climate measurements.

Probabilistic programming systems for machine learning and statistics are still in their infancy, and there are many interesting theoretical and applied problems yet to be tackled. My own work focuses on theoretical questions around representing stochastic processes and the computational complexity of sampling-based approaches to inference. I was involved in the definition of the probabilistic programming language Church, and its first implementation, MIT-Church, a Markov Chain Monte Carlo algorithm operating on the space of execution histories of an interpreter. Some of my key theoretical work includes a study of the computability of conditional probability and de Finetti measures, both central notions in Bayesian statistics. Readers looking for an overview of these results are directed to the introduction of my doctoral dissertation. A less technical description of a probabilistic programming approach to artificial intelligence can be found in a recent book chapter on legacies of Alan Turing, co-authored with Freer and Tenenbaum.

More information on probabilistic programming can be found on probabilistic-programming.org, a wiki that I maintain. In particular, look at the list of research articles and papers on probabilistic programming and the tutorials.

NIPS Workshop on Probabilistic Programming

I co-organized the first workshop on probabilistic programming for statistics and machine learning at NIPS*2008 (with Vikash Mansinghka, John Winn, David McAllester and Josh Tenenbaum). Four years later, I co-organized the second workshop on probabilistic programming at NIPS*2012 (with Vikash Mansinghka and Noah Goodman).

academic

Hello, my name is Daniel M. Roy (or simply, Dan) and I am a Professor of Statistics at the University of Toronto, with cross appointments in Computer Science and in Electrical and Computer Engineering.

I received all three of my degrees in (Electrical Engineering and) Computer Science from MIT. As a doctoral student, I was member of the Computer Science and Artificial Intelligence Laboratory (CSAIL), where I was advised by Leslie Kaelbling. I also collaborated with members of Josh Tenenbaum's Computational Cognitive Science group.

After my doctorate, I was a Newton International Fellow of the Royal Society and then Research Fellow of of Emmanuel College at the University of Cambridge. At Cambridge, I was a member of the Machine Learning Group, headed by Zoubin Ghahramani and part of the Computational and Biological Learning Lab in the Department of Engineering.

coauthors

Nate Ackerman, Siddharth Ancha, Matej Balog, Marco Battiston, William Beebee, Keith Bonawitz, Nate Ackerman, Jeremy Avigad, Blair Bilodeau, Cristian Cadar, Hal Daumé III, Brian Demsky, Daniel Dumitran, Haosui Duanmu, Gintare Karolina Dziugaite, Stefano Favaro, Cameron Freer, Zoubin Ghahramani, Noah Goodman, Creighton Heaukulani, Jonathan Huggins, Eric Jonas, Leslie Kaelbling, Charles Kemp, Balaji Lakshminarayanan, Tudor Leu, James Lloyd, Jeffrey Negrea, Vikash Mansinghka, Peter Orbanz, Mansheej Paul, Ali Ramezani-Kebrya, Ryan Rifkin, Martin Rinard, Jason Rute, Virginia Savova, Lauren Schmidt, Ekansh Sharma, David Sontag, Sam Staton, Yee Whye Teh, Josh Tenenbaum, Victor Veitch, David Wingate, Hongseok Yang

Family
Meloney Roy, Kenny Roy.

chalk-talk-style slides

Several template .TEX files for producing slides for mathematical talks that more closely mimic the chalk talk aesthetic.

valid xhtml

Valid XHTML 1.1 re-validate
Valid CSS /styles/screen

preprints and working papers

Leveraging Per-Instance Privacy for Machine Unlearning
Nazanin Mohammadi Sepahvand, Anvith Thudi, Berivan Isik, Ashmita Bhattacharyya, Nicolas Papernot, Eleni Triantafillou, Daniel M. Roy, Gintare Karolina Dziugaite,
To appear, Proc. International Conf. on Machine Learning (ICML), 2025.
arXiv:2505.18786
openreview:0A4Y9qRnu9

On the Dichotomy Between Privacy and Traceability in Lp Stochastic Convex Optimization
Sasha Voitovych, Mahdi Haghifam, Idan Attias, Gintare Karolina Dziugaite,
Roi Livni, Daniel M. Roy
arXiv:2502.17384

Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs
Alexander Ryabchenko, Idan Attias, Daniel M. Roy
To appear, Conference on Learning Theory (COLT), 2025
arXiv:2503.19856

The Non-Local Model Merging Problem:
Permutation Symmetries and Variance Collapse
Ekansh Sharma, Daniel M. Roy, Gintare Karolina Dziugaite
arXiv:2410.12766

de Finetti's theorem and the existence of regular conditional distributions and strong laws on exchangeable algebras
(with Peter Potaptchik and David Schrittesser)
arXiv:2312.16349

Statistical minimax theorems via nonstandard analysis
(with Haosui Duanmu and David Schrittesser)
arXiv:2212.13250

Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics
Jeffrey Negrea, Jun Yang, Haoyue Feng, Daniel M. Roy, Jonathan H. Huggins
arXiv:2207.12395

Admissibility is Bayes optimality with infintesimals
(with Haosui Duanmu and David Schrittesser)
arXiv:2112.14257

Enforcing Interpretability and its Statistical Impacts: Trade-offs between Accuracy and Interpretability
Gintare Karolina Dziugaite, Shai Ben-David, Daniel M. Roy
arXiv:2010.13764

Approximations in Probabilistic Programs
Ekansh Sharma, Daniel M. Roy
arXiv:1912.06791

The Class of Random Graphs Arising from Exchangeable Random Measures
Victor Veitch and Daniel M. Roy
arXiv:1512.03099

The continuum-of-urns scheme,
generalized beta and Indian buffet processes,
and hierarchies thereof
arXiv:1501.00208

Note: Articles distinguished by "(with ...)" have alphabetical author lists, as is the convention in mathematics and theoretical computer science.

thesis

Computability, inference and modeling in probabilistic programming
Daniel M. Roy
Ph.D. thesis, Massachusetts Institute of Technology, 2011.
MIT/EECS George M. Sprowls Doctoral Dissertation Award

research articlesand publications

Selective Unlearning via Representation Erasure Using Adversarial Training
Nazanin Mohammadi Sepahvand, Eleni Triantafillou, Hugo Larochelle, Doina Precup, James J. Clark, Daniel M. Roy, Gintare Karolina Dziugaite
International Conference on Learning Representations (ICLR), 2025.
openreview:KzSGJy1PIf

The Size of Teachers as a Measure of Data Complexity: PAC-Bayes Excess Risk Bounds and Scaling Laws
Gintare Karolina Dziugaite, Daniel M. Roy
Proc. Artificial Intelligence and Statistics (AISTATS), 2025.
openreview:FDUfAcAVjO

Sequential Probability Assignment with Contexts: Minimax Regret, Contextual Shtarkov Sums, and Contextual Normalized Maximum Likelihood
Ziyi Liu, Idan Attias, Daniel M. Roy
Advances in Neural Information Processing Systems (NeurIPS), 2024.
arXiv:2410.03849

Simultaneous linear connectivity of neural networks modulo permutation
Ekansh Sharma, Devin Kwok, Tom Denton, Daniel M. Roy, David Rolnick, Gintare Karolina Dziugaite
Proc. European Conference in Machine Learning (ECML), 2024.
arXiv:2404.06498

Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals
Ziyi Liu, Idan Attias, Daniel M. Roy
Proc. International Conf. on Machine Learning (ICML), 2024.
arXiv:2407.00950

Information Complexity of Stochastic Convex Optimization:
Applications to Generalization and Memorization
(with Idan Attias, Gintare Karolina Dziugaite, Mahdi Haghifam, Roi Livni)
Proc. International Conf. on Machine Learning (ICML), 2024. ICML 2024 Best Paper
arXiv:2402.09327

Probabilistic programming interfaces for random graphs: Markov categories, graphons, and nominal sets
Nathanael L. Ackerman, Cameron E. Freer, Younesse Kaddar, Jacek Karwowski, Sean K. Moss, Daniel M. Roy, Sam Staton, Hongseok Yang
Proc. ACM SIGPLAN Symposium on Principles of Programming Languages (POPL), 2024.
doi:10.1145/3632903
arXiv:2312.17127

The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy
Advances in Neural Information Processing Systems (NeurIPS), 2023.
arXiv:2306.17759

Relaxing the I.I.D. Assumption: Adaptive Minimax Optimal Sequential Prediction with Expert Advice
Blair Bilodeau, Jeffrey Negrea, Daniel M. Roy
Annals of Statistics 51(4): 1850-1876 (2023).
doi:10.1214/23-AOS2315
doi:10.1214/23-AOS2315SUPP
arXiv:2007.06552

Minimax Rates for Conditional Density Estimation via Empirical Entropy
Blair Bilodeau, Dylan J. Foster, Daniel M. Roy
Annals of Statistics 51(2): 762-790 (2023).
doi:10.1214/23-AOS2270
doi:10.1214/23-AOS2270SUPP (supplement)
arXiv:2109.10461

Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization
Mahdi Haghifam, Borja Rodríguez-Gálvez, Ragnar Thobaben, Mikael Skoglund, Daniel M. Roy, Gintare Karolina Dziugaite
Proc. International Conference on Algorithmic Learning Theory (ALT), 2023.
arXiv:2212.13556

Understanding Generalization via Leave-One-Out Conditional Mutual Information
Mahdi Haghifam, Shay Moran, Daniel M. Roy, Gintare Karolina Dziugaite
Proc. IEEE Int. Symp. Information Theory (ISIT), 2022.
arXiv:2206.14800

Existence of matching priors on compact spaces
(with Haosui Duanmu and Aaron Smith)
Biometrika, 110(3): 763-776, 2023.
doi:10.1093/biomet/asac061
arXiv:2011.03655

The Neural Covariance SDE:
Shaped Infinite Depth-and-Width Networks at Initialization
Mufan Bill Li, Mihai Nica, Daniel M. Roy
Advances in Neural Information Processing Systems (NeurIPS), 2022. Oral
arXiv:2206.02768

Adaptively Exploiting d-Separators with Causal Bandits
Blair Bilodeau, Linbo Wang, Daniel M. Roy
Advances in Neural Information Processing Systems (NeurIPS), 2022. Oral
arXiv:2202.05100

Pruning's Effect on Generalization Through the Lens of Training and Regularization
Tian Jin, Daniel M. Roy, Michael Carbin, Jonathan Frankle, Gintare Karolina Dziugaite
Advances in Neural Information Processing Systems (NeurIPS), 2022.
arXiv:2210.13738

Towards a Unified Information-Theoretic Framework for Generalization
Mahdi Haghifam, Gintare Karolina Dziugaite, Shay Moran, Daniel M. Roy
Advances in Neural Information Processing Systems 35 (NeurIPS), 2021. Spotlight
arXiv:2111.05275

Minimax Optimal Quantile and Semi-Adversarial Regret via Root-Logarithmic Regularizers
Jeffrey Negrea, Blair Bilodeau, Nicolò Campolongo, Francesco Orabona, Daniel M. Roy
Advances in Neural Information Processing Systems 35 (NeurIPS), 2021.
arXiv:2110.14804

The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization
Mufan (Bill) Li, Mihai Nica, Daniel M. Roy
Advances in Neural Information Processing Systems 35 (NeurIPS), 2021.
arXiv:2106.04013

Information-Theoretic Generalization Bounds for Stochastic Gradient Descent
Gergely Neu, Gintare Karolina Dziugaite, Mahdi Haghifam, Daniel M. Roy
Conference on Learning Theory (COLT), 2021.
arXiv:2102.00931

NUQSGD: Improved Communication Efficiency for Data-parallel SGD via Nonuniform Quantization
Ali Ramezani-Kebrya, Fartash Faghri, Ilia Markov, Vitaly Aksenov, Dan Alistarh, Daniel Roy
Journal of Machine Learning Research, 22(114):1–43, 2021.
jmlr.org/papers/v22/20-255
arXiv:1908.06077

On Extended Admissible Procedures and their Nonstandard Bayes Risk
Haosui Duanmu and Daniel M. Roy
Annals of Statistics, 49(4): 2053-2078 (2021).
doi:10.1214/20-AOS2026
doi:10.1214/20-AOS2026SUPP (supplement)

On the role of data in PAC-Bayes bounds
Gintare Karolina Dziugaite, Kyle Hsu, Waseem Gharbieh, Gabriel Arpino, Daniel M. Roy
Artificial Intelligence and Statistics (AISTATS), 2021.
arXiv:2006.10929

Pruning Neural Networks at Initialization: Why are We Missing the Mark?
Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin
International Conference on Learning Representations (ICLR), 2021.
arXiv:2009.08576

An estimator for the tail-index of graphex processes
Zacharie Naulet, Ekansh Sharma, Victor Veitch, Daniel M. Roy
Electronic Journal of Statistics, Volume 15, Number 1 (2021), 282-325.
arXiv:1712.01745
doi:10.1214/20-EJS1789

Adaptive Gradient Quantization for Data-Parallel SGD
Fartash Faghri, Iman Tabrizian, Ilia Markov, Dan Alistarh, Daniel M. Roy, Ali Ramezani-Kebrya
Advances of Neural Information Processing Systems 34 (NeurIPS)
arXiv:2010.12460

In Search of Robust Measures of Generalization
Gintare Karolina Dziugaite, Alexandre Drouin, Brady Neal, Nitarshan Rajkumar, Ethan Caballero, Linbo Wang, Ioannis Mitliagkas, Daniel M. Roy
Advances of Neural Information Processing Systems 34 (NeurIPS)
arXiv:2010.11924

Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel
Stanislav Fort, Gintare Karolina Dziugaite, Mansheej Paul, Sepideh Kharaghani, Daniel M. Roy, Surya Ganguli
Advances of Neural Information Processing Systems 34 (NeurIPS)
arXiv:2010.15110

Sharpened Generalization Bounds based on Conditional Mutual Information and an Application to Noisy, Iterative Algorithms
Mahdi Haghifam, Jeffrey Negrea, Ashish Khisti, Daniel M. Roy, Gintare Karolina Dziugaite
Advances of Neural Information Processing Systems 34 (NeurIPS)
arXiv:2004.12983

Improved Bounds on Minimax Regret under Logarithmic Loss via Self-Concordance
Blair Bilodeau, Dylan Foster, Daniel M. Roy
Proc. International Conf. on Machine Learning (ICML), 2020.
arXiv:2007.01160

Linear Mode Connectivity and the Lottery Ticket Hypothesis
Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin
Proc. International Conf. on Machine Learning (ICML), 2020.
arXiv:1912.05671

In Defense of Uniform Convergence: Generalization via Derandomization with an Application to Interpolating Predictors
Jeffrey Negrea, Gintare Karolina Dziugaite, Daniel M. Roy
Proc. International Conf. on Machine Learning (ICML), 2020.
arXiv:1912.04265

Fast-rate PAC-Bayes Generalization Bounds via Shifted Rademacher Processes
Jun Yang, Shengyang Sun, Daniel M. Roy
Advances in Neural Information Processing Systems 33 (NeurIPS).
arXiv:1908.07584

Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Jeffrey Negrea, Mahdi Haghifam, Gintare Karolina Dziugaite, Ashish Khisti, Daniel M. Roy
Advances in Neural Information Processing Systems 33 (NeurIPS).
arXiv:1911.02151

Gibbs-type Indian Buffet Processes
(with Creighton Heaukulani)
Bayesian Analysis.
arXiv:1512.02543
doi:10.1214/19-BA1166

Algorithmic barriers to representing conditional independence
(with Nathanael L. Ackerman, Jeremy Avigad, Cameron E. Freer, and Jason M. Rute)
Proc. Logic in Computer Science (LICS), 2018.
arXiv:1801.10387 (old)
doi:10.1109/LICS.2019.8785762

On the computability of conditional probability
(with Nate Ackerman and Cameron Freer)
Journal of the ACM.
arXiv:1005.3014
doi:10.1145/3321699

Sampling and Estimation for (Sparse) Exchangeable Graphs
Victor Veitch and Daniel M. Roy
Annals of Statistics, 47(6): 3274-3299 (2019).
doi:10.1214/18-AOS1778
doi:10.1214/18-AOS1778SUPP (supplement)
arXiv:1611.00843

Data-dependent PAC-Bayes priors via differential privacy
Gintare Karolina Dziugaite and Daniel M. Roy
Advances in Neural Information Processing Systems 31 (NeurIPS).
arXiv:1802.09583

Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors
Gintare Karolina Dziugaite and Daniel M. Roy
Proc. International Conf. on Machine Learning (ICML), 2018.
arXiv:1712.09376

The Beta-Bernoulli process and algebraic effects
Sam Staton, Dario Stein, Hongseok Yang, Nathanael Ackerman, Cameron Freer, Daniel M. Roy
Proc. Int. Colloq. on Automata, Languages, and Programming (ICALP), 2018.
arXiv:1802.09598

Sequential Monte Carlo as Approximate Sampling: bounds, adaptive resampling via \infty-ESS, and an application to Particle Gibbs
(with Jonathan Huggins)
Bernoulli.
arXiv:1503.00966
doi:10.3150/17-BEJ999

A characterization of product-form exchangeable feature probability functions
Marco Battiston, Stefano Favaro, Daniel M. Roy, Yee Whye Teh
Annals of Applied Probability.
arXiv:1607.02066
doi:10.1214/17-AAP1333

Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data
Gintare Karolina Dziugaite and Daniel M. Roy
Proc. Uncertainty in Artificial Intelligence (UAI), 2017.
arXiv:1703.11008
code

Measuring the reliability of MCMC inference with bidirectional Monte Carlo
Roger B. Grosse, Siddharth Ancha, and Daniel M. Roy
Advances in Neural Information Processing Systems 29 (NIPS), 2016.
arXiv:1606.02275
nips

On computability and disintegration
(with Nate Ackerman and Cameron Freer)
Mathematical Structures in Computer Science.
arXiv:1509.02992 (recommended)
doi:10.1017/S0960129516000098

The Mondrian Kernel
Matej Balog, Balaji Lakshminarayan, Zoubin Ghahramani, Daniel M. Roy, Yee Whye Teh
Proc. Uncertainty in Artificial Intelligence (UAI), 2016.
arXiv:1606.05241

Mondrian Forests for Large-Scale Regression when Uncertainty Matters
Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh.
Proc. Artificial Intelligence and Statistics (AISTATS), 2016.
arXiv:1506.03805

Training generative neural networks via Maximum Mean Discrepancy optimization
Gintare Karolina Dziugaite, Daniel M. Roy, and Zoubin Ghahramani
Proc. Uncertainty in Artificial Intelligence (UAI), 2015.
arXiv:1505.03906

The combinatorial structure of beta negative binomial processes
(with Creighton Heaukulani)
Bernoulli, 2016, Vol. 22, No. 4, 2301–2324.
doi:10.3150/15-BEJ729

Particle Gibbs for Bayesian Additive Regression Trees
Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh
Proc. Artificial Intelligence and Statistics (AISTATS), 2015.
arXiv:1502.04622

Mondrian Forests: Efficient Online Random Forests
Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh
Advances in Neural Information Processing Systems 27 (NIPS), 2014.
code

Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures
(with Peter Orbanz)
IEEE Trans. Pattern Anal. Mach. Intelligence (PAMI), 2014.
doi:10.1109/TPAMI.2014.2334607
arXiv:1312.7857
slides for talk at CIMAT

Top-down particle filtering for Bayesian decision trees
Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh
Proc. Int. Conf. on Machine Learning (ICML), 2013.
code

Towards common-sense reasoning via conditional simulation:
Legacies of Turing in Artificial Intelligence
(with Cameron Freer and Josh Tenenbaum)
Turing's Legacy (ASL Lecture Notes in Logic), 2012.
doi:10.1017/CBO9781107338579.007

Random function priors for exchangeable arrays with applications to graphs and relational data
James Lloyd, Peter Orbanz, Zoubin Ghahramani, Daniel M. Roy
Advances in Neural Information Processing Systems 25 (NIPS), 2012.

Computable de Finetti measures
(with Cameron Freer)
Annals of Pure and Applied Logic, 2012.
doi:10.1016/j.apal.2011.06.011

Complexity of Inference in Latent Dirichlet Allocation
David Sontag and Daniel Roy
Advances in Neural Information Processing Systems 24 (NIPS), 2011.

Noncomputable conditional distributions
(with Nate Ackerman and Cameron Freer)
Proc. Logic in Computer Science (LICS), 2011.

Probabilistically Accurate Program Transformations
Sasa Misailovic, Daniel M. Roy, and Martin C. Rinard
Proc. Int. Static Analysis Symp. (SAS), 2011.

Bayesian Policy Search with Policy Priors
David Wingate, Noah D. Goodman, Daniel M. Roy, Leslie P. Kaelbling, and Joshua B. Tenenbaum
Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI), 2011.

Posterior distributions are computable from predictive distributions
(with Cameron Freer)
Proc. Artificial Intelligence and Statistics (AISTATS), 2010.

The Infinite Latent Events Model
David Wingate, Noah D. Goodman, Daniel M. Roy, and Joshua B. Tenenbaum
Proc. Uncertainty in Artificial Intelligence (UAI), 2009.

Computable exchangeable sequences have computable de Finetti measures
(with Cameron Freer)
Proc. Computability in Europe (CiE), 2009.

Exact and Approximate Sampling by Systematic Stochastic Search
Vikash Mansinghka, Daniel M. Roy, Eric Jonas, and Joshua Tenenbaum
Proc. Artificial Intelligence and Statistics (AISTATS), 2009.

The Mondrian Process
(with Yee Whye Teh)
Advances Neural Information Processing Systems 21 (NIPS), 2009.

Video animation of the Mondrian process as one zooms into the origin (under a beta Levy rate measure at time t=1.0). See also the time evolution of a Mondrian process on the plane as we zoom in with rate proportional to time. In both cases, the colors are chosen at random from a palette. These animations were produced by Yee Whye in Matlab. For now, we reserve copyright, but please email me and we'll be more than likely happy to let you use them.

Chapter 5. Distributions on data structures: a case study
(Mondrian process theory)
Those seeking a more formal presentation of the Mondrian process than the NIPS paper should see Chapter 5 of my dissertation.

Church: a language for generative models
Noah Goodman, Vikash Mansinghka, Daniel M. Roy, Keith Bonawitz, and Joshua Tenenbaum
In Proc. Uncertainty in Artificial Intelligence (UAI), 2008.

Bayesian Agglomerative Clustering with Coalescents
Yee Whye Teh, Hal Daumé III, and Daniel M. Roy
Advances in Neural Information Processing Systems 20 (NIPS), 2008.

Discovering Syntactic Hierarchies
Virginia Savova, Daniel M. Roy, Lauren Schmidt, and Joshua B. Tenenbaum
Proc. Cognitive Science (COGSCI), 2007.

AClass: An online algorithm for generative classification
Vikash K. Mansinghka, Daniel M. Roy, Ryan Rifkin, and Joshua B. Tenenbaum
Proc. Artificial Intelligence and Statistics (AISTATS), 2007.

Efficient Bayesian Task-level Transfer Learning
Daniel M. Roy and Leslie P. Kaelbling
Proc. Int. Joint Conf. on Artificial Intelligience (IJCAI), 2007.

Learning Annotated Hierarchies from Relational Data
Daniel M. Roy, Charles Kemp, Vikash Mansinghka, and Joshua B. Tenenbaum
Advances in Neural Information Processing Systems 19 (NIPS), 2007.

Clustered Naive Bayes
MEng thesis, Massachusetts Institute of Technology, 2006.

Enhancing Server Availability and Security Through Failure-Oblivious Computing
Martin Rinard, Cristian Cadar, Daniel Dumitran, Daniel M. Roy, Tudor Leu, and William S. Beebee, Jr.
Proc. Operating Systems Design and Implementation (OSDI), 2004.

A Dynamic Technique for Eliminating Buffer Overflow Vulnerabilities (and Other Memory Errors)
Martin Rinard, Cristian Cadar, Daniel Dumitran, Daniel M. Roy, and Tudor Leu
Proc. Annual Computer Security Applications Conference (ACSAC), 2004.

technical reports

On the Information Complexity of Proper Learners for VC Classes in the Realizable Case
Mahdi Haghifam, Gintare Karolina Dziugaite, Shay Moran, Daniel M. Roy
arXiv:2011.02970
Subsumed by ``Towards a Unified Information-Theoretic Framework for Generalization".

Black-box constructions for exchangeable sequences of random multisets
(with Creighton Heaukulani)
arXiv:1908.06349

Stabilizing the Lottery Ticket Hypothesis
Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin
arXiv:1903.01611

Exchangeable modelling of relational data: checking sparsity, train-test splitting, and sparse exchangeable Poisson matrix factorization
Victor Veitch, Ekansh Sharma, Zacharie Naulet, Daniel M. Roy
arXiv:1712.02311

On computable representations of exchangeable data
Nathanael L. Ackerman, Jeremy Avigad, Cameron E. Freer, Daniel M. Roy, and Jason M. Rute

Exchangeable Random Processes and Data Abstraction
Sam Staton, Hongseok Yang, Nate Ackerman, Cameron Freer, Dan Roy

A study of the effect of JPG compression on adversarial images
Gintare Karolina Dziugaite, Zoubin Ghahramani, Daniel M. Roy
arXiv:1608.00853

Neural Network Matrix Factorization
Gintare Karolina Dziugaite and Daniel M. Roy
arXiv:1511.06443

Exchangeable databases and their functional representation
James Lloyd, Peter Orbanz, Zoubin Ghahramani, Daniel M. Roy
NIPS Workshop on Frontiers of Network Analysis: Methods, Models, and Applications, 2013.

On the computability and complexity of Bayesian reasoning
NIPS Philosophy and Machine Learning Workshop, 2011.

When are probabilistic programs probably computationally tractable?
(with Cameron Freer and Vikash Mansinghka)
NIPS Workshop on Monte Carlo Methods for Modern Applications, 2010.

Complexity of Inference in Topic Models
David Sontag and Daniel Roy
NIPS Workshop on Applications for Topic Models: Text and Beyond, 2009.

A stochastic programming perspective on nonparametric Bayes
Daniel M. Roy, Vikash Mansinghka, Noah Goodman, and Joshua Tenenbaum
ICML Workshop on Nonparametric Bayesian, 2008.

Efficient Specification-Assisted Error Localization
Brian Demsky, Cristian Cadar, Daniel M. Roy, and Martin C. Rinard
Proc. Workshop on Dynamic Analysis (WODA), 2004.

Efficient Specification-Assisted Error Localization and Correction
Brian Demsky, Cristian Cadar, Daniel M. Roy, and Martin C. Rinard
MIT CSAIL Technical Report 927. November, 2003.

Implementation of Constraint Systems for Useless Variable Elimination
(advised by Mitchell Wand)
Research Science Institute. August, 1998.

marginalia

I believe that errata, clarifications, missed citations, links to follow-on work, retractions, and other "marginalia" are important, but underappreciated, contributions to the scientific literature. Ultimately, the nature of a scientific document needs to be rethought, but until then I am slowly collecting marginalia in a simple wiki. I encourage everyone to host similar wikis, or contribute to this one.

Marginalia wiki.

contact

email:
daniel.roy#utoronto.ca

social media:
@roydanroy (twitter)

Dan Roy aka daniel roy, daniel m roy, droy