Research Director, CIFAR Canada AI Chair, and Founding Faculty, Vector Institute
Professor
Department of Statistical Sciences
Department of Computer Science (cross-appointment)
University of Toronto
Department of Computer and Mathematical Sciences
University of Toronto Scarborough
Editorial Boards
Action Editor, Journal of Machine Learning Research
Action Editor, Transactions of Machine Learning Research
Jump to: Teaching | Research Articles | Group Members | Probabilistic Programming | Contact
JUL 2024.
My joint work with Idan Attias, Gintare Karolina Dziugaite, Mahdi Haghifam, Roi Livni on "Information Complexity of Stochastic Convex Optimization" won a best paper award at ICML 2024 (given to 10 out of ~10,000 submissions).
APR 2024.
I've been promoted to Full Professor.
SEP 2023.
I've been named Research Director of the Vector Institute.
My research is focused on the foundational principles underlying prediction, inference, and decision making under uncertainty. This focus on foundations ties together multiple lines of work, which span machine learning, statistics, mathematical logic, applied probability, and computer science.
My group has made significant contributions to learning theory, statistical network analysis, decision theory, probabilistic programming, Bayesian nonparametric statistics. More recently, my group has come to focus on information theories of learning, online learning, and nonstandard analytic foundations for decision theory.
Want to hear more about our research? Watch a recent talk at ETH Zurich, summarizing some recent research on "removing assumptions from statistics".
I am seeking students at all levels with strong quantitative backgrounds interested in foundational problems at the intersection of machine learning, statistics, and computer science. I am also seeking qualified postdoctoral researchers for two- and three- year positions. Click to read instructions before contacting me, or your email is likely to be deleted. I value people who are attentive to details.
STA3000 (Fall 2017)
Advanced Theory of Statistics, Part I
STA4516 (Fall 2017)
Nonstandard Analysis and Applications to Statistics and Probability
STAD68 (Fall 2017)
Advanced Machine Learning and Data Mining
STAD68 (Fall 2016)
Advanced Machine Learning and Data Mining
STA4516 (Fall 2015)
Topics in Probabilistic Programming
STAC63 (Fall 2015)
Probability Models
STAD68 (Winter 2015)
Advanced Machine Learning and Data Mining
STA4513 (Fall 2014)
Statistical models of networks, graphs, and other relational structures
Programs can be used to give compact representations of distributions: in order to represent a distribution, one simply gives a program that would generate an exact sample were the random number generator to produce realizations of independent and identically distributed random variables. This approach to representing distributions by probabilistic programs works not only for simple distributions on numbers like Poissons, Gaussians, etc., and combinations thereof, but also for more exotic distributions on, e.g., phrases in natural language, rendered 2D images of 3D scenes, and climate sensor measurements.
Probabilistic programming systems support statistical inference on models defined by probabilistic programs. By constraining some variables of a program (e.g., simulated sensor readings in some climate model) and studying the conditional distribution of other variables (e.g., the parameters of the climate model), we can identify plausible variable settings that agree with the constraints. Conditional inferences like this would allow us to, e.g., build predictive text systems for mobile phones, guess the 3D shape of an object from only a photograph, or study the underlying mechanisms driving observed climate measurements.
Probabilistic programming systems for machine learning and statistics are still in their infancy, and there are many interesting theoretical and applied problems yet to be tackled. My own work focuses on theoretical questions around representing stochastic processes and the computational complexity of sampling-based approaches to inference. I was involved in the definition of the probabilistic programming language Church, and its first implementation, MIT-Church, a Markov Chain Monte Carlo algorithm operating on the space of execution histories of an interpreter. Some of my key theoretical work includes a study of the computability of conditional probability and de Finetti measures, both central notions in Bayesian statistics. Readers looking for an overview of these results are directed to the introduction of my doctoral dissertation. A less technical description of a probabilistic programming approach to artificial intelligence can be found in a recent book chapter on legacies of Alan Turing, co-authored with Freer and Tenenbaum.
More information on probabilistic programming can be found on probabilistic-programming.org, a wiki that I maintain. In particular, look at the list of research articles and papers on probabilistic programming and the tutorials.
I co-organized the first workshop on probabilistic programming for statistics and machine learning at NIPS*2008 (with Vikash Mansinghka, John Winn, David McAllester and Josh Tenenbaum). Four years later, I co-organized the second workshop on probabilistic programming at NIPS*2012 (with Vikash Mansinghka and Noah Goodman).
Hello, my name is Daniel M. Roy (or simply, Dan) and I am an Associate Professor of Statistics at the University of Toronto, with cross appointments in Computer Science and in Electrical and Computer Engineering.
I received all three of my degrees in (Electrical Engineering and) Computer Science from MIT. As a doctoral student, I was member of the Computer Science and Artificial Intelligence Laboratory (CSAIL), where I was advised by Leslie Kaelbling. I also collaborated with members of Josh Tenenbaum's Computational Cognitive Science group.
After my doctorate, I was a Newton International Fellow of the Royal Society and then Research Fellow of of Emmanuel College at the University of Cambridge. At Cambridge, I was a member of the Machine Learning Group, headed by Zoubin Ghahramani and part of the Computational and Biological Learning Lab in the Department of Engineering.
Nate Ackerman, Siddharth Ancha, Matej Balog, Marco Battiston, William Beebee, Keith Bonawitz, Nate Ackerman, Jeremy Avigad, Blair Bilodeau, Cristian Cadar, Hal Daumé III, Brian Demsky, Daniel Dumitran, Haosui Duanmu, Gintare Karolina Dziugaite, Stefano Favaro, Cameron Freer, Zoubin Ghahramani, Noah Goodman, Creighton Heaukulani, Jonathan Huggins, Eric Jonas, Leslie Kaelbling, Charles Kemp, Balaji Lakshminarayanan, Tudor Leu, James Lloyd, Jeffrey Negrea, Vikash Mansinghka, Peter Orbanz, Mansheej Paul, Ali Ramezani-Kebrya, Ryan Rifkin, Martin Rinard, Jason Rute, Virginia Savova, Lauren Schmidt, Ekansh Sharma, David Sontag, Sam Staton, Yee Whye Teh, Josh Tenenbaum, Victor Veitch, David Wingate, Hongseok Yang
Family
Meloney Roy,
Kenny Roy.
Several template .TEX files for producing slides for mathematical talks that more closely mimic the chalk talk aesthetic.
Valid XHTML 1.1 re-validate
Valid CSS
/styles/screen
de Finetti's theorem and the existence of regular conditional distributions and strong laws on exchangeable algebras
(with Peter Potaptchik and David Schrittesser)
arXiv:2312.16349
Statistical minimax theorems via nonstandard analysis
(with Haosui Duanmu and David Schrittesser)
arXiv:2212.13250
Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics
Jeffrey Negrea, Jun Yang, Haoyue Feng, Daniel M. Roy, Jonathan H. Huggins
arXiv:2207.12395
Admissibility is Bayes optimality with infintesimals
(with Haosui Duanmu and David Schrittesser)
arXiv:2112.14257
Enforcing Interpretability and its Statistical Impacts: Trade-offs between Accuracy and Interpretability
Gintare Karolina Dziugaite, Shai Ben-David, Daniel M. Roy
arXiv:2010.13764
Approximations in Probabilistic Programs
Ekansh Sharma, Daniel M. Roy
arXiv:1912.06791
The Class of Random Graphs Arising from Exchangeable Random Measures
Victor Veitch and Daniel M. Roy
arXiv:1512.03099
The continuum-of-urns scheme,
generalized beta and Indian buffet processes,
and hierarchies thereof
arXiv:1501.00208
Note: Articles distinguished by "(with ...)" have alphabetical author lists, as is the convention in mathematics and theoretical computer science.
Computability, inference and modeling in probabilistic programming
Daniel M. Roy
Ph.D. thesis, Massachusetts Institute of Technology,
2011.
MIT/EECS George M. Sprowls Doctoral Dissertation Award
Sequential Probability Assignment with Contexts: Minimax Regret, Contextual Shtarkov Sums, and Contextual Normalized Maximum Likelihood
Ziyi Liu, Idan Attias, Daniel M. Roy
In Advances in Neural Information Processing Systems (NeurIPS), 2024.
arXiv:2410.03849
Simultaneous linear connectivity of neural networks modulo permutation
Ekansh Sharma, Devin Kwok, Tom Denton, Daniel M. Roy, David Rolnick, Gintare Karolina Dziugaite
In Proc. European Conference in Machine Learning (ECML), 2024.
arXiv:2404.06498
Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals
Ziyi Liu, Idan Attias, Daniel M. Roy
In Proc. International Conf. on Machine Learning (ICML), 2024.
arXiv:2407.00950
Information Complexity of Stochastic Convex Optimization:
Applications to Generalization and Memorization
(with Idan Attias, Gintare Karolina Dziugaite, Mahdi Haghifam, Roi Livni)
In Proc. International Conf. on Machine Learning (ICML), 2024. ICML 2024 Best Paper
arXiv:2402.09327
Probabilistic programming interfaces for random graphs: Markov categories, graphons, and nominal sets
Nathanael L. Ackerman, Cameron E. Freer, Younesse Kaddar, Jacek Karwowski, Sean K. Moss, Daniel M. Roy, Sam Staton, Hongseok Yang
Proc. ACM SIGPLAN Symposium on Principles of Programming Languages (POPL), 2024.
doi:10.1145/3632903
arXiv:2312.17127
The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy
Advances in Neural Information Processing Systems (NeurIPS), 2023.
arXiv:2306.17759
Relaxing the I.I.D. Assumption: Adaptive Minimax Optimal Sequential Prediction with Expert Advice
Blair Bilodeau,
Jeffrey Negrea,
Daniel M. Roy
Annals of Statistics 51(4): 1850-1876 (2023).
doi:10.1214/23-AOS2315
doi:10.1214/23-AOS2315SUPP
arXiv:2007.06552
Minimax Rates for Conditional Density Estimation via Empirical Entropy
Blair Bilodeau, Dylan J. Foster, Daniel M. Roy
Annals of Statistics 51(2): 762-790 (2023).
doi:10.1214/23-AOS2270
doi:10.1214/23-AOS2270SUPP (supplement)
arXiv:2109.10461
Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization
Mahdi Haghifam, Borja Rodríguez-Gálvez, Ragnar Thobaben, Mikael Skoglund, Daniel M. Roy, Gintare Karolina Dziugaite
Proc. International Conference on Algorithmic Learning Theory (ALT), 2023.
arXiv:2212.13556
Understanding Generalization via Leave-One-Out Conditional Mutual Information
Mahdi Haghifam, Shay Moran, Daniel M. Roy, Gintare Karolina Dziugaite
Proc. IEEE Int. Symp. Information Theory (ISIT), 2022.
arXiv:2206.14800
Existence of matching priors on compact spaces
(with Haosui Duanmu and Aaron Smith)
Biometrika, 110(3): 763-776, 2023.
doi:10.1093/biomet/asac061
arXiv:2011.03655
The Neural Covariance SDE:
Shaped Infinite Depth-and-Width Networks at Initialization
Mufan Bill Li, Mihai Nica, Daniel M. Roy
Advances in Neural Information Processing Systems (NeurIPS), 2022. Oral
arXiv:2206.02768
Adaptively Exploiting d-Separators with Causal Bandits
Blair Bilodeau, Linbo Wang, Daniel M. Roy
Advances in Neural Information Processing Systems (NeurIPS), 2022. Oral
arXiv:2202.05100
Pruning's Effect on Generalization Through the Lens of Training and Regularization
Tian Jin, Daniel M. Roy, Michael Carbin, Jonathan Frankle, Gintare Karolina Dziugaite
Advances in Neural Information Processing Systems (NeurIPS), 2022.
arXiv:2210.13738
Towards a Unified Information-Theoretic Framework for Generalization
Mahdi Haghifam, Gintare Karolina Dziugaite, Shay Moran, Daniel M. Roy
Advances in Neural Information Processing Systems 35 (NeurIPS), 2021. Spotlight
arXiv:2111.05275
Minimax Optimal Quantile and Semi-Adversarial Regret via Root-Logarithmic Regularizers
Jeffrey Negrea, Blair Bilodeau, Nicolò Campolongo, Francesco Orabona, Daniel M. Roy
Advances in Neural Information Processing Systems 35 (NeurIPS), 2021.
arXiv:2110.14804
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization
Mufan (Bill) Li, Mihai Nica, Daniel M. Roy
Advances in Neural Information Processing Systems 35 (NeurIPS), 2021.
arXiv:2106.04013
Information-Theoretic Generalization Bounds for Stochastic Gradient Descent
Gergely Neu,
Gintare Karolina Dziugaite,
Mahdi Haghifam,
Daniel M. Roy
Conference on Learning Theory (COLT), 2021.
arXiv:2102.00931
NUQSGD: Improved Communication Efficiency for Data-parallel SGD via Nonuniform Quantization
Ali Ramezani-Kebrya, Fartash Faghri, Ilia Markov, Vitaly Aksenov, Dan
Alistarh, Daniel Roy
Journal of Machine Learning Research, 22(114):1–43, 2021.
jmlr.org/papers/v22/20-255
arXiv:1908.06077
On Extended Admissible Procedures and their Nonstandard Bayes Risk
Haosui Duanmu and Daniel M. Roy
Annals of Statistics, 49(4): 2053-2078 (2021).
doi:10.1214/20-AOS2026
doi:10.1214/20-AOS2026SUPP (supplement)
On the role of data in PAC-Bayes bounds
Gintare Karolina Dziugaite, Kyle Hsu, Waseem Gharbieh, Gabriel Arpino, Daniel M. Roy
Artificial Intelligence and Statistics (AISTATS), 2021.
arXiv:2006.10929
Pruning Neural Networks at Initialization: Why are We Missing the Mark?
Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin
International Conference on Learning Representations (ICLR), 2021.
arXiv:2009.08576
An estimator for the tail-index of graphex processes
Zacharie Naulet, Ekansh Sharma, Victor Veitch, Daniel M. Roy
Electronic Journal of Statistics, Volume 15, Number 1 (2021), 282-325.
arXiv:1712.01745
doi:10.1214/20-EJS1789
Adaptive Gradient Quantization for Data-Parallel SGD
Fartash Faghri, Iman Tabrizian, Ilia Markov, Dan Alistarh, Daniel M. Roy, Ali Ramezani-Kebrya
Advances of Neural Information Processing Systems 34 (NeurIPS)
arXiv:2010.12460
In Search of Robust Measures of Generalization
Gintare Karolina Dziugaite, Alexandre Drouin, Brady Neal, Nitarshan Rajkumar, Ethan Caballero, Linbo Wang, Ioannis Mitliagkas, Daniel M. Roy
Advances of Neural Information Processing Systems 34 (NeurIPS)
arXiv:2010.11924
Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel
Stanislav Fort,
Gintare Karolina Dziugaite,
Mansheej Paul,
Sepideh Kharaghani,
Daniel M. Roy,
Surya Ganguli
Advances of Neural Information Processing Systems 34 (NeurIPS)
arXiv:2010.15110
Sharpened Generalization Bounds based on Conditional Mutual Information and an Application to Noisy, Iterative Algorithms
Mahdi Haghifam,
Jeffrey Negrea,
Ashish Khisti,
Daniel M. Roy,
Gintare Karolina Dziugaite
Advances of Neural Information Processing Systems 34 (NeurIPS)
arXiv:2004.12983
Improved Bounds on Minimax Regret under Logarithmic Loss via Self-Concordance
Blair Bilodeau, Dylan Foster, Daniel M. Roy
In Proc. International Conf. on Machine Learning (ICML), 2020.
arXiv:2007.01160
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin
In Proc. International Conf. on Machine Learning (ICML), 2020.
arXiv:1912.05671
In Defense of Uniform Convergence: Generalization via Derandomization with an Application to Interpolating Predictors
Jeffrey Negrea,
Gintare Karolina Dziugaite,
Daniel M. Roy
In Proc. International Conf. on Machine Learning (ICML), 2020.
arXiv:1912.04265
Fast-rate PAC-Bayes Generalization Bounds via Shifted Rademacher Processes
Jun Yang, Shengyang Sun, Daniel M. Roy
Advances in Neural Information Processing Systems 33 (NeurIPS).
arXiv:1908.07584
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Jeffrey Negrea,
Mahdi Haghifam,
Gintare Karolina Dziugaite,
Ashish Khisti,
Daniel M. Roy
Advances in Neural Information Processing Systems 33 (NeurIPS).
arXiv:1911.02151
Gibbs-type Indian Buffet Processes
(with Creighton Heaukulani)
Bayesian Analysis.
arXiv:1512.02543
doi:10.1214/19-BA1166
Algorithmic barriers to representing conditional independence
(with Nathanael L. Ackerman, Jeremy Avigad, Cameron E. Freer, and Jason M. Rute)
Proc. Logic in Computer Science (LICS), 2018.
arXiv:1801.10387 (old)
doi:10.1109/LICS.2019.8785762
On the computability of conditional probability
(with Nate Ackerman and Cameron Freer)
Journal of the ACM.
arXiv:1005.3014
doi:10.1145/3321699
Sampling and Estimation for (Sparse) Exchangeable Graphs
Victor Veitch and Daniel M. Roy
Annals of Statistics, 47(6): 3274-3299 (2019).
doi:10.1214/18-AOS1778
doi:10.1214/18-AOS1778SUPP (supplement)
arXiv:1611.00843
Data-dependent PAC-Bayes priors via differential privacy
Gintare Karolina Dziugaite and Daniel M. Roy
In Advances in Neural Information Processing Systems 31 (NeurIPS).
arXiv:1802.09583
Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors
Gintare Karolina Dziugaite and Daniel M. Roy
In Proc. International Conf. on Machine Learning (ICML), 2018.
arXiv:1712.09376
The Beta-Bernoulli process and algebraic effects
Sam Staton, Dario Stein, Hongseok Yang, Nathanael Ackerman, Cameron Freer, Daniel M. Roy
In Proc. Int. Colloq. on Automata, Languages, and Programming (ICALP), 2018.
arXiv:1802.09598
Sequential Monte Carlo as Approximate Sampling: bounds, adaptive resampling via \infty-ESS, and an application to Particle Gibbs
(with Jonathan Huggins)
Bernoulli.
arXiv:1503.00966
doi:10.3150/17-BEJ999
A characterization of product-form exchangeable feature probability functions
Marco Battiston, Stefano Favaro, Daniel M. Roy, Yee Whye Teh
Annals of Applied Probability.
arXiv:1607.02066
doi:10.1214/17-AAP1333
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data
Gintare Karolina Dziugaite and Daniel M. Roy
In Proc. Uncertainty in Artificial Intelligence (UAI), 2017.
arXiv:1703.11008
code
Measuring the reliability of MCMC inference with bidirectional Monte Carlo
Roger B. Grosse, Siddharth Ancha, and Daniel M. Roy
In Adv. Neural Information Processing Systems 29 (NIPS), 2016.
arXiv:1606.02275
nips
On computability and disintegration
(with Nate Ackerman and Cameron Freer)
In Mathematical Structures in Computer Science.
arXiv:1509.02992 (recommended)
doi:10.1017/S0960129516000098
The Mondrian Kernel
Matej Balog, Balaji Lakshminarayan, Zoubin Ghahramani, Daniel M. Roy, Yee Whye Teh
In Proc. Uncertainty in Artificial Intelligence (UAI), 2016.
arXiv:1606.05241
Mondrian Forests for Large-Scale Regression when Uncertainty Matters
Balaji Lakshminarayanan,
Daniel M. Roy,
Yee Whye Teh.
In Proc. Artificial Intelligence and Statistics (AISTATS), 2016.
arXiv:1506.03805
Training generative neural networks via Maximum Mean Discrepancy optimization
Gintare Karolina Dziugaite, Daniel M. Roy, and Zoubin Ghahramani
In Proc. Uncertainty in Artificial Intelligence (UAI), 2015.
arXiv:1505.03906
The combinatorial structure of beta negative binomial processes
(with Creighton Heaukulani)
Bernoulli, 2016, Vol. 22, No. 4, 2301–2324.
doi:10.3150/15-BEJ729
Particle Gibbs for Bayesian Additive Regression Trees
Balaji Lakshminarayanan,
Daniel M. Roy,
Yee Whye Teh
Proc. Artificial Intelligence and Statistics (AISTATS), 2015.
arXiv:1502.04622
Mondrian Forests: Efficient Online Random Forests
Balaji Lakshminarayanan,
Daniel M. Roy,
Yee Whye Teh
Adv. Neural Information Processing Systems 27 (NIPS), 2014.
code
Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures
(with Peter Orbanz)
IEEE Trans. Pattern Anal. Mach. Intelligence (PAMI), 2014.
doi:10.1109/TPAMI.2014.2334607
arXiv:1312.7857
slides for talk at CIMAT
Top-down particle filtering for Bayesian decision trees
Balaji Lakshminarayanan,
Daniel M. Roy,
Yee Whye Teh
Proc. Int. Conf. on Machine Learning (ICML), 2013.
code
Towards common-sense reasoning via conditional simulation:
Legacies of Turing in Artificial Intelligence
(with Cameron Freer and Josh Tenenbaum)
Turing's Legacy (ASL Lecture Notes in Logic), 2012.
doi:10.1017/CBO9781107338579.007
Random function priors for exchangeable arrays with applications to graphs and relational data
James Lloyd, Peter Orbanz, Zoubin Ghahramani, Daniel M. Roy
Adv. Neural Information Processing Systems 25 (NIPS), 2012.
Computable de Finetti measures
(with
Cameron Freer)
Annals of Pure and Applied Logic, 2012.
doi:10.1016/j.apal.2011.06.011
Complexity of Inference in Latent Dirichlet Allocation
David Sontag and
Daniel Roy
Adv. Neural Information Processing Systems 24 (NIPS), 2011.
Noncomputable conditional distributions
(with Nate Ackerman and
Cameron Freer)
Proc. Logic in Computer Science (LICS), 2011.
Probabilistically Accurate Program Transformations
Sasa Misailovic,
Daniel M. Roy,
and
Martin C. Rinard
Proc. Int. Static Analysis Symp.
(SAS), 2011.
Bayesian Policy Search with Policy Priors
David Wingate,
Noah D. Goodman,
Daniel M. Roy,
Leslie P. Kaelbling,
and
Joshua B. Tenenbaum
Proc. Int. Joint Conf. on Artificial Intelligence
(IJCAI), 2011.
Posterior distributions are computable from predictive distributions
(with Cameron Freer)
Proc. Artificial Intelligence and Statistics (AISTATS), 2010.
The Infinite Latent Events Model
David Wingate,
Noah D. Goodman,
Daniel M. Roy,
and
Joshua B. Tenenbaum
In Proc. Uncertainty in Artificial Intelligence (UAI), 2009.
Computable exchangeable sequences have computable de Finetti measures
(with Cameron Freer)
In Proc. Computability in Europe (CiE), 2009.
Exact and Approximate Sampling by Systematic Stochastic Search
Vikash Mansinghka,
Daniel M. Roy,
Eric Jonas,
and
Joshua Tenenbaum
In Proc. Artificial Intelligence and Statistics (AISTATS), 2009.
The Mondrian Process
(with
Yee Whye Teh)
In Adv. Neural Information Processing Systems 21 (NIPS), 2009.
Video animation of the Mondrian process as one zooms into the origin (under a beta Levy rate measure at time t=1.0). See also the time evolution of a Mondrian process on the plane as we zoom in with rate proportional to time. In both cases, the colors are chosen at random from a palette. These animations were produced by Yee Whye in Matlab. For now, we reserve copyright, but please email me and we'll be more than likely happy to let you use them.
Chapter 5. Distributions on data structures: a case study
(Mondrian process theory)
Those seeking a more formal presentation of the Mondrian process than the NIPS paper should see Chapter 5 of my dissertation.
Church: a language for generative models
Noah Goodman,
Vikash Mansinghka,
Daniel M. Roy,
Keith Bonawitz,
and
Joshua Tenenbaum
In Proc. Uncertainty in Artificial Intelligence (UAI), 2008.
Bayesian Agglomerative Clustering with Coalescents
Yee Whye Teh,
Hal Daumé III,
and
Daniel M. Roy
Adv. Neural Information Processing Systems 20 (NIPS), 2008.
Discovering Syntactic Hierarchies
Virginia Savova,
Daniel M. Roy,
Lauren Schmidt, and
Joshua B. Tenenbaum
Proc. Cognitive Science (COGSCI), 2007.
AClass: An online algorithm for generative classification
Vikash K. Mansinghka,
Daniel M. Roy,
Ryan Rifkin, and
Joshua B. Tenenbaum
Proc. Artificial Intelligence and Statistics (AISTATS), 2007.
Efficient Bayesian Task-level Transfer Learning
Daniel M. Roy and
Leslie P. Kaelbling
Proc. Int. Joint Conf. on Artificial Intelligience
(IJCAI), 2007.
Learning Annotated Hierarchies from Relational Data
Daniel M. Roy,
Charles Kemp,
Vikash Mansinghka,
and
Joshua B. Tenenbaum
Adv. Neural Information Processing Systems 19 (NIPS), 2007.
Clustered Naive Bayes
MEng thesis,
Massachusetts Institute of Technology, 2006.
Enhancing Server Availability and Security Through Failure-Oblivious Computing
Martin Rinard, Cristian
Cadar, Daniel Dumitran,
Daniel M. Roy,
Tudor Leu,
and William S. Beebee, Jr.
Proc. Operating Systems Design and
Implementation (OSDI), 2004.
A Dynamic Technique for Eliminating Buffer Overflow Vulnerabilities (and Other Memory Errors)
Martin Rinard, Cristian Cadar, Daniel Dumitran,
Daniel M. Roy,
and Tudor Leu
Proc. Annual Computer Security Applications Conference (ACSAC), 2004.
On the Information Complexity of Proper Learners for VC Classes in the Realizable Case
Mahdi Haghifam, Gintare Karolina Dziugaite, Shay Moran, Daniel M. Roy
arXiv:2011.02970
Subsumed by ``Towards a Unified Information-Theoretic Framework for Generalization".
Black-box constructions for exchangeable sequences of random multisets
(with Creighton Heaukulani)
arXiv:1908.06349
Stabilizing the Lottery Ticket Hypothesis
Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin
arXiv:1903.01611
Exchangeable modelling of relational data: checking sparsity, train-test splitting, and sparse exchangeable Poisson matrix factorization
Victor Veitch, Ekansh Sharma, Zacharie Naulet, Daniel M. Roy
arXiv:1712.02311
On computable representations of exchangeable data
Nathanael L. Ackerman, Jeremy Avigad, Cameron E. Freer, Daniel M. Roy, and Jason M. Rute
Exchangeable Random Processes and Data Abstraction
Sam Staton, Hongseok Yang, Nate Ackerman, Cameron Freer, Dan Roy
A study of the effect of JPG compression on adversarial images
Gintare Karolina Dziugaite, Zoubin Ghahramani, Daniel M. Roy
arXiv:1608.00853
Neural Network Matrix Factorization
Gintare Karolina Dziugaite and Daniel M. Roy
arXiv:1511.06443
Exchangeable databases and their functional representation
James Lloyd, Peter Orbanz, Zoubin Ghahramani, Daniel M. Roy
NIPS Workshop on Frontiers of Network Analysis: Methods, Models, and Applications, 2013.
On the computability and complexity of Bayesian reasoning
NIPS Philosophy and Machine Learning Workshop, 2011.
When are probabilistic programs probably computationally tractable?
(with Cameron Freer and Vikash Mansinghka)
NIPS Workshop on Monte Carlo Methods for Modern Applications, 2010.
Complexity of Inference in Topic Models
David Sontag and
Daniel Roy
NIPS Workshop on Applications for Topic Models: Text and Beyond, 2009.
A stochastic programming perspective on nonparametric Bayes
Daniel M. Roy,
Vikash Mansinghka,
Noah Goodman,
and
Joshua Tenenbaum
ICML Workshop on Nonparametric Bayesian, 2008.
Efficient Specification-Assisted Error Localization
Brian Demsky, Cristian Cadar,
Daniel M. Roy,
and Martin C. Rinard
Proc. Workshop on Dynamic Analysis (WODA), 2004.
Efficient Specification-Assisted Error Localization and Correction
Brian Demsky, Cristian Cadar,
Daniel M. Roy,
and Martin C. Rinard
MIT CSAIL Technical Report 927.
November, 2003.
Implementation of Constraint Systems for Useless Variable Elimination
(advised by
Mitchell Wand)
Research Science Institute.
August, 1998.
I believe that errata, clarifications, missed citations, links to follow-on work, retractions, and other "marginalia" are important, but underappreciated, contributions to the scientific literature. Ultimately, the nature of a scientific document needs to be rethought, but until then I am slowly collecting marginalia in a simple wiki. I encourage everyone to host similar wikis, or contribute to this one.
Marginalia wiki.