# STA4513

Statistical Models of Networks, Graphs, and other Relational Structures

## Overview

This course is a survey of topics on the statistical analysis of network, graph, and relational data, with an emphasis on network modeling and random graphs. In particular, the role of probabilistic symmetries---including exchangeability and stationarity---will be our primary concern, at least in the second half.

Our understanding of graph- and network-valued data has undergone a dramatic shift in the past decade. We now understand there to be fundamentally different regimes that relate to the prevalence of edges. The best understood is the dense regime, where, informally speaking, we expect to see edges among vertices chosen uniformly at random from a large graph. The mathematical foundations of this area can be traced back to work by Aldous and Hoover in the early 1980s, but work in graph theory over the past decade has enriched our understanding considerably. Most existing statistical methods, especially Bayesian ones, work implicitly in the dense regime. Real-world networks, however, are not dense. A growing community is now focused on the structure of large sparse graphs. The sparse regime, however, is not well understood: key mathematical notions continue to be identified. We will work through key papers in probability, statistics, and graph theory in order to gain the broader perspective necessary to identify opportunities to contribute to our understanding of statistical methods on graphs and networks.

### Pre-requisites

Mathematical maturity and some background in linear algebra, analysis, measure theory, and probability theory recommended.

## Time, Location, Etc.

**Date/Time:** Fall 2014, Wednesdays, 2--5pm from Sept. 10 through Oct. 15

**Room:** WE 69 (Wetmore Hall, New College, across Huron St from Sidney Smith)

Thinking about attending? Please fill out this form.

## Syllabus

Link to syllabus: contains course outline, reading, grading policy, etc.

## Course poster

Feel free to use this poster to let others know about the class.

## Scribed lecture notes

- Overview, Latent Variable Models [source]
- Exchangeable Random Graph Models [source]
- Exchangeability of Sequences and Graphs [source]
- Graphons [source]
- Graphings and unimodular random networks [source]
- Graphings and unimodular random networks, continued [source]

If you find errors/omissions or would otherwise like to improve the slides, please send updated versions. I may eventually migrate this to a github repo.

### Scribing

Every student taking this class for credit is required to scribe at least one lecture. Those auditing the course may be asked to scribe as well. Scribe notes should present the material covered in class in a clear, concise, and comprehensive manner, which will likely require that one go beyond merely replicating what is written on the chalk board. (This MIT course website has a section on Scribing that does a good job describing "good" scribe notes.)

Scribe notes are to be produced using this Latex template. Scribe notes (.tex file and all additional files) should be emailed to me no later than Friday of the same week. I will review the scribe notes to check the technical content and quality of writing, and will give feedback and ask for revisions if necessary. Please make sure to proofread the notes carefully before submitting. Completed scribe notes will be posted above.

Sign up sheet for scribing. (Email me to add your name.)

## Student Presentations

List of upcoming presentations and presentors. See the syllabus for information on presentations and expectations.

- Inference Phase Transitions in the Detection of Modules in Sparse Networks

Presented by Alberto Camacho - Modeling homophily and stochastic equivalence in symmetric relational data

Presented by Jinlong Fu - New Specifications for Exponential Random Graph Models

Presented by Namdar Homayounfar - Bayesian Models of Graphs, Arrays and Other Exchangeable Random Structures

Presented by Jinlong Fu - Estimating and understanding exponential random graph models

Presented by Alberto Camacho

See also these geogebra example 1 and 2. - Bayesian nonparametric models of sparse and exchangeable random graphs

Presented by Victor Veitch - Pseudo-likelihood methods for community detection in large sparse networks

Presented by Namdar Homayounfar