EECE 571F (2022 Winter Term 1): Deep Learning with Structures


Structures are pervasive in science and engineering. Some structures are conveniently observable, e.g., 3D point clouds, molecules, phylogenetic trees, social networks, whereas some are latent or hard to be measured, e.g., parse trees for languages/images, causal graphs, and latent interactions among actors in multi-agent systems. Advanced deep learning techniques have emerged recently to effectively process data in the above scenarios.

This course will teach cutting-edge deep learning models and methods with structures from probabilistic and geometric perspectives. In particular, for observable structures, we will introduce popular models, e.g., Transformers, Graph Neural Networks, with an emphasis on motivating applications, design principles, practical and or theoretical limitations, and future directions. For latent structures, we will introduce motivating applications, latent variable models (e.g., variational auto-encoders), and inference methods (e.g., amortization and search), and learning methods (e.g., REINFORCE and relaxation).

Previous Version: 2021 Winter Term 2


Course Information

Where and When

Instructor Renjie Liao
TA Qi Yan
Section 1 1:30pm to 3:00pm, Monday, MacLeod 3002
Section 2 1:30pm to 3:00pm, Wednesday, Forest Sciences Centre 1221
Office Hour 3:00pm to 4:00pm, Tuesday, Fred Kaiser 3065 (Maxwell)


Course Structure

The instructor will present the lectures every week except that students will present their projects in the last two weeks.
Students should ask all course-related questions on Piazza. We will use Canvas to handle submission and evaluation of all reports and project related files.


Students can work on projects individually, or in groups of up to four (group should be formed as early as possible). Students are strongly encouraged to form groups via, e.g., discussing on Piazza. However, a larger group would be expected to do more than a smaller one or individuals. All students in a group will receive the same grade. Students are allowed to undertake a research project that is related to their thesis or other external projects, but the work done for this course must represent substantial additional work and cannot be submitted for grades in another course.

The grade will depend on the quality of research ideas, how well you present them in the report, how clearly you position your work relative to prior literature, how illuminating and or convincing your experiments are, and well-supported your conclusions are. Full marks will require a novel contribution.

Each group of students will write a short (>=2 pages) research project proposal, which ideally will be structured similarly to a standard paper. You don’t have to do exactly what your proposal claims - the point of the proposal is mainly to have a plan for yourself and to allow me to give you feedback. Students will do a short presentation (roughly 5 minutes for individual, 10 to 15 minutes for a larger group) for their projects towards the end of the course. At the end of the class, every group needs to submit a project report (6~8 pages).

Evaluation Policy

Grades will be based on:

Good Project Reports from Last Semester

Important Notes

  1. All reports (i.e., paper reading report, proposal, peer-review report, and final project report) must be written in NeurIPS conference format and must be submitted as PDF

  2. Late work will be automatically subject to a 20% penalty and can be submitted up to 3 days after the deadline

  3. UBC values academic integrity. Therefore, all students must understand the meaning and consequences of cheating, plagiarism and other academic offences under the Code of Student Conduct and Discipline.

  4. It is the responsibility of each student to understand the policy for each course work, ask questions to the instructor if it is not clear, and carefully acknowledge all sources (papers, code, books, websites, individual communications) using appropriate referencing style when submitting work.


This is a tentative schedule, which will likely change as the course goes on.

#   Dates   Lecture Topic Lecture Slides Suggested Readings
1 Sep. 7 Introduction to Deep Learning slides Chapter 13, 14 of PML book & DL book
2 Sep. 12
Sep. 14
Sep. 21
Geometric Deep Learning: Invariance, Equivariance, and Deep Learning Models for Sets & Sequences slides DeepSets & Transformers & PreNorm & VisionTransformers & SwinTransformers & Chapter 15 of PML book
3 Sep. 26
Sep. 28
Geometric Deep Learning: Graph Neural Networks: Message Passing Models slides Part II of GRL book & Chapter 23 of PML book & Chapter 4 of GNN book & GNNs & GGNNs & GAT & Graphormer & GPS
4 Oct. 3
Oct. 5
Geometric Deep Learning: Graph Neural Networks: Graph Convolution Models slides Part II of GRL book & Chapter 23 of PML book & Chapter 4 of GNN book & GCNs & ChebyNet & LanczosNet
5 Oct. 12
Oct. 17
Geometric Deep Learning: Expressiveness & Generalization of Graph Neural Networks slides GIN & PAC-Bayes Bounds
6 Oct. 19
Oct. 24
Geometric Deep Learning: Unsupervised/Self-supervised Graph Representation Learning
slides DeepWalk & DeepGraphInfomax & SimCLR & SimCLRv2
7 Oct. 26
Oct. 31
Probabilistic Deep Learning: Deep Generative Models of Graphs: Auto-Regressive Models slides Chapter 11 of GNN book & DGMG & GraphRNN & GRAN
8 Nov. 2
Nov. 7
Probabilistic Deep Learning: Deep Generative Models of Graphs: VAEs and GANs
slides VGAE & GraphVAE & JunctionTreeVAEs & MolGANs
9 Nov. 16
Nov. 21
Probabilistic Deep Learning: Discrete Latent Variable Models & Contrastive Divergence & Amortized Inference & REINFORCE & Variance Reduction & Reparameterization & Wake-Sleep Algorithm slides RBMs & CD & NVIL & VAE & Wake-Sleep & Oops I Took A Gradient
10 Nov. 23 Probabilistic Deep Learning: Score based and Denoising Diffusion based Generative Models Guest Lecture by Dr. Yang Song Score-based Models & ScoreSDE & DDPM
11 Nov. 28 Probabilistic Deep Learning: Stochastic Gradient Estimation slides Straight-through Estimator & Gumble-Softmax & Gumble-TopK & RELAX
12 Nov. 30 Probabilistic Deep Learning: Learning Latent Graph Structures slides, NRI & Learning Discrete Structures for GNNs
13 Dec. 5
Dec. 7
Project Presentation    


Can I audit or sit in?

I am very open to auditing guests if you are a member of the UBC community (registered student, staff, and/or faculty). I would appreciate that you first email me. If the in-person class is too full and running out of space, I would ask that you please allow registered students to attend.

Is there a textbook for this course?

While there is no required textbook, I recommend the following closely relevant ones for further reading:

I also recommend students who are self-motivated to take a look at similar courses taught at other universities:

Paper List

Supervised Deep Learning with Observable Structures

  1. Deep sets
  2. Pointnet: Deep learning on point sets for 3d classification and segmentation
  3. Attention is all you need
  4. An image is worth 16x16 words: Transformers for image recognition at scale.
  5. Learning transferable visual models from natural language supervision
  6. Sequence to sequence learning with neural networks
  7. MLP-Mixer: An all-MLP Architecture for Vision
  8. Semi-Supervised Classification with Graph Convolutional Networks
  9. Gated Graph Sequence Neural Networks
  10. How Powerful are Graph Neural Networks?
  11. Spectral Networks and Locally Connected Networks on Graphs
  12. NerveNet: Learning Structured Policy with Graph Neural Networks
  13. The graph neural network model (the original Graph Neural Networks paper)
  14. Neural Message Passing for Quantum Chemistry
  15. Graph Attention Networks
  16. LanczosNet: Multi-Scale Deep Graph Convolutional Networks
  17. Graph Signal Processing: Overview, Challenges, and Applications
  18. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
  19. 3D Graph Neural Networks for RGBD Semantic Segmentation
  20. Few-Shot Learning with Graph Neural Networks
  21. Convolutional Networks on Graphs for Learning Molecular Fingerprints
  22. node2vec: Scalable Feature Learning for Networks
  23. Inductive Representation Learning on Large Graphs
  24. Learning Lane Graph Representations for Motion Forecasting
  25. Representation Learning on Graphs: Methods and Applications
  26. Modeling Relational Data with Graph Convolutional Networks
  27. Hierarchical Graph Representation Learning with Differentiable Pooling
  28. Inference in Probabilistic Graphical Models by Graph Neural Networks
  29. Do Transformers Really Perform Bad for Graph Representation?
  30. Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
  31. SpAGNN: Spatially-Aware Graph Neural Networks for Relational Behavior Forecasting from Sensor Data
  32. Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting
  33. Geometric Deep Learning: Going beyond Euclidean data
  34. Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs
  35. Dynamic Graph CNN for Learning on Point Clouds
  36. Weisfeiler and Lehman Go Cellular: CW Networks
  37. Provably Powerful Graph Networks
  38. Invariant and Equivariant Graph Networks
  39. On Learning Sets of Symmetric Elements
  40. Relational inductive biases, deep learning, and graph networks
  41. Graph Matching Networks for Learning the Similarity of Graph Structured Objects
  42. Deep Parametric Continuous Convolutional Neural Networks
  43. Neural Execution of Graph Algorithms
  44. Neural Execution Engines: Learning to Execute Subroutines
  45. Learning to Represent Programs with Graphs
  46. Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks
  47. Pointer Graph Networks
  48. Learning to Solve NP-Complete Problems - A Graph Neural Network for Decision TSP
  49. Premise Selection for Theorem Proving by Deep Graph Embedding
  50. Graph Representations for Higher-Order Logic and Theorem Proving
  51. What Can Neural Networks Reason About?
  52. Discriminative Embeddings of Latent Variable Models for Structured Data
  53. Learning Combinatorial Optimization Algorithms over Graphs
  54. On Layer Normalization in the Transformer Architecture
  55. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
  56. Recipe for a General, Powerful, Scalable Graph Transformer

Unsupervised Deep Learning with Observable Structures

  1. Variational Graph Auto-Encoders
  2. Deep Graph Infomax
  3. GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models
  4. Efficient Graph Generation with Graph Recurrent Attention Networks
  5. MolGAN: An implicit generative model for small molecular graphs
  6. GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders
  7. Learning Deep Generative Models of Graphs
  8. Permutation Invariant Graph Generation via Score-Based Generative Modeling
  9. Graph Normalizing Flows
  10. Constrained Graph Variational Autoencoders for Molecule Design
  11. Generative Code Modeling with Graphs
  12. Structured Denoising Diffusion Models in Discrete State-Spaces
  13. Structured Generative Models of Natural Source Code
  14. A Model to Search for Synthesizable Molecules
  15. Grammar Variational Autoencoder
  16. Scalable Deep Generative Modeling for Sparse Graphs
  17. Energy-Based Processes for Exchangeable Data
  18. Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration
  19. Hierarchical Generation of Molecular Graphs using Structural Motifs
  20. Junction Tree Variational Autoencoder for Molecular Graph Generation

Deep Learning with Latent Structures

  1. Simple statistical gradient-following algorithms for connectionist reinforcement learning (the original REINFORCE paper)
  2. Neural Discrete Representation Learning
  3. Categorical Reparameterization with Gumbel-Softmax
  4. Neural Relational Inference for Interacting Systems
  5. Contrastive Learning of Structured World Models
  6. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables
  7. Learning Graph Structure With A Finite-State Automaton Layer
  8. Neural Turing Machines
  9. Oops I Took A Gradient: Scalable Sampling for Discrete Distributions
  10. Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces
  11. Gradient Estimation with Stochastic Softmax Tricks
  12. Differentiation of Blackbox Combinatorial Solvers
  13. REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models
  14. Monte Carlo Gradient Estimation in Machine Learning
  15. Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
  16. Thinking Fast and Slow with Deep Learning and Tree Search
  17. Mastering the Game of Go without Human Knowledge
  18. Memory-Augmented Monte Carlo Tree Search
  19. M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search
  20. DSDNet: Deep Structured Self-driving Network
  21. Learning to Search with MCTSnets
  22. Direct Loss Minimization for Structured Prediction
  23. Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement
  24. Direct Optimization through argmax for Discrete Variational Auto-Encoder
  25. Learning Compositional Neural Programs with Recursive Tree Search and Planning
  26. Reinforcement Learning Neural Turing Machines - Revised
  27. The Generalized Reparameterization Gradient
  28. Gradient Estimation Using Stochastic Computation Graphs
  29. Learning to Search Better than Your Teacher
  30. Learning to Search in Branch-and-Bound Algorithms
  31. Model-Based Planning with Discrete and Continuous Actions
  32. Learning Transferable Graph Exploration
  33. Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search