Course Syllabus: Deep Learning for Robot Perception
Table of contents
- About
- Topics and Course Structure
- Prerequisites
- Textbook
- Lectures
- Discussion Sections
- Programming Projects
- Final Project
- Quizzes
- Grading Policy
- Collaboration Policy
- Discussion Forum
Note: This course website and syllabus is still under development and is subject to change.
About
Robots need to see and understand their world to be able to interact with objects and perform useful tasks autonomously. Perception is the essential first step in the process for endowing robots to perform autonomously. Autonomous robots need to make sense of their sensory observations to represent the world around them – and enable their reasoning and action to a goal. Visual perception with cameras as sensors has matured due to the recent advancements in neural networks – which is especially true for performing visual recognition tasks such as object classification, detection, pose estimation, grasp pose detection, etc.
This course aims to cover the necessary background of neural-network-based deep learning for robot perception – building on advancements in computer vision and enabling – for enabling robots to dexterously manipulate physical objects. During the first part of this course, students will learn to implement, train and debug their own neural networks. During the second part of this course, students will explore recent emerging topics in deep learning for robot perception and manipulation. This exploration will include analysis of research publications in the area, building up to reproducing and implementing state-of-the-art deep learning approaches as a final course project.
This course builds on and is indebted to these existing courses (as a “star” and a “fork” in the open source sense):
- University of Michigan - ROB 498-011 / 599-011: Deep Learning for Robot Perception instructed by Xiaoxiao Du, Anthony Opipari, and Chad Jenkins
- University of Michigan - ROB 498-002 / 599-009: Deep Learning for Robot Perception instructed by Anthony Opipari, Chad Jenkins, and Karthik Desingh
- University of Michigan - EECS 498-007 / 598-005: Deep Learning for Computer Vision instructed by Justin Johnson
- Stanford - CS231n: Deep Learning for Computer Vision instructed by Fei-Fei Li and Andrej Karpathy
Topics and Course Structure
The first half of the course will cover deep learning fundamentals in computer vision catered to robot perception problems.
- Linear classifiers
- Stochastic gradient descent
- Fully-connected networks
- Convolutional networks
The second half of the course will switch to seminar style covering following advanced topics in robot perception and manipulation via discussing publications.
- 3D vision in robotics
- Pose estimation
- Object perception for robot manipulation
- Neural radiance fields for perception
- Robot grasp pose detection
Prerequisites
- Strongly encouraged prerequisites:
- Programming: ROB 320, EECS 281, or equivalent
- Linear Algebra: ROB 101, MATH 214, MATH 217, or equivalent
- Recommended prerequisites:
- Prior experience with the Python programming language is recommended.
- Familiarity with gradients and how to calculate them from vector calculus.
- Familiarity with random variables and probability distributions from probability theory.
- Familiarity with concepts from machine learning (e.g. EECS 445) will be helpful.
Textbook
There is no required textbook for this course, however optional readings will be suggested from the textbook, “Deep Learning” by Ian Goodfellow and Yoshua Bengio and Aaron Courville.
For additional references, consider the following textbooks:
- “Introduction to Robotics and Perception” by Frank Dellaert and Seth Hutchinson
- “Robotics, Vision and Control” by Peter Corke
- “Computer Vision: Algorithms and Applications” by Richard Szeliski
- “Foundations of Computer Vision” by Antonio Torralba, Phillip Isola, and William T. Freeman
Lectures
Lectures will take place in-person.
In-person lectures will be held on Mondays and Wednesdays from 12:00-1:30 PM EST in room CSRB 2246. Remote access will be available through Zoom.
Discussion Sections
Discussions will take place in-person.
In-person discussions will be held on Tuesdays from 3:30-5:30 PM EST in room CSRB 2246. Remote access will be available through Zoom.
Programming Projects
You will complete 5 programming projects over the course of the semester. All projects will be implemented using Python, Pytorch and Google Colab.
Final Project
Instead of a final exam at the end of the semester, you will complete a final project working in groups of 1 to 3 students.
The final project will entail five core deliverables: (1) a written paper review, (2) an in-class paper presentation, (3) reproducing the published results of an existing deep learning paper, (4) extending the chosen paper’s methods and (5) documenting your reproduction and extension in a written report.
The objective of the final project is for you to gain experience with state of the art approaches in deep learning and a sense of how research in the area is conducted.
Quizzes
Throughout the semester, there will be a total of 16 quizzes administered through gradescope. These quizzes will be posted before lecture sections throughout the semester and be available to take until the beginning of lecture that same day. Quizzes will be released at 7:00AM EST and must be submitted by 5:00PM EST. Each quiz will have a 15 minute time limit. Each quiz will consist of 1 or 2 short questions within the scope of previously covered lectures and graded projects. Use of lecture, project and other course materials is permitted while taking the quizzes. Use of external sources (i.e. from the internet) is not permitted during quizzes.
Grading Policy
Course grades will be determined according to the following criteria:
- Project 0: 6%
- Project 1: 12%
- Project 2: 12%
- Project 3: 12%
- Project 4: 12%
- Final Project:
- Paper Review: 5%
- Paper Presentation: 10%
- Paper Reproduction: 5%
- Algorithmic Extension: 5%
- Written Report: 5%
- 16 Pre-Lecture Quizzes: 16% (1% each)
Collaboration Policy
The free flow of discussion and ideas is encouraged. However, all work submitted must be your own.
All code submitted must comply with the College of Engineering Honor Code.
No code can be communicated, including verbally. Explicit use of external sources must be clearly cited. Experimentation with and use of generative AI as an educational tool is encouraged, however any use of AI for course work must abide by the College of Engineering Honor Code and must be clearly cited.
Discussion Forum
The Piazza discussion forum is available for discussion of course materials including lectures and projects. Students are not required to participate, use or join the Piazza forum.
Any discussion of quizzes and verbatim code on the Piazza forum must be posted privately.