- Topics and Course Structure
- Discussion Sections
- Programming Projects
- Final Project
- Grading Policy
- Collaboration Policy
- Discussion Forum
Robots need to see and understand their world to be able to interact with objects and perform useful tasks autonomously. Perception is the essential first step in the process for endowing robots to perform autonomously. Autonomous robots need to make sense of their sensory observations to represent the world around them – and enable their reasoning and action to a goal. Visual perception with cameras as sensors has matured due to the recent advancements in neural networks – which is especially true for performing visual recognition tasks such as object classification, detection, pose estimation, grasp pose detection, etc.
This course aims to cover the necessary background of neural-network-based deep learning for robot perception – building on advancements in computer vision and enabling – for enabling robots to dexterously manipulate physical objects. During the first part of this course, students will learn to implement, train and debug their own neural networks. During the second part of this course, students will explore recent emerging topics in deep learning for robot perception and manipulation. This exploration will include analysis of research publications in the area, building up to reproducing one of these publications for implementation as a final course project.
This course builds on and is indebted to these existing courses (as a “star” and a “fork” in the open source sense):
- University of Michigan - ROB 498-002 / 599-009: Deep Learning for Robot Perception instructed by Anthony Opipari, Chad Jenkins, and Karthik Desingh
- University of Michigan - EECS 498-007 / 598-005: Deep Learning for Computer Vision instructed by Justin Johnson
- Stanford - CS231n: Deep Learning for Computer Vision instructed by Fei-Fei Li and Andrej Karpathy
The first half of the course will cover deep learning fundamentals in computer vision catered to robot perception problems.
- Linear classifiers
- Stochastic gradient descent
- Fully-connected networks
- Convolutional networks
The second half of the course will switch to seminar style covering following advanced topics in robot perception and manipulation via discussing publications.
- 3D vision in robotics
- Pose estimation
- Object perception for robot manipulation
- Neural radiance fields for perception
- Robot grasp pose detection
- Strongly encouraged prerequisites:
- Programming: ROB 320, EECS 281, or equivalent
- Linear Algebra: ROB 101, MATH 214, MATH 217, or equivalent
- Recommended prerequisites:
- Prior experience with the Python programming language is recommended.
- Familiarity with gradients and how to calculate them from vector calculus.
- Familiarity with random variables and probability distributions from probability theory.
- Familiarity with concepts from machine learning (e.g. EECS 445) will be helpful.
There is no required textbook for this course, however optional readings will be suggested from the textbook, “Deep Learning” by Ian Goodfellow and Yoshua Bengio and Aaron Courville.
Lectures will take place in-person.
In-person lectures will be held on Tuesdays and Thursdays from 3:00-4:30 PM EST in room 906 COOLG.
Discussions will take place in-person.
In-person discussions will be held on Wednesdays from 3:30-5:30 PM EST in room 1311 EECS.
You will complete 5 programming projects over the course of the semester. All projects will be implemented using Python, Pytorch and Google Colab.
Instead of a final exam at the end of the semester, you will complete a final project working in groups of 1 to 3 students.
The final project will entail five core deliverables: (1) a written paper review, (2) an in-class paper presentation, (3) reproducing the published results of an existing deep learning paper, (4) extending the chosen paper’s methods and (5) documenting your reproduction and extension in a written report.
The objective of the final project is for you to gain experience with state of the art approaches in deep learning and a sense of how research in the area is conducted.
Throughout the semester, there will be a total of 16 quizzes administered through gradescope. These quizzes will be posted before lecture sections throughout the semester and be available to take until the beginning of lecture that same day. Quizzes will be released at 7:00AM EST and must be submitted by 3:00PM EST. Each quiz will have a 15 minute time limit. Each quiz will consist of 1 or 2 short questions within the scope of previously covered lectures and graded projects. Use of lecture, project and other course materials is permitted while taking the quizzes. Use of external sources (i.e. from the internet) is not permitted during quizzes.
Course grades will be determined according to the following criteria:
- Project 0: 12%
- Project 1: 12%
- Project 2: 12%
- Project 3: 12%
- Project 4: 12%
- Final Project:
- Paper Review: 3%
- Paper Presentation: 3%
- Paper Reproduction: 6%
- Algorithmic Extension: 6%
- Written Report: 6%
- 16 Pre-Lecture Quizzes: 16% (1% each)
The free flow of discussion and ideas is encouraged. However, all work submitted must be your own.
All code submitted must comply with the College of Engineering Honor Code.
No code can be communicated, including verbally. Explicit use of external sources must be clearly cited.
The Ed Stem discussion forum is available for discussion of course materials including lectures and projects. This forum will be shared across course offerings at the University of Michigan and the University of Minnesota. Students are not required to participate, use or join the Ed Stem forum. Students may opt-in to join the forum using this Google Form.
Any discussion of quizzes and verbatim code on the Ed Stem forum must be posted privately.