EE 6485 Computer Vision 計算機視覺

Fall 2023, Mon. 15:30 to 18:20, Location EECS Building (資電) 206
Instructor: Min Sun

TAs: Suraj Dengale (蘇拉傑) surajdengale@gapp.nthu.edu.tw
Frank Jhang (張晉承) frank890725@gapp.nthu.edu.tw
Tin-Ying Lin (林廷穎) tintin890327@gmail.com
Wendy Hsieh (謝雅竹) wendyhsieh0506@gapp.nthu.edu.tw
Jordan Hsieh (謝侑呈) sphinx5912@gapp.nthu.edu.tw


Computer Vision, art by hinesedora.com

Course Description


Can computers understand the visual world as we could? This course treats vision as a process of inference from noisy and uncertain data and emphasizes probabilistic, statistical, data-driven approaches. Topics include image processing; segmentation, grouping, and boundary detection; recognition and detection; motion estimation and structure from motion. This class will also lead you to the discussion of applications applying state-of-the-art techbiques in recognition, detection, and video analysis.

The course will consist of four programming projects and one final gruop project (max 5 members each team). Please find information about final project in the syllabus.

Prerequisites

This course requires programming experience (mainly Python) as well as linear algebra, basic calculus, and basic probability. Previous knowledge of visual computing will be helpful.

Textbook

Readings will be assigned in "Computer Vision: Algorithms and Applications" by Richard Szeliski. The book is available for free online or available for purchase.

Resource

Awesome computer vision github link
Awesome deep learning github link

Grading

Your final grade will be made up from You will lose 10% each day for late projects. However, you have three "late days" for the whole course. That is to say, the first 24 hours after the due date and time counts as 1 day, up to 48 hours is two and 72 for the third late day. This will not be reflected in the initial grade reports for your assignment, but they will be factored in and distributed at the end of the semester so that you get the most points possible.

Contact Info and Office Hours

You can contact the professor with any of the following: Office Hours

Tentative Syllabus

s
WeekClass DatesTopicSlidesRecordingExtra Info (e.g., Homework/Exam)
1 M, Sept. 11 Introduction to CV pdf Policy form Out
Group Form Out
Homework 1 Out
Camera Model, Light and color pdf
Python Tutorial Link
2 M, Sep. 18 Image filtering pdf pdf Policy Form Due
Camera Geometry and calibration pdf
3 M, Sept. 25 Single View Geometry pdf Group Form Due
Homework 1 Due
Epipolar Geometry pdf
Stereo System pdf
4 M, Oct. 2 Colab and Pytorch Tutorial Tutorial Link ICCV Trip
Homework 2 Out
5 M, Oct. 9 Holiday (National Day)  
6 M, Oct. 16 Stereo System pdf Project Proposal Due
Multi View Geometry pdf
7 M, Oct. 23 [No Physical Class - Video Recording Only] Active Stereo pdf
Fitting and Matching pdf
8 M, Oct. 30 Intro. to machine learning pdf Homework 2 Due
Homework 3 Out
Project Pitch (3-5 minutes, 22 team, 110 minutes) Peer review
9 M, Nov. 6 Intro. to CNN-1 pdf  
Intro. to CNN-2 pdf
10 M, Nov. 13 Training NN pdf
Object Detection and Beyond pdf
11 M, Nov. 20 Handle domain shift pdf Homework 3 Due
Homework 4 Out
Midterm Project Report Due
Scaling-up Depth Estimation & Feature Tracking pdf
12 M, Nov. 27 Scaling-up Flow pdf  
Neural Radiance Field (NeRF) pdf1, pdf2
13 M, Dec. 4 Guest Lecture by Kaggle Grandmster Kun-Hao Yeh Homework 4 Due
Vision and language pdf
Transformers in CV-1 pdf
14 M, Dec. 11 Transformer in CV-2 pdf Please Watch Video Lecture. No Physical Class due to Travel.
How to keep up with the advancements in CV? pdf
Variational autoencoders and Diffusion Models External Video
15 M, Dec. 18 Final presentation  
16 M, Dec. 25 Final presentation  
17 M, Jan. 1 Holiday (New Year's Day)  
18 M, Jan. 8 Final exam week Final Project Report Due

Project Proposal Format:

- Max 4 pages;
- 3 sections:
- Final format: pdf, please!

Project Progress (mid-term) Report Format:

- Max 4 pages;
- 3 sections:
- CVPR final format: pdf, please!

Project Final Report Format:

- Max 10 pages;
- Title and authors
- Abstract: short summary of the project with main results
- 6 sections:
- CVPR final format: pdf, please!
You can look at one of the recent publications (such as this) as an example.

Project Report Evaluation:

- Your project report will be evaluated based on the quality of the writing, the clarity of your technical explanation and, overall, how well you get your message across. If you follow the structure above, you'll have good chances to do a good job. :)

Project Source Code:

There is no need to attach a print out of the source codes to the manuscript. Final source codes of your working program need to be shared with TA and the instructor through eeclass; this file is due on the project submission due date.

Project Pitch in Class:

- The presentation must be at most 5 minutes long. Please impress your audience with imaginary results to illustrate your idea.

Project Presentation in Class:

- The presentation must be at most 15 minutes long. Please see below for detailed presentation guidelines.

Presentation Format:

Your slides should consist of a title slide, followed by slides that discuss the following aspects of your project:

Evaluation:

- Your team will be evaluated based on the clarity of the presentation, quality of the slides, how well you get your message across, and how well you handle the questions at the end. Note that the presentation can still contain ongoing/preliminary results; final results may be included in the final report.
- We will use a peer-review system.

Acknowledgements:

The materials from this class rely significantly on slides prepared by other instructors, especially James Hayes, Fei-Fei Li, and Silvio Savarese. Each slide set and assignment contains acknowledgements.