ECE-5973: Computer Vision

Vision is the most important sense organ of humans and many other animals. We rely on our vision systems to explore the surrounding, recognize objects, and perform complex task such as driving our cars and manipulating tools. There have been drastic advancements in the field computer vision for the last decade. Many vision tasks, such as face recognition, that were very difficult a couple decades ago have become routine components even in our cellphone applications. However, other vision tasks, such as scene understanding and image captioning, are still rather difficult and existing computer vision algorithms perform poorly comparing to humans. Therefore, computer vision is still a very active research area.

The goal of this course is multiple folds: to provide students the core foundation so that they can understand the existing CV algorithms behind those systems, to involve them with hands-on experience in implementing existing algorithms so that they can build basic CV systems on their owns, and to prepare them with sufficient technical depth so that they can contribute to CV research after the course.

Prerequistes

Calculus (MATH 1914 or equivalent), linear algebra (MATH 3333 or equivalent), basic probability (MATH 4733 or equivalent), and intermediate programming skill (MATLAB or Python/Numpy is preferred)

Piazza

Please sign up Piazza here

Office Hours and TA

There are no “regular” office hours. And you are welcome to come catch me anytime or contact me through emails.

There is no dedicated TA for this course. But there is a Matlab TA available from the department. The Matlab TA this semester is Wanghao Fei (whfei@ou.edu)

Graduate Credit

If you are enrolled in the graduate section, I will have a bit higher expectation on your final project. Moreover, you are expected to give a short presentation in class for a current topic. The presentation will contribute 20% of your final grade and other components (as shown below) will be scaled down accordingly.

Textbook

Computer Vision: Algorithms and Applications by Richard Szeliski. The book is available for free online or available for purchase. It is not required but is a very good reference.
Computer Vision: Models, Learning, and Inference by Smon J.D. Prince. The book is also available for free here
Fundamental of Computer Vision by Mubarak Shah

Python-numpy tutorial by Justin Johnson
OpenCV python tutorial
Projective geometry for machine vision by Mundy and Zisserman
A curated list of awesome computer vision resources
A curated list of deep learning resources for computer vision
CVF open access
This course is mostly based on the following computer vision course materials available online
- Course at Brown University
- Course at Stanford
- Course at Cornell
- Course at UCF

Some nice talks

3D computer vision: past present, and future by Steve Seitz

Course Syllabus (Tentative)

Overview of computer vision
Low-level processing
- Filtering techniques
- Edge detectors
- Image resampling and image pyramids
- Interest point and corner detectors
Object recognition and tracking
- Optical flow
- Local and global motion models
- Lucas-Kanade Tracker (KLT)
- Mean shift algorithm
- Face detection
Camera model
- Perspective geometry
- Camera calibration
- Stereo vision
- Structure from motion
Misc techniques
- Bag of words
- Hough transform
- Deformable part models

Projects and presentation ideas

For presentations, please only pick more recent works (2010 and beyond). For projects, it is okay to work on older topics. And you are expected to try to test and implement the methods for the project. For the presentation, it is sufficient as far as you understand the technique completely. You may also try to borrow idea from the list below.

James Tompkin's page
Snavely's page
Lazebnik's page
Li Zhang's page
CVPR 2017 Open Access
Arxiv-sanity: Good starting point for searching papers

Grading

Programming/written assignments/other activities: 70%
Final Project: 30%
- Presentation: (10 out of 30)
  - clarity, structure, references
  - background literature survey, good understanding of the problem
  - good insights and discussions of methodology, analysis, results, etc.
- Technical: (10 out of 30)
  - correctness
  - depth
  - innovation
- Evaluation and results: (10 out of 30)
  - sound evaluation metric
  - thoroughness in analysis and experimentation
  - results and performance
Presentation: graduate students only (20% of final grade, other components adjusted accordingly)
Final grade:
- A: above 85%
- B: above 70% but not more than 85%
- C: above 55% but not more than 70%
- D: above 40% but not more than 55%
- F: not more than 40%

Late Policy

There will be 5% deduction per each late day for all submissions
The deduction will be saturated after 10 days. So you will get half of your marks even if you are super late

Reasonable Accommodation Policy

Any student in this course who has a disability that may prevent the full demonstration of his or her abilities should contact me personally as soon as possible so we can discuss accommodations necessary to ensure full participation and facilitate your educational opportunities.

Should you need modifications or adjustments to your course requirements because of documented pregnancy-related or childbirth-related issues, please contact me as soon as possible to discuss. Generally, modifications will be made where medically necessary and similar in scope to accommodations based on temporary disability. Please see this for commonly asked questions.

Calendar

	Topics	Materials
1/16	Introduction to Computer Vision	overview
1/18	Pixels, color, white balancing	color
1/23	Filter	filter
1/25	Edge detection	edge-detection
1/30	Frequency in images	frequency, LiveFT
2/1	Image resampling and image pyramids	resampling, pyramids
2/6	Harris corner detection	Harris corner detector
2/8	SIFT, Feature matching	Local feature extraction
2/13	Image alignment, RANSAC	Image alignment
2/15	Segmentation, k-mean, agglomerative clustering, mean-shift	segmentation
2/20	Class cancelled due to weather condition
2/22	Class cancelled due to weather condition
2/27	LIFT: Learned Invariant Feature Transform (slides), introduction to optical flow	Optical flow
3/1	Modeling the World from Internet Photo Collections (slides), Computer Vision at the Dawn of Transportation Autonomy (slides)	Photo tourism
3/6	Optical flow
3/8	Introduction of Artificial Neural Networks and Applications, Feature Identification and Tracking in Weather Radar Applications
3/13	Object Tracking, KLT	tracking, KLT
3/15	HW5 explanation
3/20	spring break
3/22	spring break
3/27	Pedestrian tracking using GeneralizedMaximum Multi Clique Problem (GMMCP) (slides), Stable Multi-Target Tracking in Real-Time Surveillance Video (slides)
3/29	Point Cloud Library (slides), Hough transform (screencast)
4/3	Object recognition, bag of features	Object recognition introduction
4/5	Dalal-Triggs detection algorithm, deformable part models (screencast)	Dalal-Triggs and DPM
4/10	Viola-Jones face detection, Eigenfaces, and Fisherfaces	Viola Jones and Eigenface
4/12	Introduction to deep learning	deep learning
4/17	Backpropagation algorithm
4/19	Convolutional neural networks	Convnet
4/24	Camera model and calibration (video)	camera model
4/26	Introduction to Epipolar geometry (video)	Epipolar geometry