ECE-5973: Computer Vision

Vision is the most important sense organ of humans and many other animals. We rely on our vision systems to explore the surrounding, recognize objects, and perform complex task such as driving our cars and manipulating tools. There have been drastic advancements in the field computer vision for the last decade. Many vision tasks, such as face recognition, that were very difficult a couple decades ago have become routine components even in our cellphone applications. However, other vision tasks, such as scene understanding and image captioning, are still rather difficult and existing computer vision algorithms perform poorly comparing to humans. Therefore, computer vision is still a very active research area.

The goal of this course is multiple folds: to provide students the core foundation so that they can understand the existing CV algorithms behind those systems, to involve them with hands-on experience in implementing existing algorithms so that they can build basic CV systems on their owns, and to prepare them with sufficient technical depth so that they can contribute to CV research after the course.

Prerequistes

Calculus (MATH 1914 or equivalent), linear algebra (MATH 3333 or equivalent), basic probability (MATH 4733 or equivalent), and intermediate programming skill (MATLAB or Python/Numpy is required)

Disscusion forum

Please sign up Discord through the link on Canvas. Please raise any questions, comments, or concerns there. You may contact me privately through Discord as well.

Office Hours and TA

There are no “regular” office hours. Please contact me through Discord. I typically respond within 24 hours.

TA is Zhihao Zhao (zhihao.zhao@ou.edu).

Graduate Credit

If you are enrolled in the graduate section, I will have a bit higher expectation on your final project. Moreover, you are expected to give a short presentation in class for a current topic. The presentation will contribute 20% of your final grade and other components (as shown below) will be scaled down accordingly.

Textbook

  • Computer Vision: Algorithms and Applications by Richard Szeliski. The book is available for free online or available for purchase. It is not required but is a very good reference.

  • Computer Vision: Models, Learning, and Inference by Smon J.D. Prince. The book is also available for free here

  • Fundamental of Computer Vision by Mubarak Shah

Read more

Some nice talks

Course Syllabus (Tentative)

  • Overview of computer vision

  • Low-level processing

    • Filtering techniques

    • Edge detectors

    • Image resampling and image pyramids

    • Interest point and corner detectors

  • Object recognition and tracking

    • Optical flow

    • Local and global motion models

    • Lucas-Kanade Tracker (KLT)

    • Mean shift algorithm

    • Face detection

  • Camera model

    • Perspective geometry

    • Fundamental matrix

    • Camera calibration

    • Stereo vision

  • Misc techniques

    • Bag of words

    • Hough transform

    • Deformable part models

Projects

The goal of the project is actually to get your hands dirty. There is a broad scope of valid and sound projects. At one extreme, you may do a more research-oriented project involving improving a tiny scope of computer vision. For example, you may have an idea to enhance edge detection and would like to try it out. You do not need to build a complete application for this kind of project, but you must include sufficient comparisons to prior approaches to give a fair evaluation. And of course, you have to conduct adequate literature studies to include the most representative methods for comparison. For example, if you propose a new edge detection method, you should compare it with the Canny detector for the minimum.

On the other extreme, you can also do an application-oriented project. In that case, you will bother less with the individual techniques. But try to build an interesting application to solve some problems. For example, you may build a prototype app to take student attendance. You may set the camera to a fixed location and when a student moved towards the camera. They will be automatically recognized and have their attendance recorded. In this kind of project, you will focus more on what technologies/methods are needed for your application (say, in my case, I will definitely need face recognition). Then, you may want to search if there is an open-source implementation of such a module that can use along with OpenCV (should be many). And your main job will be piecing everything together.

Of course, your project can fall in the middle of the above.

The bottom line, your final report should be able to convince the readers that you have put sufficient effort into pursuing what you planned to do. So it is crucial to keep a detailed log of what you have been doing. In case things didn't go well, you can at least document your unsuccessful attempts in the final report.

Graduate presentation

For presentations, please only pick more recent works (2010 and beyond). For projects, it is okay to work on older topics. And you are expected to try to test and implement the methods for the project. For the presentation, it is sufficient as far as you understand the technique completely. You may also try to borrow idea from the list below.

Grading

Undergrad

  • Programming/written assignments/other activities: 60%

  • Final Project: 40%

    • Project proposal: (5 out of 40)

    • Progress report: (10 out of 40)

    • Final report: (25 out of 40)

  • Presentation: graduate students only (20% of final grade, other components adjusted accordingly)

  • Participation and quizzes bonus: maximum 20%

Graduate

  • Programming/written assignments/other activities: 40%

  • Final Project: 40%

    • Project proposal: (5 out of 40)

    • Progress report: (10 out of 40)

    • Final report: (25 out of 40)

  • Presentation: graduate students only (20% of final grade, other components adjusted accordingly)

  • Participation and quizzes bonus: maximum 10%

Final grade:

  • A: above 90%

  • B: above 80% but not more than 90%

  • C: above 70% but not more than 80%

  • D: above 60% but not more than 70%

  • F: not more than 60%

Late Policy

  • There will be 5% deduction per each late day for all submissions

  • The deduction will be saturated after 10 days. So you will get half of your marks even if you are super late

Reasonable Accommodation Policy

Any student in this course who has a disability that may prevent the full demonstration of his or her abilities should contact me personally as soon as possible so we can discuss accommodations necessary to ensure full participation and facilitate your educational opportunities.

Should you need modifications or adjustments to your course requirements because of documented pregnancy-related or childbirth-related issues, please contact me as soon as possible to discuss. Generally, modifications will be made where medically necessary and similar in scope to accommodations based on temporary disability. Please see this for commonly asked questions.

Title IX Resources

For any concerns regarding gender-based discrimination, sexual harassment, sexual misconduct, stalking, or intimate partner violence, the University offers a variety of resources, including advocates on-call 24.7, counseling services, mutual no contact orders, scheduling adjustments and disciplinary sanctions against the perpetrator. Please contact the Sexual Misconduct Office 405-325-2215 (8-5, M-F) or OU Advocates 405-615-0013 (24.7) to learn more or to report an incident.

Calendar

Topics Materials
1/26 Introduction to Computer Vision (video) overview, better and bitter lessons of AI
1/28 Introduction to Linux and Python (video) OpenCV getting start, OpenCV capture video
2/02 More introduction to Python, color (video) color
2/04 Introduction to OpenCV (video) OpenCV color examples, filter
2/09 Decomposable filters (video) A short introduction of signal and system
2/11 Review of signals and systems (video)
2/16 Cancelled due to weather condition
2/18 Fourier Transform (video) frequency, OpenCV filtering examples
2/23 Image resampling and image pyramids (video) resampling, pyramids
2/25 Edge detection (video) edge-detection, OpenCV edge detection
3/02 Image warping (video) Image warping
3/04 cvui, Harris corner detection (video) Harris corner detector, OpenCV corner detector demo
3/09 Local feature extractor, SIFT (video) Local feature extraction
3/11 LBP, feature matching, image alignment (video) Image alignment
3/16 Presentations
3/18 Presentations
3/23 Presentations
3/25 Presentations, Hough transform (video) OpenCV Hough transform examples
3/30 Optical flow, Lucas-Kanade, Horn-Schunck and Gunnar Farneback Algorithms (video) Optical flow
4/01 Optical flow demo (video)
4/06 Instructional Holiday
4/08 Object Tracking, KLT (video) Object tracking, KLT tracker, camera model, point cloud demo, OpenCV calibration demo
4/13 Camera models (video) Epipolar geometry, OpenCV epipolar demo
4/15 Camera calibration (video)
4/20 introduction to Epipolar geometry (video) epipoloar geometry and fundamental matrix
4/22 Introduction to deep learning (video) deep learning, backprop example and weight initialization
4/27 SVM, multi-layer perceptron, backpropagation algorithm (video)
4/29 Convolutional neural networks (video) CNN, CNN applications)