Computer Science - The City College of New York
CSC 471 - Fall 2018 3D Computer Vision
Assignment 1 ( Deadline: Sept 16 Sunday before midnight)
======================================================================
Note: All the writings of your assignment must be in "soft" copies
(in a single PDF file) by sending to Prof. Zhu
<cv.zhu.ccny@gmail.com> via an email attachment. You are
responsible for the lose of your submissions if you don't
include "CSc 471 Computer
Vision Assignment 1 " (exactly) in the subject of your email and write
your name in your email body. For your programming part, in addition
to the writing report, please also send your source code - in
their original formats; please don't format them into PDF or Word
formats. Please don't send in your images and executable. You
may want to include images as they show results of your work. Do
write your names and IDs (last four digits) in both files for your
report and the code.
1. Writing Assignments (10x4 = 40 points)
(1). How does an image change (e.g., objects' sizes in the
image, field of view, etc.) if the focal length of a pinhole
camera is varied?
(2). Give an intuitive explanation of
the reason why a pinhole camera has an infinite depth of field.
(3). In the thin lens model, 1/o + 1/i =
1/f, there are three variables, the focal length f, the object
distance o and the image distance i (please refer to Slide # 19 of
the Image Formation lecture). If we define Z = o-f, and z
= i-f, please
write
two a few words to describe the physical meanings of Z and z, and
then prove that Z*z = f*f given 1/o
+ 1/i = 1/f.
(4). Prove that, in the pinhole camera model, three
collinear points (i.e., they lie on a line) in 3D space are
imaged into three collinear points on the image plane. You may
either use geometric reasoning (with line drawings) or algebra
deduction (using equations).
2. Programming Assignments (Matlab preferred - here is a quick matlab tutorial. You
may use C++ or Java if you like, but you may need to bring
your own machine to me in my office hours to run
your programs when I ask you. If you don't have a Matlab
license, CUNY has
recently made several software available for use ( including
MathWorks MatLab), through the CUNY
Virtual Desktop. ) (15x4 = 60 points)
Image formation. In this small project, you are going to use
Matlab to read, manipulate and write image data. The purpose of the
project is to make you familiar with the basic digital image
formations. Your program should do the following things:
- Read in a color image C1(x,y) = (R(x,y), G(x,y), B(x,y)) in
Windows BMP format, and display it.
- Display the images of the three color components, R(x,y),
G(x,y) and B(x,y), separately. You should display three
black-white-like images.
- Generate an intensity image I(x,y) and display it. You should
use the equation I = 0.299R + 0.587G + 0.114B (the NTSC standard
for luminance) and tell us what are the differences of the
intensity image thus generated from the one using a simple
average of the R, G and B components.
- The original intensity image should have 256 gray
levels. Please uniformly quantize this image into K levels
( with K=4, 16, 32, 64). As an example, when
K=2, pixels whose values are below 128 are turned to
0, otherwise to 255. Display the four quantized
images with four different K levels and tell us how
the images still look like the original ones.
- Quantize the original three-band color image C1(x,y)
into K level color images CK(x,y)= (R'(x,y),
G'(x,y), B'(x,y)) (with uniform intervals) , and display them.
You may choose K=2 and 4 (for each band). Do they have any
advantages in viewing and/or in computer processing (e.g.
segmentation)?
- Quantize the original three-band color image C1(x,y)
into a color image CL(x,y)= (R'(x,y), G'(x,y),
B'(x,y)) (with a logarithmic function) , and display it. You may
choose a function I' =C ln (I+1) ( for each band),
where I is the original value (0~255) , I' is the quantized
value, and C is a constant to scale I' into (0~255),
and ln is the natural logarithm. Please find the best C
value so for an input in the range of 0-255, the output range is
still 0 - 255. Note that when I = 0, I' = 0 too.
Please for each of the above, provide your analysis /
observations / conclusions, rather than just show the experimental
results in images and/or charts.
I have provided a piece of starting code
for you to use. Questions a and b have been done. You only
need to work on c to f (15x4 = 60 points). You may use Prof. Zhu's old ID picture for testing
your algorithm.