Professor Zhigang Zhu
Department of Computer Science
The City College of New
York and Graduate Center
The City University of New York (CUNY)
Time: Tuesday 6:30 - 8:30 pm
Room: 3209
Credits: 3.0
Office Hours: Tuesday 4:30 - 6:00 pm
Rm 4439
Data Mining has become one of the most exciting
and fastest
growing fields in computer science. Data Mining refers to various
techniques
which can be used to uncover hidden information from a database. The
data to be
mined may be complex, multimedia data including text, graphics, video,
audio
and bioinformatics data. Data Mining has evolved from several areas
including:
databases, artificial intelligence, machine learning, pattern
recognition, multimedia
information retrieval, and can be applied to the exploration of hidden
information from web, text, image, audio, video, and bioinformatics
data.
Multimedia
data mining is also related to multimedia data compression. Data
compression is the technique to reduce the redundancies in data
representation in order to decrease data storage requirements, and
hence communication overloads when transmitted through a communication
network. It the compressed data are properly indexed, it may improve
the performance of mining data in the compressed large database as
well. This is particularly useful when interactivity is involved with a
data mining system.
This course
is
designed to provide graduate students with introductory of multimedia
data
compression and data mining concepts and tools, and to some extend,
their connections. In addition, the students in the class is going to
explore the literature on the state-of-the-art research and development
in some advanced topics such as web mining, image/video mining and
bioinformatics.
The course will consist of lectures by the instructor, and
presentations by students for their
readings and project assignments.
1. Introduction,
Related Topics & Course Organization (pdf)
2. Overview of
Multimedia Data Compression ( pdf )
3. Overview of Multimedia Data Mining ( pdf )
2. Clustering: hierarchical, partitional, clustering in large database
3. Associate Rules: basic and advanced algorithms
1. Information theory concepts
2. Data compression
issues: models, measures and algorithms
3. Text
compression: LZ77, LZ78, LZW algorithms
4. Image compression:
principles, JPEG, JPEG2000
5. Video compression: MPEG-2, MPEG-4/7
1. Text Mining:
keyword-basd, text retrieval, similarity-based, etc.
2. Web Mining: contents, structure and usage
2. Image/Video Mining: CBIR, video event detection
3. Bioinformatics: biology preliminaries, information aspects, microarray data clustering
Students who take the course for credits
will be required
(1) to finish 2 assignments consisting of mainly paperwork (20%);
(2) to attend class lectures, presentations and discussions (20%);
(3) to give two presentations (40 minutes and 10 minutes each) to
the class on their reading
assignments (30%); and
(4) to submit a report of a synthetic proposal on their readings and/or
designs, and to give a
presentation ( 1 hour) in class (30%).
Copyright @ Zhigang
Zhu ( zhu at cs.ccny.cuny.edu
), 2007.