Readings in Media Processing:

Multimedia Data Compression and Data Mining 

CSc 80000, Spring 2007

Professor Zhigang Zhu
Department of Computer Science
The City College of New York  and Graduate Center
The City University of New York (CUNY)

Assignment 1

Please write a brief description of the following: (1) Your background and experiences that you think are related to data mining, such as courses, readings, research projects, work experience and/or hobbies. (2) Topics that you are interested in data mining. Please refer to the following reading topics. You might choose more than one of the topics in or out of the nine reading topics.  You will need to write at least two pages. Based on your description, I will discuss with you and then decide one reading topic for each of you.

Assignment 2

On the reading topic selected, each of you will need to give 2-3 PPT presentations and give me a final report on your synthetic project. Assignment 2 is designed for you to prepare your readings, presentations and reports, For this assignment, please do the following: (1) Sub-topics you are going to cover in your reading presentations. (2) A more focused sub-topic you are going to use as your synthetic project. (3) References you have selected (with full citations of authors, titles, sources, volumes, numbers, years, pages and publishers). Then I will help you to refine your reading topics and references.

Reading/Synthetic Topics (under editing)

1. Classification in Data Mining

- An Overview of Bayesian, KNN, ID3, ANN, rule-based etc (Ch. 5)
- A focussed subtopic: e.g., Support Vector Machine (Ch. 5.5)
- Synthetic project , e.g. multimodal video or image classification using SVM

W.-H. Lin and A. Hauptmann. News video classification using svm-based multimodal classifiers and combination strategies. In ACM Multimedia, Juan-les-Pins, France, 2002. http://citeseer.ist.psu.edu/lin02news.html 

Christopher J. C. Burges. "A Tutorial on Support Vector Machines for Pattern Recognition". Data Mining and Knowledge Discovery 2:121 - 167, 1998

Wikipedia SVM
Support Vector Machine Links

2. Clustering in Data Mining

- An overview of hierarchical, partitional, clustering in large database (Ch. 6)
- A focussed topic, e.g. model-based clustering (EM) (Ch. 6.3.1)
- Synthetic project, e.g. on Motion segmentation and object tracking using  EM

Hai Tao, Harpreet S. Sawhney, Rakesh Kumar, "Object tracking with Bayesian estimation of dynamic layer representations," IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), vol. 24, no. 1, pp. 75-89, 2002.

Wikipedia EM

3. Associate Rules in Data Mining

- An overview of the basic rules (Ch. 7)
- Some advanced algorithms with image and video mining /segmentation
- Synthetic project, e.g. associate rules in image mining(Ch. 7.11)

Carlos Ordonez and Edward Omiecinski. Discovering association rules based on image content. In IEEE ADL Conference, 1999. http://citeseer.ist.psu.edu/ordonez99discovering.html

Jelena Tešić, Shawn Newsam, and B.S. Manjunath, "Mining Image Datasets using Perceptual Association Rules,"  Proceedings of SIAM Sixth Workshop on Mining Scientific and Engineering Datasets in conjunction with the Third SIAM International Conference (SDM), San Francisco, California, May 2003.

4. Text Compression/Mining

- An overview of keyword-basd, text retrieval, similarity-based, etc. (Ch 9.2)
- Text compression (Ch 3.12), string matching and compressed pattern matching (Ch. 4.5)
- Synthetic project on latent semantic analysis (LSA) (Ch 9.25)

Latent semantic analysis - Wikipedia, the free encyclopedia
LSA @ CU Boulder

5. Web Mining

- An overview of contents, structure and usage mining (Ch. 9.5; see also Dunham's book)
- Multimedia data compression and mining on Web
- Synthetic project, e.g. on multimedia data compression and mining on Internet

Mining the Web for Object Recognition

6. Image Compression & Mining

- An overview of content-based image retrieval (Ch. 9.3)
- Image compression with DCT, wavelet and PCA (Ch. 3.8 - 3.11)
- Synthetic project, e.g. on appearance-based image matching using PCA (Ch 9.3.6)

Appearance-Based Robotics
Abstract: Image Mining by Matching Exemplars

7. Audio Compression & Mining

- An overview of phonetic audio mining, audio searching, speech analytics
- A focused subtopic, e.g. on video coding and mining compressed audio?
- Synthetic project, e.g. on searching spoken words in audio/video files?

http://jmdl.com/howard/audio-mining.html

8. Video Compression & Mining

- Overview of video mining (Ch. 9.4.2)
- Video coding and compression - MPEG 2, 4, 7 (Ch. 9.4.1 and online)
- Synthetic project, e.g. on content-based video coding and event detection

DIMACS Workshop on Video Mining, November 4-6, 2002
MERL – Video Mining
Video Representation With Three-dimensional Entities
Pedro. M. Q. Aguiar and José M. F. Moura, "Video Representation via 3D Shaped Mosaics." ICIP ’98, IEEE Proceedings of International Conference on Image Processing, Chicago, Illinois, October 1998.

9. Data Mining to Bioinformatics

- biology preliminaries (Ch. 10.2) & information aspects (Ch. 10.3)
- Approximate string matching (Ch. 4.4)
- Synthetic project, e.g. on microarray data clustering (Ch. 10.4) or LSA in bioinformatics

Application of latent semantic analysis to protein remote homology ...
Gene clustering by latent semantic indexing of MEDLINE abstracts

10. 3D Shape Representation and Graphic Mining

- Overview of  3D model representation
- 3D shape based retrieval and analysis
- Graphic mining?

3D Shape-Based Retrieval and Analysis at Princeton University


Textbook:

Data Mining: Multimedia, Soft Computing, and Bioinformatics, Sushmita Mitra, Tinku Acharya, ISBN: 0-471-46054-0, Hardcover, 424 pages, September 2003


Copyright @ Zhigang Zhu ( zhu at cs.ccny.cuny.edu ), 2007.