Laser Doppler vibrometer (LDV) is a non-contact,
remote and high resolution voice detector. Vibration of the objects
caused by voice reflects the voice itself. After the enhancement with
a Gaussian bandpass filtering and an adaptive volume scaling, the LDV
voice signals were mostly intelligible from targets without
retro-reflective finishes at short or medium distances (< 100m). By
using retro-reflective tapes, the distance could be as far as 300
meters. Infrared (IR) imaging for target selection and localization was
also discussed for LDV listening. A system has been set up with three
types of sensors (IR cameras, PTZ color cameras and LDVs) for
performing integration of multimedia sensors in human signature
detection. The basic idea is to provide an advanced augmented interface
in order to give users the best cognitive understanding of the
environment, the sensors and the events.
The research challenge is, without retro-reflective tape treatment, the
LDV voice signals
were still very noisy from targets at medium and large distances.
Therefore, with the state-of-the-art sensor technology, more advanced
signal enhancement techniques are needed. Further sensor improvement is
also necessary. In addition, automatic targeting and intelligent
refocusing is a technical issue that deserves research attention for
long range LDV listening.
To
improve the performance and the efficiency of
Laser Doppler Vibrometers (LDVs) for long-range hearing, we design an
active
multimodal sensing platform that integrates a Pan-Tilt-Zoom (PTZ)
camera, a
mirror and a Pan-Tilt-Unit (PTU) to the LDV . With
the assistance of the vision and active
control components, the LDV can automatically select the best
reflective
surfaces, point the laser beam to the selected surfaces, and quickly
focus the
laser beam. For accomplishing these functions, distance measurement and
sensor
calibration methods are proposed using the triangulation between the
PTZ camera
and the mirrored LDV laser beam. Based on both the measured distances
and the
return signal levels of the LDV, a fast and automatic LDV focusing
algorithm is
designed. Furthermore, strategies of surface selection and laser
pointing are
designed for the platform to automatically point the laser beam to the
designated surfaces.
Z.
Zhu, W. Li, E.
Molina and G. Wolberg, LDV Sensing and Processing for
Remote Hearing in a Multimodal Surveillance System, Chapter 4 in Multimodal
Surveillance:
Sensors,
Algorithms
and
Systems,Z. Zhu and T. S. Huang (eds), ISBN-10:
1596931841, Artech House Publisher, July 2007, pp 59-90.
W.
Li, Z. Zhu and G.
Wolberg, Remote
Voice
Acquisition in Multimodal Surveillance, accepted
to IEEE International Conference on Multimedia & Expo (ICME),
Toronto,
Canada, July 9-12 2006, oral presentation, acceptance rate 22%
Professor
Thomas
Huang, University of Illinois at Urbana-Champaign (UIUC)
Professor
Ning Xiang, Rensselaer Polytechnic Institute (RPI)
Professor George Wolberg, Department
of Computer Science, The City College of
New York
Research Associate and Assistants
at CUNY:
Yufu Qu, Tao Wang, Edgardo Molina, Rui Li, Wai L.
Khoo, Weihong Li
Related Grants
NCIIA
E-Team Award,
“Automating Long-range Vibrometry through
Vision and Web Technologies” (#6629-09), PI: Z. Zhu, , 09/01/2009-
01/31/2011
AFRL/HECB, Award No.
F33615-03-1-63-83, Integration of
Laser Vibrometry with Infrared Video for Multimedia Surveillance
Displays, PI - Z. Zhu, 08/24/03 – 10/24/04
CUNY
Research
Equipment Grant Award, Integration
of Laser Vibrometry, Infrared and Video
for Multimodal Human Detection, Co-PIs - Z. Zhu and G. Wolberg
02/19/04-02/18/05