Remote Hearing: a Novel Use of the LDV Sensor
Laser Doppler vibrometer (LDV) is a non-contact,
remote and high resolution voice detector. Vibration of the objects
caused by voice reflects the voice itself. After the enhancement with
a Gaussian bandpass filtering, Wiener filtering and an adaptive
volume scaling, the LDV
voice signals were mostly intelligible from targets without
retro-reflective finishes at short or medium distances (< 100m). By
using retro-reflective tapes, the distance could be as far as 300
meters.
Here a few audio clips (in mp3 or wav formats) captured by the
LDV, before and
after processing.
Experiment 1.
The waveform of the original signal and the results of fixed
scaling and adaptive scaling, after using suitable filtering. The short
audio clip reads “I am whispering…(noise)… OK … Hello (noise)”, which
was captured by the LDV OFV-505 from a metal cake-box carried by a
person at a distance of about 30 meters from the LDV. The surface of
the target was treated by a piece of retro-tape.
(a)
Original LDV signal
(b) x1 after
band-pass filtering
(c) x8 after
band-pass filtering
(d) adaptive scaling
after band-pass filtering
Experiment
2. Long
range LDV listening experiment. A metal cake box (left) is used, with a
piece
of 3M traffic retro-tape pasted. The laser spot can be clearly seen. The
signal return
of the LDV is insensitive to the incident angles of the laser beam,
thanks to
the retro-tape finish. Both normal
speech volumes and whispers have been successfully detected. The size of the laser spot changed from less
than 1 mm to about 5-10 mm when the range changed from 30 to 300
meters. The
noise levels also increased from 2 mV to 10 mV out of the total range
of 20 V
analogous LDV signals. The 260-meter measurement was obtained when the
target
was behind trees and bushes. With longer ranges, the laser is more
difficult to
localize and focus, and the signal return becomes weaker.
Therefore, the noise levels become larger.
Within 120 meters, the LDV voice is obviously intelligible; at
260-meter
distance, many parts of the speech could be identified, even with some
difficulty. For all the distances, the signal processing plays a
significant
role in making the speech intelligible. Without processing, the audio
signal is
buried in the low-frequency large-amplitude vibration and
high-frequency
speckle noises.
Table
1. Long range LDV listening via retro-tape on a cake box and Gaussian
band-pass filtering
Experiment
3. LDV voice
enhancement comparison (please click the
spectrograms to hear the
corresponding audio clips). The
LDV audio signal was captured 100 feet away by aiming the laser beam at
a metal cake
box (without retro-reflective finish), and the clean signal was
captured using
the wireless microphone connected to a laptop placed next to the target
(i.e.,
the metal box).
(a)(b)
(c)(d)
(e)(f)
The spectrogram of (a)
original LDV
signal (b) Gaussian bandpass filtered signal (c) Wiener filtered signal
(d)
Wiener filtered + Gaussian bandpass filtered signal (e) Wiener filtered
+ Hann
bandpass filtered signal (f) clean signal. All correspond to the speech
of "Hello, Hello".
Experiment
4. Comparison of
the SNR values of LDV audio signals
enhanced by various methods, namely
Gaussian bandpass only, Wiener filter
only, and the combined approach. Two possible combination strategies,
i.e., bandpass filter followed by Wiener filter (BW) and Wiener filter
followed by bandpass (WB), are conducted and the results are shown in
the last column of
Table 2. Three different types of reflecting surfaces are tested: the
small
empty mental cake box (with retro-tape), the mental box surface itself
(without
retro-tape), and the wood hose box surface. They are all 100 feet away
from the
sensor head.
Table
2: The segmental SNRs (dB) - click on numbers to hear the audio clips
*BW: Gaussian Bandpass followed by
Wiener
filter; WB:
Wiener filter followed by Gaussian Bandpass