Spectrogram Audio Movement Detection

조회 수: 6 (최근 30일)
LabRat . 2022년 8월 13일
댓글: William Rose . 2022년 8월 14일
I am looking for a solution to automatically detect audio movement or occlusion. I plan on having an object constantly play a sound close to it's microphone. I expect that from moving the location of the object in space toward ie. walls, into boxes, etc, will result volume changes in certain frequencies in the spectrogram. Whereas, when the object is away from walls and not moving, a somewhat constant spectrogram would result.
Is there an easy way to automatically detect:
-when the object is close to a wall, in a box, etc? (recorded tone spectrogram will be different than spectrogram of direct/original audio spectrogram on computer [transfer function])
-when the object is moving and being touched (changes in the spectrogram will occur in real time)
I am considering using the deep learning toolbox and the audio toolbox.
  댓글 수: 3
LabRat 2022년 8월 13일
Eventually, I'd like to be able to do both. But it might be better to start simpler and try to just detect any significant change in volume of certain frequency ranges (object being touched or held). Do you have any suggestions for that?
William Rose
William Rose 2022년 8월 14일
@LabRat, Yes I have a suggestion. Decide frequency ranges. Suppose you choose nine octaves from 25 Hz - 12.8 kHz: 25-50, 50-100, ..., 6400-12800.
Decide on a time reslution, for example 0.5 second. IN other words, you will compare the spectrogram in 0.5 second long chunks.
THen collect data for say 1 minute. THe first 10 seconds should be nothing moving, nothing being touched. For this 10 second period you will have 20 spectrograms, each 0.5 seconds long. Find the mean and SD of the log amplitude for each octave (i.e. do the anlaysis on the intensity masured in dB). Save that info. Then nalyze the remaining 50 seconds=100 spectrograms. See if any of the frequency bands are more than 3 SD's louder or softer than the baseline condition. That at least gets you started.
In the above analysis, you will have done a statistical test on 9 bands x 100 times = 900 tests. If the data are normall distributed, and you set your threshold at 3 standard deviaitons, you expect a hit in about 0.27% of cases, due to chance alone, ven if nothing really changed. So if you do 900 tests, you expect about 2.4 false alarms. You can reduce false alarms by only tiggering when the value is more than +-4 SD's away ffrom the mean , but you may miss some real changes. This is the standard tradeoff with vent detection. You have to decide a level that you are comfortable with.

댓글을 달려면 로그인하십시오.

답변 (1개)

William Rose
William Rose 2022년 8월 13일
You asked: "Is there an easy way to automatically detect..."
No. There is not an easy way. As you know, a spectrogram has many frequencies, and analyzing the spectrogram frm one moment to the next, to decide if it has changed, or if the observed change is "just noise", is not easy. It requires experimentaion and undesrstanding of the particular system being tested.
Consider the system in simple terms, and then gradually add complications to make it more like your real situation.
  1. Starting point: sound source and microphone in an open area. There are no echo effects in this case. Movement will affect the spectrogram because of the inverse square law relating itensity to distance.
  2. Add a partially reflective wall to the picture. Assume for the moment that it reflects all frequencies equally well. Now you can detect motion based on the combination of the inverse square law plus the echo effect. The echo effect adds a time-delayed copy (at lower amplitude) of the primary wave. The presence of a time-delayed copy in the masured signal can best be detected with autocorrelation analysis. The you might reasonably ask, At what time scale must I measure autocorrelation? If the source is 1 m away from the reflecting wall, and the microphon is in the opposite dirction from th wall, then the echo must go 2 m farther to reach the microphone. That takes 6 milliseconds, since the speed of sound in air at sea level and normal temperature is about 340 m/s. THerefore autocorrelation anaysis out to 20 msec will be sufficient to detect audio paths that are up to 7 m longer than the direct path from source to microphone. There is a large literature on this subject. One example is here. Or anything by Leo Beranak (multiple books and articles), who has pioneered the analysis of concert hall acoustics. Beranak also co-founded Bolt Beranak and Newman (BBN), the company that "started the internet".
Collect some data from a stationary source (preferably a wide band source - I recommend white noise). Then move the source around. Explore the spectrogram. Measure the autocorrelation out to 20 msec lag, every 100 msec. See if whether the spectrogram or the autocorrelation show obvious or subtle changes. Then think about moving on to deep learning. That's my recommendation.


Help CenterFile Exchange에서 Audio Processing Algorithm Design에 대해 자세히 알아보기




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by