c++ - How to find silent parts in audio track -


i have following code stores raw audio data wav file in byte buffer:

byte header[74]; fread(&header, sizeof(byte), 74, inputfile); byte * sound_buffer; dword data_size;  fread(&data_size, sizeof(dword), 1, inputfile); sound_buffer = (byte *)malloc(sizeof(byte) * data_size); fread(sound_buffer, sizeof(byte), data_size, inputfile); 

is there algorithm determine when audio track silent (literally no sound) , when there sound level?

well, "sound" array of values, whether integer or real - depends on format.

for file silent or "have no sound" values in array have zero, or close zero, or worst case scenario - if audio has bias - value stay same instead of fluctuating around produce sound waves.

you can write simple function returns delta range, in other words difference between largest , smallest value, lower delta lower sound volume.

or alternatively, can write function returns ranges in delta lower given threshold.

for sake of toying, wrote nifty class:

template<typename t> class silencefinder { public:   silencefinder(t * data, uint size, uint samples) : sbegin(0), d(data), s(size), samp(samples), status(undefined) {}    std::vector<std::pair<uint, uint>> find(const t threshold, const uint window) {     auto r = findsilence(d, s, threshold, window);     regionstotime(r);     return r;   }  private:   enum status {     silent, loud, undefined   };    void togglesilence(status st, uint pos, std::vector<std::pair<uint, uint>> & res) {     if (st == silent) {         if (status != silent) sbegin = pos;         status = silent;       }     else {         if (status == silent) res.push_back(std::pair<uint, uint>(sbegin, pos));         status = loud;       }   }    void end(status st, uint pos, std::vector<std::pair<uint, uint>> & res) {     if ((status == silent) && (st == silent)) res.push_back(std::pair<uint, uint>(sbegin, pos));   }    static t delta(t * data, const uint window) {     t min = std::numeric_limits<t>::max(), max = std::numeric_limits<t>::min();     (uint = 0; < window; ++i) {         t c = data[i];         if (c < min) min = c;         if (c > max) max = c;       }     return max - min;   }    std::vector<std::pair<uint, uint>> findsilence(t * data, const uint size, const t threshold, const uint win) {     std::vector<std::pair<uint, uint>> regions;     uint window = win;     uint pos = 0;     status s = undefined;     while ((pos + window) <= size) {         if (delta(data + pos, window) < threshold) s = silent;         else s = loud;         togglesilence(s, pos, regions);         pos += window;       }     if (delta(data + pos, size - pos) < threshold) s = silent;     else s = loud;     end(s, pos, regions);     return regions;   }    void regionstotime(std::vector<std::pair<uint, uint>> & regions) {     (auto & r : regions) {         r.first /= samp;         r.second /= samp;       }   }    t * d;   uint sbegin, s, samp;   status status; }; 

i haven't tested looks should work. however, assumes single audio channel, have extend in order work , across multichannel audio. here how use it:

silencefinder<audiodatatype> finder(audiodataptr, sizeofdata, samplerate); auto res = finder.find(threshold, scanwindow); // , output silent regions (auto r : res) std::cout << r.first << " " << r.second << std::endl; 

also notice way implemented right now, "cut" silent regions abrupt, such "noise gate" type of filers come attack , release parameters, smooth out result. example there might 5 seconds of silence tiny pop in middle, without attack , release parameters, 5 minutes split in two, , pop remain, using can implement varying sensitivity when cut off.


Comments

Popular posts from this blog

node.js - Mongoose: Cast to ObjectId failed for value on newly created object after setting the value -

[C++][SFML 2.2] Strange Performance Issues - Moving Mouse Lowers CPU Usage -

ios - Possible to get UIButton sizeThatFits to work? -