如何录制麦克风直到没有声音

How to record the microphone untill there is no sound?

本文关键字：声音何录制麦克风更新时间：2023-10-16

我创建了两个函数：-录制麦克风的-播放麦克风声音的

它记录麦克风3秒

#include <iostream>
#include <Windows.h>
#include <vector>
using namespace std;
#pragma comment(lib, "winmm.lib")
short int waveIn[44100 * 3];
void PlayRecord();
void StartRecord()
{
const int NUMPTS = 44100 * 3;   // 3 seconds
int sampleRate = 44100;  
// 'short int' is a 16-bit type; I request 16-bit samples below
// for 8-bit capture, you'd use 'unsigned char' or 'BYTE' 8-bit     types
HWAVEIN      hWaveIn;
MMRESULT result;
WAVEFORMATEX pFormat;
pFormat.wFormatTag=WAVE_FORMAT_PCM;     // simple, uncompressed format
pFormat.nChannels=1;                    //  1=mono, 2=stereo
pFormat.nSamplesPerSec=sampleRate;      // 44100
pFormat.nAvgBytesPerSec=sampleRate*2;   // = nSamplesPerSec * n.Channels *    wBitsPerSample/8
pFormat.nBlockAlign=2;                  // = n.Channels * wBitsPerSample/8
pFormat.wBitsPerSample=16;              //  16 for high quality, 8 for telephone-grade
pFormat.cbSize=0;
// Specify recording parameters
result = waveInOpen(&hWaveIn, WAVE_MAPPER,&pFormat,
0L, 0L, WAVE_FORMAT_DIRECT);
WAVEHDR      WaveInHdr;
// Set up and prepare header for input
WaveInHdr.lpData = (LPSTR)waveIn;
WaveInHdr.dwBufferLength = NUMPTS*2;
WaveInHdr.dwBytesRecorded=0;
WaveInHdr.dwUser = 0L;
WaveInHdr.dwFlags = 0L;
WaveInHdr.dwLoops = 0L;
waveInPrepareHeader(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));
// Insert a wave input buffer
result = waveInAddBuffer(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));

// Commence sampling input
result = waveInStart(hWaveIn);

cout << "recording..." << endl;
Sleep(3 * 1000);
// Wait until finished recording
waveInClose(hWaveIn);
PlayRecord();
}
void PlayRecord()
{
const int NUMPTS = 44100 * 3;   // 3 seconds
int sampleRate = 44100;  
// 'short int' is a 16-bit type; I request 16-bit samples below
// for 8-bit capture, you'd    use 'unsigned char' or 'BYTE' 8-bit types
HWAVEIN  hWaveIn;
WAVEFORMATEX pFormat;
pFormat.wFormatTag=WAVE_FORMAT_PCM;     // simple, uncompressed format
pFormat.nChannels=1;                    //  1=mono, 2=stereo
pFormat.nSamplesPerSec=sampleRate;      // 44100
pFormat.nAvgBytesPerSec=sampleRate*2;   // = nSamplesPerSec * n.Channels * wBitsPerSample/8
pFormat.nBlockAlign=2;                  // = n.Channels * wBitsPerSample/8
pFormat.wBitsPerSample=16;              //  16 for high quality, 8 for telephone-grade
pFormat.cbSize=0;
// Specify recording parameters
waveInOpen(&hWaveIn, WAVE_MAPPER,&pFormat, 0L, 0L, WAVE_FORMAT_DIRECT);
WAVEHDR      WaveInHdr;
// Set up and prepare header for input
WaveInHdr.lpData = (LPSTR)waveIn;
WaveInHdr.dwBufferLength = NUMPTS*2;
WaveInHdr.dwBytesRecorded=0;
WaveInHdr.dwUser = 0L;
WaveInHdr.dwFlags = 0L;
WaveInHdr.dwLoops = 0L;
waveInPrepareHeader(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));
HWAVEOUT hWaveOut;
cout << "playing..." << endl;
waveOutOpen(&hWaveOut, WAVE_MAPPER, &pFormat, 0, 0, WAVE_FORMAT_DIRECT);
waveOutWrite(hWaveOut, &WaveInHdr, sizeof(WaveInHdr)); // Playing the data
Sleep(3 * 1000); //Sleep for as long as there was recorded
waveInClose(hWaveIn);
waveOutClose(hWaveOut);
}
int main()
{
StartRecord();
return 0;
}

如何更改我的"开始录制"功能(我想我的"播放录制"功能也是如此)，使其录制到麦克风没有输入为止？

(到目前为止，这两个功能都很完美——录制麦克风3秒，然后播放录音)。。。

谢谢！

编辑：没有声音，我的意思是声音太低或其他什么(意味着这个人可能没有说话)。。。

因为声音是波，所以它在高压和低压之间振荡。该波形通常记录为正数和负数，零表示中性压力。如果你取信号的绝对值，并保持一个运行的平均值，这就足够了。

平均值应该在足够长的时间内得出，这样你就可以考虑到适当的沉默量。一种非常便宜的保持运行平均值估计的方法是这样的：

const double threshold = 50;    // Whatever threshold you need
const int max_samples = 10000;  // The representative running average size
double average = 0;             // The running average
int sample_count = 0;           // When we are building the average
while( sample_count < max_samples || average > threshold ) {
// New sample arrives, stored in 'sample'
// Adjust the running absolute average
if( sample_count < max_samples ) sample_count++;
average *= double(sample_count-1) / sample_count;
average += std::abs(sample) / sample_count;
}

max_samples越大，average对信号的响应就越慢。声音停止后，它会慢慢消失。然而，它再次上升的速度也会很慢。这对于合理连续的声音来说是可以的。

对于像语音这样可以有短时间或长时间停顿的东西，你可能想使用基于冲动的方法。你可以定义你期望的"静音"样本数量，并在收到超过阈值的脉冲时重置它。使用上面的运行平均值和更短的窗口大小将为您提供一种检测脉冲的简单方法。那你只需要数数。。。

const int max_samples = 100;             // Smaller window size for impulse
const int max_silence_samples = 10000;   // Maximum samples below threshold
int silence = 0;                         // Number of samples below threshold
while( silence < max_silence_samples ) {
// Compute running average as before
//...
// Check for silence.  If there's a signal, reset the counter.
if( average > threshold ) silence = 0;
else ++silence;
}

调整threshold和max_samples可以控制弹出和点击的灵敏度，而max_silence_samples可以控制在停止录制之前允许的静音程度。

毫无疑问，有更多的技术方法可以实现你的目标，但首先尝试简单的方法总是很好的。看看你怎么做。

我建议您通过DirectShow来完成。您应该创建麦克风、SampleGrabber、音频编码器和文件编写器的实例。你的图表应该是这样的：

麦克风->采样器->音频编码器->文件写入程序

每个样本都通过SampleGrabber，您可以读取所有原始样本，并检查是否应继续记录。这是你和双方记录和检查其内容的最佳方式。