【DigiKey“智造万物,快乐不停”创意大赛】3,语音识别模型调试
[复制链接]
语音识别使用的是speech_recognition库,首先创建虚拟环境,并安装相应的包
```
python3 -m venv .venv
source .venv/bin/activate
pip install SpeechRecognition
sudo apt install ffmpeg flac
```
接着,我们还需要修改一下这个库的源代码,因为库默认使用的16bits数据,但是一般I2S数据都是24bits数据,32bit空间,因此我们需要更改一下库文件site-packages/speech_recognition/__init__.py的第94行,class Microphone(AudioSource)中:
```
# self.format = self.pyaudio_module.paInt16 # 16-bit int sampling
self.format = self.pyaudio_module.paInt32 # 32-bit int sampling
```
接着,回到我们自己的代码中,写下一下测试代码,注意我们主要用他识别中文,而google的中文语言选择的字符串不是标准的国家code,要按我代码中的方式写才可以成功识别。
```
import speech_recognition as sr
r = sr.Recognizer()
def obtain():
with sr.Microphone(device_index=1) as _source:
print("Width: ", _source.SAMPLE_WIDTH)
r.dynamic_energy_threshold = False
r.energy_threshold = 10000000
r.pause_threshold = 1.2
print(">说点什么:")
audio = r.listen(_source)
print("Processing...")
try:
text_input = r.recognize_google(audio, language="cmn-Hans-CN")
print("You said: " + text_input)
except sr.UnknownValueError as _error:
print("Google could not understand audio")
print(_error)
text_input = None
except sr.RequestError as _error:
print("Could not request results from Google")
print(_error)
text_input = None
return text_input
obtain()
```
运行以上代码,如果一切正常,应该可以在terminal中看到识别到的文字。
|