【DigiKey“智造万物,快乐不停”创意大赛】7,TTS功能实现
[复制链接]
TTS功能的实现基于的是edge-tts这个库,这个库是基于爬虫方法进行实现的。
首先是安装这个库:
```
pip3 install edge-tts
```
用以下命令可以列出所有支持的语言角色:
···
edge-tts --list-voices
···
测试代码如下:
```
#!/usr/bin/env python3
import edge_tts
import pydub
import io
async def tts(text, actor = "zh-CN-XiaoyiNeural", fmt = "mp3"):
_voices = await edge_tts.VoicesManager.create()
_voices = _voices.find(ShortName=actor)
_communicate = edge_tts.Communicate(text, _voices[0]["Name"])
_out = bytes()
async for _chunk in _communicate.stream():
if _chunk["type"] == "audio":
# print(chunk["data"])
_out += _chunk["data"]
elif _chunk["type"] == "WordBoundary":
# print(f"WordBoundary: {chunk}")
pass
if fmt == "mp3":
return _out
if fmt == "wav":
_raw = pydub.AudioSegment.from_file(io.BytesIO(_out))
_raw = _raw.set_frame_rate(16000)
_wav = io.BytesIO()
_raw.export(_wav, format="wav")
# for i in range(len(_wav.getvalue())-1,-1,-1):
# if _wav.getvalue()[i] != 0x00:
# break
return _wav.getvalue()#[:i+1]
if __name__ == "__main__":
import asyncio
import pydub.playback
while True:
text_in = input(">说点什么:")
raw_wav = asyncio.run(tts(text_in, actor = "zh-CN-XiaoyiNeural", fmt = "wav"))
wav = pydub.AudioSegment.from_file(io.BytesIO(raw_wav))
pydub.playback. _play_with_pyaudio (wav)
```
这里强制使用了pyaudio方法来播,是为了指定播放设备,因为我们这里要使用I2S HAT来进行播放,而不是默认设备。先使用aplay -l查询设备编号,接着修改库的源代码,在site-packages/pydub/playback.py中第26行中添加output_device_index=1, 完整函数如下:
···
def _play_with_pyaudio(seg):
import pyaudio
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(seg.sample_width),
channels=seg.channels,
rate=seg.frame_rate,
output_device_index=1,
output=True)
# Just in case there were any exceptions/interrupts, we release the resource
# So as not to raise OSError: Device Unavailable should play() be used again
try:
# break audio into half-second chunks (to allows keyboard interrupts)
for chunk in make_chunks(seg, 500):
stream.write(chunk._data)
finally:
stream.stop_stream()
stream.close()
p.terminate()
···
|