【DigiKey“智造万物，快乐不停”创意大赛】7，TTS功能实现

顺竿爬 · 发表于2023-12-21 11:02

【DigiKey“智造万物，快乐不停”创意大赛】7，TTS功能实现 [复制链接]

TTS功能的实现基于的是edge-tts这个库，这个库是基于爬虫方法进行实现的。

首先是安装这个库：

```

pip3 install edge-tts

```

用以下命令可以列出所有支持的语言角色：

···

edge-tts --list-voices

···

测试代码如下：

```

#!/usr/bin/env python3

import edge_tts

import pydub

import io



async def tts(text, actor = "zh-CN-XiaoyiNeural", fmt = "mp3"):

    _voices = await edge_tts.VoicesManager.create()

    _voices = _voices.find(ShortName=actor)

    _communicate = edge_tts.Communicate(text, _voices[0]["Name"])

    _out = bytes()

    async for _chunk in _communicate.stream():

        if _chunk["type"] == "audio":

            # print(chunk["data"])

            _out += _chunk["data"]

        elif _chunk["type"] == "WordBoundary":

            # print(f"WordBoundary: {chunk}")

            pass

    if fmt == "mp3":

        return _out

    if fmt == "wav":

        _raw = pydub.AudioSegment.from_file(io.BytesIO(_out))

        _raw = _raw.set_frame_rate(16000)

        _wav = io.BytesIO()

        _raw.export(_wav, format="wav")

        # for i in range(len(_wav.getvalue())-1,-1,-1):

        #     if _wav.getvalue()[i] != 0x00:

        #         break

        return _wav.getvalue()#[:i+1]



if __name__ == "__main__":

    import asyncio

    import pydub.playback

    while True:

        text_in = input(">说点什么：")

        raw_wav = asyncio.run(tts(text_in, actor = "zh-CN-XiaoyiNeural", fmt = "wav"))

        wav = pydub.AudioSegment.from_file(io.BytesIO(raw_wav))

        pydub.playback. _play_with_pyaudio (wav)

```

这里强制使用了pyaudio方法来播，是为了指定播放设备，因为我们这里要使用I2S HAT来进行播放，而不是默认设备。先使用aplay -l查询设备编号，接着修改库的源代码，在site-packages/pydub/playback.py中第26行中添加output_device_index=1, 完整函数如下：

···

def _play_with_pyaudio(seg):

    import pyaudio

    p = pyaudio.PyAudio()

    stream = p.open(format=p.get_format_from_width(seg.sample_width),

                    channels=seg.channels,

                    rate=seg.frame_rate,

                    output_device_index=1,

                    output=True)

    # Just in case there were any exceptions/interrupts, we release the resource

    # So as not to raise OSError: Device Unavailable should play() be used again

    try:

        # break audio into half-second chunks (to allows keyboard interrupts)

        for chunk in make_chunks(seg, 500):

            stream.write(chunk._data)

    finally:

        stream.stop_stream()

        stream.close()

        p.terminate()

···

genvex · 发表于2023-12-21 20:53

【DigiKey“智造万物，快乐不停”创意大赛】7，TTS功能实现 [复制链接]

最新回复