语音字符

Phonetic characters to speech

本文关键字：字符语音更新时间：2023-10-16

我的目的是能够让我的应用程序用不太流行的语言（例如闽南语、马来语等）进行对话。我目前的方法是使用录制的mp3。

我想知道.net或任何平台上是否存在"语音转语音"引擎？

这里的拼音字符就像纸质字典中的拼音条目一样。知道吗？

您需要的是一个大型词汇TTS引擎。微软有一个语音SDK，可以让你边打字边说话，还有Windows SAPI（speech API-不确定SDK和API是否相同）。我知道他们在英语方面确实有男性和女性的声音，但可能在马来语等其他语言方面没有（那里可能还没有太多市场）。你可能想看看CMU的Festival Project。他们通常有很多不同语言的声音，但有些不太为人所知的声音可能不如英语的声音发达。

进一步更新：
查看MBROLA网站。这是一个开发多语言大型vocab TTS引擎的开源项目，它们也有马来语扩展。我不知道它有多好。我试了一下印地语，觉得还有很多工作要做。

此外，请查看BabelFish网站。他们有很多免费TTS引擎的链接，这些引擎应该对马来语有一些支持。

更新3:我不知道这是否适合您的目的，但如果应用程序必须说出的文本较低，那么您也可以尝试concatenative speech synthesis而不是limited vocabulary。用马来语（或任何其他语言）录制句子片段，并将程序的输出传递给您自己的有限vocab-tts引擎，在那里您可以创建输出。一个例子可以是（英语）："was the most valve player"。在这里，"was the most valve player"变成了一个片段，而"player X"可以随意更改。如果这符合你的目的，应该会很好地发挥作用。

您看过System.Speech名称空间吗？

特别是System.Speech.Synthesis和System.Speech.Synthesis.TtsEngine名称空间。

以下是VB.NET代码：

'create the object. This object will store your phonetic 'characters'
Dim PBuilder As New System.Speech.Synthesis.PromptBuilder
'add your phonetic 'characters' here. Just ignore the first parameter.
'The second parameter is your phonetic 'characters'
PBuilder.AppendTextWithPronunciation("test", "riːdɪŋ")
'now create a speaker to speak your phonetic 'characters'
Dim SpeechSynthesizer2 As New System.Speech.Synthesis.SpeechSynthesizer
'now actually speaking. It will speak 'reading'
SpeechSynthesizer2.Speak(PBuilder)

这是转换后的C#代码：

//create the object. This object will store your phonetic 'characters'
System.Speech.Synthesis.PromptBuilder PBuilder = new System.Speech.Synthesis.PromptBuilder();
//add your phonetic 'characters' here. Just ignore the first parameter.
//The second parameter is your phonetic 'characters'
PBuilder.AppendTextWithPronunciation("test", "riːdɪŋ");
//now create a speaker to speak your phonetic 'characters'
System.Speech.Synthesis.SpeechSynthesizer SpeechSynthesizer2 = new System.Speech.Synthesis.SpeechSynthesizer();
//now actually speaking. It will speak 'reading'
SpeechSynthesizer2.Speak(PBuilder);

.Net System.Speech.Synthesis.PromptBuilder类将从SSML字符串创建音频。您可以使用这些来从原始音素和采样音频中构建声音。音频与语言无关。

也许是这样？System.Speech.Recognition.SrgsGrammar.ScsPhoneticAlphabet

我尝试过System.Speech.Synthesis。PromptBuilder。我不得不说，当前语音字符的实现非常初级，而且不准确。例如，PromptBuilder缺少语音语调，单词中缺少重音强调。PromptBuilder只能输出单调和机器人的声音，这很烦人。

我的建议是继续使用你目前的方法。从翻译完美语音字符所需的时间来看，使用mp3传递信息更自然，也更具成本效益。