MBROLA
MBROLA is speech synthesis software as a worldwide collaborative project. The MBROLA project web page provides diphone databases for many[1] spoken languages.
Original author(s) | Thierry Dutoit |
---|---|
Developer(s) | Vincent Pagel |
Initial release | 1995 |
Stable release | 3.3
/ 17 December 2019 |
Repository | github |
Written in | C |
Operating system | Linux Windows FreeBSD |
Type | Speech synthesizer |
License | GNU Affero General Public License |
Website | github |
The MBROLA software is not a complete speech synthesis system for all those languages; the text must first be transformed into phoneme and prosodic information in MBROLA's format, and separate software (e.g. eSpeakNG) is necessary.
History
MBROLA project started in 1995 at TCTS Lab of the Faculté polytechnique de Mons (Belgium) as a scientific project to obtain a set of speech synthesizers for as many languages as possible. First release of MBROLA software was in 1996 and was provided as freeware for non-commercial, non-military application. Licenses for created voice databases differ, but are also mostly for non-commercial and non-military use.
Due to its free usage only for non-commercial applications, MBROLA was as alternative choice for private/home users for de facto speech synthesis engine eSpeakNG in Linux workstations, but mostly was not used for commercial solutions (e.g. for speaking time clocks, boarding notifications for ports and terminals etc.) After initial development of voice databases updates and support of MBROLA software ceased and gradually closed-source binaries fell behind development of recent hardware and operating systems.[2] To deal with this MBROLA development team decided to release MBROLA as open source software, and on October 24, 2018 source code was released on GitHub with GNU Affero General Public License. On January 23, 2019 tool called MBROLATOR was released to provide creation of MBROLA database from WAV files with the same license.
Used technology
MBROLA software uses MBROLA (Multi-Band Resynthesis OverLap Add)[3] algorithm for speech generation. Although it is diphone-based, the quality of MBROLA's synthesis is considered to be higher than that of most diphone synthesisers as it preprocesses the diphones imposing constant pitch and harmonic phases that enhances their concatenation while only slightly degrading their segmental quality.
MBROLA is a time-domain algorithm similar to PSOLA, which implies very low computational load at synthesis time. Unlike PSOLA, however, MBROLA does not require a preliminary marking of pitch periods. This feature has made it possible to develop the MBROLA project around the MBROLA algorithm, through which many speech research labs, companies, or individuals around the world have provided diphone databases for many languages and voices, but there are some notable omissions such as Chinese.
References
- List of MBROLA voices
- Mbrola-64 crashes immediately with a SEGFAULT
- Dutoit, T; Leich, H (Dec 1993). "MBR-PSOLA: Text-To-Speech synthesis based on an MBE re-synthesis of the segments database". Speech Communication. 13 (3–4): 435–440. doi:10.1016/0167-6393(93)90042-J.