Julius (software)

Julius is a speech recognition engine, specifically a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers. It can perform almost real-time computing (RTC) decoding on most current personal computers (PCs) in 60k word dictation task using word trigram (3-gram) and context-dependent Hidden Markov model (HMM). Major search methods are fully incorporated.

Julius
Original author(s)Lee Akinobu
Developer(s)Kawahara Lab., Kyoto University
Julius project team, Nagoya Institute of Technology
Initial release1991 (1991)
Stable release
4.5 / 2 January 2019
Repositorygithub.com/julius-speech
Written inC
Operating systemUnix (GNU/Linux, BSD, etc.), Windows (via Cygwin)
PlatformIA-32, x86-64
Available inJapanese, English
TypeSpeech recognition
LicenseFree, BSD style[1][2]
Websitejulius.osdn.jp/en_index.php

It is also modularized carefully to be independent from model structures, and various HMM types are supported such as shared-state triphones and tied-mixture models, with any number of mixtures, states, or phones. Standard formats are adopted to cope with other free modeling toolkit. The main platform is Linux and other Unix workstations, and it works on Windows. Julius is free and open-source software, released under a revised BSD style software license.

Julius has been developed as part of a free software toolkit for Japanese LVCSR research since 1997, and the work has been continued at Continuous Speech Recognition Consortium (CSRC), Japan from 2000 to 2003.

From rev.3.4, a grammar-based recognition parser named Julian is integrated into Julius. Julian is a modified version of Julius that uses hand-designed type of finite-state machine (FSM) termed a deterministic finite automaton (DFA) grammar as a language model. It can be used to build a kind of voice command system of small vocabulary, or various spoken dialog system tasks.

About models

To run, the Julius recognizer needs a language model and an acoustic model for each language.

Julius adopts acoustic models in Hidden Markov Model Toolkit (HTK) ASCII format, pronunciation dictionary in HTK-like format, and word 3-gram language models in ARPA standard format: forward 2-gram and reverse 3-gram as trained from speech corpus with reversed word order.

Although Julius is only distributed with Japanese models, the VoxForge project is working to create English acoustic models for use with the Julius Speech Recognition Engine.

In April 2018, thanks to the effort of Mozilla foundation, a 350-hour audio corpus of spoken English was made available. The new English ENVR-v5.4 open-source speech model was released alongside with Polish PLPL-v7.1 models and are available from Sourceforge.[3]

gollark: ... actual numbers?
gollark: 2D would be very easy.
gollark: You could just send them publicly or something. But not now, I want to see if I can practically make a 3D visualization thing.
gollark: I don't really 3D graphics but I could probably glue enough libraries together to make it work.
gollark: 3D is doable, if you want the cultural axis too?

See also

References

  1. Callaway, Tom (spot) (2012-08-13). "Licensing/Julius". Fedora Wiki. Red Hat. Retrieved 2019-03-24.
  2. "Large Vocabulary Continuous Speech Recognition Engine Julius". Julius development team. Nagoya Institute of Technology. 2014. Retrieved 2019-03-24.
  3. https://sourceforge.net/projects/juliusmodels/files/
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.