Mongolian Text to Speech Conversion Tool
Project Proposal
1. Project Background and Justification
Information and communication technology is rapidly evolving as an effective tool for making information wide spread and available online to several communities. The industrial society is turning towards information society. The increased use of information technology is enabling people across the world to participate in the knowledge network; however visually impaired people in developing country like Mongolia are being deprived of the benefits of the use of ICT and the computer system. One of the main reasons for this is lack of suitable human computer interface and the software designed and developed to meet local needs. To design and develop a computer interface for a person who can not see what computer displays, is the most challenging task for many software developers. In most of the developed countries like Japan they have many public projects and commercial software companies addressing to such issue. Many software companies in Mongolia are developing commercial software like content management system and financial software etc., however due to current market needs they do not recognize the needs of text to speech (TTS) converter. There is a great need to develop a text to speech converter tool with simple human computer interface in local language to meet needs of visually impaired people and to put foundation for side applications. The TTS conversion tool can effectively address ICT needs of visually impaired people in Mongolia. On the other hand the leading causes of loosing sight are computer displays, TVs and video games.
The TTS software can also be utilized for reading e-books, newsletters, online newspapers etc., to avoid eye concentration on computer display which potentially leads to various eye diseases.
As an initiative to fight against sight problems in Mongolia, we propose to develop text to speech converter tool.
2. Objectives
2.1. General Objective
To make usage of PC’s more user friendly by developing text to speech synthesizer and to meet needs of visually impaired people in Mongolian language.
2.2 Specific Objective:
3. Project Beneficiaries
-
Visually impaired people
- Potential users of the Computer System
- Students and researchers
- Mongolian IT companies
- The community will be benefited by the Mongolian text to speech synthesizer as it would be available online as it is going to be released under GNU/GPL license.
- General public
4. Project Sustainability
The software team at Infocon will be taking this project as one of its activities to further research and development of text to speech (TTS) synthesizer. The outcome of the project would be hosted online by Infocon for free access to all under GNU/GPL license. Further research and development from other organizations will be facilitated once the system is released under GNU/GPL license. This project will definitely initiate further research and development activities into text to speech conversion in Mongolia. Mongolian Association of the Blind NGO will market the TTS software to blind people for free of charge.
5. Project Methodology
5.1 The Development Methodology
The TTS system as a whole comprises of two subsystems; the Interface part and the Text to speech conversion engine. The interface part will have two different interfaces; one for visually impaired people and the other for ordinary people. Text to speech conversion engine can take input file in both text and image format. Dozen of TTS conversion research materials, tools and software released under GNU/GPL license in the internet will be investigated and best utilized for avoiding duplications. One of such initiative is the MBROLA project. The general architecture of proposed TTS conversion engine is shown below.
Figure 5.1: General Architecture
The system shall support file types that are widely used such as *.doc, *.txt, *.rtf, *.jpg and *.gif etc., If system gets image file as an input then Mongolian character recognition module will first convert the image file into text file then it will be processed to get the desired speech.
The interface for visually impaired people will be sound and mouse click driven and the system may read a loud the file name being inputted to the system for confirmation.
5.2 Working Methodology
Most of the development work will be carried out at Infocon whereas linguistic professors of National University of Mongolia will provide the linguistic and other expertise required. For developing and testing the interface for the visually impaired people we have verbal agreements with the Mongolian Association of the Blind.
Figure 5.2: Project Timeline
7. Project Outputs
The output of the project will be:
-
Mongolian text to speech converter
- Mongolian character recognition tool
- Text to speech converter software package for visually impaired people
- User manual in Mongolian
Dissemination:
- Project outputs in the form of software shall be disseminated under GNU/GPL license to the public.
- 100 CD’s with text to speech converter software package for visually impaired people will be freely distributed.
8. Project Monitoring
With various types of given text the TTS conversion tool will be tested for naturalness and accuracy and examined by linguistic experts to achieve more correct pronunciation of Mongolian words. The outcomes of these examinations shall be incorporated to the TTS.
What needs to be tested?
Providing the "correct" pronunciation of a particular word is not that easy in the Mongolian language. In fact, learning to read (pronounce a word from its written form) in Mongolian language is simpler as compared to that for English, however numerous exceptions to the pronunciation rules exist.
These exceptions include Number handling, foreign words, acronyms, abbreviations, names addresses, homographs, punctuation and incorrect writing style.
Last modified 2005-06-13 10:46 AM




