Open Access Open Access  Restricted Access Subscription or Fee Access

Beyond Words: Exploring the Boundaries of Text-to-Speech Technology

Suchi Pandey, Suraj Singh, Raja Jai Singh, Vaibhav Shukla, Anukalp .

Abstract


This work presents a novel, effective, and economical method for turning text graphics into audio information in real-time. This method enables visually challenged individuals to efficiently interact with computers through a voice interface. Text extraction from color photographs is a tough computer vision task. Text-to-speech conversion uses OCR technology to convert English alphabets and numbers in images into voices. Of course! The main goal of real-time speech-to-text translation is the instantaneous conversion of spoken words into written language. When it comes to incorporating people with hearing impairments into oral communication environments, such as conferences or counseling sessions, this technology is invaluable. In essence, it gives hard-of-hearing or deaf individuals the ability to fully engage in discussions by giving them a written transcript of the actual discourse. Their capacity to interact, comprehend, and participate in a variety of social and professional contexts may be greatly impacted by this. However, the transmission of voice into written language in Real-time demands particular strategies because it must be highly fast and accurate to be intelligible. Speech processing is widely used in modern communication devices, including security systems, household appliances, mobile phones, ATMs, computers, and hotels. Our method aims to translate speech signals into text output for disabled students in school and industry. Text-to-speech converter is a new software technology that enables even the visually impaired to read and comprehend various materials. The blinds are unable to read documents, thus this software can serve as an assistant to them by reading those materials for them. This application was specially developed for those people who are not able to speak and see things. The user can just input what he or she wishes to say, and the software will speak it for them. So, this program is not just an advancement toward future development, but also a godsend for individuals who cannot speak or sight.

Full Text:

PDF

References


Trivedi A, Pant N, Shah P, Sonik S, Agrawal S. Speech-to-text and text-to-speech recognition systems-Areview. IOSR J. Comput. Eng. 2018 Mar;20(2):36–43.

Nafis NA, Hossain MS. Speech-to-text conversion in real-time. International Journal of Innovation and Scientific Research ISSN. 2015 Aug:2351–8014.

Klatt DH. Review of text‐to‐speech conversion for English. The Journal of the Acoustical Society of America. 1987 Sep 1;82(3):737–93.

Nagdewani S, Jain A. A review of methods for speech-to-text and text-to-speech conversion. International Research Journal of Engineering and Technology (IRJET). 2020 May;7(05).

Python-Overview, tutorials point, accessed in 2023 [Online] available from https://www.tutorialspoint.com/python/python_overview.htm

py-tesseract 0.3.10, accessed in 2023 [Online] available from https://pypi.org/project/pytesseract/

Lemmetty, S., 1999. Review of Speech Syn1thesis Technology. Masters Dissertation, Helsinki University Of Technology.

Dutoit, T., 1993. High-quality text-to-speech synthesis of The French language.

Doctoral dissertation, Faculte Polytechnique de Mons. Suendermann, D., Höge, H., and Black, A., 2010.Challenges in Speech Synthesis. Chen, F., Jokinen, K.,(eds.), Speech Technology, Springer Science + Business Media LLC.

Allen, J., Hunnicutt, M. S., Klatt D., 1987. From Text to Speech: The MI Talk system. Cambridge University Press.

Rubin, P., Baer, T., and Mermelstein, P., 1981. An Articulatory synthesizer for perceptual research. Journal of the Acoustical Society of America 70: 321–328.

Van Santen, J.P.H., Sproat, R. W., Olive, J.P., and Hirschberg, J., 1997.Progress in Speech Synthesis Springer.


Refbacks

  • There are currently no refbacks.