
const AboutData = [
    {
        title: `What is SYSPIN?`,
        desc: [
            {
                isPadding: true,
                isLink: false,
                params: `AI-based voice recognition has great potential to make technology more inclusive and enable millions of people to access services they are not able to use yet – be it in agriculture, education, health or others. The diversity and lack of technological support for spoken languages in India makes universal access to information and services an ongoing challenge. Artificial intelligence (AI) and machine learning (ML) offer novel and efficient ways to tackle this challenge. `
            },
            {
                isPadding: true,
                isLink: false,
                params: `Many of the low resourced languages are the official languages of different territories and are spoken and written by a very large population. This increases the necessity for the development of corpora and systems for the development of various speech technology solutions to provide different voice-based user-friendly applications. Providing open voice datasets sets free the innovative potential of this technology that hitherto is widely untapped. This is the starting point of this proposal for developing a large corpus and models for text-to-speech (TTS) systems in multiple Indian languages. It reduces the main barriers to voice-based technologies and creates a potential market for tech innovators and social entrepreneurs. The development of TTS corpora and AI models in nine Indian languages as summarized in the objectives below potentially benefits 602.1 million speakers of these languages. Overall, the collection of open voice data strengthens the local AI ecosystem for the development of voice technologies. This lays the foundation for more open and inclusive voice-based information services across sectors incl. sustainable agriculture, climate risk insurance, e-mobility, finance, healthcare and education.`,
            }
        ]
    },
    {
        title: `What is a text-to-speech (TTS) synthesis system?`,
        desc: [
            {
                isPadding: true,
                isLink: false,
                params: `A text-to-speech (TTS) synthesis system performs the task of converting text to its
                corresponding spoken voice. Hence, a computer can learn to read out any sentence,
                emulating human speech. Text in any natural language can be input to such a TTS
                system. While techniques for high quality TTS systems are available, the development of
                such a system is limited by a large amount of recording from a target speaker in a
                language with good quality. It is still a challenge to develop TTS systems in
                multi-speaker scenarios robust to variation in recording conditions where small amount
                recordings from multiple people in different recording environments can be used for
                speech synthesis. Thus, a large high-quality speaker-specific speech corpus is essential
                for the development of the state-of-the-art TTS system.
                `,
            },
            {
                isPadding: true,
                isLink: false,
                params: `A TTS system takes natural language text as an input and generates a speech waveform as an output. Natural language processing (NLP) and digital signal processing (DSP) are two key components of a TTS system. As input text may contain symbols, numbers, abbreviations, acronyms, etc., it is required to be converted into an intermediate form to produce the desired output speech. Hence, text normalization is an important phase where the input text is normalized into its actual pronunciation depending on the context of its use. Following this, a grapheme (the smallest semantically distinguishing units in a written language) to phoneme (the smallest pronounceable units in spoken languages) conversion is required. Selection of appropriate methods or rules in different languages for this conversion is a challenge in TTS systems. Selection of prosodic features (pitch, duration, stress) contributes to the naturalness of the synthesized speech. With the advancement in deep learning, the naturalness in synthesized speech has improved. `
            },
            {
                isPadding: false,
                isLink: true,
                link: `https://syspin.iisc.ac.in/evolution-of-tools`,
                params: `Evolution of Text-to-speech (TTS) - `
            },
        ]
    },
    {
        title: `TTS research in India`,
        desc: [
            {
                isPadding: false,
                isLink: false,
                params: `In spite of the challenges associated with the Indian language TTS system developments, there have been a number of initiatives for developing Indian language TTS systems by speech researchers in India as follows:`
            },
            {
                isPadding: false,
                isLink: true,
                link: `https://www.cse.iitb.ac.in/~vani/`,
                params: `IIT Mumbai - Vani Framework`
            },
            {
                isPadding: false,
                isLink: true,
                link: `https://www.cdac.in/index.aspx?id=mc_st_TTS_Bangla`,
                params: `C-DAC Kolkata - Bengali TTS`
            },
            {
                isPadding: false,
                isLink: false,
                params: `HP Labs - Hindi TTS • Simputer Trust - Dhvani TTS - dhvani.Sourceforge.net`,
            },
            {
                isPadding: false,
                isLink: true,
                link: `https://www.cdac.in/index.aspx?id=mc_st_shruti_drishti`,
                params: `IIT Kharagpur - Shruti TTS`
            },
            {
                isPadding: false,
                isLink: true,
                link: `http://cvit.iiit.ac.in/research/projects/cvit-projects/text-to-speech-dataset-for-indianlanguages`,
                params: `IIIT Hyderabad - Telugu TTS`
            },
            {
                isPadding: true,
                link: `https://www.cdac.in/index.aspx?id=mc_st_speech_technology`,
                params: `C-DAC Thiruvananthapuram - Malayalam TTS (SUBHASHINI)`
            },
            {
                isPadding: false,
                isLink: false,
                params: `The TTS engines have been developed at different times in the past using various technologies mentioned above depending on the data and resources available. However, publicly available TTS engines that cater to Indian languages at present are the following:`,
            },
            {
                isPadding: false,
                isLink: true,
                link: `http://tdil.meity.gov.in/`,
                params: ` tdil TTS`,
            },
            {
                isPadding: false,
                isLink: true,
                link: `https://indiantts.com/`,
                params: `IndianTTS`,
            },
            {
                isPadding: false,
                isLink: false,
                params: `ESpeak - espeak.sourceforge.net`,
            },
            {
                isPadding: false,
                isLink: false,
                params: `Festvox - festvox.org`,
            },
            {
                isPadding: false,
                isLink: true,
                link: `https://www.iitm.ac.in/donlab/hts/`,
                params: `IITM TTS`,
            },
            {
                isPadding: false,
                isLink: true,
                link: `https://www.readspeaker.com/languages-voices/`,
                params: `ReadSpeaker`,
            },
            {
                isPadding: true,
                isLink: true,
                link: `http://dhvani.sourceforge.net/`,
                params: `Dhvani`,
            },
            {
                isTable: true,
                isPadding: true,
                title: `A summary of different Indian languages covered by these TTS systems are summarized below:`,
                data: [
                    {
                        lang: `Hindi`,
                        info: `Espeak, Festvox, Dhvani, Indiantts, Festvox, IITM TTS, ReadSpeaker`,
                    },
                    {
                        lang: `Tamil`,
                        info: `Espeak, Festvox, Dhvani, Indiantts, Festvox, IITM TTS`,
                    },
                    {
                        lang: `Telugu`,
                        info: `Festvox, Dhvani, Indiantts, Festvox, IITM TTS`,
                    },
                    {
                        lang: `Kannada`,
                        info: `Espeak, Dhvani, Indiantts, IITM TTS`,
                    },
                    {
                        lang: `Malayalam`,
                        info: `Espeak, Dhvani, Indiantts, IITM TTS`,
                    },
                    {
                        lang: `Punjabi`,
                        info: `Espeak, Dhvani, Indiantts`,
                    },
                    {
                        lang: `Marathi`,
                        info: `Festvox, Dhvani, Indiantts, Festvox, IITM TTS`,
                    },
                    {
                        lang: `Odia`,
                        info: `Dhvani, Indiantts, IITM TTS`,
                    },
                    {
                        lang: `Bengali`,
                        info: `Dhvani, Indiantts, IITM TTS`,
                    },
                    {
                        lang: `Gujarati`,
                        info: `Dhvani, Indiantts, IITM TTS Pashto Dhvani`,
                    },
                    {
                        lang: `Sanskrit`,
                        info: `JNU Assamese Indiantts, IITM TTS`,
                    },
                    {
                        lang: `Manipuri`,
                        info: `Tdil TTS, IITM TTS`,
                    },
                    {
                        lang: `Boro`,
                        info: `Tdil TTS, IITM TTS`,
                    },
                    {
                        lang: `Rajasthani`,
                        info: `IITM TTS `,
                    },
                ]
            }
        ]
    },
    {
        title: `Why?`,
        desc: [
            {
                isPadding: true,
                isLink: false,
                params: `Providing people with information in their language is a key driver of economic empowerment and political participation. Open voice data and AI models in Indian languages will provide innovative and efficient ways to tackle this challenge. These include developing an automatic dialogue system where both speech recognition and TTS are key components. Such systems are of great demand in the areas of sustainable agriculture, climate adaptation, finance and e-commerce: voice-enabled queries and responses on weather, fertilizers, crop price etc. are of great use for farmers especially when they are illiterate. Additionally, the TTS corpus will be a unique resource for developing assistive technologies for people with speech and visual disabilities. The TTS technology as screen readers may help these people to listen to the written documents in different Indian languages. The TTS systems may be coupled with a computer-aided learning system to provide helpful tools for learning new languages by listening to the pronunciation of the words. The TTS system would also be useful for designing different human–computer interactive systems such as kiosks or automated tellers. In short, AI-based voice technologies have great potential to make technology more inclusive and enable millions of people to access services they are not able to use yet. Overall, such technologies would benefit 602 million speakers of the languages included in this proposal.`,
            },
            {
                isPadding: true,
                isLink: false,
                params: `TTS systems are useful for a variety of applications which require human-computer interactions in different domains. For people with speech disabilities, the TTS technology may be used as a synthesizer for producing speech artificially out of the typed utterances in their native languages. As per the census data provided by the Govt. of India (https://censusindia.gov.in/census_and_you/disabled_population.aspx), out of the total population, more than 1.6 million people are suffering from different speech-related disabilities. TTS systems in regional languages may help such people to enhance their lifestyle. The TTS systems may be used as screen readers for people with visual impairment or reading disabilities. In India, more than 10.6 million people suffer from visual disabilities. The TTS technology as screen reader may help these people to listen to the written documents in different Indian languages. The TTS systems may be coupled with a computer-aided learning system to provide helpful tools for learning new languages by listening to the pronunciation of the words. Speech synthesis, combined with speech recognition, is useful for telecommunication and multimedia applications. Synthetic speech may be used in several games or talking toys. The possible use of the TTS systems may be in designing different human–computer interactive systems such as kiosks or automated tellers. The TTS technologies these days are available as browser plugins, SMS or Email reader, PDF readers, pronunciation dictionaries, etc. `,
            },
        ]
    },
    {
        title: `How?`,
        desc: [
            {
                isPadding: true,
                isLink: false,
                params: `Process/ Flow of work:`,
            },
            {
                isPadding: false,
                isLink: false,
                params: `Design of text in the domain of agriculture and finance`,
            },
            {
                isPadding: false,
                isLink: false,
                params: `Selection of subjects/speakers`,
            },
            {
                isPadding: false,
                isLink: false,
                params: `Data collection`,
            },
            {
                isPadding: false,
                isLink: false,
                params: `Data validation`,
            },
            {
                isPadding: false,
                isLink: false,
                params: `Implementation`,
            },
            {
                isPadding: true,
                isLink: false,
                params: `Open sourcing`,
            },
            {
                isPadding: true,
                isLink: false,
                params: `Development of the 40 hours of TTS corpus for a male and a female speaker in each of nine Indian languages, namely, Bhojpuri, Maithili, Maghadi, Hindi, Chhattisgarhi, Bengali, Kannada, Telugu and Marathi. The size of the corpus will be several times larger than any existing TTS corpus in Indian languages.                 `,
            },
            {
                isPadding: true,
                isLink: false,
                params: `Open-sourcing baseline TTS engines in each of the nine languages so that they can be downloaded by any individual including academic & industry researchers, and start-ups and application specific models are built upon that. `,
            },
            {
                isPadding: true,
                isLink: false,
                params: `Evangelizing the TTS corpus and engines through workshops so that researchers across the globe get to know about the open datasets. Development of application-specific TTS models through challenges in various sectors (e.g. agriculture, finance, assistance for disabled) and hosting and making the corpus publicly available. `,
            }
        ]
    },
    {
        title: `Then!`,
        desc: [
            {
                isPadding: true,
                isLink: false,
                params: `Once the proposed 720 hours of TTS data is made openly available, it will open up opportunities for academic researchers, students, small and large-scale industries and research labs to innovate and develop algorithms and text-to-speech synthesizers in all the nine Indian languages included in this proposal. It will also bring competitiveness among different research groups in coming up with ideas for developing high-quality synthesized speech to improve voice-based services. This will bring synergies with other initiatives currently undertaken for developing large-scale corpora in these languages for other speech technologies including automatic speech recognition. Open voice data is the foundation for local AI innovators to build applications that are geared towards the specific capabilities and requirements of users in India. `,
            },
            {
                isPadding: true,
                isLink: false,
                params: `Such a large-scale corpora would bring an ideal set of resources for developing human-like voice chatbots that have wide applications in e-commerce, travel, agriculture, finance. As unavailability of data has been a roadblock for development of speech technology in many of the low-resourced Indian languages, it hampers the development of the speech start-up ecosystem where research and developmental innovation can take place. Having a TTS corpus from this project as a public good will help foster many more start-ups in the speech technology domain, which in turn will create more jobs, open new markets, and overall form a basis for social and economic development of the country. More importantly, it will bring forth the need for developing such corpora in other Indian languages raising alarm in the funding agencies in the government and private sectors. Overall, the open voice tech ecosystem in India and beyond will immensely benefit from the output of this project SYSPIN.                `
            }
        ]
    },
]

export default AboutData;