Speech recognition systems have been developed at the Institute of Control Systems

+A -A

Speech recognition systems have been developed at the Institute of Control Systems

21.02.2021 / Conferences, assemblies

Systems based on technologies and methods of artificial intelligence (AI) make it possible to implement various functions of human intelligence, to solve difficult to formalize practical problems in computer systems, which leads to significant economic efficiency and daily expands the scope of application. These technologies are widely used in areas such as image identification, text understanding, speech recognition, economics, medicine, forensics, journalism, etc.

Natural language processing is a branch of the field of artificial intelligence that studies the analysis and synthesis of any natural language by means of a computer. Speech is the most natural form of human communication, and today it is starting to play an increasingly important role in the man-machine interaction. Speech recognition is the analysis of human speech and its automatic transformation into text using computer programs. Recently, great results have been achieved in this area, which has been developing since the 1960s.

Azerbaijani science also contributes to this process. Scientists from the Institute of Control Systems (IMS) of the Azerbaijan National Academy of Sciences have been able to create competitive applied systems as a result of research in the field of speech recognition (conversion of human speech into text and its processing), one of the key elements of artificial intelligence. Research and commercialization of the results were carried out at the institute in cooperation with a private company.

Based on the results of the research, led by Associate Professor Abulfat Fatullayev, and consulted by Academician Ali Abbasov, neural network models and algorithms for speech recognition systems (SRS) were developed and created using machine learning and deep learning methods for a number of European languages (German, Turkish, English, Spanish, Italian, French) and Azerbaijani. Big Data and distributed computing technologies are widely used in the creation and processing of acoustic, linguistic and other information bases to ensure the normal operation of SRS. It should be noted that information bases were created in a fully automated way, automatically selecting terabytes of information using Internet resources.

Although the neural network models obtained as a result of the study consisted of a large number of parameters, using the models, scientists were able to perform calculations at a fairly high speed using parallel algorithms, matrix decomposition and other methods. Versions of the system in English, German, Turkish and Azerbaijani have been created and made available for online use via the Internet.

Today, SRS make it possible to ensure the development of many areas. Thus, they are used for programming and solving a number of problems in the humanities, including research in the field of linguistics, such as automatic text analysis, machine translation, and compilation of dictionaries. SRS can be used in business management, automation of communication with customers in call centers and reducing the load on operators and telephone lines, automatic analysis of calls, conducting market research, data input and other areas. In addition, SRS can also be widely used to create a voice interface ("smart home", household appliances, car navigation, etc.). As a result of the application on the electronic government portals, the SRS can create the possibilities of voice navigation, voice search, input of voice data, biometric user authentication.

The use of SRS in a widely used distance learning system can create conditions for confirming the identity of students, especially for authentication during exams. Using these technologies, it is possible to create special services that meet the needs of people with disabilities, so that they can get education, acquire necessary information and professional knowledge and, as a result, actively engage in social and labor activities.

It should be noted that according to forecasts for the development of information technology for the coming years, the annual volume of the world market for speech recognition technology and its various applications (speech analysis, media monitoring, virtual assistant, speaker identification, etc.) is estimated at $ 25 billion.

Expansion and comprehensive application of scientific research in the field of artificial intelligence, one of the most modern and prestigious areas of science, and perhaps the first one, will have a positive impact on the innovative economic development of our country.