Saugat Bhattarai
It recently came to our attention that, in the specific domain of the telecom industry, Artificial Intelligence faces difficulties with the terminologies and concepts used.[1] As stated in the linked article:
“Our industry operates with a unique lexicon – a dense landscape of acronyms, technical standards and operational jargon. Standard AI models, typically trained on broad internet data, lack the fluency needed to navigate this complexity accurately.”
This issue with the telecom industry has led us to ask: is Nepali a specific case? There certainly is less data on the internet in the Nepali language. Will working and delivering in the Nepali language lead to errors, which will then translate into a loss rather than gain in efficiency? Who will create a “domain-specific language model” for Nepal? Why is the telecom jargon considered to possess a “unique lexicon” and Nepali is not? In the first place, it seems that finding out whether and to what extent this problem exists falls as a task on the hands of NGOs rather than private developers.
We can look at this problem from the perspective of development, and how the most marginalized might face issues when AI is not accurate in the Nepali language. It might mean, for instance, that education curricula are too traditional, because they cannot incorporate AI-based support in Nepali. It might mean that any new government service will take longer to deliver because of the need to ‘fix bugs’ that resulted from AI’s difficulty with the Nepali language.
More broadly, the gap between English and Nepali will get further aggravated. There might be other issues but the research into this and the drive for a solution, like there seems to be for the telecom industry, is not there in the case of Nepal and Nepali.
It might be right to advocate for the design of AI that addresses the problems it has with Nepali terminology and concepts. This would be a ‘domain specific AI.’ But the issue here is also broader, and must include Nepali experiences and realities, over and above the language. We, as part of ICT4D, must question exactly how AI is being tailored to the Nepali context. And we can learn lessons from other parts of the world, but when it comes to the Nepali language, we need to be specific and speak for, and from, our own place.
[1] The article bringing attention to this issue is found here: https://www.gsma.com/newsroom/article/closing-the-ai-accuracy-gap-in-telecoms-the-critical-role-of-domain-specific-language-models/
