Cypriot Greek speakers may soon be understood by voice-activated systems thanks to a breakthrough speech-to-text AI model developed by a small team. This innovation aims to address the challenges faced by speakers of the island’s unique dialect, who have long struggled for recognition in technology.
Igor Akimov, an AI product manager, collaborated with interns Hussein Khadra and Nikita Markov from the University of Nicosia and UCLan, respectively, to create an automatic speech recognition system specifically designed for Cypriot Greek. The system converts spoken language into written text, making it applicable for various uses, including AI voice agents, translation services, and automated customer support.
This technology is not just a boon for everyday users; it holds significant promise across multiple sectors. In healthcare, for instance, it can transcribe patient speech directly into medical systems, facilitating smoother interactions, especially for older adults. In business, it paves the way for automated voice agents to communicate naturally with Cypriot customers. Furthermore, it can play a crucial role in education by aiding the preservation of the Cypriot dialect and digitising local audio archives.
One of the key goals of the project was to develop a methodology for working with languages and dialects that lack sufficient data. Akimov remarked on the complexity of the task, saying, “It was not easy. I think we all underestimated just how complex it would be. There were definitely ups and downs along the way.”
Initially, the team encountered significant obstacles in sourcing high-quality data. Despite reaching out to various researchers, they often faced challenges such as lost data, prohibitive fees, or outright refusals for access. With existing resources limited, they turned to a variety of media, gathering Cypriot audio from TV shows, radio stations, podcasts, and books. This led to the creation of the largest Cypriot Greek speech collection ever assembled.
Training the AI was a multi-phase endeavour. The first stage involved exposing the system to casual Cypriot Greek speech to capture its unique sounds and rhythms. Subsequently, clearer professional speech from news broadcasts and radio shows was introduced to refine the AI’s understanding and minimise errors. A reading assistant tool, KenLM, was integrated to enhance recognition accuracy by suggesting the most probable words.
As the project progressed, the team focused on continuous improvement. They developed a platform where native speakers could correct the AI’s transcripts, feeding these corrections back into the training process. This iterative approach aims to increase the system’s accuracy and fidelity to the Cypriot dialect over time.
Impressively, the entire project was executed on a modest budget of just $150, leveraging innovative approaches and accessible cloud technology. However, Akimov emphasised that their work is still a work in progress: “With only a few hours of high-quality transcribed audio, we couldn’t create the world’s best model yet – but it’s absolutely achievable.”
Currently, the team has amassed around 300 hours of Cypriot speech and is actively seeking volunteers to contribute. Those interested can assist by spending just 15 minutes validating transcriptions on their project website, voiceofcyprus.org. This small effort could significantly enhance the quality of the AI model for Cypriot speech recognition and potentially lead to a text-to-speech system that authentically represents the dialect.
Akimov expressed the importance of this initiative for the Cypriot community, stating, “This will help us – and Cyprus – tremendously. Even just 10-15 minutes makes a difference. We want every Cypriot to be able to speak in their own dialect and still be understood by technology.”
