Sunday, November 17, 2024

Google adds first Indigenous language in Canada to its translation service

Must read

One of the most widely spoken Indigenous languages in this country is now available through Google’s translation service, the first time the tech giant has included a First Nations, Métis or Inuit language spoken in Canada on its platform.

One of the most widely spoken Indigenous languages in this country is now available through Google’s translation service, the first time the tech giant has included a First Nations, Métis or Inuit language spoken in Canada on its platform.

Inuktut, a broad term encompassing different dialects spoken by Inuit in Canada, Greenland and Alaska, has been added to Google Translate, which translates text, documents and websites from one language into another.

The latest addition is part of a Google initiative to develop a single artificial intelligence language model to support 1,000 of the most spoken languages in the world.

There are roughly 40,000 Inuktut speakers in Canada, data from Statistics Canada suggests.

The number of speakers alone is not enough to determine whether a language can be included in Google Translate, said Isaac Caswell, a senior software engineer with the platform.

There also has to be enough online text data to pull from to create a language model.

Other Indigenous languages in Canada have “had simply too little data to have any usable machine translation model,” said Caswell.

For example, engineers looked at adding Cree, which is spoken by more than 86,000 people in Canada, but there were fewer websites in the language to pull from.

“We don’t want to put anything on the product which just produces broken text or nonsense,” said Caswell.

“Inuktut really stands out in that it has a lot of clean and a lot of well written data, because, I think, the community is increasing online.”

When adding a language to Google Translate, the tech company looks at two main things: whether there’s a desire or need from the community and how technically feasible is it.

After Google determined its model could recognize Inuktut, it began to consult with language speakers and organizations.

The company reached out to Inuit Tapiriit Kanatami, the national organization that represents about 70,000 Inuit in Canada, to ensure development of the model was true to the Inuktut language, including the ability to translate both of the language’s writing systems.

Inuktut uses qaniujaaqpait, or syllabics, and qaliujaaqpait, which uses the Roman alphabet.

Inuit Tapiriit Kanatami has developed its own data set of common characters that can be used to write in any dialect of Inuktut to help ease written communication among the different Inuit regions.

“If we hadn’t had their help, we would have just been able to launch in syllabics, which undermines some of their current work,” said Caswell.

The organization welcomed Google’s work to include Inuktut, citing the need to revitalize, protect and promote Inuit languages.

“The addition of Inuktut on such a widely used platform empowers Inuit to interact more fully in the digital world,” Natan Obed, president of Inuit Tapiriit Kanatami, said in a statement.

With the introduction of Inuktut, Google aims to be more representative of a group of people often overlooked by the tech sector.

“I hope, maybe if anything, it will make them feel a little bit more seen by a big tech (company). Because, in general, Indigenous communities have had a lot of experiences being overlooked by technology,” said Caswell.

Users will have the ability to translate written Inuktut to English and vice versa through Google Translate. Other options, including the verbal translation tool, may come at a later time, said Caswell.

The use of AI in promoting Indigenous languages is not without its limitations, said Caswell, but he suspects this will change as more and more languages are unlocked with improved technology.

This report by The Canadian Press was first published Oct. 17, 2024.

Brittany Hobson, The Canadian Press

Latest article