Reo Moana AI Development Has Begun!

We are proud to announce that we have begun development of Reo Moana AI! We are collecting, cleaning, collating, and organizing data and hope to begin our first round of training soon. We are not yet prepared to share details on this system, what architectures is it built upon or the specific foci of Reo Moana AI, but we will do that soon. We are also spending considerable time and effort on establishing guidelines for ethical use of the data we have and will collect.

Most commercial LLMs have limited knowledge of languages like Hawaiian, Māori, Tahitian, and other Polynesian languages Their training data is dominated by English and other major languages. The texts that are accessible by these systems are not optimally organized for their training, often has inconsistent use of diacritics and other issues that lead to inaccurate results for these Pacific languages. By meticulously training on these languages, using high-quality, culturally relevant texts, we believe there is potential to produce superior results. Reo Moana will have a genuine understanding of the languages’ structures, nuance, and cultural context, enabling accurate and meaningful communication.

We acknowledge we are not the first to point our wa‘a/waka/vaka (canoe) in this direction. We do not consider this a race, but a collaborative ‘imi loa (long search) to ensure the perpetuation and growth of these languages. Our gratitude to our colleagues across the Pacific who have inspired us to join in this journey:

We aspire to meet the lofty expectations of those whose ancestors spoke, preserved, and transmitted the languages to today’s language and cultural communities, and are aligning our work to meet their standards for ethical use of their languages.