No 1 About Project

MTE2O, Machine Translation English to Odia is an open source data collection and Translation engine preparation project from English to a low resource Indic language, Odia.

The open source project made up of volunteers has been made up of the following sub-parts:

  • Data Ingestion architecture
  • Collect maximum number of English-Odia bilingual parallel-pairs. We have been able to collect more than 10,00,000 parallel pairs.
  • Data cleaning: Clean the parallel pairs
  • Data alignment: Align an English sentence to its exact Odia sentence in parallel.
  • SMT: Statistical Machine Learning translation based on IBM's statistics models.
  • NMT: Neural Machine Translation with the help of Pytorch library.
  • Hosting the live translation module over the cloud.
source: OdiaNLP

Photo by JACQUELINE BRANDWAYN on Unsplash</>

No 2 My responsibilities

Developed Asynchronous wikipedia data extraction

  • Developed an application to extract monolingual data from Wikipedia.
  • Developed another application to extract an english to odia dictionary database.

Crowdsourcing & Volunteers

  • Developed an application and hosted on Cloud to monitor number of En-Od pairs posted by volunteers on Twitter.
  • Inspired the volunteers to work on the project.

Content writer

  • The progress on this project has been documented effectively on an open source Github page.
  • Written the unique concepts and key technologies to understand for the newly joined volunteers and future ones.

No 3 Tech stacks

  • Python
  • Golang
  • Mkdocs
  • Markdown
  • PyCharm
  • Heroku
  • Github pages
  • Docker
  • Git

No 6 Website

No 7 More Projects

  • AIOps-Platform Tech lead
  • Cognitive search Tech lead