India Voter Registration Roll
India Voter Registration Roll

The India Voter Registration Roll project was centered around the extraction and transformation of data gathered from different types of files. Intsurfing was requested to bring all hard-to-process data about Indian voters into a single file.

What Challenges We Endured

Within the scope of this project, our team had to deal with:

  • Non-standardized data - information was presented in different file types - PDF, images, photos of hand-written samples - all files having a different structure.
  • Data volume - our team had to process 1 billion records and transform them into a readable format.
  • Different languages - the processing engine had to be able to recognize all 22 separate official Indian languages.
Here Is What We Delivered

By the end of the project, the Client received a file with a poll of Indian voters. By leveraging cutting-edge technology and our deep expertise in big data, we extracted and transformed information about the Indian population harvested from open resources and wrapped it in a readable format. Thus:

  • The adaptive engine we developed scanned various files (PDF files, screenshots, images with hand-written notes, etc.) and converted the captured information into an easy-to-process format.
  • The engine was also capable of recognizing data in different Indian languages.
  • Intsurfing brought all the pieces of data together to create a single database of voters with 1 billion records.
  • Our team provided ongoing support and updated the file once a year to delete irrelevant information and add new data.
Tech Stack
  • .NET Core 2.0
  • Amazon AWS
  • Tesseract OCR
  • iText PDF