USA Voters and NCOA
USA Voters and NCOA

This is a big data project dealing with SOLR data standardization, cleansing, and indexing. We were determined to make things work with a robust search backend solution enframed in a user-friendly interface.

What Challenges We Endured

The Intsurfing team was requested to extract, transform, and load data about USA voters and individuals changing their addresses from open sources. Therefore, our big data specialists had to:

  • Process big data volumes - the project covered 4 billion records scattered across different databases like Infutor, Movers, Thrive, NCOA, and Spoke.
  • Non-standardized data - since we had to process multiple sources, data was hardly falling within a single formatting and structural standard.
  • Risk of data duplication - multiple databases we had to scrape data from could have information about the same people, so we had to make sure our solution doesn’t carry duplicates.
Here Is What We Delivered

By the end of the project, our Client was provided with a fully standardized database of American voters and individuals who changed their place of living.

  • The assembled database included 4 billion records.
  • Data has been standardized, including names and addresses.
  • We cleaned duplicated and redundant information to ensure data is clear and easy to use.
  • Our team uploaded the records to the database, offering a user-friendly interface for effortless data manipulation.
Tech Stack
  • .NET 4.5
  • WPF
  • WCF
  • SQL Server 2012
  • Amazon AWS
  • Hadoop
  • SOLR