Python, Software Development, UX and Product Design - Blog - STX Next

Data Validation in a Big Data Environment on the Example of Apache Spark and Great Expectations

Written by Lidia Kurasińska | Mar 1, 2022 2:14:02 AM

Every minute in 2020, Facebook users shared 150,000 messages, LinkedIn users applied for 69,444 job offers, and Instagram users posted 347,222 stories, according to stats from Domo.

Taking into account the huge quantities of data being used every day, providing data quality is becoming increasingly important in today’s world.

Since data is said to be “the new oil,” efficiently validating it can determine whether your business idea becomes successful or not.

By reading this article, you will learn the answers to the following questions:

  • What is data validation?
  • What is Great Expectations and why should you care about it?
  • What are some basic Great Expectations concepts?
  • What is a sample use of Great Expectations?
  • How do you write custom Expectations using real-life examples?