Python, Software Development, UX and Product Design - Blog - STX Next

How to Build a Spark Cluster with Docker, JupyterLab, and Apache Livy—a REST API for Apache Spark

Written by Lidia Kurasińska | Aug 3, 2021 8:02:29 AM

Have you ever wondered how you can leverage Apache Livy in your project to take your experience with the Apache Spark cluster to the next level? I put together a step-by-step guide that’ll help you achieve that goal.

To run a sample project and make the most of this guide, you’ll need to install the Docker container service first. If you’re not familiar with containers, you’ll find more details in the Docker documentation.

By reading this article, you’ll learn how to build a Spark cluster with the Livy server and JupyterLab based on the Docker virtual environment.

You’ll also find out how to prepare the business logic in JupyterLab and discover how I used a sample project to run PySpark code via the Livy service.