Big Data Platform - Spark Cluster

Teaching the technology of Big Data is important. My background is not as a Data Scientist but more of a Data Engineer. And I love deploying infrastructure.

Current Cluster:

Current Hardware

Storage component for Spark Cluster

There is an on-prem solution for storing datasets. Using Minio

On-prem S3 compatible Object Storage

Software Support

  • Supported Software
  • MapReduce
  • Spark
  • SparkR
  • SparkQL

Actual Image Located at the Wheaton Rice Campus - Room 242