This repository demonstrates big data processing, visualization, and machine learning using tools such as Hadoop, Spark, Kafka, and Python.
-
Updated
Jan 15, 2026 - Jupyter Notebook
This repository demonstrates big data processing, visualization, and machine learning using tools such as Hadoop, Spark, Kafka, and Python.
This project examines the relationship between parental education and student academic success using the Student Performance (PIP) dataset. It investigates whether students with more highly educated parents perform better academically or are less likely to drop out, using t-tests, bootstrap, permutation, and chi-square analyses.
Add a description, image, and links to the data-stratification topic page so that developers can more easily learn about it.
To associate your repository with the data-stratification topic, visit your repo's landing page and select "manage topics."