Move the sliders to update the predicted score:
Student Performance Factors Analysis
Introduction
This project explores the “Student Performance Factors” dataset, publicly available on Kaggle. This dataset collects individual-level student information, integrating academic, family, socioeconomic, and behavioral variables. The goal is to analyze factors associated with academic performance as measured by the score obtained in final evaluations.
The dataset includes variables related to study habits, attendance, access to educational resources, motivation level, family environment characteristics, school type, and other relevant determinants of school performance. This diversity of variables allows for a multivariate approach to academic performance, recognizing that educational outcomes depend not only on individual factors but also on structural and contextual conditions.
From an applied perspective, this dataset is particularly relevant for educational studies, facilitating the identification of patterns associated with academic performance and the segmentation of students according to performance profiles. In educational policy and school support program contexts, this type of analysis is fundamental to guide targeted interventions, optimize resource allocation, and strengthen strategies for improving learning.
Summary of Findings
- Data Quality: The dataset is high quality with minimal cleaning required.
- Best Model: Linear Regression is the most effective (R2 ~0.825), showing that factors like attendance and study hours have a strong linear relationship with performance.
- Segmentation: K-Means identified distinct student profiles, which can help in designing targeted educational interventions.
- Reflection: Machine Learning provides powerful tools for early intervention in education, shifting from reacting to failure to preventing it.
Score Prediction Feature
Use the sliders to estimate a student’s performance based on their habits and environment.
Project Sections
- Data Preparation: Data loading, cleaning, and initial exploration.
- Exploratory Data Analysis: Univariate and bivariate analysis of the variables.
- Modeling (Supervised & Unsupervised): Prediction of scores and student segmentation.
- Conclusions: Summary and final reflections.