Project 5: Machine Learning Classification for Educational Outcomes

This project aimed to predict educational outcomes, specifically whether students would drop out, enroll, or graduate, based on various features. The dataset encompassed diverse information, including application modes, numerical features, and binary indicators.

Methodology: The project employed a range of machine learning models, employing various algorithms to find the most suitable for the classification task. Models included K-Nearest Neighbors, Gradient Boosting, Decision Trees, Random Forests, Support Vector Machines, Gaussian Naive Bayes, Neural Networks, Linear Discriminant Analysis, and Quadratic Discriminant Analysis.

Exploratory Data Analysis (EDA): Explored distribution of application modes, numerical features, and target variable. Utilized visualizations like pie charts, histograms, and correlation matrices for insights.

Data Preprocessing: Transformed the target variable into binary classes for simplification. Split the dataset into training and testing sets. Standardized numerical features using Z-score scaling.

Modeling: Conducted Grid Search for optimal K value in K-Nearest Neighbors. Evaluated performance metrics for each model, including accuracy, confusion matrices, and classification reports.

Results: The Random Forest Classifier emerged as the most effective model, achieving an accuracy of 84%. Other notable performers include the Decision Tree Classifier (80%) and the Support Vector Machine Classifier (83%).

Challenges and Future Work: While the project yielded promising results, challenges such as class imbalance and suboptimal neural network performance were encountered. Future work could involve further hyperparameter tuning, feature engineering, and exploring ensemble techniques for enhanced predictive accuracy.

Technologies Used: Python, scikit-learn, TensorFlow, seaborn, matplotlib

The complete python code for the project can be accessed at: Github Repository

This project reflects my proficiency in machine learning, data analysis, and model evaluation. It highlights my ability to tackle real-world challenges, optimize model performance, and communicate findings effectively. You can also checkout my Instagram, YouTube and Twitter pages.

My Data Diary

Search This Blog

Project 5: Machine Learning Classification for Educational Outcomes

Comments

Post a Comment

Popular posts from this blog

Project 4: Insights from 'Olympics History' Data

Project 3: Criminal Cases Against Indian Politicians