Find Actionable Insights using Machine Learning and XGBoost
- Knowledge of Python and the basics of modeling
- Ability to run a Jupyter Notebook and install appropriate Python libraries
Applied data science is about everything that goes before and after your model. Extracting actionable insights is probably the most important aspect of any modeling project! if you want to step up your data science game then this is a great area to study. Let's do it hands-on, applied a science project together and walk through a student retention model to extract actionable insights and help out struggling students.
Explore student data
Model student behavior using XGBoost
Predict struggling/at-risk students
Identify what makes a struggling student different than successful students
Build a report of actionable insights
And help teachers help students
In the case of a student retention model, looking at the full picture means doing a lot of work before doing any modeling. For example, talking to teachers. We need to better understand the business domain. In this case, finding out what are the problems they face. What are the uncertainties they'd like help with? It is critical to also leverage all their knowledge, like how and when do they determine that a student is at-risk. What data points and triggers do they use to identify someone that could be failing a class and/or their studies. How early can they identify this? Obviously the earlier the better, you don't want to wait till have too many bad grades and can't dig themselves out of the hole.
After you've distilled all that information in the model, we dig down into the observation level. This is an important point to understand. A model may return feature importance, coefficients, or weights depending on what type of model you use and how it learns. So, imagine a model that predicts heart attacks and finds that older age is the most important feature for the model, and if your patient is young, that's not going to tell them anything, worse, may lead them to misdiagnose.
Instead, we let the model give us a prediction of the likelihood of something happening, then we dig down to the observation level (i.e. each specific patient or student level) where each case is different and unique and analyze what makes this particular patient/student different from the rest. This may yield some useful information that may allow the professional to better assist - that is actionable insight.
- Those interested in stepping up their practical machine learning and analytics knowledge
- Those interested in getting more out of their machine learning projects
- Exploratory Data Analysis - Student Performance Data Set
- Data Preparation & Feature Engineering
- Modeling with XGBoost
- Building Our Actionable Report
- Better Reporting with Seaborn Charts
- Conclusion & Bonus
Data scientist with over 20-years experience in the tech industry, MAs in Predictive Analytics and International Administration, author of Monetizing Machine Learning and The Little Book of Fundamental Indicators, founder of FastML, reached top 1% on Kaggle and awarded "Competitions Expert" title, taught over 20,000 students on Udemy and VP of Data Science at SpringML.
From consulting in machine learning, healthcare modeling, 6 years on Wall Street in the financial industry, and 4 years at Microsoft, I feel like I’ve seen it all. And this has opened my eyes to the huge gap in educational material on applied data science. Like I say:
"It just ain’t real 'til it reaches your customer’s plate"
I am a startup advisor and available for speaking engagements with companies and schools on topics around building and motivating data science teams, and all things applied to machine learning.
Reach me at firstname.lastname@example.org