
Make sure you have a google account and can access http://colab.research.google.com - I've provided links to starter and final notebooks. Let's get after it!
https://colab.research.google.com/drive/15cKoyNt9U89yuU9rONnxSOy9p2WLENki?usp=sharing
Start here! Happy Birthday
Learning to wrangle text is a key skill - let's get started!
Let's get our feet wet with regular expressions, too.
https://colab.research.google.com/drive/1_7PJTyY6S-bFwmPEOPrxTAhkwOv_KcOJ?usp=sharing
Sometimes regular expressions can get tricky, especially if we're building them dynamically. Let's use AI to help us solve our problem.
Let's think about performance analysis and think about how we might compare each sentence to all other sentences.
https://colab.research.google.com/drive/1_7PJTyY6S-bFwmPEOPrxTAhkwOv_KcOJ?usp=sharing
Rather than write 2 for loops, let's use itertools.combinations to help us iterate over all pairs of sentences and then compute the length of token intersections while storing them and creating a dataframe later on.
https://colab.research.google.com/drive/1_7PJTyY6S-bFwmPEOPrxTAhkwOv_KcOJ?usp=sharing
Alright, we did it! As a bonus, try to go from mere token interaction length to Jaccard similarity! https://en.wikipedia.org/wiki/Jaccard_index#Overview
Final notebook https://colab.research.google.com/drive/1ZcO3JzfI59o1pY98ALqND3C5ifEyBs4y?usp=sharing - GREAT WORK
https://colab.research.google.com/drive/1dZmr6C9aNCsyj2JSGhc-YNRbYvjIpauH?usp=sharing
XGBoost automated categorical handling document mentioned https://developer.nvidia.com/blog/categorical-features-in-xgboost-without-manual-encoding/
https://docs.google.com/spreadsheets/d/1RaoRSl-OH7N8gs6htBRXR5aqqP9TQ5RlOYAo7GMm4D8/edit#gid=1218296526 the document I'm using to describe business decisions and the link to true positives, precision, recall, etc...
XGBoost automated categorical handling document mentioned https://developer.nvidia.com/blog/categorical-features-in-xgboost-without-manual-encoding/
https://docs.google.com/spreadsheets/d/1RaoRSl-OH7N8gs6htBRXR5aqqP9TQ5RlOYAo7GMm4D8/edit#gid=1218296526 the document I'm using to describe business decisions and the link to true positives, precision, recall, etc...
Unlock the Power of Python for Real-World Data Science and Analytics
Are you ready to take your Python skills to the next level and tackle real-world data science and analytics challenges? Look no further than "Applied Python for Data Science and Analytics," a comprehensive Udemy course designed to bridge the gap between memorization and practical problem-solving.
In this course, you'll learn from Jeff James, a senior machine learning engineering manager with 15 years of applied data analytics and coding experience, who has also taught at the University of Denver. Andrew will guide you through the complexities of the Python standard library, Pandas, SciPY, and powerful machine learning libraries like scikit-learn, empowering you to solve open-ended problems with confidence.
Throughout the course, you'll dive deep into real-world scenarios, learning how to approach and solve challenges that go beyond the typical "table of contents" style video courses. You'll gain hands-on experience working with diverse datasets, applying advanced analytical techniques, and leveraging the full potential of Python's data science ecosystem.
Whether you're a data analyst, aspiring data scientist, or a developer looking to expand your skill set, this course will equip you with the tools and knowledge you need to excel in the field. You'll learn how to:
- Effectively utilize the Python standard library for data manipulation and analysis
- Harness the power of pandas for efficient data wrangling and exploration
- Apply statistical techniques using SciPY to gain deeper insights from your data
- Implement machine learning algorithms using scikit-learn to solve real-world problems
- Develop a problem-solving mindset to tackle open-ended challenges in data science and analytics
By the end of this course, you'll have a robust portfolio of projects showcasing your ability to apply Python to real-world data science and analytics problems. You'll be ready to take on complex challenges, drive data-driven decision-making, and make a tangible impact in your organization.
Don't miss this opportunity to learn from an experienced industry professional and elevate your Python skills to new heights. Enroll now in "Applied Python for Data Science and Analytics" and unlock your full potential in the world of data science and analytics!