Over the years, almost every organization has understood the importance of analyzing data.
In fact, it would not be an overstatement to say that “No organization will be able to survive today’s cut-throat competition if it does not analyze data.”
Data analysis as we know it is the process of taking the source data, refining it to get useful information, and then making useful predictions from it.
In this Learning Path, we will learn how to analyze data using the powerful toolset provided by Python.
Packt’s Video Learning Paths are a series of individual video products put together in a logical and stepwise manner such that each video builds on the skills learned in the video before it.
Python features numerous numerical and mathematical toolkits such as Numpy, Scipy, Scikit learn, and SciKit, all used for data analysis and machine learning. With the aid of all of these, Python has become the language of choice of data scientists for data analysis, visualization, and machine learning.
We will have a general look at data analysis and then discuss the web scraping tools and techniques in detail. We will show a rich collection of recipes that will come in handy when you are scraping a website using Python, addressing your usual and unusual problems while scraping websites by diving deep into the capabilities of Python’s web scraping tools such as Selenium, BeautifulSoup, and urllib2.
We will then discuss the visualization best practices. Effective visualization helps you get better insights from your data, and help you make better and more informed business decisions.
After completing this Learning Path, you will be well-equipped to extract data even from dynamic and complex websites by using Python web scraping tools, and get a better understanding of the data visualization concepts. You will also learn how to apply these concepts and overcome any challenge while implementing them.
To ensure that you get the best of the learning experience, in this Learning Path we combine the works of some of the leading authors in the business.
About the authors
Benjamin Hoff spent 3 years working as a software engineer and team leader doing graphics processing, desktop application development, and scientific facility simulation using a mixture of C++ and Python. This sparked a passion for software development and developmental programming and led him to explore state-of-the art projects in natural language processing, facial detection/recognition, and machine learning.
Charles Clayton is a sole proprietor of crclayton technologies co, and an independent web developer. He is an experienced developer and Python specialist in Python web scraping solutions and tools such as Selenium, BeautifulSoup, and urllib2. He also has worked as a Reliability Engineer with West frazweer.
Dimitry Foures is a data scientist with a background in applied mathematics and theoretical physics. After completing his physics undergraduate studies in ENS Lyon (France), he studied fluid mechanics at École Polytechnique in Paris where he obtained first class in Master’s degree. He holds a PhD in applied mathematics from the University of Cambridge. He currently works as a data scientist for a smart energy startup in Cambridge, in close collaboration with the university.
Giuseppe Vettigli is a data scientist who has worked in the research industry and academia for many years. His work is focused on the development of machine learning models and applications to use information from structured and unstructured data. He also writes about scientific computing and data visualization in Python in his blogs.
is an experienced developer, with strong background in Linux system knowledge and software engineering education. He is skilled in building scalable data-driven distributed software rich systems.