
This lecture aims to introduce the course to students
This lecture aims to explain some features of the platform to watch the course
This lecture aims to introduce the concept of machine learning and how it works
This lecture aims to introduce the main types of variables used in statistics and data science
This lecture aims to explain the concepts of label encoding and one-hot encoder
This lecture aims to introduce the concepts of data scaling and the importance of it
This lecture aims to introduce the concepts of supervised and unsupervised learning
This lecture aims to introduce the concepts of training and how dataset can be divided. It also explains the concepts of overfitting and underfitting
This lecture aims to introduce the main types of graphs used to represent scientific data
This lecture aims to introduce the main types of graphs used to represent scientific data: part 2
This lecture aims to introduce a dimensionality reduction technique called PCA (principal component analysis)
This lecture aims to introduce the concept of classification and main algorithms
This lecture aims to explain the evaluation of classification algorithms
This lecture aims to explain the algorithm Naive Bayes
This lecture aims to explain laplace correction
This lecture aims to use a scientific example as a case study for Naive Bayes
This lecture aims to explain the algorithm Decision trees
This lecture aims to explain the concepts of entropy and gain of information
This lecture aims to explain the enhancement over decision trees called random forest
This lecture aims to bring a scientific case study for Random forest
This lecture aims to explain the algorithm KNN
This lecture aims to explain distance calculation in KNN
This lecture aims to bring scientific examples of how the KNN algorithm could be used
This lecture aims to explain the algorithm SVM (Support Vector Machines)
This lecture aims to explain the effect on margin in SVM
This lecture aims to bring a few scientific examples to explain SVM
This lecture aims to introduce the concept of neural networks, their usefulness, and features
This lecture aims to explain more details about neural networks
This lecture aims to explain more details about neural networks, like gradient descent and activation functions
This lecture aims to introduce the concept of convolutional neural networks and how they are inspired in the visual cortex
This lecture aims to present the final remarks of the course, showing the main learning expectations and the possible next steps
The course "Principles of Data Science and Machine Learning for Natural Sciences" is designed to connect traditional scientific disciplines with the rapidly growing fields of Data Science (DS) and Machine Learning (ML). As research increasingly depends on large datasets and advanced computational methods, it’s becoming essential for scientists to know how to leverage DS and ML techniques to improve their work.
This course offers a solid introduction to the key concepts of Data Science and Machine Learning, specifically aimed at scientists and researchers in areas like biology, chemistry, physics, and environmental science. Participants will learn the basics of data analysis, including data collection, cleaning, and visualization, before moving on to machine learning algorithms that can help identify patterns and make predictions from data.
The course doesn’t require any programming skills and focuses on fundamental theoretical concepts. It's structured into six main sections:
1. Introduction
We'll start by introducing the course, covering its main features, content, and how to follow along.
2. Core DS/ML Concepts
We’ll go over basic concepts like variables, data scaling, training, datasets, and data visualization.
3. Classification
In this section, we’ll discuss key classification algorithms such as decision trees, random forests, Naive Bayes, and KNN, with examples of how they can be applied in scientific research.
4. Regression
We’ll briefly cover linear and multiple linear regression, discussing the main ideas and providing examples relevant to science.
5. Clustering
This section will focus on standard and hierarchical clustering methods, along with practical examples for scientific applications.
6. Neural Networks
Finally, we’ll introduce neural networks, discussing their biological inspiration and common architectures like Feedforward Neural Networks (FNN), Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and Hopfield Networks.