
Course introduction, lecture overview, course objectives
What is databricks, a preliminary understanding of databricks, which is a company.
How to find databricks service on azure cloud
Understand how to create a workspace in azure databricks through hands-on
Understand how to create a spark cluster in a workspace through hands-on
Understand how to create a notebook in a workspace through hands-on
Understand how to create a table in the notebook through hands-on
Understand how to delete a spark cluster in a workspace through hands-on
Delete resources in time to prevent increasing the cost of your cloud
Understand why workspace is used in azure databricks
Understanding the importance of using resource groups in azure cloud
Understand how to choose Databricks Runtime correctly
Understand the types of clusters
History of hadoop can let us know the origin and development
Understand the advantages of Apache Spark, what scenarios spark is used in.
You will learn how to install and use virtual machines to pave the way for seting up Apache spark locally
You will learn what putty is and lay the groundwork for the following practices
You will learn how to download and use winscp
You'll learn what HDP is and how to download it.
You will learn how to install HDP
You'll learn how to use SSH to connect to HDP and execute shell commands
You will learn how to operate HDP easily through winscp
You'll learn what a web shell is and how to use it.
You'll learn about the start-up and management of Hadoop ecosystems
Develop a spark program and get a preliminary understanding of Apache spark
Spark's Architecture and Basic Concepts
You'll learn about components in the spark ecosystem
Developer tools help you develop Azure Databricks applications
Learn how to download and install Python and use databricks cli
Installation of databricks cli
Learn how to use commands in databricks cli to operate workspace and dbfs
Learn how to use DBUtils to operate dbfs and how to use magic commands
Understand JDK download and installation
Understand the download and installation of IDEA development tools
You'll learn how to develop a custom library through IDEA and install it into databricks
You will learn how to use Azure Data Factory and how to integrate Azure Data Factory with Azure Databricks.
You'll learn how to analyze and debug notebook code in Azure Data Factory
Deployment and use of ETL use cases
How to verify and debug ETL
You will learn about the types and functions of notebook
You'll learn about the powerful visualization features of notebook in databricks
On the first day of the Spark AI Summit 2019, databricks announced a new open source project called Delta Lake to deliver reliability to data lakes, this can be considered as Data Lake 2.0. This lecture demonstrates how to use delta Lake in Apache spark.
you'll learn about Delta Lake on Databricks
You'll learn what postman is and how to install it in two ways
Generate a token in databricks for authentication of restful API
The Spark Clusters can be created either manually or through restful APIs
Understand how to delete a spark cluster with restful API
How to delete spark cluster completely and release resources.
You'll learn about the history of AI, how neural networks came into being, and the trend of AI.
You will learn what machine learning is and key concepts in machine learning through animated stories.
How to Use Single-machine scikit-learn in Spark Cluster for Machine Learning
You'll learn how to tune machine learning models with distributed spark
You will learn how to train a neural network model using tensorflow based on GPU
Origin of Distributed Deep Learning, Architecture of Distributed Deep Learning, Application Trend of GPU
Framework, method, architecture of distributed deployment of tensorflow
Introducing HorovodRunner for Distributed Deep Learning Training
you'll learn about how to manage the end-to-end machine learning lifecycle
What is transfer learning and how does transfer learning come into being?
use GPU machines to practice the transfer learning on databricks
You'll learn what a virtual network is and how to create a virtual network for HDInsight
How to create a Kafka cluster using HDInsight service in azure
You will learn how to use your local computer to connect to the Kafka cluster in the azure cloud
In order to communicate between Kafka and databricks, you need to modify the configuration of Kafka
Kafka and databricks are on different networks, and you'll learn how to make the two networks interoperable
You will learn how to use commands to create topics for Kafka in HDInsight
You will learn how to send messages to the Kafka cluster in databricks
You'll learn how to receive Kafka messages in databricks
You will understand the story of the birth of graph theory, the application scenarios of graph analysis, and the development of graph computing.
Spark cluster does not contain dataframes library, you will learn how to install dependency libraries in spark cluster.
Understand how to create a graph in the spark cluster through hands-on, you will also learn how to query vertices, edges and degree.
Microsoft Azure is the fastest growing cloud platform in the world. No prior Azure experience required.
Azure Databricks is unique collaboration between Microsoft and Databricks, forged to deliver Databricks’ Apache Spark-based analytics offering to the Microsoft Azure cloud. With Azure Databricks, you can be developing your first solution within minutes. Azure Databricks is a fast, easy and collaborative Apache Spark–based analytics service.
Databricks builds on top of Spark and adds:
Highly reliable and performant data pipelines
Productive data science at scale
In this course, you'll have a strong understanding of azure databricks, you will know how to use Spark SQL, Machine Learning, Graph Computing and Structured Streaming Computing in Aziure Databricks.
Why Azure Databricks?
Productive : Launch your new Apache Spark environment in minutes.
Scalable : Globally scale your analytics and machine learning projects.
Trusted : Help protect your data and business with Azure AD integration, role-based controls and enterprise-grade SLAs.
Flexible : Build machine learning and AI solutions with your choice of language and deep learning frameworks.
This course contains both theory lectures ( slides are attached to download for reference) and a significant number of hands-on demos that helps you in gaining hands-on experience. This course help you in laying strong basic foundation in preparation of Microsoft Azure Cloud and Databricks.
In this course, you can not only learn azure databricks, but also learn and practice Machine Learning, Streaming Computing, Graph Analysis, installation and deployment of Open Source Apache spark.