Getting Started with Apache Flink
- Architecture of Apache Flink
Features of Apache Flink
- Job Manager
- Task Manager
- Job Client
- Set Path on Environment Variables
- Installation on Ubuntu
- Multiple Java Installation
- Basics of Computer Science
- Basics of Big Data
- Basics of Analytics
Apache Flink is an open source platform for distributed stream and batch data processing. It can run on Windows, Mac OS and Linux OS. In this blog post, let’s discuss how to set up Flink cluster locally. It is similar to Spark in many ways – it has APIs for Graph and Machine learning processing like Apache Spark – but Apache Flink and Apache Spark are not exactly the same.
Flink is an alternative of MapReduce, it processes data more than 100 times faster than MapReduce. Flink is independent of Hadoop but it can use HDFS to read, write, store, process the data. Flink does not provide its own data storage system. It takes data from distributed storage. The development of Flink is started in 2009 at a technical university in Berlin under the stratosphere. It was incubated in Apache in April 2014 and became a top level project in December 2014. Flink is a German word meaning swift / Agile. The logo of Flink is a squirrel, in harmony with Hadoop ecosystem.
- Big Data Developers who want to analyse and process their data using their Flink
- Spark Developers who want to upgrade their skills using Apache Flink