This course is a part of “Big data Internship Program” which is aligned to a typical Big data project life cycle stage.
This course is focused on the Ingestion in Big data .
Our Course is divided into two part 1) Technical Knowledge with examples and 2) Work on project
In this video, we have explained what is data ingestion, How to process data, challenges in data ingestion, the key function of data ingestion.
This part -1 course is focused on the foundation of Big data . It covers technical items like
Part - 1 is free here
In this video, we have explained what is data ingestion and there tools available in markets.
In this video, we have explained data ingestion tools Kafka, Chukwa,Storm etc.
This video shows different type of file format supported in Hadoop.
CSV /Text files are quite common and often used for exchanging data between Hadoop and external systems.
This video shows that Sequence files store data in a binary format with a similar structure to CSV. Like CSV, sequence files do not store metadata with the data so the only schema evolution option is appending new fields.
Avro files are quickly becoming the best multi-purpose storage format within Hadoop. Avro files store metadata with the data but also allow specification of an independent schema for reading the file. Here we show you all about this file format .
RC Files or Record Columnar Files were the first columnar file format adopted in Hadoop. Like columnar databases, the RC file enjoys significant compression and query performance benefits.ORC Files or Optimized RC Files were invented to optimize performance in Hive and are primarily backed by HortonWorks. This video shows about these two file format.
Parquet Files are yet another columnar file format that originated from Hadoop creator Doug Cutting’s Trevni project. Like RC and ORC, Parquet enjoys compression and query performance benefits, and is generally slower to write than non-columnar file formats. In this video you can learn more about this file format .
In this video, we have explained to you what is sqoop, what is flume, sqoop work flow, sqoop architecture.
In this video, we have explained what is import command, how sqoop import command is executed.
In this video we have explained how to execute commands in terminal,how to get table list, how to get list of data bases, how to import data in hdfs.
In this video, we have explained how to run sqoop commands, what is structure of sqoop commands, what are the parameters used in the execution of sqoop commands.
In this video we have explained what is sqoop export, and how it is used.
In this video, we have explained what is sqoop jobs how it used and when it is used. how to create jobs, how to list sqoop jobs available.
In this video we have explained what is incremental sqoop, and how it works.what are the incremental import parameters etc.
In this video, we have explained how incremental import works, how to append data to the table.
In this video, we have explained what is flume, and where it is used.difference between flume and sqoop.
In this video, we have explained how flume works, what is flume agent what are the components of flume agent, how data is flow between various components of the flume.
In this video, we have explained what are components of the flume, how they are configured i.e how flume agent is configured.
In this video, we have explained how to run flume agent. and get a result.
In this video, we have explained what is multi-agent flume, what is the consolidation of flume.
In this video, we have explained what is multiplexing,use of multiplexing, channel selector etc.
In this video, we have tried to explain what is an interceptor, why it is used, how it is configured, and how this runs. what are types of interceptors?
In this video, we have tried to explain what is Recommendation with the help of book recommendation concepts.
In this video, we have shown you how to load data in MySQL and then how to import data in hdfs. through sqoop commands.
In this video, we have explained what is a script,how we can execute our job by using the shell script.
In Video, we have shown how book recommendation is working, how the rating is generated in hdfs through the flume.
Big Data Trunk is the leading Big Data focus consulting and training firm founded by industry veterans in data domain. It helps is customer gain competitive advantage from open source, big data, cloud and advanced analytics. It provides services like Strategy Consulting, Advisory Consulting and high quality classroom individual and corporate training.