
This video provides detailed overview of this course and the topics that will be covered as a part of this course. Also , a brief note about trainer's profile and experience is also mentioned.
This video explains in detail about the Big Data Introduction, Challenges of BigData, sources of bigdata along with real time example scenarios
This video provides information about hadoop introduction, different roles in hadoop and distributors of hadoop etc.,Realtime example has been discussed to demonstrate the roles in hadoop.
This video explains the overview about hadoop ecosystems. Each ecosystem introduction and their importance in big data stack.
This video explains about distributed architecture ., how big data addresses storage issue using hdfs architecture and daemon services of hadoop1 architecture.
In continuation to the previous video, this video explains about the hdfs architecture in detail
In continuation to the previous video, this video explains about the concepts of edge node, cluster nodes, responsibilities of job tracker and namenode in detail
This video explains about the disadvantages of hadoop1 architecture and introduces Yarn architecture and its deamon services. Also , how hadoop2 architecture has overcome the limitations of hadoop1 is explained in detail.
This video provides the detailed explanation about how hadoop handles namenode failure.
This video explains the process of setting up hadoop in pseudo distribution mode. Also I have explained the softwares required to download for hadoop quickstart virtual environment setup.
This video provides the information about the linux commands that are used to interact with hdfs.Using these basic linux commands user can interact with hdfs to store the big data and also to implement his business logic on the data in hdfs.
This video provides the way how we connect to a cluster node or edge node remotely from a window desktop using putty.exe and also I have explained the file transfer from windows machine to datanode using winscp.
This video demonstrates how we can setup hadoop environment in Azure cloud.
This video demonstrates how we can setup hadoop environment in Azure cloud.
This video demonstrates how we can login to Ambari
This video explains the second approach in ingesting the data from a remote machine to edgenode/clusternode using sftp protocol in linux and using winscp in windows.
This video provides the introduction to Apache Sqoop and various common, control arguments we use in sqoop command
This video provides handson experience on data ingestion from RDMS mysql database to hdfs
This video provides practical demo on incremental append scenario in sqoop.
This video provides the practical demonstration of sqoop commands like querying,columnar records sqooping,importing all the tables from a database etc.,
This video provides the information about apache flume and its components, architecture, properties used etc.,
This video provides a practical demo on data ingestion of streaming data from an external source folder to hdfs using spoolDir source property in flume
This video provides the introduction to hive and the way how managed tables can be created in hive and how to load the data in to those tables etc.,
This video provides a hands on demonstration of external tables creation in hive
This video provides a diagrammatic explanation of hive architecture and its components. Also the way how we execute hive queries in GUI mode using HUE manager.
This video provides a hands on demo on how we partition the data using hive and advantages of using partitioning concept in hive. Also we will discuss about types of partitioning we have in hive
This video provides detailed hands on demo on dividing the data into buckets and the properties to set to load the data into bucketing tables
Various properties have to be set in order to enable certain features in hive which are disabled by default. For example dynamic partitioning, data loading into bucketing tables needs extra properties to be set. They will be dealt in this video
Inpu dataset can be xml documents also. We will be learning how to process xml documents in hive as a part of this demo
Through this video you can learn json file processing in hive with a detailed example
Apart from hive cli which is deprecated we have beeline shell to connect to hive server . This video will provide you the hands on approach to demonstrate connectivity to beeline shell and its usage
we have various file formats in hive depending on the compression techniques used and size., all these file formats are explained in detail in this video
This video demonstrates various file formats in hive with an example
This video provides a hands on demonstration about complex datatypes in hive like structs, unions, array and map .
This video explain about the properties to be set for enabling update and delete operations in hive
Hive converts joins over multiple tables into a single map/reduce job if for every table the same column is used in the join clauses. This video demonstrates mapside join in hive
This video demonstrates the way how we execute hive scripts
This video helps you to understand the importance of pig in performing data cleansing operations in hadoop and its basic commands to start with
This video explains the usage of group and cogroup commands in apache pig
This video provides the detailed explanation about FILTER,JOIN,RANK,FLATTEN,ORDERBY,DISTINCT commands in pig
In this video you will learn how to create and execute a pig script
This video demonstrates how we will un-nest the tuples and bag in the input dataset.
This video demonstrates how we can process a json file using pig functions
This video helps you to understand the basic building blocks of programming and java concepts
Basic building blocks of core java programming are explained in detail in video along with eclipseIDE environment usage
This video explains about inheritance,polymorphism,abstraction and encapsulation properties in java with sample hands on demo. Also interfaces are explained with an example program
Data Analytics is the practice of using data to drive business strategy and performance. It includes a range of approaches and solutions, from looking backward to evaluate what happened in the past to looking forward to do scenario planning and predictive modelling.Data Analytics spans all of the functional businesses to address a continuum of opportunities in Information Management, Performance Optimisation and Analytic Insights. Organizations now realize the inherent value of transforming these big data into actionable insights. Data science is the highest form of big data analytics that produce the most accurate actionable insights, identifying what will happen next and what to do about it.
Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs. Hadoop is not just an effective distributed storage system for large amounts of data, but also, importantly, a distributed computing environment that can execute analyses where the data is.
In this course, detailed explanation about hadoop framework and its ecosystems has been provided. All the concepts are explained in detail with examples and business use cases as case studies.Also, latest technologies in big data area like apache spark, apache kafka, Mongo DB are explained. In addition, Interview questions with respect to each ecosystem and resume preparation tips are included.