
Explore big data testing with Hadoop, Hive, Cassandra, and HBase, and learn how to start, stop, and harness the file system to manage data workflows.
Discover what big data is and why it matters, and explore how testing big data systems with Hadoop, Hive, Cassandra, and HBase supports reliable analysis.
Explore what big data is and why we need big data technologies, focusing on storage and analysis and the challenges with databases.
Explore Cassandra overview and Cassandra background within the big data testing framework, covering the role of Cassandra alongside Hadoop, Hive, and HBase.
Explore Cassandra architecture and its fault tolerance, highlighting no single point of failure, multi-node backups, master-slave roles, and load distribution.
Load data from text or csv files into a Cassandra table by creating the table with columns and a primary key, then copy data from the file.
Explore the Cassandra set type by creating and manipulating sets of values, such as names and phone numbers, and inserting, updating, and querying them within a table.
Explore Cassandra map type collections, modeling a table with id, name, and a map column, and learn inserting, updating, and querying map values such as home and office.
Cassandra collections overview: use the list type to store multiple values in one column, with emails as examples, and insert list values alongside a primary key.
Learn how to drop and truncate in Cassandra by walking through dropping tables, inserting records, and querying data to understand schema and data state.
Explore starting the hbase service, using the hbase shell, and verifying status to run queries and view values with timestamps.
Learn how to create an HBase table, define column families, and insert employee data for big data testing.
Learn how to enable and disable an HBase table, and how to use describe to view table metadata, with practical steps from creating a table to handling common errors.
Investigate how HBase filters control which data to display from a table, using column families, prefixes, and value-based criteria to scan and reveal employee records.
Explore how acid properties apply to HBase and examine the cap theorem in distributed systems. Understand how isolation, consistency, and availability shape data guarantees in practice.
Understand Hadoop's running modes, starting with standalone mode that uses the local file system for input and output, and note that this mode is now obsolete.
Demo shows using hdfs commands in a Cloudera environment to list directories and files, view data, and create directories and files within the Hadoop cluster.
Practice hands-on with HDFS commands on Cloudera to read files, inspect directories, and verify file sizes and availability, gaining practical insight into HDFS data management.
Learn to copy files from your local system to a Hadoop cluster using HDFS commands, verify transfer with directory listings, and manage files and directories across local and cluster storage.
Master HDFS commands to copy files between local and HDFS using put and get, verify with ls, and manage data across a Cloudera cluster.
Learn to run HDFS commands on Ubuntu and start the Hadoop cluster by launching five daemons, including the NameNode, DataNodes, ResourceManager, and NodeManager.
Explore different big data supported data types and how relational databases interact with these formats in Hadoop ecosystems like Hive, Cassandra, and HBase.
Explore Hive and its features, noting that it is designed for batch processing and not designed for online processing, in the context of big data testing.
Discover how to install Hive on an Ubuntu machine by downloading the Hive package, extracting it with tar, creating an installation folder, using sudo, and starting Hive after setup.
Learn to create databases and tables in Hive, insert data, and run Hive queries to explore schemas and content, essential for big data testing with the Hadoop ecosystem.
Learn how to create a managed Hive table and load data from the local file system, mapping columns such as name, country, and company, then query the table.
Create an external table in Hive, specify its location, and load data from external sources. Verify data at the specified location through the external table.
Explore Hive static and dynamic partitions to organize data by column values. Create static partitions in advance for a table like employee; use dynamic partitions when values are unknown.
Explore how Hive bucketing works by creating buckets, configuring enforced bucketing, and analyzing regions and bucket counts to manage dynamic data efficiently.
Explore Hive index implementation by creating a named index on a table and column, and learn how it affects query performance.
Discover techniques to compare two Hive tables by extracting distinct values, handling nulls, and applying unions and joins to identify differences and harmonize data.
This course is for Testing profile candidate who wanted to build there career into Big Data Testing. So I have designed this course so they can start learning big data technologies. All the users who are working or looking for Job in QA profile or wanted to move into big data testing domain should take this course and go through the complete tutorials.
I have included the material which is needed for big data testing profile and it has all the necessary contents which includes practical examples as well depends on questions and there practicality.
It will give the detailed information for different topics like big data hadoop, hive, Hbase, Cassandra, Unix, Shell, Pig along with Agile which is needed by the tester to move into bigger umbrella i.e. Big Data Testing.
This course is well structured with all elements of different technologies in practical manner separated by different topics. Students should take this course who wanted to move into big data testing to advance their career.