Apache Hadoop is a framework that allows for the distributed processing of massive data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Hadoop has established itself as an industry-leading platform for deploying cloud-based applications and services. The Hadoop eco-system is large, and it includes such popular products as HDFS, Map/Reduce, HBase, Zookeeper, Oozie, Pig, and Hive. However, with such versatility comes complexity and difficulty in deciding on appropriate use cases.
Hadoop Administration training course presents all the small building blocks with a thorough coverage of each component in the Hadoop Administration stack. We begin by looking at Hadoop’s architecture and its underlying parts with topdown identification of component interactions within the Hadoop eco-system. This course then provides in-depth coverage of Hadoop Administration Distributed FileSystem (HDFS), HBase, Map/Reduce, Oozie, Pig and Hive. To re-enforce concepts, each section is followed by a set of hands-on exercises.
By attending Hadoop Administration workshop, Participants will learn:
- Hadoop, HDFS and it's Ecosystem?
- Understand Data Loading Techniques using Sqoop and Flume.
- How to Plan, implement, manage, monitor, and secure a Hadoop Cluster.
- How to configure backup options, diagnose and recover node failures in a Hadoop Cluster.
- Have a good understanding of ZooKeeper service.
- Secure a deployment and understand Backup and Recovery.
- HBASE, Oozie, Hive, and Hue.
Hadoop Administration class assumes:
- Good knowledge of Linux is required.
- Fundamental Linux system administration skills such as Linux scripting (perl/bash), good troubleshooting skills, understanding of system’s capacity, bottlenecks, basics of memory, CPU, OS, storage, and networks are preferable.
- No prior knowledge of Apache Hadoop and Hadoop Clusters is required.
Production support Database Administrators, Development Database Administrators, System Administrators, Software Architects, Data Warehouse Professionals, IT Managers, Software Developers and those interested in learning Hadoop Cluster Administration should attend this course.