EncartaLabs

Cloudera Administrator

( Duration: 5 Days )

This Cloudera Administrator training course for Apache Hadoop provides a comprehensive understanding of all the steps necessary to operate and maintain a Hadoop cluster using Cloudera Manager. This course also covers the installation, configuration, load balancing and tuning.

By attending Cloudera Administrator workshop, delegates will learn:

  • Cloudera Manager features that make managing your clusters easier, such as aggregated logging, configuration management, resource management, reports, alerts, and service management.
  • The internals of YARN, MapReduce, Spark, and HDFS
  • Determining the correct hardware and infrastructure for your cluster
  • Proper cluster configuration and deployment to integrate with the data center
  • How to load data into the cluster from dynamically-generated files using Flume and from RDBMS using Sqoop
  • Configuring the FairScheduler to provide service-level agreements for multiple users of a cluster
  • Best practices for preparing and maintaining Apache Hadoop in production
  • Troubleshooting, diagnosing, tuning, and solving Hadoop issues

  • Basic Linux experience. Prior knowledge of Apache Hadoop is not required.

The Cloudera Administrator class is ideal for:

  • Systems administrators and IT managers who have basic Linux experience.

COURSE AGENDA

1

The Hadoop Distributed File System (HDFS)

  • HDFS Features
  • Writing and Reading Files
  • NameNode Memory Considerations
  • Overview of HDFS Security
  • Web UIs for HDFS
  • Using the Hadoop File Shell
2

MapReduce and Spark on YARN

  • The Role of Computational Frameworks
  • YARN: The Cluster Resource Manager
  • MapReduce Concepts
  • Apache Spark Concepts
  • Running Computational Frameworks on YARN
  • Exploring YARN Applications Through the Web UIs, and the Shell
  • YARN Application Logs
3

Hadoop Configuration and Daemon Logs

  • Cloudera Manager Constructs for Managing Configurations
  • Locating Configurations and Applying
  • Configuration Changes
  • Managing Role Instances and Adding Services
  • Configuring the HDFS Service
  • Configuring Hadoop Daemon Logs
  • Configuring the YARN Service
4

Getting Data Into HDFS

  • Ingesting Data From External Sources With Flume
  • Ingesting Data From Relational Databases With Sqoop
  • REST Interfaces
  • Best Practices for Importing Data
5

Planning Your Hadoop Cluster

  • General Planning Considerations
  • Choosing the Right Hardware
  • Virtualization Options
  • Network Considerations
  • Configuring Nodes
6

Installing and Configuring Hive, Impala, and Pig

  • Hive
  • Impala
  • Pig
7

Hadoop Clients Including Hue

  • What Are Hadoop Clients?
  • Installing and Configuring Hadoop Clients
  • Installing and Configuring Hue
  • Hue Authentication and Authorization
8

Advanced Cluster Configuration

  • Advanced Configuration Parameters
  • Configuring Hadoop Ports
  • Configuring HDFS for Rack Awareness
  • Configuring HDFS High Availability
9

Hadoop Security

  • Why Hadoop Security Is Important
  • Hadoop’s Security System Concepts
  • What Kerberos Is and how it Works
  • Securing a Hadoop Cluster With Kerberos
  • Other Security Concepts
10

Managing Resources

  • Configuring cgroups with Static Service Pools
  • The Fair Scheduler
  • Configuring Dynamic Resource Pools
  • YARN Memory and CPU Settings
  • Impala Query Scheduling
11

Cluster Maintenance

  • Checking HDFS Status
  • Copying Data Between Clusters
  • Adding and Removing Cluster Nodes
  • Rebalancing the Cluster
  • Directory Snapshots
  • Cluster Upgrading
12

Cluster Monitoring and Troubleshooting

  • Cloudera Manager Monitoring Features
  • Monitoring Hadoop Clusters
  • Troubleshooting Hadoop Clusters
  • Common Misconfigurations

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 4,000 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top