EncartaLabs

Hortonworks - Administration

( Duration: 3 Days )

COURSE AGENDA

1

HDFS

  • Understand HDFS architecture
  • Understand how the NameNode maintains the file-system metadata
  • Understand how data is stored in HDFS
  • Understand the relationship between NameNodes and DataNodes
  • Understand the relationship between NameNodes and namespaces in Hadoop 2.0
  • Understand the WebHDFS commands
  • Understand the various “hadoop fs” commands
2

Install and Configure HDP

  • Understand the minimum hardware and software requirements
  • Understand how to set up a local repository for HDP installation
  • Understand how to install HDP using Apache Ambari
  • Understand differences between master and slave services
  • Understand complete deployment layout
  • Understand how to configure and manage different services
  • Understand different configuration parameters
3

Ensure Data Integrity

  • Understand block-scanning reportDefine Pig relations
  • Run file-system check
  • Understand replication factor, under & over replication
  • Set up NFS Gateway to access HDFS data
4

YARN Architecture and MapReduce

  • Understand the architecture of YARN
  • Understand the components of the YARN ResourceManager
  • Demonstrate the relationship between NodeManagers and ApplicationMasters
  • Demonstrate the relationship between ResourceManagers and ApplicationMasters
  • Explain the relationship between Containers and ApplicationMasters
  • Explain how Container failure is handled for a YARN MapReduce job
  • Understand the architecture of MapReduce
  • Understand the various phases of a MapReduce job
5

Job Schedulers and Enterprise Data Movement

  • Understand the concept of job scheduling
  • Configure the capacity scheduler
  • Understand the difference between capacity and fair scheduler
  • Understand various data ingestion mechanisms for Hadoop
  • Explain the different between traditional and Hadoop-based ETL platforms
  • Use the distcp command to move data between two clusters
  • Understand Hive architecture
  • Move data between a traditional database and Hadoop using Apache Sqoop
  • Explain Hive/MR vs. Hive/Tez
  • Stream data using Apache Flume
  • Configure workflows and deployment using Apache Oozie
6

Monitor and Administer Clusters

  • Monitor using the Ambari UI, Ganglia, and Nagios
  • Commission and decomission nodes
  • Back up and recover Hadoop data
  • Use Hadoop snapshots
  • Understand rack awareness and topology
  • Understand NameNode high availability
  • Use the “hdfs haadmin” commands
7

Secure HDP

  • Understand security concepts
  • Configure Kerberos
  • Configure HDP authorization and authentication

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 4,000 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top