EncartaLabs

HPE Data Fabric Cluster Administration

( Duration: 3 Days )

The HPE Data Fabric Cluster Administration training course provides the knowledge and skills required to plan, install, maintain, and manage a secure Data Fabric cluster, then manage YARN jobs. Learn how to design and install a cluster, and perform pre- and post-installation testing. You configure users and groups, and work with key features of a Data Fabric cluster, including volumes, snapshots, and mirrors-including how to use remote mirrors for disaster recovery. The course also covers monitoring and maintaining disks and nodes, and troubleshooting basic cluster problems. Finally, you install YARN services and practice configuring logging and job schedulers.

By attending HPE Data Fabric Cluster Administration workshop, delegates will learn to:

  • Audit and prepare cluster hardware prior to installation
  • Run pre-installation tests to verify performance
  • Plan a service layout according to cluster configuration and business needs
  • Describe the primary architectural components of a HPE Data Fabric installation (nodes, storage pools, volumes, containers, chunks, blocks)
  • Use the UI installer to install the HPE Data Fabric distribution
  • Define and implement an appropriate node topology
  • Define and implement an appropriate volume topology
  • Set permissions and quotas for users and groups
  • Set up email and alerts
  • Set up log aggregation for YARN
  • Locate and manipulate configuration files used by the cluster
  • Start and stop services
  • Use Hadoop commands to perform basic functions
  • Use maprcli commands to perform basic functions
  • Use the MCS
  • Assist with data ingestion
  • Configure, monitor, and respond to alerts
  • Detect and replace failed disks
  • Detect and replace failed nodes
  • Create and delete snapshots using both maprcli and the MCS
  • Create and delete mirrors using both maprcli and the MCS
  • Use mirrors and snapshots for data protection
  • Create and implement a disaster recovery plan
  • Add, remove, and upgrade ecosystem components
  • Monitor and tune job performance
  • Configure appropriate job scheduling (FIFO, Fair Scheduler, Capacity Scheduler, Label-base scheduling, Express Lane)
  • Set up NFS access to the cluster

  • Basic Hadoop knowledge and intermediate Linux knowledge
  • Experience using a Linux text editor such as vi
  • Familiarity with the Linux command line options such as mv, cp, ssh, grep, and user add

  • This HPE Data Fabric Cluster Administration class is ideal for System Administrators who will be creating and maintaining a Hadoop cluster environment.

COURSE AGENDA

1

Introduction to the HPE Data Fabric

  • Key components of HDFS
  • Key components of Data Fabric File System
  • Data Fabric File System versus HDFS
2

Prepare for Installation

  • Security modes
  • Planning the service layout
  • Preparing cluster hardware
  • Testing nodes
3

Install the Data Fabric

  • The Installer
  • Performing a manual installation
  • Licensing the cluster
4

Verify and Test the Cluster

  • Verifying cluster status
  • Post-installation benchmark tests
  • Cluster structure
5

Work with Volumes

  • About volumes
  • Volume placement (topology)
  • Attributes for standard volumes
  • Designing a volume plan
  • Creating and configuring volumes
6

Work with Snapshots

  • How snapshots work
  • Working with snapshots
  • Using and maintaining snapshots
7

Work with Mirrors

  • How mirrors work
  • Working with local mirrors
  • Working with remote mirrors
  • Remote mirrors and disaster recovery
8

Configure Users and Cluster Parameters

  • Managing users and groups
  • Access Control Expressions (ACEs)
  • User and group quotas
  • Configuring topology and email
9

Configure Cluster Access

  • Accessing cluster data
  • Virtual IP addresses
  • Client access
10

Monitor and Manage the Cluster

  • Using the MCS and CLI
  • Monitoring
  • Responding to alarms
11

Disk and Node Maintenance

  • Adding disks
  • Replacing failed disks
  • Node maintenance
  • Adding nodes
12

Troubleshoot Cluster Problems

  • Basic troubleshooting
  • Tools and utilities
13

Install and Configure YARN

  • YARN services
  • YARN job execution flow
  • Configuring YARN
  • Configuring YARN logging
14

Job Schedulers

  • Overview of job schedulers
  • Configuring the capacity scheduler
  • Configuring the fair scheduler
  • Label-based scheduling

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 4,000 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top