EncartaLabs

Cloudera Data Platform Private Cloud Base

( Duration: 4 Days )

This Cloudera Data Platform Private Cloud Base training course provides a comprehensive understanding of all the steps necessary to operate and maintain on-premises clusters using Cloudera Manager. This course also covers the installation, configuration, load balancing and tuning.

By attending Cloudera Data Platform Private Cloud Base workshop, delegates will learn to:

  • Install Cloudera Manager
  • Use Cloudera Manager to install a CDP Private Cloud Base cluster
  • Configure and monitor the cluster using Cloudera Manager
  • Understand, evaluate, and select the most appropriate data storage option
  • Optimize cluster performance
  • Perform routine cluster maintenance tasks
  • Detect, troubleshoot, and repair problems with the cluster

The Cloudera Data Platform Private Cloud Base class is ideal for:

  • Systems administrators who have at least basic Linux experience. Prior knowledge of CDP, nor earlier platforms such as Cloudera’s CDH or Hortonworks HDP, is not required.

COURSE AGENDA

1

Cloudera Data Platform

  • Industry Trends for Big Data
  • The Challenge to Become Data-Driven
  • The Enterprise Data Cloud
  • CDP Overview
  • CDP Form Factors
2

CDP Private Cloud Base Installation

  • Installation Overview
  • Cloudera Manager Installation
  • CDP Runtime Overview
  • Cloudera Manager Introduction
3

Cluster Configuration

  • Overview
  • Configuration Settings
  • Modifying Service Configurations
  • Configuration Files
  • Managing Role Instances
  • Adding New Services
  • Adding and Removing Hosts
4

Data Storage

  • Overview
  • HDFS Topology and Roles
  • HDFS Performance and Fault Tolerance
  • HDFS and Hadoop Security Overview
  • Working with HDFS
  • HBase Overview
  • Kudu Overview
  • Cloud Storage Overview
5

Data Ingest

  • Data Ingest Overview
  • File Formats
  • Ingesting Data using File Transfer or REST Interfaces
  • Importing Data from Relational Databases with Apache Sqoop
  • Ingesting Data Using NiFi
  • Best Practices for Importing Data
6

Data Flow

  • Overview of Cloudera Flow Management and NiFi
  • NiFi Architecture
  • Cloudera Edge Flow Management and MiNiFi
  • Controller Services
  • Apache Kafka Overview
  • Apache Kafka Cluster Architecture
  • Apache Kafka Command Line Tools
7

Data Access and Discovery

  • Apache Hive
  • Apache Impala
  • Apache Impala Tuning
  • Search Overview
  • Hue Overview
  • Managing and Configuring Hue
  • Hue Authentication and Authorization
  • CDSW Overview
8

Data Compute

  • YARN Overview
  • Running Applications on YARN
  • Viewing YARN Applications
  • YARN Application Logs
  • MapReduce Applications
  • YARN Memory and CPU Settings
  • Tez Overview
  • Hive on Tez
  • ACID for Hive
  • Spark Overview
  • How Spark Applications Run on YARN
  • Monitoring Spark Applications
  • Phoenix Overview
9

Managing Resources

  • Configuring cgroups with CPU Scheduling
  • The Capacity Scheduler
  • Managing Queues
  • Impala Query Scheduling
10

Planning Your Cluster

  • General Planning Considerations
  • Choosing the Right Hardware
  • Network Considerations
  • CDP Private Cloud Considerations
  • Configuring Nodes
11

Advanced Cluster Configuration

  • Configuring Service Ports
  • Tuning HDFS and MapReduce
  • Managing Cluster Growth
  • Erasure Coding
  • Enabling HDFS High Availability
12

Cluster Maintenance

  • Checking HDFS Status
  • Copying Data Between Clusters
  • Rebalancing Data in HDFS
  • HDFS Directory Snapshots
  • Host Maintenance
  • Upgrading a Cluster
13

Cluster Monitoring

  • Cloudera Manager Monitoring Features
  • Health Tests
  • Events and Alerts
  • Charts and Reports
  • Monitoring Recommendations
14

Cluster Troubleshooting

  • Overview
  • Troubleshooting Tools
  • Misconfiguration Examples
15

Security

  • Data Governance with SDX
  • Hadoop Security Concepts
  • Hadoop Authentication Using Kerberos
  • Hadoop Authorization
  • Hadoop Encryption
  • Securing a Hadoop Cluster
  • Apache Ranger
  • Apache Atlas
  • Backup and Recovery
16

Private Cloud / Public Cloud

  • CDP Overview
  • Private Cloud Capabilities
  • Public Cloud Capabilities
  • What is Kubernetes?
  • WXM Overview
  • Auto-scaling

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 4,000 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top