EncartaLabs

MapReduce

( Duration: 5 Days )

MapReduce training course will introduce you to distributed data processing and using MapReduce to process large amounts of data. With more focus on practical and hands-on exercises, the workshop will teach you to write complex MapReduce programs and program in YARN. Understanding the advanced features of MapReduce will help you use it to give logical insights for business benefits. The emergence of Big Data has alleviated the need for large scale parallel processing of data. MapReduce is a software framework that helps perform this processing in a scalable and fault tolerant manner.

By attending MapReduce workshop, attendees will learn:

  • About Hadoop and its use in parallel data processing
  • To write MapReduce programs to analyse Big data and bring in business benefits to the organization
  • The advanced features of MapReduce and YARN

  • Hands-on experience in Core Java and good analytical skills.
  • Software Professionals, Java Developers, Analytics Professionals, ETL developers, Project Managers, Testing and other professionals who are keen to pursue a career in Big Data analytics.

COURSE AGENDA

1

Revisiting Hadoop

  • Hadoop vs RDBMS
  • Core components of Hadoop
  • Hadoop Distributed File System
  • HDFS Architecture and MapReduce
2

Introduction to MapReduce

  • MapReduce in Hadoop
  • History of MapReduce
  • MapReduce applications
  • Data Flow in MapReduce
  • Map and Reduce operations
  • Job submission flow of MapReduce
3

Understanding MapReduce

  • Data Flow in MapReduce
  • MapReduce example
  • MapReduce Daemons
  • Job tracker
  • Task Tracker
  • Other phases in MapReduce
  • Data Flow in single, multiple and no reduce task
4

MapReduce with YARN

  • Hadoop 1.x architecture
  • Problem with Hadoop 1.x, Hadoop 2.x features,
  • YARN MR Application Execution Flow
  • YARN Workflow
  • Anatomy of MapReduce Program
5

Advanced MapReduce

  • Input Splits in MapReduce
  • Combiner
  • Partitioner
  • Demos on MapReduce
  • Counters
  • Distributed Cache
  • MRunit
  • Reduce Join
  • Custom Input Format
  • Sequence Input Format

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 4,000 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top