EncartaLabs

Hadoop Hive and Pig

( Duration: 2 Days )

In Hadoop, Hive and Pig training course, Participants will receive an overview of Hadoop ,Hive and Pig and discover how it can help meet your business goals. They will learn the different Apache Hadoop Hive and Pig technologies, including MapReduce, Hadoop, Pig Distributed File System (HDFS), Hive, Pig, Sqoop, Flume, and how these fit into their existing technology environment.

By attending this Apache Hadoop Hive and Pig Developer Workshop, Participants will learn:

  • The core technologies of Hadoop Hive and Pig
  • How HDFS and MapReduce work
  • How to develop MapReduce applications
  • How to unit test MapReduce applications
  • How to use MapReduce combiners, partitioners and the distributed cache
  • Best practices for developing and debugging MapReduce applications
  • How to implement data input and output in MapReduce applications
  • Algorithms for common MapReduce tasks
  • How to join data sets in MapReduce
  • How Hadoop Hive and Pig integrates into the data center
  • How to use Mahout’s machine learning algorithms
  • How Hive and Pig can be used for rapid application development
  • How to create large workflows using Oozie

Hadoop, Hive and Pig workshop is appropriate for developers who will be:

  • Writing, maintaining and/or optimizing Hadoop Hive and Pig jobs.
  • Participants should have programming experience; knowledge of Java is highly recommended.
  • Understanding of common computer science concepts is a plus.

  • The Hadoop Hive and Pig Development class is geared toward software developers with experience in the Java programming language.
  • Architects and developers, who wish to write, build and maintain Apache Hadoop Hive and Pig jobs.

COURSE AGENDA

1

What is Big Data & Why Hadoop Hive and Pig?

  • Big Data Characteristics, Challenges with traditional system
2

Hadoop Hive and Pig Overview & it’s Ecosystem

  • Anatomy of Hadoop Hive and Pig Cluster, Installing and Configuring Hadoop Hive and Pig
  • Hands On Exercise - Build a pseudo - distributed cluster
3

HDFS - Hadoop Hive and Pig Distributed File System and Map Reduce Anatomy

  • HDFS Architecture, Name Nodes, Data Nodes and Secondary Name Node
  • How Map Reduce Works?
  • Hands On Exercise - Basic HDFS operations and Running map reduce programs
4

Monitoring & Management of Hadoop Hive and Pig

  • Managing HDFS with Tools like fsck and dfsadmin
  • Using HDFS & Job Tracker Web UI
5

Hive Basics

  • Hive Architecture, Hive Variables Creating Internal & External Tables, Partitioning Data, Configuring Shared Meta Store
  • Loading Data into Hive, Storing Query Output
  • Writing queries - Joining Table, Union, Filtering, Grouping, Sorting etc. and advanced queries
6

Advanced Hive

  • Sampling, Buckets and Clusters
  • TRANSFORM, Creating User Defined Functions and SerDes
  • Debugging & Troubleshooting Hive Queries
  • Hive Best Practices
  • Hands on Exercise - Configuring Hive and shared meta store, Creating tables and partitions, Structured data analysis
  • Hands on Exercise - Writing UDFs and SerDes
7

Sqoop & Flume

  • Importing and Exporting data from using RDBMS and Log Files
  • Hands on Exercise - Import and Export data from MySQL to Hive using Sqoop
  • Hands on Exercise - Importing logs from applications using Flume
8

Pig Basics

  • Pig Basics, Loading data files, dumping and storing results
  • Writing queries - Filter, Group, Join and Sort, FOREACH, SPLIT, SAMPLE etc.
9

Pig Advanced

  • UDFs and Macros
  • Diagnostic operators, debugging and collecting statistics
  • Performance Optimizations, Multi Query Execution
  • Hands on Exercise - Semi - structured Data Analysis(Tweets and Log Analysis)
  • Hands on Exercise - Writing UDFs
10

Hadoop Hive and Pig Best Practices

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 3,500 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 20,000 corporate candidates across india and abroad
  • All our trainings are conducted in workshop mode with more focus on hands On

View our other course offerings by visiting www.encartalabs.com/course-catalogue

Contact us for delivering this course as a public/open-house workshop for a group of 10+ candidates at our venue

Top