Hadoop Hive and Pig

( Duration: 2 Days )

In Hadoop, Hive and Pig training course, Participants will receive an overview of Hadoop ,Hive and Pig and discover how it can help meet your business goals. They will learn the different Apache Hadoop Hive and Pig technologies, including MapReduce, Hadoop, Pig Distributed File System (HDFS), Hive, Pig, Sqoop, Flume, and how these fit into their existing technology environment.

By attending this Apache Hadoop Hive and Pig Developer Workshop, Participants will learn:

  • The core technologies of Hadoop Hive and Pig
  • How HDFS and MapReduce work
  • How to develop MapReduce applications
  • How to unit test MapReduce applications
  • How to use MapReduce combiners, partitioners and the distributed cache
  • Best practices for developing and debugging MapReduce applications
  • How to implement data input and output in MapReduce applications
  • Algorithms for common MapReduce tasks
  • How to join data sets in MapReduce
  • How Hadoop Hive and Pig integrates into the data center
  • How to use Mahout’s machine learning algorithms
  • How Hive and Pig can be used for rapid application development
  • How to create large workflows using Oozie

Hadoop, Hive and Pig workshop is appropriate for developers who will be:

  • Writing, maintaining and/or optimizing Hadoop Hive and Pig jobs.
  • Participants should have programming experience; knowledge of Java is highly recommended.
  • Understanding of common computer science concepts is a plus.

  • The Hadoop Hive and Pig Development class is geared toward software developers with experience in the Java programming language.
  • Architects and developers, who wish to write, build and maintain Apache Hadoop Hive and Pig jobs.



What is Big Data & Why Hadoop Hive and Pig?

  • Big Data Characteristics, Challenges with traditional system

Hadoop Hive and Pig Overview & it’s Ecosystem

  • Anatomy of Hadoop Hive and Pig Cluster, Installing and Configuring Hadoop Hive and Pig
  • Hands On Exercise - Build a pseudo - distributed cluster

HDFS - Hadoop Hive and Pig Distributed File System and Map Reduce Anatomy

  • HDFS Architecture, Name Nodes, Data Nodes and Secondary Name Node
  • How Map Reduce Works?
  • Hands On Exercise - Basic HDFS operations and Running map reduce programs

Monitoring & Management of Hadoop Hive and Pig

  • Managing HDFS with Tools like fsck and dfsadmin
  • Using HDFS & Job Tracker Web UI

Hive Basics

  • Hive Architecture, Hive Variables Creating Internal & External Tables, Partitioning Data, Configuring Shared Meta Store
  • Loading Data into Hive, Storing Query Output
  • Writing queries - Joining Table, Union, Filtering, Grouping, Sorting etc. and advanced queries

Advanced Hive

  • Sampling, Buckets and Clusters
  • TRANSFORM, Creating User Defined Functions and SerDes
  • Debugging & Troubleshooting Hive Queries
  • Hive Best Practices
  • Hands on Exercise - Configuring Hive and shared meta store, Creating tables and partitions, Structured data analysis
  • Hands on Exercise - Writing UDFs and SerDes

Sqoop & Flume

  • Importing and Exporting data from using RDBMS and Log Files
  • Hands on Exercise - Import and Export data from MySQL to Hive using Sqoop
  • Hands on Exercise - Importing logs from applications using Flume

Pig Basics

  • Pig Basics, Loading data files, dumping and storing results
  • Writing queries - Filter, Group, Join and Sort, FOREACH, SPLIT, SAMPLE etc.

Pig Advanced

  • UDFs and Macros
  • Diagnostic operators, debugging and collecting statistics
  • Performance Optimizations, Multi Query Execution
  • Hands on Exercise - Semi - structured Data Analysis(Tweets and Log Analysis)
  • Hands on Exercise - Writing UDFs

Hadoop Hive and Pig Best Practices

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 4,000 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.