EncartaLabs

Apache Avro

( Duration: 2 Days )

This Apache Avro training course provides skills to perform data serialization and data exchange services for Apache Hadoop. Avro facilitates the exchange of big data between programs written in any language.

  • A general familiarity with distributed computing.

The Apache Avro workshop is ideal for:

  • Developers

COURSE AGENDA

1

Introduction

2

Principles of Distributed Computing

  • Apache Spark
  • Hadoop
3

Principles of Data Serialization

  • How data object is passed over the network
  • Serialization of objects
  • Serialization approaches
    • Thrift
    • Protocol Buffers
    • Apache Avro
      • data structure
      • size, speed, format characteristics
      • persistent data storage
      • integration with dynamic languages
      • dynamic typing
      • schemas
      • untagged data
      • change management
4

Data Serialization and Distributed Computing

  • Avro as a subproject of Hadoop
  • Java serialization
  • Hadoop serialization
  • Avro serialization
5

Using Avro with

  • Hive (AvroSerDe)
  • Pig (AvroStorage)
6

Porting Existing RPC Frameworks

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 4,000 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top