EncartaLabs

AWS Glue

( Duration: 4 Days )

AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. You can create and run an ETL job with a few clicks in the AWS Management Console. You simply point AWS Glue to your data stored on AWS, and AWS Glue discovers your data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, your data is immediately searchable, queryable, and available for ETL. This AWS Glue training course will take you through the fundamentals of AWS Glue to get you started with this service.

By attending AWS Glue workshop, delegates will learn:

  • Serverless ETL
  • The knowledge and architecture of a typical ETL project
  • The prerequisite setup of AWS parts to use AWS Glue for ETL
  • Knowledge of how to use AWS Glue to perform serverless ETL
  • How to edit ETL processes created from AWS Glue

  • One ore more of the data storage destinations offered by AWS
  • Data warehousing principles
  • Serverless computing
  • Object-orientated programming (Python)
The AWS Glue class is ideal for Personnel of:
  • Data warehouse engineers who are looking to learn more about serverless ETL and AWS Glue
  • Developers who want to learn more about ETL work using AWS Glue
  • Developer Leads who want to learn more about the serverless ETL process
  • Project Managers and Owners who want to learn about data preparation

COURSE AGENDA

1

Overview of cloud and AWS

  • Overview of AWS web services and Free tier Account
  • Creating an AWS Account
  • Exploring Web Console
  • Overview of AWS CLI tool, SDKs and APIs
  • Overview of EC2 instance
2

Data Storage on AWS

  • Overview of AWS Storage Service
  • Overview of S3, Glacier
  • Creating S3 Bucket
  • Properties of S3 bucket
  • Working with RDS databases
  • Overview of No-SQL database (DynamoDB)
3

Introduction To Glue

  • Glue Basics
  • Features of Glue
  • Glue Components
  • Securing with IAM
  • Setting up environment
  • Pointing to specific data stores and endpoints
4

Glue Data Catalogue

  • What and why of Data Catalogue
  • Crawlers
  • Connecting to your data store
  • Using Crawlers for Catalogue tables.
5

Working with Glue Jobs

  • Overview and working of Glue Jobs
  • Adding new jobs in Glue
  • Editing scripts in Glue
  • Triggering Jobs and their scheduling
6

Administering Glue

  • Scheduling Jobs
  • Scheduling Crawlers
  • Logging and monitoring Glue
7

ETL scripts and Glue APIs

  • ETL scripts in Python
  • Various Glue APIs
  • Common Data Types and Exceptions
  • Use-cases and Benefits
8

Troubleshooting on Glue

  • Where to look?
  • Connection issues
  • Most common issues
  • Best Practices for Glue

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 4,000 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top