Call : (+91) 99 8080 3767
Mail : info@EncartaLabs.com
EncartaLabs

StreamSets Transformer

( Duration: 2 Days )

This StreamSets Transformer training course provides comprehensive coverage of StreamSets Transformer while at the same time providing an overview of the entire StreamSets eco-system. Additionally, today’s heterogeneous IT landscape requires businesses to seamlessly interact with a variety of environments such as traditional databases, Hadoop, DataBricks, SnowFlake, AWS, Azure. You will learn to configure and use Transformer to access the various environments, transfer and transform data, use the Pipeline Repository, configure and run jobs, and monitor the performance of pipelines across all instances of StreamSets products running in the organization.

  • Experience with StreamSets Data Collector is required.

The StreamSets Transformer workshop is ideal for:

  • Those who will be building, managing, monitoring, and administering data flow pipelines.

COURSE AGENDA

1

Overview of the StreamSets Data Operations Platform

  • DataOps Platform Overview
  • StreamSets DataOps Architecture and Use Cases
2

Transformer UI Overview

  • Pipelines
  • Controls & Views
  • Package Management
  • Origins, Operators, Destinations
3

Spark Overview

  • Spark Overview
  • RDDs
  • DataFrames
  • Datasets
4

Transformer Deep Dive

  • Transformer Execution
  • Pipeline Processing on Spark
  • Transformer Batch Mode
  • Transformer Streaming Mode
  • Data Origin & Data Sources
  • Spark Partitioning & Caching
  • Ludicrous Mode
5

Batch Processing

  • Spark Batch Processing
  • Transformer Batch Processors
  • SparkSQL
6

Streaming & Windowing

  • Spark Streaming
  • Common Streaming Pipelines
  • Window Processor
7

Logs & Monitoring

  • Log Management & log files
  • Monitoring Pipelines
  • Spark UI & Execution
8

Framework Connectors

  • Hadoop Distributed Architecture
  • Hadoop, Hive, Kafka, Spark, Databricks, Snowflake, AWS, and Azure Operators
  • Hive Tables
9

Using PySpark ML Functions

  • PySpark Operator
  • PySpark Inputs
  • Machine Learning with PySpark ML Example
10

Spark Tuning in Transformer

  • Spark Tuning Properties
  • Partition, Shuffle, Repartition
  • Network Considerations
  • Java Serialization & Garbage Collection
11

SCH & Transformer Security

  • Web UI Security
  • Authentication
  • Access Control
  • Limiting Deployment of Stage Libraries
  • Source/Destination Security
  • Credential Security

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 6,000 various courses on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top