EncartaLabs

Apache Cassandra

Apache Cassandra is a second-generation distributed database originally open-sourced by Facebook. Its write-optimized shared-nothing architecture results in excellent performance and scalability.

Cassandra moves away from the master-slave model and instead uses a peer-to-peer model. This means there is no single master but all the nodes are potentially masters. This makes the writes and reads extremely scalable and even allows nodes to function in cases of partition tolerance.

The large volume and variety of data that today's businesses process necessitates the need for a highly available, low latency database. Cassandra provides this solution by permitting high-speed reads and writes across a replicated, distributed system.

In Apache Cassandra - Administrator training course, Delegates will learn to:
  • Benchmark read and write operations
  • Recognize different types of failure
  • Fix a failed or partially failed cluster
  • Identify potential performance bottlenecks
  • Perform live schema updates
  • Perform move operations
  • Plan and perform cluster-wide operations
  • Monitor compaction, repair, and hinted handoff
In Apache Cassandra - Developer training course, Delegates will learn to:
  • Architect and engineer Cassandra databases for competitive advantage
  • Model data in Cassandra based on query patterns
  • Access Cassandra databases using CQL and Java
  • Create a balance between read/write speed and data consistency
  • Integrate Cassandra with Hadoop, Pig and Hive
  • Implement commonly used Cassandra design patterns

Apache Cassandra - Administrator workshop is designed for Administrators with basic knowledge of databases.

Database Administrators, Data Analytics professionals, Data architects, Managers

COURSE AGENDA

Apache Cassandra - Administrator
(Duration : 2 Days)

1

The Write Path

  • Log Structured Storage
  • Memtables
  • Flushing
2

The Read Path

  • SSTables
  • Row Merging
  • Cache (Key, Row)
  • Compaction
  • Distributed Deletes
  • Memory Mapped Files
  • Evolving Applications
3

Introduction to CAP

4

Partition Tolerance

  • Data partitioners
  • Replication strategies
  • Snitches
  • Hinted Handoff
5

Availability

  • How Cassandra handles failure of one or more nodes
  • What to do in the face of failure
6

Aspects of consistency

  • Consistency
  • Coordinators
  • Read Repair
  • Phi Accrual
  • Hinted Handoff
  • Anti-entropy Service
7

Cassandra and the JVM

8

Monitoring Cassandra

9

How Cassandra works with the physical hardware

  • CPU
  • Disk
  • Network
  • Goals for Sizing
10

The different storage strategies (disk configurations) including

  • Specific concerns for cloud hosting
  • Logical and Physical disk configuration
  • Local disks vs. network mounted/shared drives
11

Logging

  • Cassandra
  • System
  • GC
12

Backup and Recovery

  • Backup
  • Recovery
13

Security

  • Authentication
  • Authorization
  • Physical Security

Apache Cassandra - Developer
(Duration : 3 Days)

1

NoSQL Overview

  • Justifying non-relational data stores
  • Listing the categories of NoSQL Data Stores
2

Exploring Cassandra

  • Defining column family data stores
  • Surveying Cassandra
  • Dissecting the basic Cassandra architecture
3

Querying Cassandra

  • Defining Cassandra Query Language, CQL
  • Enumerating CQL data types
  • Manipulating data from the cqlsh interface
4

Leveraging Cassandra structures and types

  • Drawing comparisons with the relational model
  • Organizing data with keyspaces, tables and columns
  • Creating collections and counters
5

Modeling data based on queries

  • Designing tables around access patterns
  • Clustering with compound primary keys
  • Improving data distribution with composite partition Keys
6

Detailing tunable consistency

  • Identifying consistency levels
  • Selecting appropriate read and write consistency levels
  • Distinguishing consistency repair features
7

Balancing consistency and performance

  • Relating replication factor and consistency
  • Trading consistency for availability
  • Trading consistency for availability
8

Working with Cassandra collection types

  • Grouping elements in sets
  • Ordering elements in lists
  • Expressing relationships with maps
  • Nesting collections
9

Storing data for easy retrieval

  • Mapping data to tuples and user defined types
  • Investigating the frozen keyword
  • Applying the Valueless Columns Pattern
  • Strategic implementation of clustering columns
10

Controlling data life span

  • Expiring temporal data with time-to-live
  • Reviewing how tombstones achieve distributed deletes
  • Executing DELETEs and UPDATEs in the future
11

Constructing materialized views and time series

  • Modeling time series data
  • Enhancing queries with materialized views
  • Materialized views maintained in the application
  • Driving analytics from materialized views
12

Managing triggers

  • Creating triggers by implementing ITrigger
  • Attaching triggers to tables
  • Supporting materialized views with triggers
13

Querying Cassandra data with the Datastax Java Driver

  • Connecting to a Cassandra cluster
  • Running CQL through the Java Driver
  • Batching prepared statements
  • Paginating large queries
14

Persisting Java Objects with Kundera

  • Defining the Java Persistence Architecture, JPA
  • Configuring Kundera to work with Cassandra
  • Generating schemas automatically
  • Managing JPA transactions in Kundera
15

Leveraging built-in Cassandra connectors

  • Loading data into Hadoop MapReduce with the Cassandra InputFormat
  • Utilizing the Cassandra Loader to create Pig relations
  • Converting a Cassandra table to a Hive table with the Casssandra serializer/deserializer (SerDe)

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 3,500 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 20,000 corporate candidates across india and abroad
  • All our trainings are conducted in workshop mode with more focus on hands On

View our other course offerings by visiting www.encartalabs.com/course-catalogue

Contact us for delivering this course as a public/open-house workshop for a group of 10+ candidates at our venue

Top