EncartaLabs

MarkLogic Data Curation

( Duration: 1 Day )

In MarkLogic Data Curation training course, you will learn to build a MarkLogic Data Hub powered by the MarkLogic database to help accelerate data integration projects and deliver faster time to value to your customers.

By attending MarkLogic Data Curation workshop, delegates will learn to:

  • Develop, test, debug and deploy custom code using a local IDE (Visual Studio Code)
  • Use custom code during ingest, mapping and mastering
  • Implement an entity model that includes nesting and relationships
  • Load data from a variety of sources
  • Load data using a variety of methods and describe the use cases and best practices for each method
  • Use custom code during data ingest
  • Implement mapping configurations for a more complex data model
  • Implement smart mastering configurations with more complexity and customization

  • Knowledge of Hub Central
  • Knowledge of MarkLogic Security

The MarkLogic Data Curation class is ideal for:

  • Data Architects, MarkLogic Developers, Data Engineers

COURSE AGENDA

1

Data Services First

  • Understand the high-level approach to data integration projects using the MarkLogic Data Hub
  • Understand the customer and business requirement for the course hands-on project
  • Understand the user stories and technical requirements for the course hands-on project
  • Understand the data sources available for the course hands-on project
2

The MarkLogic Data Hub

  • Understand what it is
  • Understand what it does
  • Initialize and install a new MarkLogic Data Hub project
3

Implement Security

  • Create users and roles for both business users and members of the technical project team
  • Understand how to use Data Hub specific roles
  • Implement role hierarchies
  • Assign execute privileges necessary to meet project requirements
  • Deploy security configuration using QuickStart and ml-gradle
4

Create an Entity

  • Create a new entity
  • Define properties
  • Configure Indexed
  • Protect access to PII (personally identifiable information)
5

Ingest Data

  • Create flow pipelines
  • Configure ingestion steps
  • Understand the purpose and use of the staging and final databases in a MarkLogic Data Hub
  • Implement key data modeling concepts including document URIs, collections, document permissions, property naming best practices, geospatial data modeling patterns, denormalization, and the use of the envelope pattern
6

Curate Data

  • Configure mapping steps
  • Use pre-built mapping functions
  • Program, deploy and use a custom mapping function
  • Test and debug mapping steps
7

Use Semantics

  • Understand key semantic data modeling concepts including triples, IRIs, ontology triples, managed and unmanaged triples
  • Load triples to a MarkLogic Data Hub
  • Program, deploy and use a custom harmonization step to add triples to the envelope of a document
8

Access Data

  • Explore the use of JavaScript APIs
  • Explore the use of SPARQL
  • Validate that the curated data from the hub can be used to meet the business and technical requirements for the hands-on project
9

Adapt to Change: Perform Another Iteration of Ingest | Curate | Access

  • Ingest a new data source
  • Curate the new data so that it can be consumed in the same way as existing data
10

Use Smart Mastering

  • Configure a matching step
  • Configure a merging step
  • Test Smart Mastering
  • Explore mastered data

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 4,000 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top