EncartaLabs

Text Mining with R

( Duration: 3 Days )

It is estimated that over 70% of potentially usable business information is unstructured, often in the form of text data. Text mining provides a collection of techniques that allow us to derive actionable insights from these data.

The Text Mining with R training course will show you the various tools and major techniques for mining and analyzing text data to discover interesting patterns, extract useful knowledge, and support decision making, with an emphasis on statistical approaches, to making sense of unstructured data. Work with a live example of extraction of data from Web and perform all the facets of text mining using R.

By attending Text Mining with R workshop, delegates will learn:

  • Sentiment analysis
  • Word cloud
  • Ngrams
  • Topics Modeling
  • LDA
  • Extracting text from social media

  • Basic knowledge of R.
  • Data Scientists
  • Data Analysts
  • Finance Analysts
  • Marketers

COURSE AGENDA

1

Introduction

  • What is text mining
  • Applications of text mining
2

Basic Text Functions

  • Text manipulation functions
  • Working with strings
  • Working with gsub
  • Advanced methods
  • Convert to corpus
3

Importing Data

  • Converting docx into corpus
  • Converting pdf into corpus
  • Converting html to corpus
  • Web scraping
4

Tidytext Package

  • Tidying text objects
  • Tidying document term matrix objects
  • Tidying document frequency matrix objects
  • Tidying corpus objects
  • Mining literacy works
5

Word Frequencies & Relationships

  • Pre-processing text
  • Wordcloud
  • Frequency analysis
  • nGrams & bigrams
  • Bigrams for sentiment analysis
  • Visualizing bigrams network
6

Sentiment Analysis

  • Sentiment libraries
  • Analyzing positive & negative words
  • Comparing 3 sentiment libraries
  • Common positive & negative words
7

Topic Modelling

  • Latent Semantic Indexing (LSI)
  • Latent Dirichlet Allocation (LDA)
  • Word topic probabilities
  • Document - topic probabilities
  • Chapters probabilities
  • Per document classification
8

Document Similarity & Classifier

  • Text alignment & pairwise comparison
  • Minihashing and locality sensitive hashing
  • Extract key words
  • Classify by location, language, topic
9

Working internet and social media (Optional)

  • Extracting data from amazon
  • Extracting data from twitter
  • Extracting youtube comments
  • Extracting facebook comments

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 4,000 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top