EncartaLabs

Python for Data Scientists

( Duration: 5 Days )

The Python for Data Scientists training course is designed to introduce you to the Python programming language, which continues to gain popularity not only among developers, but also among data scientists due to its rich ecosystem for data manipulation, data analytics, and machine learning.

This course begins by covering the fundamentals of Python–including data structures, loops, and list comprehensions–then transitions into ways in which data scientists can leverage the expressive nature of the language for doing their daily work. Next, delegates will utilize the Python ecosystem to import and manipulate data, create summaries and exploratory visualizations, and perform standard hypothesis tests. They will then learn to fit and evaluate regression models to data sets and create data visualizations. Finally, the course will cover basic machine learning methods.

By attending Python for Data Scientists workshop, delegates will:

  • Learn how to use Python to explore and analyze data, run basic regression models, visualize data, and apply some basic machine learning models to data.
  • Use Python to read, manipulate, and clean your data.
  • Analyze and visualize your data using Python.
  • Perform predictive analysis using basic machine learning models.

  • Business Analyst
  • Data Engineer
  • Data Scientist

COURSE AGENDA

1

Introduction to the Anaconda Python Distribution

  • Understanding and using Jupyter notebooks
  • Understanding the data science and machine learning libraries that are included
2

Introduction to Python

  • Dynamic typing
  • Primitive datatypes
  • Looping/list comprehensions
  • Modules and packages
3

Introduction to Pandas

  • Datatypes
  • Importing data
    • CSV
    • Excel
    • SQL
  • Creating numerical summaries
  • Exploring data
  • Descriptive statistics
  • Basic probability distributions (Gaussian/normal, Poisson, Chi-Squared, binomial, exponential) including generating random numbers and finding critical values
  • Standard hypothesis testing, e.g., t-tests, z-tests, ANOVA, chi-square tests, as well as basic non-parametric tests like Wilcoxon signed-rank and rank-sum tests
  • Dummy variables
  • Linear regression
  • Logistic regression
  • Evaluating regression models
  • Simulating data from probability distributions
  • Permutation tests and the bootstrap
  • Creating publication-quality graphics
4

Introduction to SciKit-Learn

  • Supervised vs. Unsupervised learning
  • Classification vs. Regression
  • Linear Regression
  • Decision Trees
  • Support Vector Machines
  • Ensemble Models
  • Evaluating Models
  • Fine-Tuning Your Models

Encarta Labs Advantage

  • One Stop Corporate Training Solution Providers for over 4,000 Modules on a variety of subjects
  • All courses are delivered by Industry Veterans
  • Get jumpstarted from newbie to production ready in a matter of few days
  • Trained more than 50,000 Corporate executives across the Globe
  • All our trainings are conducted in workshop mode with more focus on hands-on sessions

View our other course offerings by visiting http://encartalabs.com/course-catalogue-all.php

Contact us for delivering this course as a public/open-house workshop/online training for a group of 10+ candidates.

Top