Data Science Training in Hyderabad

Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining.

The following is a comprehensive list of Data Science courses and resources that explain or teach skills within Data Science, such as machine learning, data mining, analytics, cleaning, visualization, scraping, using APIs to make data products, artificial intelligence, and much more.

Data Science: Dealing with unstructured and structured data, Data Science is a field that comprises of everything that related to data cleansing, preparation, and analysis. ... A buzzword that is used to describe immense volumes of data, both unstructured and structures, Big Data inundates a business on a day-to-day basis.

DATASCIENCE CONTENT:

DESCRIPTIVE STATISTICS AND PROBABILITY DISTRIBUTIONS:

Introduction about Statistics

Different Types of Variables

Measures of Central Tendency with examples

Measures of Dispersion

Probability & Distributions

Probability Basics

Binomial Distribution and its properties

Poisson distribution and its properties

Normal distribution and its properties

INFERENTIAL STATISTICS AND TESTING OF HYPOTHESIS

Sampling methods

Different methods of estimation

Testing of Hypothesis & Tests

Analysis of Variance

COVARIANCE & CORRELATION

PREDICTIVE MODELING STEPS AND METHODOLOGY WITH LIVE EXAMPLE:

Data Preparation

Exploratory Data analysis

Model Development

Model Validation

Model Implementation

SUPERVISED TECHNIQUES:

MULTIPLE LINEAR REGRESSION

Linear Regression - Introduction - Applications

Assumptions of Linear Regression

Building Linear Regression Model

Understanding standard metrics (Variable significance, R-square/Adjusted R-Square, Global hypothesis etc)

Validation of Linear Regression Models (Re running Vs. Scoring)

Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc)

Interpretation of Results - Business Validation - Implementation on new data

Real time case study of Manufacturing and Telecom Industry to estimate the future revenue using the models

LOGISTIC REGRESSION - INTRODUCTION - APPLICATIONS

Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models

Building Logistic Regression Model

Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification etc)

Validation of Logistic Regression Models (Re running Vs. Scoring)

Standard Business Outputs (Decile Analysis, ROC Curve)

Probability Cut-offs, Lift charts, Model equation, drivers etc)

Interpretation of Results - Business Validation - Implementation on new data

Real time case study to Predict the Churn customers in the Banking and Retail industry

PARTIAL LEAST SQUARE REGRESSION

Partial Least square Regression - Introduction - Applications

Difference between Linear Regression and Partial Least Square Regression

Building PLS Model

Understanding standard metrics (Variable significance, R-square/Adjusted R-Square, Global hypothesis etc)

Interpretation of Results - Business Validation - Implementation on new data

Sharing the real time example to identify the key factors which are driving the Revenue

VARIABLE REDUCTION TECHNIQUES

FACTOR ANALYSIS

PRINCIPLE COMPONENT ANALYSIS

Assumptions of PCA

Working Mechanism of PCA

Types of Rotations

Standardization

Positives and Negatives of PCA

SUPERVISED TECHNIQUES CLASSIFICATION:

CHAID

CART

DIFFERENCE BETWEEN CHAID AND CART

RANDOM FOREST

Decision tree vs. Random Forest

Data Preparation

Missing data imputation

Outlier detection

Handling imbalance data

Random Record selection

Random Forest R parameters

Random Variable selection

Optimal number of variables selection

Calculating Out Of Bag (OOB) error rate

Calculating Out of Bag Predictions

COUPLE OF REAL TIME USE CASES WHICH ARE RELATED TO TELECOM AND RETAIL INDUSTRY. IDENTIFICATION OF THE CHURN.

UNSUPERVISED TECHNIQUES:

SEGMENTATION FOR MARKETING ANALYSIS

 Need for segmentation

Criterion of segmentation

Types of distances

Clustering algorithms

Hierarchical clustering

K-means clustering

Deciding number of clusters

Case study

BUSINESS RULES CRITERIA

REAL TIME USE CASE TO IDENTIFY THE MOST VALUABLE REVENUE GENERATING CUSTOMERS.

TIME SERIES ANALYSIS:

TIME SERIES COMPONENTS( TREND, SEASONALITY, CYCLICITY AND LEVEL) AND DECOMPOSITION

BASIC TECHNIQUES

Averages,

Smoothening etc

ADVANCED TECHNIQUES

AR Models,

ARIMA

UCM

Hybrid Model

UNDERSTANDING FORECASTING ACCURACY - MAPE, MAD, MSE ETC

COUPLE OF USE CASES, TO FORECAST THE FUTURE SALES OF PRODUCTS

TEXT ANALYTICS:

GATHERING TEXT DATA FROM WEB AND OTHER SOURCES

PROCESSING RAW WEB DATA

COLLECTING TWITTER DATA WITH TWITTER API

NAIVE BAYES ALGORITHM

Assumptions and of Naïve Bayes

Processing of Text data

Handling Standard and Text data

Building Naïve Bayes Model

Understanding standard model metrics

Validation of the Models (Re running Vs. Scoring)

SENTIMENT ANALYSIS

Goal Setting

Text Preprocessing

Parsing the content

Text refinement

Analysis and Scoring

Thursday 7 September 2017

Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining.

The following is a comprehensive list of Data Science courses and resources that explain or teach skills within Data Science, such as machine learning, data mining, analytics, cleaning, visualization, scraping, using APIs to make data products, artificial intelligence, and much more.

DESCRIPTIVE STATISTICS AND PROBABILITY DISTRIBUTIONS:

INFERENTIAL STATISTICS AND TESTING OF HYPOTHESIS

PREDICTIVE MODELING STEPS AND METHODOLOGY WITH LIVE EXAMPLE:

MULTIPLE LINEAR REGRESSION

LOGISTIC REGRESSION - INTRODUCTION - APPLICATIONS

PARTIAL LEAST SQUARE REGRESSION

VARIABLE REDUCTION TECHNIQUES

FACTOR ANALYSIS

PRINCIPLE COMPONENT ANALYSIS

CHAID

CART

DIFFERENCE BETWEEN CHAID AND CART

RANDOM FOREST

COUPLE OF REAL TIME USE CASES WHICH ARE RELATED TO TELECOM AND RETAIL INDUSTRY. IDENTIFICATION OF THE CHURN.

UNSUPERVISED TECHNIQUES:

SEGMENTATION FOR MARKETING ANALYSIS

BUSINESS RULES CRITERIA

REAL TIME USE CASE TO IDENTIFY THE MOST VALUABLE REVENUE GENERATING CUSTOMERS.

TIME SERIES COMPONENTS( TREND, SEASONALITY, CYCLICITY AND LEVEL) AND DECOMPOSITION

BASIC TECHNIQUES

ADVANCED TECHNIQUES

UNDERSTANDING FORECASTING ACCURACY - MAPE, MAD, MSE ETC

COUPLE OF USE CASES, TO FORECAST THE FUTURE SALES OF PRODUCTS

TEXT ANALYTICS:

GATHERING TEXT DATA FROM WEB AND OTHER SOURCES

PROCESSING RAW WEB DATA

COLLECTING TWITTER DATA WITH TWITTER API

NAIVE BAYES ALGORITHM

SENTIMENT ANALYSIS

USE CASE OF HEALTH CARE INDUSTRY, TO IDENTIFY THE SENTIMENT OF THE PATIENTS ON SPECIFIED HOSPITAL BY EXTRACTING THE DATA FROM THE TWITTER.

LIVE CONNECTIVITY FROM R TO TABLEAU

GENERATING THE REPORTS AND CHARTS