# Data science online training

**

**

**

**

**4.9/5

**

**

**

**

**4.9/5

**

**

**

**

**4.9/5

Etlhive has been awarded as the best institute for Data Science by multiple companies across India. There are many factors why you should choose Etlhive for Data science online training.

##### Course packed with latest modules

### Scikit(scipy, sklearn)

### Data manipulation with Pandas

### Numerical processing with Numpy

### Tensorflow for deep learning

### Keras, Pytorch for Neural networks

### Matplotlib, seaborn for Data visualization

# Frequently asked questions

Two types: Descriptive(focuses on past data to understand past/present trends) and Predictive(focuses on past data to predict unknown/future)..

Data analyst generally works on creation of reports based on company’s data driven KPI’s(generally involves descriptive analytics), whereas Data scientists understand business and domain along with the technicalities to understand what will happen in future(more on descriptive + predictive analytics both)

Machine Learning algorithms are mathematical in nature, hence you need to first understand that part(includes statistics, probability theory, Linear Algebra).

Once you know this part, then in order to implement these algorithms on a real life data set, you need a language which contains modules which simplify ML development(like MATLAB, Python, R etc)

So, to sum it up:

-Maths

-Programming

-Data manipulation/preprocessing

-Machine Learning Algos

-Lots of scenarios

-End to End projects

-Deployment of Models on various cloud platforms.

Every person who has done some online course or has went through some tutorial uploads a CV for a JOB. But not necessary everyone gets it.

In order for you to get jobs, your skill level has to be detailed, your knowledge cannot be limited to generics. If you feel that by doing some sort of a crash course will get you anywhere, then I wish you luck.

By the time you have completed the course you should be able to handle complex scenarios with efficiency and measured approach(without any help of course).

So a simple advice: Join a detailed course.

The answer to this is relative based on your previous experience and technology. Kindly have a word with our technical counsellor regarding the same.

The term Data Science is used interchangeably with Datalogy.

Data Science employs its theories and techniques from physics, mathematics, nanotechnologies and this list goes to 23 fields.

Data Science is considered to be a part of many academic and research areas.

Data Science has been employed in fraud monitoring and security.

Data Analytics is now increasingly used in multiple sectors and these sectors owe their success to Data Science and Data Analytics

Various companies have certifications available for these kind of programs:

AWS Certified Machine Learning – Specialty certification

Professional Data Engineer Certification(Google)

Google Data Analytics Professional Certificate

Microsoft Certified: Azure Data Scientist Associate

Data science professionals(IBM)

and the list goes on.

Graduation in any stream, Freshers or working professionals who either wish to start their career as a Data Scientist or wish to switch from their previous profile into mainstream analytics.

# Syllabus

- Defining Python
- History of Python and its Growing Popularity

Features of Python and its Wide Functionality - Python 2 vs Python 3
- Setting Up Python
- Environment for Development
- What and How of Python Installation?
- IDEs: IDLE, Pycharm, and Jupyter
- Writing First Python Program
- Python Scripts on UNIX and Windows
- Installation on Ubuntu-based Machines
- Programming on Interactive Shell
- Python Identifiers and Keywords
- Indentation in Python
- Comments and Writing to the Screen
- Command Line Arguments and Flow Control
- User Input
- Python Core Objects
- Defining Built-in Functions
- Objectives
- Variables and their types
- Variables – String Variables
- Variables – Numeric Types
- Variables – Boolean Variables
- Boolean Object and None Object
- Tuple Object and Operations
- Dictionary Object and Operations
- Types of Variables – Dictionary
- Comparison of Variables
- Dictionary Methods and Manipulations
- Operators and Logical Operators
- Data Structures and Data Processing
- Arithmetic Operations on Numeric Values
- Operators and Keywords for Sequences
- Understanding Conditional Statements
- Break Statements and Continue Statements
- Using Indentations for defining if & else block
- Loops in Python
- While, Nested, Demo-Create
- How to Control Loops?
- Sequence and Iterable Objects

- Objectives of Functions
- Types of Functions
- Creating UDF Functions
- Function Parameters
- Unnamed and Named Parameters
- Creating and Calling Functions
- Python user Defined Functions
- Python packages Functions
- Anonymous Lambda Function
- Understanding String Object Functions
- List and Tuple Object Functions
- Studying Dictionary Object Functions
- Defining Python Inbuilt Modules
- Studying Types of Modules
- os, sys, time, random, datetime, zip modules
- How to Create Python User Defined Modules?
- Understanding Pythonpath
- Creating Python Packages
- init File and Package Initialization
- What and How of File Handling with Python?
- How to Process Text Files using Python?
- Read/Write and Append File Object
- Test Operations: os.path
- Overview of Object Oriented Programming
- Defining Classes, Objects, and Initializers
- Attributes – Built-In Class
- Destroying Objects
- Methods – Instance, Class, Static, Private methods, and Inheritance
- Data Hiding
- Module Aliases and reloading modules
- Python Exceptions Handling
- Standard Exception Hierarchy
- .. except…else
- .. finally…clause
- Creating Self-Exception Class
- User-defined Exceptions
- Debugging Errors – Unit Tests
- Project Skeleton
- Creating and Using the Skeleton
- How to use pdb debugger?
- Using Pycharm Debugger
- Asserting Statement for Debugging
- Using UnitTest Framework for Testing
- Understanding Regular Expressions
- Match Function, Search Function, and the Comparision
- Compile and Match, Match and Search
- Search and Replace
- What and How of Extended Regular Expressions?
- Wildcard Characters

- Data Visualization and Matplotlib, seaborn
- Python Libraries
- Features of Matplotlib
- Line Properties Plot with (x, y)
- Set Axis, Labels, and Legend Properties
- Alpha and Annotation
- Univariate plots
- Bivariate plots
- Multivariate plots
- Interpretations

• Data Manipulation and Machine Learning with Python

• Data Manipulation with Python – Pandas

• Understanding Pandas

• Defining Data Structures

• Data Operations(filtering, sorting, grouping, aggregation, merging) and Data Standardization

• Pandas: File Read and Write Support

• SQL Operations(pandasql)

• Exploring and Understanding Data

• Exploring Numeric Variables

• Understanding Types of Data

• Qualitative and Quantitative Analysis

• Studying Descriptive Statistics

• Exploring Numeric Variables

• Measuring the Central Tendency – The Model

• Measuring Spread – Variance and Standard Deviation

• Visualising Numeric Variables – Boxplots and Histograms

• Understanding Numeric Data – Uniform and Normal Distributions

• Measuring the Central Tendency – The Mode

• Exploring Relationships between Variables

• Visualizing Relationships – Scatterplots

• Nominal and Ordinal Measurement

• Interval and Ratio Measurement

• Statistical Investigation

• Inferential Statistics

• Probability and Central Limit Theorem

• Exploratory Data Analysis

• Normal Distribution

• Distance Measures

• Euclidean & Manhattan Distance

• Minkowski & Mahalanobis

• Cosine

• Correlation

• PPMC (Pearson Product Moment Correlation)

• Importance of Hypothesis Testing in Business

• Null and Alternate Hypothesis

• Understanding Types of Errors

• Contingency Table and Decision Making

• Confidence Coefficient

• Upper Tail Test

• Understanding Parametric Tests

• Z-Test and Z-Test in R

• Chi-Square Test

• Degree of Freedom

• One-Way ANOVA Test

• F-Distribution, F-Ration Test

Regression Methods for Forecasting Numeric Data

Regression Methods for Forecasting Numeric Data

• Understanding Neural Networks

• From Biological to Artificial Neurons

• Activation Functions

Deep Learning – Neural Networks and Support Vector Machines

• What is Regression?

• Model Selection

• Generalized Regression

• Simple Linear Regression

• Multiple Linear Regression

• Correlations

• Correlation between X and Y

• Ridge and Regularized Regression

• LASSO

• Time Series

• Prediction: Time Dependent/Variant Data

• Ordinary Least Square Regression Model

• Dummy Variable Regression Model

• Interaction Regression Model

• Non-Linear Regression Model

• Perform Regression Analysis with Multiple Variables

• Network Topology

• Recurrent and Gaussian Neural Network

• The Number of Layers

• The Direction of Information Travel

• The Number of Nodes in Each Layer

• Training Neural Networks with Backpropagation

• Support Vector Machines

• Classification with Hyperplanes

• Finding the Maximum Margin

• The Case of Linearly Separable Data

• The Case of Non-Linearly Separable Data

• Retrieve Data using SQL Statements

• Using Kernels for Non-Linear Spaces

Classification

• K-NN, Naïve Bayes, Support Vector Machines

• Defining Classification

• Understanding Classification and Prediction

• Decision Tree Classifier

• How to Build Decision Trees?

• Basic Algorithm for a Decision Tree

• Decision Trees and Data Mining

• Random Forest Classifier

• Features of Random Forests

• Out of Box Error Estimate and Variable Importance

• Naïve Bayes Classifier Model

• Bayesian Theorem

Advantages and Disadvantages of Naïve Bayes Classifier Model

• Understanding Support Vector Machines

• Understanding Linear SVMs

• Logistic Regression

• Bagging and Boosting(Adaboost)

• Understanding K-means Clustering

• K-means and Pseudo Code

• K-means Clustering using R

• TF-IDF and Cosine Similarity

• Application to Vector Space Model

• What is Hierarchical Clustering?

• Hierarchical Clustering Algorithm

• Understanding Agglomerative Clustering Process

• DBSCAN Clustering

• What is Association Rule Mining?

• Association Rule Strength Measures

• Checking Apriori Algorithms

• Ordering Items

• Understanding Candidate Generation

• Performing Visualisation on Associated Rules

• Dimensionality reduction

## Features

##### Course features

### Collabration projects

### Deployment on local and cloud platforms

### Video recordings for missed sessions

### Deployment on Sagemaker

### Production scenarios

### Online and classroom options

**

**

**

**

**

**

**

**

**

**

**

**

**

**

**

**

**

**

**

**