Course Description

Data Science with Python course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. 

This course will introduce the learner to information visualization basics, with a focus on reporting and charting using the matplotlib library. The course will start with a design and information literacy perspective, touching on what makes a good and bad visualization, and what statistical measures translate into in terms of visualizations.

This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. Supervised approaches for creating predictive models will be described, and learners will be able to apply the scikit learn predictive modelling methods while understanding process issues related to data generalizability (e.g. cross validation, overfitting). The course will end with a look at more advanced techniques, such as building ensembles, and practical limitations of predictive models. By the end of this course, students will be able to identify the difference between a supervised (classification) and unsupervised (clustering) technique, identify which technique they need to apply for a particular dataset and need, engineer features to meet that need, and write python code to carry out an analysis.


Course Content

Module 01 - Introduction to Data Science using Python

1.1 What is Data Science, what does a data scientist do
1.2 Various examples of Data Science in the industries
1.3 How Python is deployed for Data Science applications
1.4 Various steps in Data Science process like data wrangling, data exploration and selecting the model.
1.5 Introduction to Python programming language
1.6 Important Python features, how is Python different from other programming languages
1.7 Python installation, Anaconda Python distribution for Windows, Linux and Mac
1.8 How to run a sample Python script, Python IDE working mechanism
1.9 Running some Python basic commands
1.10 Python variables, data types and keywords.

Module 02 - Python basic constructs

2.1 Introduction to a basic construct in Python
2.2 Understanding indentation like tabs and spaces
2.3 Python built-in data types
2.4 Basic operators in Python
2.5 Loop and control statements like break, if, for, continue, else, range() and more.

Module 03 - Mathatematics for Data Science-Statistics & Probability

3.1 Central Tendency
3.2 Variabiltiy
3.3 Hypothesis Testing
3.4 Anova
3.5 Correlation
3.6 Regression
3.7 Probability Definitions and Notation
3.8 Joint Probabilities
3.9 The Sum Rule, Conditional Probability, and the Product Rule
3.10 Bayes’ Theorem

Module 04 - OOPs in Python 

4.1 Understanding the OOP paradigm like encapsulation, inheritance, polymorphism and abstraction
4.2 What are access modifiers, instances, class members
4.3 Classes and objects
4.4 Function parameter and return type functions
4.5 Lambda expressions.

Module 05 - NumPy for mathematical computing

5.1 Introduction to mathematical computing in Python
5.2 What are arrays and matrices, array indexing, array math, Inspecting a numpy array, Numpy array manipulation

Module 06 - Scipy for scientific computing

6.1 Introduction to scipy, building on top of numpy
6.2 What are the characteristics of scipy
6.3 Various subpackages for scipy like Signal, Integrate, Fftpack, Cluster, Optimize, Stats and more, Bayes Theorem with scipy.

Module 07 - Data manipulation

7.1 What is a data Manipulation. Using Pandas library
7.2 Numpy dependency of Pandas library
7.3 Series object in pandas
7.4 Dataframe in Pandas
7.5 Loading and handling data with Pandas
7.6 How to merge data objects
7.7 Concatenation and various types of joins on data objects, exploring dataset\\\\

Module 08 - Data visualization with Matplotlib

8.1 Introduction to Matplotlib
8.2 Using Matplotlib for plotting graphs and charts like Scatter, Bar, Pie, Line, Histogram and more
8.3 Matplotlib API

Module 09 - Machine Learning using Python

9.1 Revision of topics in Python (Pandas, Matplotlib, numpy, scikit-Learn)
9.2 Introduction to machine learning
9.3 Need of Machine learning
9.4 Types of machine learning and workflow of Machine Learning
9.5 Uses Cases in Machine Learning, its various algorithms
9.6 What is supervised learning
9.7 What is Unsupervised Learning

Module 10 - Supervised learning

10.1 What is linear regression
10.2 Step by step calculation of Linear Regression
10.3 Linear regression in Python
10.4 Logistic Regression
10.5 What is classification
10.6 Decision Tree, Confusion Matrix, Random Forest, Naïve Bayes classifier (Self paced), Support Vector Machine

Module 11 - Unsupervised Learning

11.1 Introduction to unsupervised learning
11.2 Use cases of unsupervised learning
11.3 What is clustering
11.4 Types of clustering(self-paced)-Exclusive clustering, Overlapping Clustering, Hierarchical Clustering(self-paced)
11.5 What is K-means clustering
11.6 Step by step calculation of k-means algorithm
11.7 Association Rule Mining(self-paced), Market Basket Analysis(self-paced), Measures in association rule mining support, confidence, lift
11.8 Apriori Algorithm

Module 12 - Python integration with Spark

12.1 Introduction to pyspark
12.2 Who uses pyspark, need of spark with python
12.3 Pyspark installation
12.4 Pyspark fundamentals
12.5 Advantage over mapreduce, pyspark
12.6 Use-cases pyspark  and demo.

Module 13 - Dimensionality Reduction

13.1 Introduction to Dimensionality
13.2 Why Dimensionality Reduction
13.3 PCA
13.4 Factor Analysis
13.5 LDA

Module 14 - Time Series Forecasting

14.1 White Noise
14.2 AR model
14.3 MA model
14.4 ARMA model
14.5 ARIMA model
14.6 Stationarity
14.7 ACF & PACF

Student feedback

10 Reviews

  • 10
  • 0
  • 0
  • 0
  • 0


out of 5

Course Rating


Dhruv Nautiyal

Great Learning

I enjoyed the learning. May be a hard course if you are just looking to understand Data Science superficially. But what I feel is it does really worth it if you gain your proper focus towards it the learning is fabulous. Must join SparkAcademy!


Sunny Gada

Excellent tutorials

An excellent course for a person who wants to explore the world of Data Science and wanna be master in that. Content and pattern of this course is the best part and are the reason why I joined here.


Ritesh Jha

Amazing Content

Amazing, very rich content, it allows you to go deep in understanding of Data Science and build your concepts really well. Go for it guys!!


Karan Grover

Best Course

Best course ever. It was very helpful a helped me to clear various concepts of Data Science using Python. All lectures were really great. Amazing training.


Sachin Bohra

Useful Course

Very useful course. It built my Data Science basis from scratch. First course gives me the enthusiasm to start online learning. A good place for studying. Appreciable.


Sanjana Rajput

Best training

I am giving a 5 starts because I really enjoyed the course and I learned a lot. SparkAcademy provides such a great training, they are the best.


Kajal Devgn

Great Course

Very well taught, easy to understand. one must pay attention to the content and understand the concept. Amazing course in my point of view!


Akash Dagar

Amazing training

The instructor explained the concept clearly in engaging way that I really appreciate about whole tutorials thing. The most I liked is the design of assignments.


Manvi Ujjainiya

Worthy Course

it's definitely worth it. I faxed problems solving assignments, but since the difficulty helps with gaining better understanding so that's a positive part of the course.


Kritika Goel

Excellent course

Excellent course. I'm looking forward to continuing in this specialization. Highly recommended!! This Python Data Science course was worthwhile.

Add Reviews & Rate

  • What is it like to Course?

Related Courses

Python Programming Courses
Preview Course

Python Course

    Course Features

    • Probability & Statistics
    • Machine Learning
    • Programming
    • Data Manipulation
    • Data Visualization with Matplotlib
    • OOPS in Python
    • Dimensionality Reduction
    • Time Series Forecasting
    • Python integration with Spark
    • Pandas numpy & scikit-Learn)