Introduction to Data Science (3 Hours)
In this module you will be learning about data science and how it is being used in the industry in various places. You will also understand what types of analytics are being done and what exactly an analytics job profile like. What does a data scientist do in a day to day life? You will also be learning the difference in AI and ML and when these algorithms are used.
Data Analytics across Domains:
We will be discussing data analytics in Insurance, Automobile, Retail, Banking, Marketing, Aviation, Defense, Social Services and other domains. This will help you understand how analytics is applied in various domains.
What is Analytics:
We have to understand how these industries we just discussed above use analytics. Sometimes, it is not a detailed analysis that needs to be done. So we need to understand when and what is required for analytics. This helps us set the expectations of work that needs to be done and deliver better results.
Types of Analytics:
We would be discussing from the data science perspective about analytics being used, especially details about Exploratory Data Analysis, Descriptive Analytics and Predictive Analytics. We also need to understand when to use which type of analytics to give the results.
AI vs ML:
Not every project is a high rigor project and can be done using simple steps too. Hence, we need to understand well what Artificial Intelligence, Deep Learning and Machine Learning are. This helps us to get more understanding of which algorithm to apply when and how to use it.
Structured Query Language (9 Hours)
SQL is an important tool used for capturing, saving and maintaining data. From the data science perspective we do not need all the features of databases to be learnt as a Database Administrator. But you will learn the features and concepts here which are used by the data engineers and data scientists. This is a very important part of any data professional to understand how to use queries to gather data from multiple systems to do data analytics.
Basic SQL:
We will be covering basics of databases and queries here. You will learn about the different types of queries here and understand well about tables. You will also learn about how to fetch and handle data from different sources here. It is important to know how to fetch the correct data from the systems in order to do correct analysis.
Advanced SQL:
It is not always that data comes from a single source or a single table during analysis. We may need to collate data from multiple sources in order to form a single dataset for analysis. This will require the concept of joins as well as subqueries at times. You will learn here about collating data from multiple sources and prepare a final dataset.
Python Programming (24 Hours)
Analytics is analyzing the data. We did learn in SQL how to collate the data. We will now learn python how to use the language to do analytics. Python is an interpreted language and is easy to learn. A language in high demand in the industry these days due to its versatility and easy implementation.
Introduction to Python:
You will start from the scratch of programming here and understand python and the IDE used for python programming. We will learn about the basics of python, it’s data types and indexing of data along with a few basic commands.
Python Objects:
Here we will learn about the basic objects of python namely lists, tulles and dictionaries which are often used in the data science tasks. We will also learn about list comprehension here which is used by many programmers in advanced levels of python programming.
Numpy and Pandas:
The most common libraries used in data analysis are Numpy and Pandas. We will learn both these libraries in detail as to how it is used and what needs to be done using these packages. You will also learn about how you import data files in data frames and how to handle data frames and clean your data for analysis.
Data Visualization:
Data visualization is a very important concept for analysis. A picture is worth a million words is well said when we dive deeper in matplotlib and seaborn libraries which are used for data visualization for plotting graphs for analysis. We will start from basic scatter plot, line plot, bar graph to advanced strip plots, swarm plots, etc. and also learn how to infer results from such graphs.
Exploratory Data Analysis:
Exploratory Data Analysis is a part done by every data analyst or a data scientist in every project. This gives a good understanding of how the data is spread and what the data wants to say. Understanding the data well alongside the business understanding of the problem statement helps in getting better picture of the data and give better insights of the data.
Statistics and Probability (6 Hours)
Statistics and Probability is a very important concept to understand Data Science. The algorithms are based on the base of statistics and probability. We will learn about the basics of probability first and then enter into statistics and understand the concepts of statistics and how it is used for inferential statistics.
Probability and Basic Statistics:
We will slow down a little here and grasp every concept of statistics and also about probability. We are not here to learn the entire statistics but the basic concepts of it which will be used in the purview of data science. You will learn about the basics of probability, conditional probability and also total probability and how it is used. Further you will be learning about the spread of data and also about normal distribution, skewed data and tests of normality here.
Data Distributions:
Data can be in multiple distribution types. Here you will be learning about Uniform distribution, Bernoulli distribution, Binomial distribution and normal distribution. You will also learn the difference of these distributions and what is a standard normal distribution. We will also discuss CLT and why it is the backbone of data science. We will further discuss point estimate and confidence intervals.
Hypothesis Testing and Statistical Tests:
We are talking about inferential statistics here which means based on the sample we are going to infer about the population. Hence there has to be some assumption to test and back it up with statistical evidence. This can be done using statistical tests and hypothesis testing. We will go through concepts like t test, z test, F test, ANOVA test, Chisq test, etc.
Supervised Learning - Regression (6 Hours)
You will learn about what is supervised learning and what is a regression problem in supervised learning. Supervised learning is classified into two parts: regression and classification. We will be discussing regression problem statements here and see how the algorithm fits here.
Regression Problems:
You will learn here how to identify regression problems and how to solve them. Which algorithm needs to be applied for regression problems? What is the best fit line? How do we measure the authenticity of the model built?
Errors in the Model:
Not everything in this world is perfect. We all are prone to errors and so are our models. We will learn how to measure the errors in the models and how to use them. We will further learn what is overfitting and under fitting of models. Concepts about regularization will be shared as well which will help us understand better about the cost functions and how it is used to handle the models. This will help us understand Ridge and Lasso Regression as well.
Supervised Learning - Classification (12 Hours)
Here we will be talking in detail about the second type of supervised learning problem statement which is classification. We will understand how to identify the classification problem statement.
Classification Problems:
We will also be looking into how to use logistic regression; concepts of Sensitivity, Specificity, Recall, Precision, F1 and when to use them. It is important to understand when to use which concept here based on a problem statement.
Errors and how to minimize them:
Here you will learn about maximum likelihood estimation. Also it is important to know that not always what the system suggests is the correct solution. So, what can we do to get the optimized thresholds? You will learn about optimal thresholds here. Also, you will learn about concepts of ROC curves, confusion matrix and concordance.
Decision Trees:
As and when time progressed multiple algorithms were devised for solving regression and classification problems apart from the traditional regression techniques. Once such method is decision trees. You will learn here about the CART algorithm and also understand the basics of how the trees perform. What are the differences between the basic regression methods and the tree methods? What would be the right time to use the tree methods and how it can be used to the best.
Supervised Learning - Ensemble Techniques (6 Hours)
Regression models can at times give us generalized results but not high results. On the other hand, there is a possibility that trees can overfit and give very high results which are not good for us. So what do we do now? We can then rely on the ensemble techniques and give them a try. Here you will be learning about different ensemble techniques, the concepts behind them and which one gives us better results and when to use them.
Bootstrapping and Bagging:
The ensemble techniques work by combining multiple trees together in a specific pattern. This is where you will learn about bagging and random forest algorithms and how they work. The understanding of the algorithms and how these algorithms work behind the scenes? The pros and cons of the algorithms and when to use them. These models are black box models and cause lot of trouble when explaining such models. We will find ways and discuss how to explain such black box models to management as well.
Supervised Learning - Boosting Techniques (6 Hours)
You will learn about the most commonly tried techniques for supervised learning now. Boosting is the most commonly used algorithm among the industry algorithms. You will learn the concepts behind the boosting algorithms and how they are implemented in the industry.
Boosting Algorithms:
You will be learning here about adaptive boosting, gradient boosting and extreme gradient boosting. XGB is the most common algorithm being used in the industry. However, it is important to know which algorithm to be used and how it works behind the scenes. You will also learn about how to hyperparameter tune these algorithms and use them to give the required results.
Unsupervised Learning - Clustering (6 Hours)
Unsupervised learning is a very important concept that can be used for grouping data based on properties of the data. This helps a lot is customer segmentation, consumer behaviour analysis, and many such case studies. Here you will be learning about the concepts of unsupervised learning and the algorithms commonly used for clustering.
K-means Clustering:
This is the most common technique used for clustering in the industry especially for larger data sets. We will understand here the concept of how the k-means algorithm works and how it is used to form clusters. You will also learn how to determine the optimal value of k for doing k means clustering.
Hierarchical Clustering:
This algorithm is not preferred for larger datasets because it is computationally expensive. However, it is very useful on smaller datasets. You will be learning here about the concept for hierarchical clustering and how it is used on the datasets. How do you determine the number of clusters to be selected based on the cophenetic correlation?
Dimensionality Reduction- PCA (3 Hours)
You would have learnt by now about feature engineering, curse of dimensionality and also overfitting of the models. To cater this we can use Principal Component Analysis (PCA). We will learn about the concepts of PCA and how to use it in unsupervised and supervised learning models. We will also know about the pros and cons of using PCA in modeling.
Time Series Forecasting (3 Hours)
This is a crucial thing which we use for forecasting in the future. You will learn here about moving averages, smoothening techniques, ARIMA, SARIMA, ARIMAX and how they are used forecast the future based on time. The variable here is time based on which the values change and we will be discussing in detail how this will be implemented.
Student Reviews:
Your Mentor:
Yogesh Singh
(17 +Years Of Experience)Shivam Chugh
(5 +Years Of Experience)Shivam Chawla
(7 +Years Of Experience)Vikrant Yadav
(8 +Years Of Experience)Nilesh Pathak
(9 +Years Of Experience)LMS Features
“Welcome to smart AI Enabled Learning management system -Aspire.”
Aspire will be your one stop place for all your learning and placement needs. In your journey to be a data scientist , Aspire is the only place you need to be on. It has all the content , assessment , Interview preps and Placements services.It comes with adaptive learning helping you learn at your own pace and guiding you on each step as it learns your learning pattern. Below are few things which distinguish Aspire from the other LMS in the education space.
On the dashboard you can see all the courses you are enrolled in. All the important notifications on
upcoming tests, classes , competition, job placements and webinars shall be visible here. You can
simply click and access all of it.
“You can see all your course content with assessment due dates, You need to cover and close all the
topics before you can go ahead, Aspire makes sure you are on track
with its intelligent insight and assessment of your performance based on your qualitative and
quantitative learning.”
“Access to Discussion Forums to brainstorm and promote peer learning”
You get access to a discussion forum where you can start a topic where everyone gets involved and brainstorm and help you get the required support and build new ideas and clear your doubts.
“You can say hi to your peers and Mentor via Aspire Chat.”
We have an aspire chat where you can ping your classmates , mentors and guide and speak to them one on one whenever you have doubts.
“AI enabled Job Assistance”
We take your required information and our AI populates the most relevant job where you can apply in a single click and follow up with the further process. You can see teh job listings of our placement partners with the job that is curated for you to push ahead in your career aspirations.You no longer have to go to multiple platform to find the correct job for you, Aspire does all the hard work for you!
We keep releasing national level hackathons, you can participate and show the world of your skill and get absorbed by the companies looking for you!