• Home
  • Data Analysis : Intermediate : DA201

Continue to develop your skills in Unsupervised Traditional (Shallow) Machine Learning and Statistical Modelling with this Intermediate level Data Analysis program.

Data Analysis : Intermediate : DA201

  • DURATION

    4 Months

  • WEEKLY

    45 hours

  • FEE

    USD 5,760 INR 437,760†

About the Course

The problem is not Big Data. The problem is Small Data. We live in a world of data exhaust. Unrefined Data is aplenty. But most of the time, we have very Small Data to analyze. This is where Statistical Modelling shines. This course shows you everything you need to get started with the first principles of statistical modelling in a fun and easy to digest format. There are four functional roles in Data Science, namely, Business Analyst, Data Analyst, Machine Learning Engineer and Data Engineer. The DA track targets the Data Analyst role. Although this course is on Data Analysis, you will learn some Predictive Analytics too as sometimes its difficult to separate the two.

Most people ignore Statistics thinking in the world of Deep Learning its irrelevant. Nothing can be further from the truth. Before you can learn to analyze Big Data, you need to learn to analyze small data. Before you can apply Deep Learning, learn to apply Shallow Learning using traditional non-deep and Statistical Modelling techniques. This is exactly the focus of this course - Shallow Learning and Small Data.

Reason? Unlike the Deep learning models which are data hungry black boxes, these statistical models allow you to arrive upon an educated guesstimate with very little data and the results are explainable. For example, they are the foundation of many digital marketing techniques such as A/B testing and Hypothesis Testing. They allow you to infer and predict properties of populations using a small sample which are cheap and easy to collect. Sometimes its very expensive or outright impossible to collect huge quantities of data. Lastly, there is an entire class of Data problems that can only be solved using Computational Probability and Statistics, period.

Most importantly, this is where you learn to solve Data problems from first principles and learn how to program with Data. These techniques are as old as statistics itself and predate the modern computers. The twist is that we have made it easy for you to learn and apply these techniques by letting the computer do all the hard work! You learn to understand a problem, which Statistical tool to use to solve it, formulate a solution and then let the computer do the calculations. This discreet approach is not formula or Calculus dependent. This allows you to create tests for which formulas do not exist. The discreet computational approach is more flexible so its less prone to errors and are easy to understand and apply as compared to Analytical Statistics (the one we hated in school).

This is not your regular school textbook Statistics course filled with theoretical mathematical symbols. This is a practical, hands-on, solution based approach designed to be used on real life problems. The course is jam-packed with interactive classes, interesting articles, book references and exciting projects the likes of which you may have never seen!

Prerequisites

  • Curiosity
  • Basic arithmetic skills - Brackets, division, multiplication, addition, subtraction
  • Ability to operate a computer, keyboard and mouse
  • Ability to use a web browser to access and use the internet
  • Ability to install software on your computer
  • Data Analysis : Introduction : DA101

Hardware and Software Requirements

  • Physical operational computer (not in virtualization) - Fedora 34 or greater OR PopOS/Ubuntu 20.04 or greater, OR Windows 10 or greater, OR MacOS 10 or greater
  • 16 GB RAM
  • Broadband internet connection > 5 MBPS
  • 100 GB free hard disk space, SSD Drive recommended
  • Dedicated graphic card is not required but recommended. Cloud will be used

Learning Objective

Intermediate AI And ML (Shallow/Traditional Learning)
  • Unsupervised Clustering
  • Reinforcement Learning - Multiarm Bandit problem
  • Lasso And Ridge Regularization
  • Boosting
  • Feature Engineering
  • Dimensionality Reduction
Intermediate Computational Statistics
  • Probability Mass Functions
  • Cumulative Mass Functions
  • Modelling Distributions
  • Continuous Distributions
  • Probability Density Functions
  • Using Statistical Sampling
  • Relationship Between Variables
  • Estimating Populations And Samples
  • Confidence Intervals
  • Hypothesis Testing
  • Intro To Chi Squared Test
  • Survival Analysis
  • Analytic Methods

Learning Outcome

  • Learn to perform unsupervised clustering
  • Understand and apply Multi-arm Bandit problem
  • Understand what is over-fitting and how to prevent it
  • Learn to use the most successful models in Data Science competitions like XGBoost
  • Perform feature engineering and selection to increase model performance
  • Reduce the number of features to improve model performance
  • Learn to model data distributions using various interconnected techniques
  • Learn to model discreet and continuous distributions
  • Learn to model probability distributions
  • Learn the correct ways to sample and collect data
  • Measure relationships between data
  • Learn to estimate population parameters using sample statistics
  • Perform Null Hypothesis Significance Testing (NHST)
  • Learn to perform survival analysis
  • Contrast and understand the difference between analytical methods and discreet computational methods

Fineprint

  • The topics presented are tentative and we reserve the right to add or remove a topic to update or improve the bootcamp, or for a technical or time reasons.
  • † 18% Indian taxes extra.
teacher
Manuj Chandra

Manuj Chandra

Data Science

Related Course

Business Intelligence and Information Design : Introduction : BIID101
  • 3 Month
  • Business Intelligence

Business Intelligence and Information Design : Introduction : BIID101

About the Course There are four functional roles in Data Science, namely, Business Analyst, Data …

Apply now
Generative AI for Marketing Professionals (No Code)
  • 2 Days
  • Data Analytics

Generative AI for Marketing Professionals (No Code)

About the Course Unleash the Power of Generative AI in Your Marketing Campaigns: A Course …

Apply now
Data Analysis : Introduction : DA101
  • 4 Months
  • Data Analysis

Data Analysis : Introduction : DA101

About the Course The problem is not Big Data. The problem is Small Data.

Apply now