• Home
  • Data Analysis : Advance : DA301

Master the secrets of Traditional (Shallow) Machine Learning and Probabilistic Modelling with this Advanced level Data Analysis program.

Data Analysis : Advance : DA301

  • DURATION

    4 Months

  • WEEKLY

    45 hours

  • FEE

    USD 5,760 INR 437,760†

About the Course

In May 1968, the U.S. Navy’s nuclear submarine USS Scorpion failed to arrive as expected at her home port of Norfolk, Virginia. The command officers of the U.S. Navy were nearly certain that the vessel had been lost off the Eastern Seaboard, but an extensive search there failed to discover the remains of Scorpion.

Then, a Navy deep-water expert, John P. Craven, suggested that Scorpion had sunk elsewhere. Craven organized a search southwest of the Azores based on a controversial approximate triangulation by hydrophones. He was allocated only a single ship, Mizar, and he took advice from a firm of consultant mathematicians in order to maximize his resources. A Bayesian search methodology was adopted. Experienced submarine commanders were interviewed to construct hypotheses about what could have caused the loss of Scorpion.

The sea area was divided up into grid squares and a probability assigned to each square, under each of the hypotheses, to give a number of probability grids, one for each hypothesis. These were then added together to produce an overall probability grid. The probability attached to each square was then the probability that the wreck was in that square. A second grid was constructed with probabilities that represented the probability of successfully finding the wreck if that square were to be searched and the wreck were to be actually there. The result of combining this grid with the previous grid is a grid which gives the probability of finding the wreck in each grid square of the sea if it were to be searched.

At the end of October 1968, the Navy’s oceanographic research ship, Mizar, located sections of the hull of Scorpion on the seabed, about 740 km southwest of the Azores, under more than 3,000 m of water.

Sounds fun? Then dive in to learn everything you need to get started with the first principles of Bayesian Probabilistic Programming in a fun and easy to digest format. Bayesian statistics is the statistics of small data. Sometimes we have to make educated guesses based on very few data points as it could be impossible or expensive to gather more data. And this course will show you exactly how to do it.

The focus of this course is to apply probabilistic modelling to non-trivial problems. There are four functional roles in Data Science, namely, Business Analyst, Data Analyst, Machine Learning Engineer and Data Engineer. The DA track targets the Data Analyst role.

Prerequisites

  • Curiosity
  • Basic arithmetic skills - Brackets, division, multiplication, addition, subtraction
  • Ability to operate a computer, keyboard and mouse
  • Ability to use a web browser to access and use the internet
  • Ability to install software on your computer
  • Data Analysis : Intermediate : DA201

Hardware and Software Requirements

  • Physical operational computer (not in virtualization) - Fedora 34 or greater OR PopOS/Ubuntu 20.04 or greater, OR Windows 10 or greater, OR MacOS 10 or greater
  • 16 GB RAM
  • Broadband internet connection > 5 MBPS
  • 100 GB free hard disk space, SSD Drive recommended
  • Dedicated graphic card is not required but recommended. Cloud will be used

Learning Objective

Advanced AI And ML (Shallow/Traditional Learning)
  • Timeseries
  • AutoML
  • H20
  • Explaining Models
Bayesian Computational Statistics
  • What is a Statistical Distribution
  • Introduction to Inverse Probability
  • Introduction to Bayes Theory
  • Advance Computational Statistics
  • Bayesian Estimation
  • Odds
  • Decision Analysis
  • Probabilistic Prediction
  • Observer Bias and Queuing Theory
  • Applying Bayes Theorem in Two Dimensions
  • B.E.S.T (Bayesian estimation supersedes the t-test): Hypothesis Testing
  • Hierarchical Models
  • Species Problem

Learning Outcome

  • Learn various techniques to perform time-series analysis on non-random-walk data
  • Use AutoML tools to create models
  • Learn to explain models
  • Understand the difference between Frequentist and Bayesian approaches
  • Learnt to apply the inverse probability theorem using advanced computational statistics methods
  • Performing estimations and decision analysis
  • Performing probabilistic predictions (for example, which soccer team will win - think Moneyball)
  • Understand and apply Queuing theory
  • Apply computational Bayes Theory in two dimensions
  • Performing hypothesis testing using Bayesian methods
  • Creating hierarchical probabilistic models
  • Apply Bayes Theorem to Species detection problem

Fineprint

  • The topics presented are tentative and we reserve the right to add or remove a topic to update or improve the bootcamp, or for a technical or time reasons.
  • † 18% Indian taxes extra.
teacher
Manuj Chandra

Manuj Chandra

Data Science

Related Course

Business Intelligence and Information Design : Intermediate : BIID201
  • 3 Month
  • Business Intelligence

Business Intelligence and Information Design : Intermediate : BIID201

About the Course There are four functional roles in Data Science, namely, Business Analyst, Data …

Apply now
Data Engineering : Introduction : DE101
  • 4 Months
  • Data Engineering

Data Engineering : Introduction : DE101

About the Course There are four functional roles in Data Science, namely, Business Analyst, Data …

Apply now
Introduction to Building LLM powered APPs (Coding Based)
  • 2 days
  • Data Analytics

Introduction to Building LLM powered APPs (Coding Based)

Introduction to Building LLM powered APPs In today’s rapidly evolving technology ecosystem, …

Apply now