IMW - Classification Modelling

Speakers and Syllabus


Detailed Syllabus

20-02-2023 (Day1)  
Lecture 1 & 2. Random experiment, sample space, events, probability and its properties, conditional probability, independent events, Bayes probability, random variables, standard probability density functions, independence, law of large numbers, central limit theorem.
Lecture 3 & 4. Basics of matrix algebra, square symmetric matrices, eigen values and eigen vectors, positive definite matrices, spectral decomposition, square root of a square symmetric matrix and its properties, quadratic forms, matrix inequalities and optimization.
Lab: Basic vector computation, matrix operations (addition, multiplication, inverse, determinant, rank, eigen values, eigen vectors, svd for square symmetric matrices, square root of matrices), generating random variables, plotting of pdf’s/pmf’s , sample mean and its closeness to the population mean.

21-02-2023 (Day2)
Lecture 1 & 2. Data matrix, measures of central tendency, dispersion, skewness, kurtosis, some graphical tools like box plot, histogram, scatterplot, likelihood function, estimation strategies, application of central limit theorem.
Lecture 3 & 4.  Basics of testing of hypothesis, type I error and type II error, power, p-value, connection between testing of hypothesis and classification, z-test, t-test, paired t-test.
Lab: Exploration of few real/simulated data, simulation and confirmatory analysis of empirical significance level for mean testing along with power curve.

22-02-2023 (Day3)
Lecture 1 & 2. General regression setup, logistic regression, parameter estimation and diagnostics of logistic regression model.
Lecture 3 & 4. Linear discriminant analysis of Gaussian populations, misclassification probability matrix, ROC, quadratic discriminant analysis, cross validation method.
Lab: Implementations of logistic, LDA, QDA on real/simulated data.  
23-02-2023 (Day 4)
Lecture 1 & 2. Naïve Bayes classifier and comparison with Logistic, LDA and QDA.
Lecture 3 & 4. Support vector classifier, support vector machine with linear/nonlinear boundaries, cross validation method.
Lab: Implementations of Naïve Bayes classifier and SVM on real/simulated data.
24-02-2023 (Day 5)
Lecture 1 & 2. Classification using decision trees, boosting, regularization, random forests, variable of importance.
Lecture 3 & 4. K-Nearest Neighbours classifier, notion of distance measures, cross validation, advantage of KNN, comparison, real data application.
Lab: Implementations of decision tree, random forest and KNN on real/simulated data.

25-02-2023 (Day 6)
Lecture 1. Applications of unsupervised learning, principal component analysis and factor analysis.
Lecture 2. K-means clustering algorithm.
Lecture 3 & 4. Hierarchical clustering algorithms, dendrogram.
Lab: Implementations of K-means clustering and hierarchical clustering algorithms on real/simulated data.

Book Reference:

  • Searl, S.R., and Khuri A. I., Matrix Algebra Useful for Statistics, 2nd Edition, Wiley, New York, 2017.
  • Hogg, R., McKean. J. and Craig, A., Introduction to Mathematical Statistics, 8th Edition, Pearson, Boston, 2019.
  • Johnson, R.A. and Wichern, D.W., Applied Multivariate Statistical Analysis, 6th Edition, Upper Saddle River, Prentice Hall, New Jersey, 2007.
  • Hastie, T., Tibshirani, R. and Friedman, J., The elements of statistical learning, 2nd Edition, Springer, New York, 2016.
  • James, G., Witten, D.,  Hastie, T., and Tibshirani, R., An introduction to statistical learning, 1st  Edition, Springer, New York, 2013.

Note: Lab sessions will be held in R/Python. Participants are required to know the basic of R/Python.


Time Table

Time Table

Each course will be scheduled for six consecutive days in a week and will comprise of lectures/ tutorials/ lab/ computing components. Participants with Engineering/Science background and/or work experience in Data Science.

 

Date

Lecture 1
9.30 am
to
10.30 am

 

Lecturer 2
11.00 am
to
12.00 pm

Lecture 3
12:00 pm
to
1:00 pm

 

Lecture 4
2.30 pm
to
3:30 pm

 

Lab Session
4:00 pm
to
5:00 pm

 

20-02-2023

RS

 

T

E

A

RS

SS

L

U

N

C

H

SS

 

T

E

A

RS+SS+SR+JD

S

N

A

C

K

S

21-02-2023

RS

RS

SS

SS

RS+SR+SS+JD

22-02-2023

SS

SS

RS

RS

RS + SS+SR +JD

23-02-2023

SS

SS

RS

RS

RS + SS+SR +JD

24-02-2023

SR

SR

SS

SS

RS + SS+SR +JD

25-02-2023

SS

RS

SR

SR

RS + SS+SR +JD

SS - Prof. Sanjeev Sabnis (IITB), (11Lecture+6Lab)
RS - Prof. Radhendushka Srivastava (IITB), (9Lecture+6Lab)
SR - Dr. Siddhartha Roy (Industry Expert), (4Lecture+6Lab)
JD – Dr. Jovi D’Silva (NIO) (TA), (6Lab).
 

File Attachments: