IMW  Classification Modelling
Speakers and Syllabus
Detailed Syllabus
20022023 (Day1)
Lecture 1 & 2. Random experiment, sample space, events, probability and its properties, conditional probability, independent events, Bayes probability, random variables, standard probability density functions, independence, law of large numbers, central limit theorem.
Lecture 3 & 4. Basics of matrix algebra, square symmetric matrices, eigen values and eigen vectors, positive definite matrices, spectral decomposition, square root of a square symmetric matrix and its properties, quadratic forms, matrix inequalities and optimization.
Lab: Basic vector computation, matrix operations (addition, multiplication, inverse, determinant, rank, eigen values, eigen vectors, svd for square symmetric matrices, square root of matrices), generating random variables, plotting of pdf’s/pmf’s , sample mean and its closeness to the population mean.
21022023 (Day2)
Lecture 1 & 2. Data matrix, measures of central tendency, dispersion, skewness, kurtosis, some graphical tools like box plot, histogram, scatterplot, likelihood function, estimation strategies, application of central limit theorem.
Lecture 3 & 4. Basics of testing of hypothesis, type I error and type II error, power, pvalue, connection between testing of hypothesis and classification, ztest, ttest, paired ttest.
Lab: Exploration of few real/simulated data, simulation and confirmatory analysis of empirical significance level for mean testing along with power curve.
22022023 (Day3)
Lecture 1 & 2. General regression setup, logistic regression, parameter estimation and diagnostics of logistic regression model.
Lecture 3 & 4. Linear discriminant analysis of Gaussian populations, misclassification probability matrix, ROC, quadratic discriminant analysis, cross validation method.
Lab: Implementations of logistic, LDA, QDA on real/simulated data.
23022023 (Day 4)
Lecture 1 & 2. Naïve Bayes classifier and comparison with Logistic, LDA and QDA.
Lecture 3 & 4. Support vector classifier, support vector machine with linear/nonlinear boundaries, cross validation method.
Lab: Implementations of Naïve Bayes classifier and SVM on real/simulated data.
24022023 (Day 5)
Lecture 1 & 2. Classification using decision trees, boosting, regularization, random forests, variable of importance.
Lecture 3 & 4. KNearest Neighbours classifier, notion of distance measures, cross validation, advantage of KNN, comparison, real data application.
Lab: Implementations of decision tree, random forest and KNN on real/simulated data.
25022023 (Day 6)
Lecture 1. Applications of unsupervised learning, principal component analysis and factor analysis.
Lecture 2. Kmeans clustering algorithm.
Lecture 3 & 4. Hierarchical clustering algorithms, dendrogram.
Lab: Implementations of Kmeans clustering and hierarchical clustering algorithms on real/simulated data.
Book Reference:
 Searl, S.R., and Khuri A. I., Matrix Algebra Useful for Statistics, 2nd Edition, Wiley, New York, 2017.
 Hogg, R., McKean. J. and Craig, A., Introduction to Mathematical Statistics, 8th Edition, Pearson, Boston, 2019.
 Johnson, R.A. and Wichern, D.W., Applied Multivariate Statistical Analysis, 6th Edition, Upper Saddle River, Prentice Hall, New Jersey, 2007.
 Hastie, T., Tibshirani, R. and Friedman, J., The elements of statistical learning, 2nd Edition, Springer, New York, 2016.
 James, G., Witten, D., Hastie, T., and Tibshirani, R., An introduction to statistical learning, 1st Edition, Springer, New York, 2013.
Note: Lab sessions will be held in R/Python. Participants are required to know the basic of R/Python.
Time Table
Time Table
Each course will be scheduled for six consecutive days in a week and will comprise of lectures/ tutorials/ lab/ computing components. Participants with Engineering/Science background and/or work experience in Data Science.
Date 
Lecture 1 

Lecturer 2 
Lecture 3 

Lecture 4 

Lab Session 

20022023 
RS 
T E A 
RS 
SS 
L U N C H 
SS 
T E A 
RS+SS+SR+JD 
S N A C K S 
21022023 
RS 
RS 
SS 
SS 
RS+SR+SS+JD 

22022023 
SS 
SS 
RS 
RS 
RS + SS+SR +JD 

23022023 
SS 
SS 
RS 
RS 
RS + SS+SR +JD 

24022023 
SR 
SR 
SS 
SS 
RS + SS+SR +JD 

25022023 
SS 
RS 
SR 
SR 
RS + SS+SR +JD 
SS  Prof. Sanjeev Sabnis (IITB), (11Lecture+6Lab)
RS  Prof. Radhendushka Srivastava (IITB), (9Lecture+6Lab)
SR  Dr. Siddhartha Roy (Industry Expert), (4Lecture+6Lab)
JD – Dr. Jovi D’Silva (NIO) (TA), (6Lab).