IMW - Classification Modelling
Speakers and Syllabus
Detailed Syllabus
20-02-2023 (Day1)
Lecture 1 & 2. Random experiment, sample space, events, probability and its properties, conditional probability, independent events, Bayes probability, random variables, standard probability density functions, independence, law of large numbers, central limit theorem.
Lecture 3 & 4. Basics of matrix algebra, square symmetric matrices, eigen values and eigen vectors, positive definite matrices, spectral decomposition, square root of a square symmetric matrix and its properties, quadratic forms, matrix inequalities and optimization.
Lab: Basic vector computation, matrix operations (addition, multiplication, inverse, determinant, rank, eigen values, eigen vectors, svd for square symmetric matrices, square root of matrices), generating random variables, plotting of pdf’s/pmf’s , sample mean and its closeness to the population mean.
21-02-2023 (Day2)
Lecture 1 & 2. Data matrix, measures of central tendency, dispersion, skewness, kurtosis, some graphical tools like box plot, histogram, scatterplot, likelihood function, estimation strategies, application of central limit theorem.
Lecture 3 & 4. Basics of testing of hypothesis, type I error and type II error, power, p-value, connection between testing of hypothesis and classification, z-test, t-test, paired t-test.
Lab: Exploration of few real/simulated data, simulation and confirmatory analysis of empirical significance level for mean testing along with power curve.
22-02-2023 (Day3)
Lecture 1 & 2. General regression setup, logistic regression, parameter estimation and diagnostics of logistic regression model.
Lecture 3 & 4. Linear discriminant analysis of Gaussian populations, misclassification probability matrix, ROC, quadratic discriminant analysis, cross validation method.
Lab: Implementations of logistic, LDA, QDA on real/simulated data.
23-02-2023 (Day 4)
Lecture 1 & 2. Naïve Bayes classifier and comparison with Logistic, LDA and QDA.
Lecture 3 & 4. Support vector classifier, support vector machine with linear/nonlinear boundaries, cross validation method.
Lab: Implementations of Naïve Bayes classifier and SVM on real/simulated data.
24-02-2023 (Day 5)
Lecture 1 & 2. Classification using decision trees, boosting, regularization, random forests, variable of importance.
Lecture 3 & 4. K-Nearest Neighbours classifier, notion of distance measures, cross validation, advantage of KNN, comparison, real data application.
Lab: Implementations of decision tree, random forest and KNN on real/simulated data.
25-02-2023 (Day 6)
Lecture 1. Applications of unsupervised learning, principal component analysis and factor analysis.
Lecture 2. K-means clustering algorithm.
Lecture 3 & 4. Hierarchical clustering algorithms, dendrogram.
Lab: Implementations of K-means clustering and hierarchical clustering algorithms on real/simulated data.
Book Reference:
- Searl, S.R., and Khuri A. I., Matrix Algebra Useful for Statistics, 2nd Edition, Wiley, New York, 2017.
- Hogg, R., McKean. J. and Craig, A., Introduction to Mathematical Statistics, 8th Edition, Pearson, Boston, 2019.
- Johnson, R.A. and Wichern, D.W., Applied Multivariate Statistical Analysis, 6th Edition, Upper Saddle River, Prentice Hall, New Jersey, 2007.
- Hastie, T., Tibshirani, R. and Friedman, J., The elements of statistical learning, 2nd Edition, Springer, New York, 2016.
- James, G., Witten, D., Hastie, T., and Tibshirani, R., An introduction to statistical learning, 1st Edition, Springer, New York, 2013.
Note: Lab sessions will be held in R/Python. Participants are required to know the basic of R/Python.
Time Table
Time Table
Each course will be scheduled for six consecutive days in a week and will comprise of lectures/ tutorials/ lab/ computing components. Participants with Engineering/Science background and/or work experience in Data Science.
Date |
Lecture 1 |
|
Lecturer 2 |
Lecture 3 |
|
Lecture 4 |
|
Lab Session |
|
20-02-2023 |
RS |
T E A |
RS |
SS |
L U N C H |
SS |
T E A |
RS+SS+SR+JD |
S N A C K S |
21-02-2023 |
RS |
RS |
SS |
SS |
RS+SR+SS+JD |
||||
22-02-2023 |
SS |
SS |
RS |
RS |
RS + SS+SR +JD |
||||
23-02-2023 |
SS |
SS |
RS |
RS |
RS + SS+SR +JD |
||||
24-02-2023 |
SR |
SR |
SS |
SS |
RS + SS+SR +JD |
||||
25-02-2023 |
SS |
RS |
SR |
SR |
RS + SS+SR +JD |
SS - Prof. Sanjeev Sabnis (IITB), (11Lecture+6Lab)
RS - Prof. Radhendushka Srivastava (IITB), (9Lecture+6Lab)
SR - Dr. Siddhartha Roy (Industry Expert), (4Lecture+6Lab)
JD – Dr. Jovi D’Silva (NIO) (TA), (6Lab).