Convener(s)
| Name: | Prof. Sanjeev Sabnis | Prof. Radhendushka Srivastava |
| Mailing Address: | IIT Bombay (Mathematics) | IIT Bombay (Mathematics) |
| Email: | svs at iitb.ac.in | rsrivastava at iitb.ac.in |
Please Note:
- NCM Participants are requested to register in NCM website followed by registration in CEP website: http://www.cep.iitb.ac.in (through Google Chrome browser).
- Shortlisted NCM participants will have to pay nominal course fee of Rs 1000 + 18% (GST) = Rs 1180
- Last Date of receiving online application from participants is 31 Dec 2022 for both NCM and CEP website
Note: Lab sessions will be held in Python. Participants are required to know the basics of Python.
Description
The word ‘classification’ in the context of statistical modelling refers to a classification of new observations into relevant classes using a statistical decision rule that is built using training data pertaining to a particular phenomenon or a field.
This classification exercise is pervasive across fields such as medicine (for example, classifying individuals having COVID or not having COVID into classes such as ‘severe symptoms’, ‘non-severe symptoms’, ‘absence of symptoms’), various manufacturing industries (classifying newly manufactured items into ‘defective’ and ‘non-defective’ classes), banking (classifying clients into classes such as ‘fraudulent’ and ‘non-fraudulent’), social sciences and law and many more fields.
The classification methods that will be covered in this workshop include (i) logistic regression, (ii) linear and quadratic discriminant analysis, (iii) naïve Bayes, (iv) K- nearest neighbours, (v) decision trees, (vi) random forests, (vii) support vector machines. Each of these classification methods will be demonstrated using real and simulated data with the help of open-source software R/python. In the machine learning parlance, these classification methods are also referred to as supervised learning methods.
Some unsupervised learning techniques (mainly, clustering methods) will also be covered in the workshop. The K-means clustering and hierarchical clustering algorithms will be demonstrated through real data.
Dates:
Venue:
Venue Address:
Indian Institute of Technology Bombay, Powai
Mumbai-400076
Venue State:
Venue City:
PIN:
Chrono Order:
Syllabus:
Detailed Syllabus
20-02-2023 (Day1)
Lecture 1 & 2. Random experiment, sample space, events, probability and its properties, conditional probability, independent events, Bayes probability, random variables, standard probability density functions, independence, law of large numbers, central limit theorem.
Lecture 3 & 4. Basics of matrix algebra, square symmetric matrices, eigen values and eigen vectors, positive definite matrices, spectral decomposition, square root of a square symmetric matrix and its properties, quadratic forms, matrix inequalities and optimization.
Lab: Basic vector computation, matrix operations (addition, multiplication, inverse, determinant, rank, eigen values, eigen vectors, svd for square symmetric matrices, square root of matrices), generating random variables, plotting of pdf’s/pmf’s , sample mean and its closeness to the population mean.
21-02-2023 (Day2)
Lecture 1 & 2. Data matrix, measures of central tendency, dispersion, skewness, kurtosis, some graphical tools like box plot, histogram, scatterplot, likelihood function, estimation strategies, application of central limit theorem.
Lecture 3 & 4. Basics of testing of hypothesis, type I error and type II error, power, p-value, connection between testing of hypothesis and classification, z-test, t-test, paired t-test.
Lab: Exploration of few real/simulated data, simulation and confirmatory analysis of empirical significance level for mean testing along with power curve.
22-02-2023 (Day3)
Lecture 1 & 2. General regression setup, logistic regression, parameter estimation and diagnostics of logistic regression model.
Lecture 3 & 4. Linear discriminant analysis of Gaussian populations, misclassification probability matrix, ROC, quadratic discriminant analysis, cross validation method.
Lab: Implementations of logistic, LDA, QDA on real/simulated data.
23-02-2023 (Day 4)
Lecture 1 & 2. Naïve Bayes classifier and comparison with Logistic, LDA and QDA.
Lecture 3 & 4. Support vector classifier, support vector machine with linear/nonlinear boundaries, cross validation method.
Lab: Implementations of Naïve Bayes classifier and SVM on real/simulated data.
24-02-2023 (Day 5)
Lecture 1 & 2. Classification using decision trees, boosting, regularization, random forests, variable of importance.
Lecture 3 & 4. K-Nearest Neighbours classifier, notion of distance measures, cross validation, advantage of KNN, comparison, real data application.
Lab: Implementations of decision tree, random forest and KNN on real/simulated data.
25-02-2023 (Day 6)
Lecture 1. Applications of unsupervised learning, principal component analysis and factor analysis.
Lecture 2. K-means clustering algorithm.
Lecture 3 & 4. Hierarchical clustering algorithms, dendrogram.
Lab: Implementations of K-means clustering and hierarchical clustering algorithms on real/simulated data.
Book Reference:
- Searl, S.R., and Khuri A. I., Matrix Algebra Useful for Statistics, 2nd Edition, Wiley, New York, 2017.
- Hogg, R., McKean. J. and Craig, A., Introduction to Mathematical Statistics, 8th Edition, Pearson, Boston, 2019.
- Johnson, R.A. and Wichern, D.W., Applied Multivariate Statistical Analysis, 6th Edition, Upper Saddle River, Prentice Hall, New Jersey, 2007.
- Hastie, T., Tibshirani, R. and Friedman, J., The elements of statistical learning, 2nd Edition, Springer, New York, 2016.
- James, G., Witten, D., Hastie, T., and Tibshirani, R., An introduction to statistical learning, 1st Edition, Springer, New York, 2013.
Note: Lab sessions will be held in R/Python. Participants are required to know the basic of R/Python.
Time Table:
Time Table
Each course will be scheduled for six consecutive days in a week and will comprise of lectures/ tutorials/ lab/ computing components. Participants with Engineering/Science background and/or work experience in Data Science.
|
Date |
Lecture 1 |
|
Lecturer 2 |
Lecture 3 |
|
Lecture 4 |
|
Lab Session |
|
|
20-02-2023 |
RS |
T E A |
RS |
SS |
L U N C H |
SS |
T E A |
RS+SS+SR+JD |
S N A C K S |
|
21-02-2023 |
RS |
RS |
SS |
SS |
RS+SR+SS+JD |
||||
|
22-02-2023 |
SS |
SS |
RS |
RS |
RS + SS+SR +JD |
||||
|
23-02-2023 |
SS |
SS |
RS |
RS |
RS + SS+SR +JD |
||||
|
24-02-2023 |
SR |
SR |
SS |
SS |
RS + SS+SR +JD |
||||
|
25-02-2023 |
SS |
RS |
SR |
SR |
RS + SS+SR +JD |
SS - Prof. Sanjeev Sabnis (IITB), (11Lecture+6Lab)
RS - Prof. Radhendushka Srivastava (IITB), (9Lecture+6Lab)
SR - Dr. Siddhartha Roy (Industry Expert), (4Lecture+6Lab)
JD – Dr. Jovi D’Silva (NIO) (TA), (6Lab).
Selected Applicants:
| Serial | SID | Full Name | Gender | Affiliation | Position in College/ University | University/ Institute M.Sc./ M.A. | Ph.D. Degree Date |
| 1 | 45141 | Ms Arulmani Komarasamy | Female | Bharathiar University PG Extension and Research Centre | PhD | Bharathiar University | |
| 2 | 45173 | Mrs. D. Poongodi | Female | Bharathiar University PG extension and research centre | Ph. D | L. R. G govt arts college for women | |
| 3 | 45217 | Ms. Sunita Rani | Female | IIT Bhubaneswar | PhD Student | Kurukshetra University,Kurukshetra | |
| 4 | 45344 | Mr. Tamil Selvan T | Male | NIT Calicut | PhD Student | PSG College of Arts & Science, Coimbatore | |
| 5 | 45465 | Ms. Krishna Mahapatra | Female | Vellore Institute of Technology, Vellore Campus | Research Scholar(PhD) | Diamond Harbour Women's University | |
| 6 | 45559 | Mr Tamizhazhagan S | Male | National Institute of Technology | PhD | Pondicherry University | |
| 7 | 45574 | Mr. Muzammil Khan | Male | Maulana Azad National Institute of Technology Bhopal | Research Scholar | Bundelkhand University | |
| 8 | 45585 | Ms. Bhavana Singh | Female | Maulana Azad National Institute of Technology, Bhopal | Ph.D | National Institute of Technology, Hamirpur | |
| 9 | 45606 | Mr. Nitish Kumar Mahala | Male | Maulana Azad National Institute of technology Bhopal | PhD | National Institute of technology Jamshedpur | |
| 10 | 45610 | Ms. Nikita Yadav | Female | Maulana Azad National Institute of Technology , Bhopal | PhD | Malaviya National Institute of Technology , Jaipur | |
| 11 | 45634 | Mr. Abhinav Gupta | Male | Maulana Azad National Institute of Technology Bhopal | PhD | VBSPU JAUNPUR | |
| 12 | 45659 | Mr. Sandeep Kumar | Male | National Institute of Technology Rourkela | PhD | Pondicherry University |
How to Reach:
Click on the folloiwng link for more details
https://www.iitb.ac.in/en/about-iit-bombay/getting-to-iit-bombay