TEW - Statistics and Linear Algebra with emphasis on Data Science (2022)

Speakers and Syllabus


Name of the speakers with affiliation

Topic

No. of Lectures

No. of Lab Sessions

Prof. Sanjeev V. Sabnis

Mathematics Department, IIT Bombay

Basics of matrix algebra, Probability, Basic Statistics Categorical data analysis, Multivariate

14

0

Dr. Radhendushka Srivastava

Mathematics Department, IIT Bombay

Probability, Basic Statistics, Regression Analysis, Multivariate data, clustering

10

6


Time Table

Day

Date

Lecture 1
9.30
to
10.30

10.30
to
11.00

Lecturer 2
11.00
to
12.00

Lecture 3
12:00
to
1:00

1.00
to
2.30

Lecture 4
2.30
to
3:30

3.30
to
4.00

R Lab Session
4:00
to
5:00

5.00
to
5.30

Mon

19 Dec

SS

T

E

A

SS

RS

L

U

N

C

H

SS

 

T

E

A

RS + (TA1,TA2)

S

N

A

C

K

S

Tue

20 Dec

SS

RS

RS

SS

RS + (TA1,TA2)

Wed

21 Dec

SS

SS

RS

RS

RS + (TA1,TA2)

Thu

22 Dec

RS

SS

SS

SS

RS + (TA1,TA2)

Fri

23 Dec

RS

SS

SS

RS

RS + (TA1,TA2)

Sat

24 Dec

RS

RS

SS

SS

RS + (TA1,TA2)

 

  • SS - Prof. Sanjeev V. Sabnis
  • RS - Dr. Radhendushka Srivastava

 Detailed Syllabus

 

sr. Detailed Syllabus
1 Basics of matrix algebra
  1. Basics of matrix algebra, square symmetric matrices, eigen values and eigen vectors, positive definite matrices, their some uses and applications in statistics (SS)
  2. Spectral decomposition, square root of a square symmetric matrix and its properties, quadratic forms, matrix inequalities and optimization. (SS)
    Book Reference:
    Searl, S.R., and Khuri A. I., Matrix Algebra Useful for Statistics, 2nd Edition, Wiley, New York, 201
2

Probability

  1. Random experiment, sample space, events, probability and its properties, conditional probability, independent events, Bayes probability theorem. (RS)

  2. Discrete and continuous random variables, standard probability mass functions and probability density functions (Bernoulli, Binomial, Poisson, Geometric, Uniform, Exponential, Normal, Gamma), independent random variables, joint probability mass/density function, transformation of random variables, probability integral transform. (SS)

  3. Weak law of large numbers, strong law of large numbers, central limit theorem. (SS)

    Book Reference:
    Hogg, R., McKean. J. and Craig, A., Introduction to Mathematical Statistics, 8th Edition, Pearson, Boston, 2019.

 

3

Statistics

  1. Data exploration and inference
    i. Exploration of data matrix, measures of central tendency, dispersion, skewness, kurtosis. Estimation of frequency distribution of discrete random variables, box plot, histogram, scatterplot between two variables.  (RS)
    ii. Likelihood function of a random sample, maximum likelihood estimation and some of its properties, basics of testing of hypothesis, p-value (RS)
    iii. Z-test, t-test, paired t-test. (SS)
    Book Reference:
    Hogg, R., McKean. J. and Craig, A., Introduction to Mathematical Statistics, 8th Edition, Pearson, Boston, 2019.
  2. Categorical data analysis
    i. Nominal and ordinal random variables, contingency tables, multinomial random vector, goodness of fit (SS)
    ii. chi square test of association, relative risk, odds, odds ratio, sensitivity, specificity. (SS)
    Book Reference:
    Agresti, A., An Introduction to Categorical Data Analysis, 3rd Edition., Wiley, New York, 2019
  3. Regression Analysis
    i. Correlation between continuous variables, Simple linear regression, estimation and testing of hypothesis of the parameters. (RS)
    ii. Multiple linear regression, least square estimation, measure of goodness of fit (R square), Adjsuted R square, Testing of linear hypothesis, Diagnostics of assumptions, testing of homogeneity (RS)
    iii. Testing of normality of residuals, outlier detection, multicollinearity. Ridge regression. (RS)
    Book References:
    Montgomery, D., Peck, E., Vining, G. Introduction to Linear Regression Analysis, 5th Edition, John Wiley, New York, 2012.
  4. Logistic and log linear regression 
    i. Introduction to logistic regression, likelihood based method for parameter estimation, statistical significance of covariates (SS)
    ii. Application of logistic regression in classification. (SS)
    iii. Log-linear regression models for count data, estimation and testing strategy for the parameters and its application. (SS)
    Book Reference:
    Agresti, A., An Introduction to Categorical Data Analysis, 3rd Edition, Wiley, New York, 2019. 
  5. Multivariate data and dimension reduction technique 
    i. Multivariate Normal random vector and data matrix, estimation of mean vector and variance-covariance matrix. (RS)
    ii. Principal Component Analysis, (SS)
    iii. Factor Analysis (Latent variable structure on covariance matrix), (SS)
    iv. Cluster Analysis using different techniques like K-mean cluster (RS)
    v. hierarchical cluster (dendrogram) (RS).
    Book Reference:
    Johnson, R.A. and Wichern, D.W., Applied Multivariate Statistical Analysis, 6th Edition, Upper Saddle River, Prentice Hall, New Jersey, 2007.

 

 

 

File Attachments: