Veidlapa Nr. M-3 (8)
Study Course Description

Multivariate Statistics

Main Study Course Information

Course Code
SL_119
Branch of Science
Other medical sciences; Other Sub-Branches of Medical Sciences
ECTS
3.00
Target Audience
Life Science
LQF
Level 7
Study Type And Form
Full-Time; Part-Time

Study Course Implementer

Course Supervisor
Structure Unit Manager
Structural Unit
Statistics Unit
Contacts

14 Balozu street, Block A, Riga, +371 67060897, statistika@rsu.lv, www.rsu.lv/statlab

About Study Course

Objective

The aim of the course is to introduce the tools and concepts of multivariate data analysis with a strong focus on applications with R program.

Preliminary Knowledge

Higher mathematics, probability, statistics, linear models, basic knowledge of R programming.

Learning Outcomes

Knowledge

1.Student: • has gained an in-depth knowledge of the theoretic probabilistic concepts related to multivariate analysis. • illustrates the visualization techniques describing the multivariate data. • assesses the most important multivariate techniques such as principal components analysis, factor analysis, cluster analysis and discriminant analysis.

Skills

1.• Implements appropriate multivariate data visualizations in R programme. • Can independently apply multivariate data analysis techniques in R programme, to carry out research activities or highly qualified professional functions.

Competences

1.• Can compare and understand the aims of various multivariate data analysis methods and choose the most appropriate for the analysis of the data set. • Can generate hypothesis and make analysis-based decisions related to multivariate data.

Assessment

Individual work

Title
% from total grade
Grade
1.

Individual work

-
-
1. Review of compulsory and additional literature to expand the knowledge acquired in lectures and classes. 2. Students will be expected to prepare five R based home assignments related to each of the topics: a. Principal components analysis (2nd practical class). b. Factor analysis (3rd practical class). c. Discriminant analysis (4th practical class). d. Cluster analysis (5th practical class). e. Multivariable linear regression (6th practical class).

Examination

Title
% from total grade
Grade
1.

Examination

-
-
Assessment on the 10-point scale according to the RSU Educational Order: • 5 home assignments to be handed in – 70%. • Written Exam – 30%.

Study Course Theme Plan

FULL-TIME
Part 1
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

Introduction to multivariate analysis, multivariate dataset examples, Covariance, correlation, multivariate normal distribution. Repetition of basic linear algebra elements: determinants, inverse, eigenvalues and eigenvectors, decompositions and quadratic forms.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Visualizing of multivariate data with R. Practicing matrix algebra calculations in R.
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

Principal components (PC) analysis. A geometrical approach to reducing data matrix dimension. Definition, interpretation and inference on principal components. Normalized principal components.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Real-world data example analysis in R: calculating PCs, determining statistical significance, drawing plots to interpreting PCs.
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

Factor analysis. Orthogonal factor model. Interpretation of factors. Test for the number of common factors. Comparison with PC analysis.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Factor analysis in R: estimating the factor model, testing the number of factors, drawing plots to interpret factors.
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

Discriminant analysis. Classes, labels and classification accuracy measures. Linear and quadratic discriminant analysis.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Discriminant analysis in R.: estimation, interpretation, comparison of methods.
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

Cluster analysis. Proximity between objects, distance functions. Various clusterization algorithms.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Cluster analysis in R: realizing and comparing various clusterization algorithms. Determining the optimal number of clusters.
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

Multivariable linear regression. Multivariate normality tests.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Multivariable linear regression in R: estimation, testing and interpretation.
Total ECTS (Creditpoints):
3.00
Contact hours:
24 Academic Hours
Final Examination:
Exam (Written)
PART-TIME
Part 1
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
1

Topics

Introduction to multivariate analysis, multivariate dataset examples, Covariance, correlation, multivariate normal distribution. Repetition of basic linear algebra elements: determinants, inverse, eigenvalues and eigenvectors, decompositions and quadratic forms.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Visualizing of multivariate data with R. Practicing matrix algebra calculations in R.
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
1

Topics

Principal components (PC) analysis. A geometrical approach to reducing data matrix dimension. Definition, interpretation and inference on principal components. Normalized principal components.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Real-world data example analysis in R: calculating PCs, determining statistical significance, drawing plots to interpreting PCs.
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
1

Topics

Factor analysis. Orthogonal factor model. Interpretation of factors. Test for the number of common factors. Comparison with PC analysis.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Factor analysis in R: estimating the factor model, testing the number of factors, drawing plots to interpret factors.
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
1

Topics

Discriminant analysis. Classes, labels and classification accuracy measures. Linear and quadratic discriminant analysis.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Discriminant analysis in R.: estimation, interpretation, comparison of methods.
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
1

Topics

Cluster analysis. Proximity between objects, distance functions. Various clusterization algorithms.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Cluster analysis in R: realizing and comparing various clusterization algorithms. Determining the optimal number of clusters.
  1. Lecture

Modality
Location
Contact hours
On site
Auditorium
1

Topics

Multivariable linear regression. Multivariate normality tests.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Multivariable linear regression in R: estimation, testing and interpretation.
Total ECTS (Creditpoints):
3.00
Contact hours:
18 Academic Hours
Final Examination:
Exam (Written)

Bibliography

Required Reading

1.

W. K. Haerdle Härdle, L. Simar, Applied Multivariate Statistical Analysis. Springer. 2015Suitable for English stream

2.

D. Zelterman. Applied Multivariate Statistics with R. Springer, Statistics for biology and health series, 2015Suitable for English stream

Additional Reading

1.

R. A. Johnson, D.W. Wickern, Applied Multivariate Statistical Analysis, 6th edition. Prentice & Hall, 2007Suitable for English stream

2.

T. Hothorn, B. Everitt, An Introduction to Applied Multivariate Analysis with R. Springer, Use R! series, 2011Suitable for English stream