Veidlapa Nr. M-3 (8)
Study Course Description

Statistical Programming and Data Management

Main Study Course Information

Course Code
SL_108
Branch of Science
Mathematics
ECTS
6.00
Target Audience
Life Science
LQF
Level 7
Study Type And Form
Full-Time; Part-Time

Study Course Implementer

Course Supervisor
Structure Unit Manager
Structural Unit
Statistics Unit
Contacts

23 Kapselu street, 2nd floor, Riga, statistika@rsu.lv, +371 67060897

About Study Course

Objective

Skills of the data analysis and management are critical to modern applied statistical research. The aim of the course is to introduce students to two statistical software tools used in biostatistics research: R. Thus, the objectives of the course are: • Introduce the students to the statistical programming and data management using R statistical software; • Prepare students for computer work in other courses of the Biostatistics study programme.

Preliminary Knowledge

No specific prerequisites are demanded, however computer skills, high-school level algebra and statistics concepts will be used in the course.

Learning Outcomes

Knowledge

1.• Know, select and use independently main programming principles in R. • Use the database management principles in R. • Learn and operate advanced programming elements such as conditional execution, cycles and customized functions.

Skills

1.After this course, students will be able to: • Examine various types of data into the software and organize it for analysis. • Complete data transformation and visualization with R. • Write and use their own R functions to automate common tasks.

Competences

1.Students will be able to: • Use qualitatively R and Jamovi software for statistical analysis in other biostatistics courses. • Recognize the differences between R and Jamovi programs and choose the most suitable for their analysis. • Profound their statistical programming skills independently, to perform research or analyse health-related data.

Assessment

Individual work

Title
% from total grade
Grade
1.

Individual work

-
-
• Individual work with lecture material preparing to all lectures of study course. • Independently prepare assigned 2 computer projects practicing the concepts studied in the course. In order to evaluate the quality of the study course as a whole, the student must fill out the study course evaluation questionnaire on the Student Portal.

Examination

Title
% from total grade
Grade
1.

Examination

-
10 points

Assessment on the 10-point scale according to the RSU Educational Order: • Two computer projects to be signed in: 50% × 2 = 100%

Study Course Theme Plan

FULL-TIME
Part 1
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Introduction to R language and R Studio. Interface, workflow, scripts and coding basics. Introduction to data visualization with ggplot – aesthetic mappings, geometric objects and statistical transformations.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Practice using R interface. Practice answering simple questions about data through creating ggplot2 visualizations of R built-in datasets.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Introduction to data wrangling (transformation) with dplyr package: filter, arrange, select, create new variables and summarize. Introducing R projects workflow.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Practice various data transformations with R dplyr package on R built-in datasets.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Exploratory data analysis: assessing variation and covariation. Statistical summaries with boxplots. Reading various types of data into R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Practice exploratory data analysis in R: data variation (barplots, histograms, boxplots) and covariation (visualizing two variable relation). Practicing readr package for reading data.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Principles of consistently organizing data in R with tidyr package: gather and spread datasets, deal with NA values. Introducing R Markdown reports system, various Markdown formats and principles of writing mathematical symbols.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Importing and tidying a real-life dataset. Creating an R Markdown report.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Introduction to data management in R. Relational data principles: relations, keys, joins and set operations. Some common non-numeric variable types in R. Organizing multiple operations with pipes.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Managing relational data (i.e., multiple tables) with dplyr. Practice dealing with non-numeric variable types in R: factors, strings, dates.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Advanced R programming: writing your own R functions, function vectorization, conditional execution (if, else statements), for loops and map functions.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Exercises on function writing and programming elements.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Descriptive statistics in R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Data visualization in R.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Qualitative data analysis in R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Descriptive statistics
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Qualitative data analysis
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Parametric tests
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Parametric tests in R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Non-parametric tests.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Non-parametric tests in R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Correlation and linear regression.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
2

Topics

Correlation and linear regression in R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Work in groups. From paper to prepared data.
Total ECTS (Creditpoints):
6.00
Contact hours:
48 Academic Hours
Final Examination:
Exam (Written)
PART-TIME
Part 1
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Introduction to R language and R Studio. Interface, workflow, scripts and coding basics. Introduction to data visualization with ggplot – aesthetic mappings, geometric objects and statistical transformations.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Practice using R interface. Practice answering simple questions about data through creating ggplot2 visualizations of R built-in datasets.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Introduction to data wrangling (transformation) with dplyr package: filter, arrange, select, create new variables and summarize. Introducing R projects workflow.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Practice various data transformations with R dplyr package on R built-in datasets.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Exploratory data analysis: assessing variation and covariation. Statistical summaries with boxplots. Reading various types of data into R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Practice exploratory data analysis in R: data variation (barplots, histograms, boxplots) and covariation (visualizing two variable relation). Practicing readr package for reading data.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Principles of consistently organizing data in R with tidyr package: gather and spread datasets, deal with NA values. Introducing R Markdown reports system, various Markdown formats and principles of writing mathematical symbols.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Importing and tidying a real-life dataset. Creating an R Markdown report.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Introduction to data management in R. Relational data principles: relations, keys, joins and set operations. Some common non-numeric variable types in R. Organizing multiple operations with pipes.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Managing relational data (i.e., multiple tables) with dplyr. Practice dealing with non-numeric variable types in R: factors, strings, dates.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Advanced R programming: writing your own R functions, function vectorization, conditional execution (if, else statements), for loops and map functions.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Exercises on function writing and programming elements.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Descriptive statistics in R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Data visualization in R.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Qualitative data analysis in R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Descriptive statistics
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Qualitative data analysis
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Parametric tests
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Parametric tests in R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Non-parametric tests.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Non-parametric tests in R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Correlation and linear regression.
  1. Lecture

Modality
Location
Contact hours
On site
Computer room
1

Topics

Correlation and linear regression in R.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Work in groups. From paper to prepared data.
Total ECTS (Creditpoints):
6.00
Contact hours:
36 Academic Hours
Final Examination:
Exam (Written)

Bibliography

Required Reading

1.

Wickham, H. and Grolemund, G. 2016. R for Data Science. Import, Tidy, Transform, Visualize, and Model Data. O'Reilly.Suitable for English stream

2.

Navarro, D. J. and Foxcroft, D. R. 2022. learning statistics with jamovi: a tutorial for psychology students and other beginners. (Version 0.75).Suitable for English stream

Other Information Sources

1.

StatmethodsSuitable for English stream

2.

Cheatsheets

3.

ggplot2.Suitable for English stream