Statistical Programming and Data Management
Study Course Implementer
23 Kapselu street, 2nd floor, Riga, statistika@rsu.lv, +371 67060897
About Study Course
Objective
Skills of the data analysis and management are critical to modern applied statistical research. The aim of the course is to introduce students to two statistical software tools used in biostatistics research: R. Thus, the objectives of the course are: • Introduce the students to the statistical programming and data management using R statistical software; • Prepare students for computer work in other courses of the Biostatistics study programme.
Preliminary Knowledge
No specific prerequisites are demanded, however computer skills, high-school level algebra and statistics concepts will be used in the course.
Learning Outcomes
Knowledge
1.• Know, select and use independently main programming principles in R. • Use the database management principles in R. • Learn and operate advanced programming elements such as conditional execution, cycles and customized functions.
Skills
1.After this course, students will be able to: • Examine various types of data into the software and organize it for analysis. • Complete data transformation and visualization with R. • Write and use their own R functions to automate common tasks.
Competences
1.Students will be able to: • Use qualitatively R and Jamovi software for statistical analysis in other biostatistics courses. • Recognize the differences between R and Jamovi programs and choose the most suitable for their analysis. • Profound their statistical programming skills independently, to perform research or analyse health-related data.
Assessment
Individual work
|
Title
|
% from total grade
|
Grade
|
|---|---|---|
|
1.
Individual work |
-
|
-
|
|
• Individual work with lecture material preparing to all lectures of study course.
• Independently prepare assigned 2 computer projects practicing the concepts studied in the course.
In order to evaluate the quality of the study course as a whole, the student must fill out the study course evaluation questionnaire on the Student Portal.
|
||
Examination
|
Title
|
% from total grade
|
Grade
|
|---|---|---|
|
1.
Examination |
-
|
10 points
|
|
Assessment on the 10-point scale according to the RSU Educational Order: • Two computer projects to be signed in: 50% × 2 = 100% |
||
Study Course Theme Plan
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Introduction to R language and R Studio. Interface, workflow, scripts and coding basics.
Introduction to data visualization with ggplot – aesthetic mappings, geometric objects and statistical transformations.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Practice using R interface. Practice answering simple questions about data through creating ggplot2 visualizations of R built-in datasets.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Introduction to data wrangling (transformation) with dplyr package: filter, arrange, select, create new variables and summarize. Introducing R projects workflow.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Practice various data transformations with R dplyr package on R built-in datasets.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Exploratory data analysis: assessing variation and covariation. Statistical summaries with boxplots. Reading various types of data into R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Practice exploratory data analysis in R: data variation (barplots, histograms, boxplots) and covariation (visualizing two variable relation). Practicing readr package for reading data.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Principles of consistently organizing data in R with tidyr package: gather and spread datasets, deal with NA values. Introducing R Markdown reports system, various Markdown formats and principles of writing mathematical symbols.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Importing and tidying a real-life dataset. Creating an R Markdown report.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Introduction to data management in R. Relational data principles: relations, keys, joins and set operations. Some common non-numeric variable types in R. Organizing multiple operations with pipes.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Managing relational data (i.e., multiple tables) with dplyr. Practice dealing with non-numeric variable types in R: factors, strings, dates.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Advanced R programming: writing your own R functions, function vectorization, conditional execution (if, else statements), for loops and map functions.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Exercises on function writing and programming elements.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Descriptive statistics in R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Data visualization in R.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Qualitative data analysis in R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Descriptive statistics
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Qualitative data analysis
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Parametric tests
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Parametric tests in R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Non-parametric tests.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Non-parametric tests in R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Correlation and linear regression.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Correlation and linear regression in R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Work in groups. From paper to prepared data.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Introduction to R language and R Studio. Interface, workflow, scripts and coding basics.
Introduction to data visualization with ggplot – aesthetic mappings, geometric objects and statistical transformations.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Practice using R interface. Practice answering simple questions about data through creating ggplot2 visualizations of R built-in datasets.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Introduction to data wrangling (transformation) with dplyr package: filter, arrange, select, create new variables and summarize. Introducing R projects workflow.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Practice various data transformations with R dplyr package on R built-in datasets.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Exploratory data analysis: assessing variation and covariation. Statistical summaries with boxplots. Reading various types of data into R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Practice exploratory data analysis in R: data variation (barplots, histograms, boxplots) and covariation (visualizing two variable relation). Practicing readr package for reading data.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Principles of consistently organizing data in R with tidyr package: gather and spread datasets, deal with NA values. Introducing R Markdown reports system, various Markdown formats and principles of writing mathematical symbols.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Importing and tidying a real-life dataset. Creating an R Markdown report.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Introduction to data management in R. Relational data principles: relations, keys, joins and set operations. Some common non-numeric variable types in R. Organizing multiple operations with pipes.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Managing relational data (i.e., multiple tables) with dplyr. Practice dealing with non-numeric variable types in R: factors, strings, dates.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Advanced R programming: writing your own R functions, function vectorization, conditional execution (if, else statements), for loops and map functions.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Exercises on function writing and programming elements.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Descriptive statistics in R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Data visualization in R.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Qualitative data analysis in R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Descriptive statistics
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Qualitative data analysis
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Parametric tests
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Parametric tests in R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Non-parametric tests.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Non-parametric tests in R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Correlation and linear regression.
|
-
Lecture
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
1
|
Topics
|
Correlation and linear regression in R.
|
-
Class/Seminar
|
Modality
|
Location
|
Contact hours
|
|---|---|---|
|
On site
|
Computer room
|
2
|
Topics
|
Work in groups. From paper to prepared data.
|
Bibliography
Required Reading
Wickham, H. and Grolemund, G. 2016. R for Data Science. Import, Tidy, Transform, Visualize, and Model Data. O'Reilly.Suitable for English stream
Navarro, D. J. and Foxcroft, D. R. 2022. learning statistics with jamovi: a tutorial for psychology students and other beginners. (Version 0.75).Suitable for English stream