Veidlapa Nr. M-3 (8)
Study Course Description

Corpus Analysis

Main Study Course Information

Course Code
KF_057
Branch of Science
Communication Theory; Media and communications
ECTS
3.00
Target Audience
Business Management; Communication Science; Information and Communication Science; Juridical Science; Law; Management Science; Pedagogy; Political Science; Psychology; Public Health; Social Anthropology; Social Welfare and Social Work; Sociology
LQF
Level 7
Study Type And Form
Full-Time

Study Course Implementer

Course Supervisor
Structure Unit Manager
Structural Unit
Faculty of Social Sciences
Contacts

Dzirciema street 16, Rīga, szf@rsu.lv

About Study Course

Objective

Learn to work with Latvian language corpora and interpret meaning of lexical units in large texts.

Preliminary Knowledge

Not required.

Learning Outcomes

Knowledge

1.students are able to recognize principles of meaning, basic difference between paradigms of meaning; principles of making and using language corpora.

Skills

1.Students can use Sketch Engine programme, find words frequencies, concordance and collocations, extract statistical data of language use, interpret data qualitatively and quantitatively.

Competences

1.Students formulate a problem in the field of their professional interests as a problem of language use, advances a hypothesis, extracts and interprets quantitative and qualitative data, formulates conclusions and recommendations.

Assessment

Individual work

Title
% from total grade
Grade
1.

Individual work

-
-
Interpretation of meaning of essentially contested notions in the student's major discipline using Latvian language corpora. In order to evaluate the quality of the study course as a whole, the student must fill out the study course evaluation questionnaire on the Student Portal.

Examination

Title
% from total grade
Grade
1.

Examination

-
-
Use of social science terms and/or key words of the student's field of specialization in political, legal, media and everyday public discourse. The report includes the following sections: - the formulation of a research problem related to the use of language in some area of public life (justice, journalism, politics, health care, economy, public relations, education) (1/2 page); - a list of key words (notions, terms, concepts) and the justification for their choice (1-3 words); - statistical hypotheses about the use of key words in corpora and subcorpora (3-5 hypotheses); - description of the corpora selected for the study and justification of choice (1/2 page); - Description of Sketch Engine queries (for one keyword, if the queries of other words are similar) (1/2 page); - summary of statistical data in tables (frequency, normalized frequency, relative frequency, t-value, MI-value, logDice, statistical significance in the z-test, statistical significance in the Pearson's test χ2, Cramer's V - depending on the needs of the research), description of the formulas used; - interpretation of concordance (up to 1 page) with examples of query results; - interpretation of statistical data (up to 1/2 page); - integration of qualitative and quantitative data in conclusions (1/2 page). Report volume: 4-5 pages. text; 1-3 p. examples of tables and concordances.
2.

Examination

-
-
Knows the corpora, identifies a research problem, formulates a hypothesis, selects key words, extracts and interprets collocations and concordances, extracts and process statistical data, formulates conclusions. Independent work, participation in classes, exam.

Study Course Theme Plan

FULL-TIME
Part 1
  1. Video Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

The concept of Corpus. Statistic regularity of language use. Sketch Engine programme.
  1. Video Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

Word frequency. Data normalization and relative frequency.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Statement of a language use research problem, selection of key words.
  1. Video Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

Word frequency (cntd.). Concordance.
  1. Video Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

Collocation. Statistical analysis (MI-, MS-, t-scores)
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Defining meaning of words: dictionary and corpus.
  1. Video Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

Collocation (cntd.). Statistical analysis (logDice).
  1. Video Lecture

Modality
Location
Contact hours
On site
Auditorium
2

Topics

Corpus statistics (Pearson's chi-squared test, Cramer's V, Residuals)
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Collocation of the researched words: qualitative and quantitative analysis.
  1. Class/Seminar

Modality
Location
Contact hours
On site
Computer room
2

Topics

Presentation and discussion of researh paper.
Total ECTS (Creditpoints):
3.00
Contact hours:
20 Academic Hours
Final Examination:
Exam (Written)

Bibliography

Required Reading

1.

Kruks, S., I. Skulte. (2016). „Politikas izzušana Saeimas diskursā”. Latvijas Zinātņu akadēmijas Vēstis: 49-56.

2.

Kruks, S. (2020). “Uzticības, sadarbības un vienotības konceptu izpratne Nacionālajā attīstības plānā 2021.-2027. gadam”. Akadēmiskā Dzīve 56: 131-147.

3.

McEnery, Tony and Andrew Hardie. (2012). Corpus Linguistics. Method, Theory and Practice. Oxford: Oxford University Press.

4.

Paquot, M. and S. Gries. (2020). Practical Handbook of Corpus Linguistics. Springer.

Additional Reading

1.

Barczewska, Shala. (2017). Corpus-Based Analysis of US Press Discourse. Cambridge Scholars Publisher.

2.

Cunningham, Clark D. and Jesse Egbert. (2020). Analyzing legal discourse in the United States. Pp. 462-480 in Friginal E., Hardy J. (eds) The Routledge Handbook of Corpus Approaches to Discourse Analysis. London: Routledge.

3.

Darģis, R., G. Rābante-Buša, I. Auziņa, S. Kruks. (2016). „ParliSearch – a system for large text corpus discourse analysis”. Pp. 115-121 in I. Skadiņa, R. Rozis (eds) Human Language Technologies – The Baltic Perspective. IOS Press.

4.

Gerring, J. (1999). "What Makes a Concept Good? A Criterial Framework for Understanding Concept Formation in the Social Sciences". Polity, Vol.31, No.3, (Spring), pp. 357–393.

5.

Kaal, B., I. Maks and A. van Elfrinkhof. (2014). From Text to Political Positions. Amsterdam: John Benjamins.

6.

Marmor, Andrei and Scott Soames. (2011). Philosophical Foundations of Language and Law. Oxford: Oxford University Press.