Text mining provides organisations with the fundamental capability to extract insights and, ultimately, actionable knowledge captured in human writing.
Online media, which continuously report both facts and opinions in cross-lingual and cross-cultural settings, represent an excellent environment to validate novel algorithms. In this context, after the initial focus on factual information (who, what, where, when) it becomes essential to carry out in-depth comparisons of how topics of interest are embraced and presented across different countries and media.
For this reason, we have recently worked on a large large-scale annotation campaign: 1000 news articles and other web documents in six different languages and more than 30 annotators.
The task consisted in identifying the document category (genre), the framing dimensions, and the persuasion techniques, including an extensive description of the underlying taxonomies.
These are the guidelines that were prepared for the annotators and the curators, i.e., the persons responsible for merging the annotations of individual annotators.
The report also provides:
- a description of the underlying annotation platform (Inception),
- training material for annotators, and
- some lessons learned
Originally Published | 14 Mar 2023 |
Knowledge service | Metadata | Text Mining | Europe Media Monitor (EMM) |
Digital Europa Thesaurus (DET) | natural language processinglinguistics |