EU institutions need accurate, targeted, and timely information at almost every stage of policy and decision making. However, such information is increasingly embedded in large amounts of textual data available online, such as traditional or social media, or in large public or proprietary document sets. The sheer volume of this data makes it nearly impossible to extract the relevant information it contains manually.
The Competence Centre on Text Mining and Analysis (CC-TMA) uses various tools to address not only the problem of volume, but also of timeliness of information in a proper format and in a variety of contexts.
The application domains relevant for EU's institutions include:
- political current affairs media monitoring;
- targeted information for crisis rooms to improve EU’s prevention, preparedness and response capabilities;
- information used for security purposes;
- business intelligence based on framework proposals;
- research and innovation monitoring;
- monitoring of health related issues;
- monitoring of news in the financial sector.
Text mining techniques and tools are highly specific and not directly accessible or useable by decision makers or policy domain experts. The use of these tools and techniques requires a range of complementary skills: from analysis, through research and development of solutions based on computational linguistics, to deployment and operation of the systems, based on sound IT knowledge and practices. It is unlikely that small isolated groups could cover all these aspects or reach the required level of expertise.
The benefits of establishing the Competence Centre on TMA are therefore to:
- provide the expertise needed to offer practical solutions based on TMA: computational linguistic research, applied IT and support;
- maintain, expand and develop knowledge/experience in TMA in an operational environment;
- provide sufficient critical mass to support research in TMA;
- provide sufficient capacity to answer to relevant ad-hoc requests;
- promote the harmonisation of tools/techniques allowing for better information exchange between users;
- leverage economies of scale by deploying the same technology/tools; in addition Institutional support for small scale media monitoring activities can be provided to EU Offices and Agencies;
- provide a clear point of reference for TMA and act as solution broker for TMA needs;
- provide a one-stop-shop for tools, services and training for the EU institutions;
- provide advice on the use of TMA techniques for information extraction;
- support or conduct technical negotiations with external data providers of structured and unstructured textual data;
- reduce number of external interfaces to data providers;
- together with Eurostat, organise the community of Data4policy within the Commission, and ensure interactions with Data4policy community outside of the Institutions.
The CC-TMA was not one among the pilot Knowledge and Competence Centres launching onto this Knowledge4Policy Platform in May 2018. While some of its key outputs are available here, its main online presence is therefore divided between two sites:
For more information or to get in touch, please contact us via email: JRC-TMA-CC@ec.europa.eu