CLARIN-PL  - Common Language Resources and Technology Infrastructure is a research infrastructure designated to offer advanced research and development services for the benefit of companies as well as academic groups and the public sector. CLARIN-PL offers advanced computing services and data storage with particular emphasis on the use of natural language processing technology in industrial research.

Natural Language Processing

We specialize in advanced language models, semantic text analysis techniques, knowledge bases and speech transcription.


Comprehensive and advanced text analysis services for research and development in science and business


The most modern computing infrastructure designed to work with machine learning and artificial intelligence.


The project is scheduled to run between 2020 and 2023. Its main goal is to expand the concentrated research infrastructure of CLARIN-PL, which will become a research and development platform for natural language processing and large language data mining. The infrastructure will provide access to universal language technology components and mechanisms to connect these components for the construction of both general and specialized text analysis systems. As part of the infrastructure, an IT architecture will be created for the construction of effective and efficient systems for mining large language data (texts and speech) and multimodal data.

Technology vector created by fullvector -

The implementation of the main project objective will include the implementation of several specific objectives:

  • Technology centre

    creation of a technology centre providing a hardware base for the development of linguistic and multimodal data mining technologies;

  • Language technologies

    equipping the centre with advanced language technologies (databases, software modules, network services and applications) for intelligent processing of large heterogeneous data in areas not supported by existing research infrastructures or open technology;

  • Tools

    providing the market with tools for analyzing linguistic data (i.e. speech and text records) in a simple, structured form (e.g. text with information about its internal structure or relationships with other texts, speech recording with meta tags referring to selected fragments of it) and annotated data (e.g. texts or speech recordings supplemented with hypertext or metadata linked to other resources);

  • Unification

    making many language resources and tools for European languages unified, thus making them interoperable;

  • Standards

    developing and implementing appropriate standards for language resources and tools;

  • Access

    providing comprehensive and easy access to the archive of language resources and technologies, as well as a range of applications intended for use by the end user.


Consortium companies

Supporting companies

Project Leader

Wroclaw University of Science and Technology

Wybrzeże Wyspiańskiego 27

50-370 Wrocław

NIP: 896-000-58-51
REGON: 000001614

tel.: +48 71 320 4221

tel.: +48 71 320 4224



The project is financed under the 2014-2020 Smart Development Operational Programme, Priority IV: Increasing the scientific and research potential, Measure 4.2: Development of modern research infrastructure of the science sector, No. POIR.04.02.00-00C002/19, "CLARIN - Common Language Resources and Technology Infrastructure".