Natural Language Processing
We specialize in advanced language models, semantic text analysis techniques, knowledge bases and speech transcription.
Comprehensive and advanced text analysis services for research and development in science and business
The most modern computing infrastructure designed to work with machine learning and artificial intelligence.
The project is scheduled to run between 2020 and 2023. Its main goal is to expand the concentrated research infrastructure of CLARIN-PL, which will become a research and development platform for natural language processing and large language data mining. The infrastructure will provide access to universal language technology components and mechanisms to connect these components for the construction of both general and specialized text analysis systems. As part of the infrastructure, an IT architecture will be created for the construction of effective and efficient systems for mining large language data (texts and speech) and multimodal data.
The implementation of the main project objective will include the implementation of several specific objectives:
creation of a technology centre providing a hardware base for the development of linguistic and multimodal data mining technologies;
equipping the centre with advanced language technologies (databases, software modules, network services and applications) for intelligent processing of large heterogeneous data in areas not supported by existing research infrastructures or open technology;
providing the market with tools for analyzing linguistic data (i.e. speech and text records) in a simple, structured form (e.g. text with information about its internal structure or relationships with other texts, speech recording with meta tags referring to selected fragments of it) and annotated data (e.g. texts or speech recordings supplemented with hypertext or metadata linked to other resources);
making many language resources and tools for European languages unified, thus making them interoperable;
developing and implementing appropriate standards for language resources and tools;
providing comprehensive and easy access to the archive of language resources and technologies, as well as a range of applications intended for use by the end user.
The project is financed under the 2014-2020 Smart Development Operational Programme, Priority IV: Increasing the scientific and research potential, Measure 4.2: Development of modern research infrastructure of the science sector, No. POIR.04.02.00-00C002/19, "CLARIN - Common Language Resources and Technology Infrastructure".