UNLEASHING THE POWER OF DEEP LEARNING FOR SLOVAK LANGUAGE

SlovakBERT

trained by

consulted and evaluated by

THE FIRST LARGE-SCALE SLOVAK MASKED LANGUAGE MODEL

Try it yourself!

Type a few words followed by <mask> in place of the word that you want SlovakBERT to fill in.

    Disclaimer: Results do not represent views or opinions of KInIT or SlovaBERT creators. KInIT bears no legal or other responsibility for the results. They are produced based on (so far imperfect) understanding of processed text corpora, mainly web-based, and constitute rather a mirror of current mood of Slovak society as represented on the internet. Learn how it works and how we plan to make it better.

    How-it-works

    How does it work?

    SlovakBERT is an artificial neural network trained to understand natural language. It is the first modern large-scale model for the Slovak language. It was trained using a very big amount of texts from the Web on powerful hardware. This huge corpus is a snapshot of how Slovak is used in practice today. Models like SlovakBERT are used nowadays in various AI applications because of their amazing linguistic capabilities. They can be easily adapted to solve a variety of NLP tasks such as document classification or sentiment analysis with great accuracy based on a relatively small amount of additional training data.

    Read more: Blog post, Paper

    SlovakBERT use cases

    industry

    Industry

    SlovakBERT is an open-source model that can be used to tackle your Slovak NLP challenges. The model is trained with a general Slovak Web-based corpus, but it can be easily adapted to new domains and to solve new tasks. Our benchmarking shows that the model achieves state-of-the-art (i.e. the best from all known) results for Slovak language. SlovakBERT can handle: text classification, entity recognition, token classification, parsing, modeling semantic similarity and many other tasks.

    If you need help with your business tasks using NLP, contact us. We can help you with the whole machine learning lifecycle, from gathering data to training and deploying the model.

    research

    Research

    SlovakBERT can be used for text analytics in academic research as well. We plan to use it to study biases, disinformation, hate speech and other antisocial forms of behavior in Slovak language. You can learn more about our NLP research from our NLP Team. You can utilize its linguistic capabilities to help you with your research as well. The model itself is open-source, but please cite1 our work and let us know what you have achieved with SlovakBERT.

    If you are interested in NLP research collaboration, feel free to contact us.

    1 Pikuliak, Matúš, et al. "SlovakBERT: Slovak Masked Language Model." arXiv preprint arXiv:2109.15254 (2021).

    Applications

    Sentiment
    analysis

    Recognize sentiment and emotions.

    sentiment

    Adaptable for any domain:
    in social networks, for product and service reviews or customer interaction.

    Document
    classification

    Categorize a plethora of documents automatically!

    99%

    Outstanding performance even for more classes of documents (check out our paper).

    Semantic
    similarity

    Find similar documents/sentences.

    79%

    Way better than using traditional search. Cutting edge performance for slovak language.

    Universal sentence
    representations

    Create versatile representations
    matching your business needs
    .

    representations Tiger is a big cat. How do lions hunt their prey? It rains in autumn.

    Let’s get in touch


      We process your personal data provided through this form on the legal basis of GPDR Art. 6 (1) letter (a) consent to the processing of personal data.You can withdraw your consent at any time. For more information please read our Privacy Policy.