Open-source MIT License 5 languages

Topic modeling for
the humanities

Discover themes in your text corpus using LDA. No coding, no setup, no compromise on transparency.

Launch Lemmata Source code

lemmata.app

Overview

Topics

Topic Map

Heatmap

Distribution

Preprocessing

Export

Coherence (C_v): 0.5220 — Good

Coherence is solid. Review the topics qualitatively to confirm they make sense for your research question.

Topics

Perplexity

649.4

Log-likelihood

-143k

Documents

144

Document-Term Matrix

Vocabulary (total): 5,900

Vocabulary (kept): 1,094

Terms removed: 4,806

Top Lemmas (pre-LDA)

guido

634

ada

512

augusta

421

carla

347

parola

283

donna

241

padre

198

Features

Built for researchers who work with texts, not terminals

No code required

Upload your corpus, configure parameters with sliders, and run LDA topic modeling entirely in your browser. Designed for literary scholars, historians, and digital humanists who want rigorous analysis without writing a single line of code.

Transparent preprocessing

Full preprocessing trace: see exactly which tokens were kept, removed, or lemmatized. Nothing is a black box.

5 languages

Built-in support for English, Italian, French, German, and Spanish with spaCy language models and per-language stopword lists.

Deterministic

Fixed random seed ensures identical results every time. Fully reproducible.

Complete export

CSV matrices, PNG/SVG charts, PDF report, and a full ZIP archive.

How It Works

Three steps from corpus to insight

Upload

Drag and drop your text files (TXT, PDF, DOCX, ODT, EPUB) or paste text directly.

Configure

Choose your language, number of topics, POS filters, and stopwords. Smart defaults get you started fast.

Analyse

Explore interactive topic charts, heatmaps, distributions, and word clouds. Export results in one click.

Cite

Using Lemmata in your research?

@software{koran_lemmata_2026,
  author    = {Koran, Oğuz and Yücesan, Barış and Cangır, Hakan},
  title     = {Lemmata: A Multilingual {LDA} Topic Modeling Platform
               for the Humanities},
  year      = {2026},
  doi       = {10.5281/zenodo.19391730},
  url       = {https://lemmata.app},
  note      = {Software available at https://github.com/oguzkoran-max/lemmata}
}

DOI: 10.5281/zenodo.19391730

Topic modeling for
the humanities

Built for researchers who work with texts, not terminals

No code required

Transparent preprocessing

5 languages

Deterministic

Complete export

Three steps from corpus to insight

Upload

Configure

Analyse

Built on trusted open-source tools

Using Lemmata in your research?

Ready to explore your corpus?

Topic modeling forthe humanities

Built for researchers who work with texts, not terminals

No code required

Transparent preprocessing

5 languages

Deterministic

Complete export

Three steps from corpus to insight

Upload

Configure

Analyse

Built on trusted open-source tools

Using Lemmata in your research?

Ready to explore your corpus?

Topic modeling for
the humanities