Slides on practical Topic Modeling: Preparation, evaluation, visualization

I gave a presentation on Topic Modeling from a practical perspective*, using data about the proceedings of plenary sessions of the 18th German Bundestag as provided by offenesparlament.de. The presentation covers preparation of the text data for Topic Modeling, evaluating models using a variety of model quality metrics and visualizing the complex distributions in the models. You can have a look at the slides here:

Probabilistic Topic Modeling with LDA – Practical topic modeling: Preparation, evaluation, visualization

The source code of the example project is available on GitHub. It shows how to perform the preprocessing and model evaluation steps with Python using tmtoolkit. The models can be inspected using PyLDAVis and some (exemplary) analyses on the data are performed.

* This presentation builds up on a first session on the theory behind Topic Modeling

Comments are closed.

Post Navigation