New June 2025 — We have released a modern Dash interface for MS2LDA (MS2LDA 2.0). The codebase is actively developed at vdHooftCompMet/MS2LDA, with an accompanying pre-print coming soon. We encourage users to migrate to the new dashboard; the legacy MS2LDAviz interface will remain available during this transition period.
Metabolomics is the large-scale, untargeted study of small molecules (metabolites) involved in life-sustaining chemical processes. Untargeted metabolomics has illuminated medical diagnostics, drug discovery, personalised medicine and many other fields. Measurements are routinely performed using LC-MS instruments. With tandem MS, fragmentation peaks characteristic of a compound can be used to help establish its identity.
Fragmentation spectra provide characteristic fingerprints of compounds and contain structural information: a subset of fragment peaks may correspond to a shared chemical substructure. This site lets users perform unsupervised substructure discovery, decompose experiments into Mass2Motifs (recurring sets of fragments/losses) and integrate these motifs with comparative metabolomics data.
How does it work? In MS2LDA, discrete fragment and neutral-loss features are extracted from spectra; co-occurring features are detected with Latent Dirichlet Allocation. The analogy to text-topic modelling is illustrated below.
The tool accepts mzML, MSP and MGF files; an MS1 peak list can optionally be matched before running LDA or Decomposition.
Key literature related to MS2LDA:
Other papers citing MS2LDA can be found here.
To host your own instance of the legacy site, see github.com/glasgowcompbio/ms2ldaviz.