Back to Image Processing

Dynamic Word Cloud Generator

TF-IDF Term Weighting · Mask-Shaped Layout · Frame-by-Frame Animation

Abstract

A small text-corpus visualiser built as a teaching exercise on top of nltk, scikit-learn, and the wordcloud library. Tokens are weighted by TF-IDF rather than raw frequency so that domain-specific terms surface above generic high-frequency words. The renderer respects an arbitrary alpha mask (a Pokémon Arceus silhouette, a fall-leaf shape, a Mexico outline), and an animated variant updates the layout frame-by-frame so the cloud "grows" as the corpus is fed in. This is an exploration / demonstration project, not a research result.

Pipeline

  1. Tokenise & clean. Lowercase, strip punctuation, drop English stop-words from nltk.corpus.stopwords, lemmatise with WordNetLemmatizer.
  2. TF-IDF weighting. Run TfidfVectorizer on the cleaned token stream against a background corpus (English Wikipedia sample). The score for term t in document d is:
    tf-idf(t, d)  =  tf(t, d) · log(N / df(t))
    Where tf is term frequency in d, N is total documents in the background corpus, and df is document frequency of t. This dampens generic English words and amplifies domain-specific ones.
  3. Mask-aware layout. Pass the binary alpha channel of an input PNG (Arceus silhouette, leaf shape, Mexico outline) into WordCloud(mask=...). The layout engine packs words inside the mask; pixels with alpha = 0 are forbidden.
  4. Animation. The dynamic variant feeds the corpus in chunks and re-renders the cloud each chunk. Frames are stitched into a GIF with imageio.mimsave.

Mask Examples

Each mask is a binary alpha image. Static cloud (PNG) shows the final layout; animated cloud (GIF) shows the per-chunk evolution as the corpus fills in.

Arceus mask · static

Word cloud arranged inside a Pokémon Arceus silhouette mask, with TF-IDF-weighted terms

Arceus mask · animated

Animated word cloud with the Arceus silhouette mask, showing terms appearing chunk by chunk

Fall leaf mask · static

Word cloud arranged inside a fall-leaf silhouette mask

Fall leaf mask · animated

Animated word cloud with a fall-leaf mask, showing terms appearing as the corpus is fed in

Mexico outline mask · static

Word cloud arranged inside a map outline of Mexico

Mexico outline mask · animated

Animated word cloud built inside a map of Mexico

Built collaboratively with Connor Carpenter, Ryan Lay, Samyak Karnavat, and Yash Shah.