Making a Markov Chain Poem Generator in Python | by Mehrab Jamee | upperlinecode | Medium

Generating Poems with Python: A Markov Chain Adventure

Have you ever felt like some poems are just a random sequence of profound-sounding words? What if a computer could generate poetry based on the patterns it learns from existing poems? This article explores how to build a simple poem generator using Markov chains in Python, a technique that can produce surprisingly creative results.

From Poetry Apathy to Algorithmic Inspiration

Initially unenthusiastic about poetry, I challenged myself to create a program that could mimic the style of a poet. The goal wasn't to create meaningful art, but to explore the possibility of generating text that resembles poetry through statistical analysis. This project turned out to be a fascinating blend of language, probability, and coding.

Unveiling the Power of Markov Chains

The core of this poem generator lies in the concept of Markov chains. In simple terms, a Markov chain is a sequence of events where the probability of the next event depends only on the current state. In our case, the "states" are words, and the chain predicts the next word based on the preceding word.

  • How it works: The program analyzes a text file of poems, identifying the frequency with which words follow each other.
  • Building a Dictionary: This information is stored in a Python dictionary, where each word (key) is associated with a list of words that have followed it in the source text (value).
  • Generating Text: To create a poem, the program starts with a random word and then randomly selects subsequent words from the dictionary, based on the previous word.

For example, if the word "sun" is often followed by "shines" and "sets" in the input text, the program will randomly choose between these options when generating the poem after encountering the word "sun". This statistical approach creates a chain of words that, while potentially nonsensical, often captures the stylistic nuances of the original poet. For a robust understanding of Markov Chains, refer to this Wikipedia article.

Choosing a Poetic Muse: Rabindranath Tagore

To make the project more interesting, I decided to use the works of Rabindranath Tagore, a Nobel laureate and a prominent figure in Bengali literature. His collection of short poems, "Stray Birds," available on Project Gutenberg, served as the perfect input for the program. By feeding the text of "Stray Birds" into the generator, I aimed to create poems in a similar style to Tagore's.

Diving into the Code

While sharing the complete source code is beyond this article's scope, here's a simplified overview:

  1. Text Processing: Load the text file and clean the data by removing punctuation and converting all words to lowercase.
  2. Building the Markov Chain: Create a dictionary where keys are words, and values are lists of words that follow them in the text.
  3. Poem Generation: Start with a random word, and iteratively select the next word from the list of possible following words based on the Markov chain.

Interested programmers can further develop this project by adding features like rhyme scheme implementation or sentiment analysis to influence word choice. Further explorations into Natural Language Processing (NLP) techniques would also be beneficial.

The Result?

While the generated poems may not rival the artistic depth of Tagore's original work, they often produce surprisingly coherent and evocative phrases. This project demonstrates the potential of using simple statistical models to generate creative text, blurring the lines between human and machine creativity.

By understanding how Markov Chains can be used to generate text, you can appreciate the possibilities of Python programming in the realm of creative arts. Feel free to explore other fascinating applications of Python, such as creating [interactive data visualizations](link to internal article) or building [web scraping tools](link to internal article).

. . .