Skip to main content

Yo mama so geeky : generating jokes using Markov Chains

A few days back, I saw this article “How to fake a sophisticated knowledge of Wine with Markov Chains” on the programming subreddit. To my utter delight, the article referenced the code, along with a very detailed explanation, so I spent an hour getting it all to work. The hour taken was no fault of the original authors, it was taken because I wanted to get a good hang of XPath, which will be the topic of a later post.
The program auto-generates wine reviews, by using Markov Chains to come up with a sequence of most probable trigrams. It was great fun spouting my expert-level sommelier reviews, especially considering that I can count on one hand the number of times I have actually tasted wine! The reviews were just the right amount of ambiguous with a hint of snobbishness (which, according to me, just made the whole thing perfectly more believable).
While I was showing off my new-found expertise in wines, my partner in crime, Rupsa, told me it could probably be used for other similar texts as well. We briefly discussed movie reviews, but getting hold of 1 sentence reviews is tough (and we are lazy), and we weren’t sure how well the system would handle longer reviews.
We decided to use Yo mama jokes. For the uninformed, these are politically incorrect, fat shaming jokes. One sentence long, can be hilarious (which is good) and very insulting (which is not) or both (which make you feel guilty while laughing).
So we trawled the internet for around a 1000 odd Yo Mama jokes. There were a few repetitions in the 1000, mainly because of differences of spellings, a little swapping of words here and there etc.
These below are some of the gems the system came up with:
Yo mama so fat that she thought fruit punch was a dirty bomb.
Yo mama so fat if she buys clothes in three sizes: large, extra large, and "Oh my God, it's coming towards us!"
Yo mama so fat when god was making light he told her to break apart.
Yo mama so fat she can't even get high.
Yo mama so fat when they took pictures of Earth it looked like Earth had a family reunion.
Yo mama so dumb she stuck a battery up her panties.
Yo mama so fat, when she walks down the stairs, I wasn't laughing but the ground started cracking up.
Yo mama so ugly that she influences the tides.
Yo mama so fat she was going to Walmart tripped over on 4th Ave, she landed on Burger King.
What we enjoyed about them was the fact that many of them, while strictly not making any grammatical/logical sense, were still funny to read. We may not have gone nuts laughing over them, but I do distinctly remember an adolescent me who would have found these hilarious for hours on end.
Of course, there was a lot of junk being thrown up as well, and these are only some of the better ones that we have shown here. One factor for the junk can be that the inputs themselves were not all of top-notch language and sentence construction, with the relatively small size of the training data also contributing.
Next up is the ambitious goal of generating a full-size text. We could go for a single author with a large body of work, or multiple authors with similar writing styles, or authors in the same genre. Maybe a mashup of Shakespeare and Bacon influence (read, using all their work as the training), or Chekhov and Tolstoy, or even C.S. Lewis with J.K. Rowling? The possibilities seem endless, even with the very obvious constraint of being able to use only a 3-word window. Of course, the limitations will become more apparent as we make more and more use of it.
Here is the GitHub link for your enjoyment.

Comments

Popular posts from this blog

In the right direction, perhaps ?

i've been toying around with a PHP framework called symfony . actually i wanted to know what MVC ( Model-View-Controller ) was, and i am really comfortable with PHP, so downloaded a copy and made a sandbox on my xampp htdocs directory. after i started playing around it just hit me that how much programming has simplified over the 12 years since i wrote my "hello world" program in c. where 150 people were needed to code a simple website back then, that too in about a year, now things can be done with just one or two programmers in about a fortnight. i think thats awesome. computers are finally doing what they were invented to do - to reduce human work. one really interesting thing is the askeet tutorial. here they guide you through making a replica of their site over 24 days, with each tutorial taking not more than one and a half hours to completely understand and implement. Also the symfony site itself is created on symfony, and that's really cool. ok, what hit me

a keeper from kiterunner

today was watching kiterunner. awesome movie. it was this dialog that hit me hard - "Now, no matter what the mullah teaches, there is only one sin, only one. And that is theft. Every other sin is a variation of theft... When you kill a man, you steal a life. You steal his wife's right to a husband, rob his children of a father. When you tell a lie, you steal someone's right to the truth. When you cheat, you steal the right to fairness... There is no act more wretched than stealing, Amir."

Mind and Friendships

This post seems to be ( atleast in my mind ) a long one. but before i write about the topic of the blog post, i would like to describe my opinions of connected topics which will lead to the actual post. First let us look at the growth of the mind as a graph covering fixed points, each of which are a topic of knowledge. A person's mind, at the beginning of life is like a line ( connecting 2 points, hunger and mother's breast ) which comes into existence only at some points and otherwise is simply a blank. after a few months, the lines become more permanent and stressed. after some more time, as the child grows and learns and understands more things, some more points add to this. if we imagine our mind to be the space enclosed in these points, then we would have a multidimensional graph. (think a really very weirdly shaped object.). However, the " The mind, once expanded to the dimensions of larger ideas, never returns to its original size. " - Oliver Wendell