Skip to main content

Yo mama so geeky : generating jokes using Markov Chains

A few days back, I saw this article “How to fake a sophisticated knowledge of Wine with Markov Chains” on the programming subreddit. To my utter delight, the article referenced the code, along with a very detailed explanation, so I spent an hour getting it all to work. The hour taken was no fault of the original authors, it was taken because I wanted to get a good hang of XPath, which will be the topic of a later post.
The program auto-generates wine reviews, by using Markov Chains to come up with a sequence of most probable trigrams. It was great fun spouting my expert-level sommelier reviews, especially considering that I can count on one hand the number of times I have actually tasted wine! The reviews were just the right amount of ambiguous with a hint of snobbishness (which, according to me, just made the whole thing perfectly more believable).
While I was showing off my new-found expertise in wines, my partner in crime, Rupsa, told me it could probably be used for other similar texts as well. We briefly discussed movie reviews, but getting hold of 1 sentence reviews is tough (and we are lazy), and we weren’t sure how well the system would handle longer reviews.
We decided to use Yo mama jokes. For the uninformed, these are politically incorrect, fat shaming jokes. One sentence long, can be hilarious (which is good) and very insulting (which is not) or both (which make you feel guilty while laughing).
So we trawled the internet for around a 1000 odd Yo Mama jokes. There were a few repetitions in the 1000, mainly because of differences of spellings, a little swapping of words here and there etc.
These below are some of the gems the system came up with:
Yo mama so fat that she thought fruit punch was a dirty bomb.
Yo mama so fat if she buys clothes in three sizes: large, extra large, and "Oh my God, it's coming towards us!"
Yo mama so fat when god was making light he told her to break apart.
Yo mama so fat she can't even get high.
Yo mama so fat when they took pictures of Earth it looked like Earth had a family reunion.
Yo mama so dumb she stuck a battery up her panties.
Yo mama so fat, when she walks down the stairs, I wasn't laughing but the ground started cracking up.
Yo mama so ugly that she influences the tides.
Yo mama so fat she was going to Walmart tripped over on 4th Ave, she landed on Burger King.
What we enjoyed about them was the fact that many of them, while strictly not making any grammatical/logical sense, were still funny to read. We may not have gone nuts laughing over them, but I do distinctly remember an adolescent me who would have found these hilarious for hours on end.
Of course, there was a lot of junk being thrown up as well, and these are only some of the better ones that we have shown here. One factor for the junk can be that the inputs themselves were not all of top-notch language and sentence construction, with the relatively small size of the training data also contributing.
Next up is the ambitious goal of generating a full-size text. We could go for a single author with a large body of work, or multiple authors with similar writing styles, or authors in the same genre. Maybe a mashup of Shakespeare and Bacon influence (read, using all their work as the training), or Chekhov and Tolstoy, or even C.S. Lewis with J.K. Rowling? The possibilities seem endless, even with the very obvious constraint of being able to use only a 3-word window. Of course, the limitations will become more apparent as we make more and more use of it.
Here is the GitHub link for your enjoyment.


Popular posts from this blog

Year 2016 in review and goals for 2017

Hello people,
It's my 34th birthday today and I wanted to put the past year in review and where I wanted my life to go in the next year.

Achievements of this year: Machine learning course by Andrew Ng ( completed )Calculus I by Robert Ghrist ( completed )Calculus II by Robert Ghrist ( completed )Probability and Statistics ( 2 weeks left )Data science - pandas ( 1 week done ) My first linear regression program Built a neural network from scratch My first Regex. From scratch. No references. With tests.Algorithms I by Robert Sedgwick ( only audit )Algorithms II by Robert Sedgwick ( only audit )
Also did the BE subjects for CS, all the stuff I had learned over the years.
I am super happy to know that MOOCS help a lot in career advancement.
Self-help books that really helped: How to win friends and influence people - Dale CarnegieA mind for Numbers - Barbara Oakley
Altogether, a pretty good year, where learning is concerned. Things/Tips that helped me while learning: Very accommodating p…

Markov chain in JavaScript

I made a small Markov Chain joke generator during my coffee break sometime last week. This is in continuation to the last post, where we did a similar thing. I did this specifically to see how well it could be extended in a language which I have typically not used before for ML/NLP.

Let me run you guys through it.
First of all, the Markhov Chains need a bunch of data to tell it how exactly you want your sentences constructed.

str_arr=[sentence1, sentence2,...]

Next, we create a dictionary of all trigrams present across the sentences. To do this, we use all bigrams as keys, and the succeeding word as the corresponding values. The key-value pairs thus form a trigram. As an example, consider the sentence : “The man had a dog.” The dictionary for this sentence will have :
[ {[The, man] : [had]}, {[man, had] : [a]}, {[had, a] : [dog]} ]
Next up, using the dictionary that we just made to create sentences. Here we provide the first two words, and let the function work its magic to complete the sen…