A few days back, I saw this article “How to fake a sophisticated knowledge of Wine with Markov Chains” on the programming subreddit. To my utter delight, the article referenced the code, along with a very detailed explanation, so I spent an hour getting it all to work. The hour taken was no fault of the original authors, it was taken because I wanted to get a good hang of XPath, which will be the topic of a later post.
The program auto-generates wine reviews, by using Markov Chains to come up with a sequence of most probable trigrams. It was great fun spouting my expert-level sommelier reviews, especially considering that I can count on one hand the number of times I have actually tasted wine! The reviews were just the right amount of ambiguous with a hint of snobbishness (which, according to me, just made the whole thing perfectly more believable).
While I was showing off my new-found expertise in wines, my partner in crime, Rupsa, told me it could probably be used for other similar texts as well. We briefly discussed movie reviews, but getting hold of 1 sentence reviews is tough (and we are lazy), and we weren’t sure how well the system would handle longer reviews.
We decided to use Yo mama jokes. For the uninformed, these are politically incorrect, fat shaming jokes. One sentence long, can be hilarious (which is good) and very insulting (which is not) or both (which make you feel guilty while laughing).
So we trawled the internet for around a 1000 odd Yo Mama jokes. There were a few repetitions in the 1000, mainly because of differences of spellings, a little swapping of words here and there etc.
These below are some of the gems the system came up with:
Yo mama so fat that she thought fruit punch was a dirty bomb.
Yo mama so fat if she buys clothes in three sizes: large, extra large, and "Oh my God, it's coming towards us!"
Yo mama so fat when god was making light he told her to break apart.
Yo mama so fat she can't even get high.
Yo mama so fat when they took pictures of Earth it looked like Earth had a family reunion.
Yo mama so dumb she stuck a battery up her panties.
Yo mama so fat, when she walks down the stairs, I wasn't laughing but the ground started cracking up.
Yo mama so ugly that she influences the tides.
Yo mama so fat she was going to Walmart tripped over on 4th Ave, she landed on Burger King.
What we enjoyed about them was the fact that many of them, while strictly not making any grammatical/logical sense, were still funny to read. We may not have gone nuts laughing over them, but I do distinctly remember an adolescent me who would have found these hilarious for hours on end.
Of course, there was a lot of junk being thrown up as well, and these are only some of the better ones that we have shown here. One factor for the junk can be that the inputs themselves were not all of top-notch language and sentence construction, with the relatively small size of the training data also contributing.
Next up is the ambitious goal of generating a full-size text. We could go for a single author with a large body of work, or multiple authors with similar writing styles, or authors in the same genre. Maybe a mashup of Shakespeare and Bacon influence (read, using all their work as the training), or Chekhov and Tolstoy, or even C.S. Lewis with J.K. Rowling? The possibilities seem endless, even with the very obvious constraint of being able to use only a 3-word window. Of course, the limitations will become more apparent as we make more and more use of it.
Here is the GitHub link for your enjoyment.
Comments
Post a Comment