Movies in Words


Kaffee:  I’ll ask for the fourth time.  You ordered…

Jessep:  You want answers?

Kaffee:  I think I’m entitled to them.

Jessep:  You want answers?!

Kaffe:  I want the truth.

Jessep:  You can’t handle the truth!

This scene from A Few Good Men is one of my favorite scenes of all time.  The performances by Tom Cruise (Kaffee) and Jack Nicholson (Jessep) are brilliant.  These actors convey such passion and intensity that I’m drawn to the edge of my seat each time this scene comes on – and I’ve seen it over a dozen times!

Something that has always fascinated me about movies is the magic that happens between a script and the final product.  I’ve read several scripts after first seeing the movie, and I am blown away at how actors and directors make those words come to life.  Take the six lines above written by Aaron Sorkin.  Pretty simple dialogue.  Short questions.  Short answers.  If I had read these lines prior to seeing the movie, I’m not sure I would have felt the climax as intensely as I did when watching Colonel Jessep lash out at Lieutenant Kaffe with that famous last line.  Those words stuck with me and are ones I will never forget.

This got me thinking about words.  Movie words.  What would a movie look like as a collection of words?


To answer my question, I thought it would be interesting to see what the scripts of some of my favorite movies looked like as word clouds.  I’m a sucker for word clouds.  They’re such a fun and creative way to quickly analyze a whole bunch of text and walk away with an understanding of what was important or largely emphasized in the writing.  To build my world clouds, I needed to find two things – scripts and a word cloud builder (pretty obvious, right?).

Both of these tasks proved easier than expected.  Having read several scripts before, I knew simple Google searches would take me where I needed to go to find them (see references below).  The more difficult task in my mind would be building the word clouds.  I have constructed word clouds in R before, but I wanted to venture out and see if there were easier ways to generate these gems.  To my good fortune, a little research pointed me to a fantastic site for generating word clouds – Tagxedo.

After pasting movie scripts into Tagxedo, I was able to customize the outputs in various ways.  Layout, font, color scheme, orientation, and word inclusion were just some of the settings I had control over.  When it came to the movie script content, there were three main ways in which I “cleaned” the data.  First, the web tool allowed me to ignore common words (ie. the, you, on, and, etc.), which were of no interest.  Second, I was able to choose the number of words to be included.  This created a list of the most widely used words.  Finally, from this list of words, I could choose to ignore those that were of no interest.  Since movie scripts were loaded into the site, I chose to ignore “script language” such as CUT (as in CUT TO), INT (as in INTERIOR), and BEAT  (to name a few).  I also chose to ignore curse words such as f-bombs.

With all of these settings in place, the final word clouds were generated.  Below are the five movies I decided upon for this exploration.  This sample is a solid representation of the types of movies I enjoy.  For those who have not seen the movies, I have included a brief synopsis from IMDB that paints a picture of the movie plots.

A Few Good Men – “Neo military lawyer Kaffee defends Marines accused of murder; they contend they were acting under orders.” (IMDB)



Back to the Future – “A young man is accidentally sent 30 years into the past in a time-traveling DeLorean invented by his friend, Dr. Emmett Brown, and must make sure his high-school-age parents unite in order to save his own existence.” (IMDB)



The Social Network – “Harvard student Mark Zuckerberg creates the social networking site that would become known as Facebook, but is later sued by two brothers who claimed he stole their idea, and the cofounder who was later squeezed out of the business.” (IMDB)



Good Will Hunting – “Will Hunting, a janitor at M.I.T., has a gift for mathematics, but needs help from a psychologist to find direction in his life.” (IMDB)



The Dark Knight – “When the menace known as the Joker wreaks havoc and chaos on the people of Gotham, the caped crusader must come to terms with one of the greatest psychological tests of his ability to fight injustice.” (IMDB)




From this exploration, I leave with two main takeaways.  First, Tagxedo is a great tool!  The site is extremely easy to use and turns out a high quality product.  I definitely plan on using it again for future world cloud needs.

My second takeaway is the emphasis of each word cloud – the characters!  Strong characters are the backbone of great movies.  Just looking at these clouds, I see Mark-Eduardo, Sean-Will, Marty-Doc, Kaffee-Jessep, Batman-Joker and am instantly thrust into my favorite scenes.   The connections we make with the characters are what make most movies special, and these visualizations reinforce just that.

– SD


Hollywood Picture


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s