Drawing inspiration from this blog post on title virality I wanted to investigate what makes these top 10,000 titles the best of their breed. Which are the best superlatives? Who/what’s the most popular subject? Let’s start with some statistics:
- On Feb. 03, 14:10:45 (UTC) the all-time top 10,000 submissions on reddit (/r/all) had a total of 82,751,429 upvotes and 62,655,532 downvotes (56.9% liked it).
- 5.2 years between the oldest and newest submission
- 8,331,382 comments. That’s about 833 comments per submission.
- The #1 post has 26,758 – 4,882 = 21,876 points
- The #10,000 post has 15,166 – 13,679 = 1,487 points
- And now some graphs….
Adjectives – reddit loves “new”, “old”, “good” and “right”
- President Obama’s new campaign poster
- Upvoting everything just to see the new pineapples 😀
- New approach to China
Top Adjective, Superlative – “Best” is the best
- Ricky Gervais has an idea that would not only make the Golden Globes watchable, it would make it the best show of the year
- Best picture of a dog getting hit in the crotch with a tennis ball you will see all day. Yup that’s my dog.
- CSI: Modern computer technology at its best.
Questions reddit loves how?
- Dear Old People. We don’t want to kill you. You’re our parents and grandparents and we love you. But if you throw a cranky fit and keep us from getting decent, affordable health care, you can figure out how to work your own goddamn PCs and cable boxes and remote controls from now on.
- How I got an uncooperative eBay buyer to pay for her purchase. Was it unethical?
- How to report the News
What’s reddit talking about? People.
- Supreme Court ruling comes down – Corporations are people with free speech and the protected right to bribe politicians. Let’s not even pretend anymore folks, democracy in America is dead.
- We, the People of the United States of America, reject the U.S. Supreme Court’s ruling in Citizens United, and move to amend our Constitution to firmly establish that money is not speech, and that human beings, not corporations, are persons entitled to constitutional rights.
- 14 out of 14 people found this review helpful (PIC)
Or news, the president, man…
- This is a news website article about a scientific finding
- I work in News. This is how you stop SOPA.
- Can we all agree, that this is NOT an accident? Fuck you, Fox News. [pic]
Reddit appreciates personal content about you, this, it and I.
- I hate my job…
- Reddit, I don’t give a damn about your aunt, uncle, boyfriend, girlfriend, boss or toothless rabies infested dog who reads Reddit. Less personal crap and more a rticles please.
- I’ve had a vision and I can’t shake it: Colbert needs to hold a satirical rally in DC.
- I’m the only Caucasian in my part of town. I found this note on my windshield today…
Even NLTK doesn’t understand these…
I’m pretty sure you don’t need example links for these…
The top 10,000 seem to come mostly from 17:00 UTC and rarely from around 12:00 UTC
This isn’t exactly the probability of succeeding to hit the front page as it’s not clear at what time submission count is highest. But it’s something.
This is my first time using NLTK and though I’m ok at coding I most certainly have no idea how to parse natural language. Here’s hoping this was somewhat insightful.
More graphs and data
- Python Reddit API
- Python Natural Language Toolkit