Farewell wordpress

I imported all the posts to my new jekyll-bootstrap personal blog at yuvalg.com/blog and this old http://uberpython.wordpress.com won’t see any new posts – it’ll be kept as an archive.

That’s it. I’m done. It took only 3 years from the first complaint to finish. Looking back, I’m very grateful for the service.

What I enjoyed at wordpress

  • Yearly reports – they’re eye pleasing, informative and fun. Maybe google analytics could replace this somehow?
  • Ease of initial setup and maintenance.
  • Stable, even on surprise high traffic days.
  • Free.
  • They allow exporting the data! No lock-in is the mark of a serious business with good guys running it.

Why I’m leaving

  • Huge, annoying, ads that they hide from the blog author so you have to go incognito to see what the rest of the world sees. You can pay wordpress to remove the ads, but what else are they hiding from the authors? I don’t know.
  • No control over CSS. You can pay for that too but…
  • I already have a VPS so why not use it? (no way I’m installing the exploit ridden wordpress-php-mysql stack again though).
  • Jekyll allows me to edit the plain-text posts from any device with ease.
  • I wanted to rename the blog anyway.
  • Jekyll is a much simpler stack – preprocessing that results in a a lot of html files is a better fit for my blog. MySQL/PHP/Cache is oldschool. Though this wasn’t a concern of mine when hosted on wordpress.com – I still get a certain “higher road” satisfaction leaving.

Goodbye wordpress.

http://yuvalg.com/blog/

Secret societies of reddit

Out there in the wild internet there are many dark corridors and places we’ll never be able to visit. Understandably. But on reddit?! I think the people deserve to at least vaguely know the inner workings of their contentocracy. Here’s a list of a few most of us can only see the closed door of:

What are we voting for? What’s running this voting machine?

</tinfoil_hat>

redditp – a fullscreen presentation with reddit

tl;dr – add a “p” before the “.com” to any subreddit you visit and voila, you have a fullscreen presentation of all the images.

I like to show my friends cool stuff on the internet but browsing is a real conversation killer. You can’t really lean back, talk and have fun with friends while operating a website, surely not one as clunky as reddit. Even though RES does help.

So I just had to make this “hands-free” reddit mode. Where I can see:

Easy!

Welp, not that easy, there was a lot of CSS to handle and the design right now is dead ugly but functional. Also, many stories on reddit aren’t images and I skip those that aren’t in a quirky way. If the url’s 4th character from the right is a dot, I display it. That’s a hack that works for imgur (which is most of reddit’s images) so I’m using it for now until I have more time to fix it. Any suggestions are more than welcome - help improve redditp on github! Also, comics are a pain to watch right now. I might implement some sort of scroll wheel zooming in the future, though that really is a bit of a different use case that might deserve a different site.

I guess not too surprisingly the first 200 visits where mostly to gonewild. You internet you….

edit - here are some stats from the launch night

redditp launch night stats

redditp launch night stats

Introducing Absolute Ratio

Let’s define the absolute ratio for positive numbers:

abs_ratio(x) = 1 / x when x < 1, otherwise: x

When x is smaller than 1 return 1 / x, otherwise return x. Here are a few example values:

x abs_ratio(x)
0.5 2
2 2
0.2 5
5 5

And a graph:

Absolute Ratio Graph

Another spelling for the same operator would take 2 positive numbers and give their absolute ratio:

And a graph:

Absolute ratio in 3D

Use case examples

  • Music and audio – an octave of a frequency F is 2F. More generally a harmony of a frequency F is N*F where N is a natural number. To decide if one frequency is a harmony of another we just need to get their absolute ratio and see if it’s whole. E.g. if abs_ratio(F1, F2) == 2 they’re octaves. If abs_ratio(F1, F2) is whole – they’re harmonies.
  • Computer vision – to match shapes that have similar dimensions e.g. their width is only 10% larger or smaller. We don’t care which is the bigger or smaller, we just want to know if 0.91 < W1 / W2 < 1.1 which may be easier to pronounce as abs_ratio(W1, W2) < 1.1
  • Real life – when we see 2 comparable objects we’re more likely to say one is “three times the other” vs “one third the other”. Either way in our brains both statements mean the same concept. We think in absolute ratios.
  • General case – When you want to know if X is K times bigger than Y or vice versa and you don’t care which is the bigger one.

Interesting Properties

  • abs_ratio(Y / X) == abs_ratio(X / Y)
  • log(abs_ratio(X)) = abs(log(X))
  • log(abs_ratio(Y / X)) = abs(log(Y / X)) = abs(log(Y) – log(X))
  • You can see from the above that absolute ratio is somewhat of an absolute value for log-space.

What’s next for absolute ratio

  • I’d love to hear more use cases and relevant contexts.
  • What would be the written symbol or notation?
  • How can we get this operator famous enough to be of use to mainstream minds?
  • About negative numbers and zero – right now that’s undefined as I don’t see a use case for that domain.
  • For some code and graphs in python checkout https://github.com/ubershmekel/abs_ratio

EDIT – I’m growing to like the binary form of the operator more so from now on let’s call it like this in python:

def abs_ratio(a, b):
    return a / b if a > b else b / a

Ah the old Reddit switch-a-roo analyzed

So after clicking through what seemed an infinite amount of tabs from one of these switcheroo comments I finally wrote down the script which analyzed the graph. I’d suggest you ignore the following png and take a gander at the network pdf of the switcharoo graph because you can click through to the links.

The old reddit switch-a-roo analyzed image

To recap – 50 nodes, 52 edges, though there are probably more out there that point into some point of that chain. And here are the awards:

There. I hope that didn’t take away from the magic.

Appendix – The hardships

This was overly hard to do – first of all NSFW links gave me the “are you over 18?” prompt which for some reason I wasn’t able to solve by cookies. I eventually turned to the mobile version of the site (append “.compact”) to avoid the prompts completely. Also, matplotlib and networkx aren’t that fun for drawing graphs it seems. To visualize and output the graph I eventually used gephi which was somewhat easy although has it’s clunkiness baggage.

Statistics on reddit’s top 10,000 titles with NLTK

Drawing inspiration from this blog post on title virality I wanted to investigate what makes these top 10,000 titles the best of their breed. Which are the best superlatives? Who/what’s the most popular subject? Let’s start with some statistics:

  • On Feb. 03, 14:10:45 (UTC) the all-time top 10,000 submissions on reddit (/r/all) had a total of 82,751,429 upvotes and 62,655,532 downvotes (56.9% liked it).
  • 5.2 years between the oldest and newest submission
  • 8,331,382 comments. That’s about 833 comments per submission.
  • The #1 post has 26,758 – 4,882 = 21,876 points
  • The #10,000 post has 15,166 - 13,679 = 1,487 points
  • And now some graphs….

Adjectives – reddit loves “new”, “old”, “good” and “right”

Adjectives

Top Adjective, Superlative – “Best” is the best

Questions reddit loves how?

Questions

What’s reddit talking about? People.

Or news, the president, man…

Reddit appreciates personal content about you, this, it and I.

Even NLTK doesn’t understand these…

I’m pretty sure you don’t need example links for these…

The top 10,000 seem to come mostly from 17:00 UTC and rarely from around 12:00 UTC

This isn’t exactly the probability of succeeding to hit the front page as it’s not clear at what time submission count is highest. But it’s something.

An apology

This is my first time using NLTK and though I’m ok at coding I most certainly have no idea how to parse natural language. Here’s hoping this was somewhat insightful.

I have no idea what I'm doing

Appendix

The Google App Engine Glass Ceiling

With the new billing arrangements, each and every paid GAE app costs at least $2.10 per week which is supposedly $9 per month ($9.125 by my calculation) regardless of quota usage. This cost does cover whatever quotas your app consumes and the regular free quotas are still “free”.

Now I have a GAE app that sends 70-80 emails per day where the free limit is 100. I’d gladly switch over to the paid side of GAE just to be sure that if it ever passes the 100 mark I don’t have any failed email requests but the price of that is 9$ per month. GAE is extremely expensive for apps that just barely brush the end of their free quotas. In order to actually use the $9 per month minimum I’d have to send out 3000 emails per day (at $0.0001 per email).

I don’t know if the free quota on email recipients is really low or if sending out an email is extremely cheap. Either way, GAE expects me to scale from 100 to 3000 while paying the price of 3000. Who knows if I’ll ever even reach that mark?

If google keeps with this plan, I’m probably never going to start another GAE app that has a chance to grow. Every time I have a chance of hitting the quota limits I have 2 choices:

  • Pay google and be screwed over for an indefinite amount of time until I reach the next landmark.
  • Migrate to a cheaper shared hosting option until I reach the next landmark.

Thanks, but no thanks. That’s the GAE glass ceiling.

Appendix

  • Other than this problem I do like GAE. It’s a shame I have to leave it.
  • I’ve made about 11 small python GAE apps. Only 2 of which ever reached the aforementioned glass ceiling.
  • This issue shouldn’t bother you if your app is already big enough to cost more than $9.
  • Maybe google can’t bill less than $9 per month? I doubt it, android apps can cost $0.99.
  • A proposed solution: Google takes $9 of credit at a time from your google wallet and eats quotas out of that. When the $9 run out, it bills another 9. Sounds reasonable and “don’t be evil” to me. Another thing that could be nice would be to allow multiple paid apps to feed from the same budget.