CodeHappy

July 19, 2008

Backslapping Ruby’s average performance once again

Filed under: General Programming — pwrighta @ 6:50 am
Technorati Tags: ,,,,,

I like Joyent. I think as a company they’ve done some very cool things and the fact that they are now fully invested in improving the performance of Ruby on Rails is a great thing. But, for now at least, it still competely sucks.

Joyent published an article a month or so ago about how they scaled a facebook application to support millions of hits. The application, BumperSticker, simply serves out customized images to users - online bumper stickers. It’s not hard, not complex and processes around 20 to 27 million page views a day. That’s a good number by anyone’s standards.

But, this dinky little Ruby on Rails app required the following architecture to do it

  • 13 Application servers.
  • 8 Static content asset servers
  • 4 MySql databases

Thats a staggering 25 servers just to serve a bunch of images at a rate of no more than 320 hits per second.

Come ON Ruby world!!!! This is not a success story. That level of hardware investment, backed up by a ton of work on software optimizations like DevCentrals BigIP, to serve just a few hundred hits a second is really completely piss poor.

Let’s stop with the “it will scale/it wont scale” arguments and just look at ’success’ stories like this one. There are serious issues with Ruby on Rails performance and architecture that need to be addressed and urgently.

Source: DevCentral

24 Comments »

  1. I think the problem is the ADD Ruby community.. always chasing the next shiny object. Instead of REST (which end users could care less about), they could have focused on i18n, thread-safety, or as you point out, performance. All things that DO matter to end users.

    Comment by Stephen Waits — July 19, 2008 @ 9:46 am

  2. Sounds to me like RoR was a bad choice. I absolutely love RoR if you’re doing a CRUD-heavy website – the bonus you gain from Rails outweighs the performance hit in my view. But if you’re writing a goddamn messaging system (twitter) or a BumperSticker app, you absolutely do not need RoR for the processing portion of your app. Seems like they’re taking a framework designed for one thing and being shocked when it doesn’t work well for other things!

    Comment by Jon Canady — July 19, 2008 @ 10:39 am

  3. What does this show? RoR is not for everyone.

    People continually shout “improve speed, fix this”.

    But what kind of speed improvement can one really bring about when not relying on C or C++?
    It is simply not doable and the passed time since these problems are all “known” are a proof of it already,
    just google for RoR performance. The title is misleading completely, I want to see how Django fares for
    such big problems.

    Comment by markus — July 19, 2008 @ 12:08 pm

  4. Well, Joyent is seriously lagging behind on this one. I know because the company I work for has servers there AND some custom instances on Amazon EC2.

    Guess what? The EC2 instances kick Joyent’s ass on a daily basis.

    Why? Phusion Passenger (a.k.a. mod_rails - http://modrails.com/), that’s why. Joyent doesn’t support it officially and it’s a huge pain in the ass getting it to work yourself (in fact, I think nobody has actually made it yet).

    Oh, and by the way: it’s not Rails that doesn’t scale; I’m fairly certain it’s the application’s programmers that are doing it wrong. BTW, if the app is so simple, why not use Merb of even Sinatra instead?

    Comment by Fabio FZero — July 19, 2008 @ 3:04 pm

  5. Do you have any stats on how much server power a similar site developed in frameworks using python or php serving similar traffic would need?

    Comment by garg — July 19, 2008 @ 3:05 pm

  6. 27e6 page views * 20 objects / page = 540e6 object views a day. 540e6 object views/day / 86400 s/day = 6250 hits/s. Or 250/s per server. It doesn’t seem so bad in those terms, does it?

    Comment by Elliott Back — July 19, 2008 @ 3:08 pm

  7. It’s funny, I was just talking to a developer about ROR. He’s basically turned away from C# + ASP.NET in favor of Rails and says it’s a joy to work with, code heaven etc. can do anything with it.

    I don’t know that much about Ruby, Rails or any of it but I do know that every developer who is familiar with it seems to LOVE it and claims it’s the best thing since OOP was invented.

    Then I see blog post like this..

    What is the deal? I don’t know enough about Rails to even begin to compare it ASP.NET, Python, PHP frameworks etc.

    Comment by Confused — July 19, 2008 @ 3:12 pm

  8. There are serious issues with *Ruby* performance and I find it noteworthy that even Joyent won’t touch that.

    Comment by Jason Dusek — July 19, 2008 @ 3:16 pm

  9. Sorry, but the “8 Static content asset servers” and “4 MySQL Servers” are exactly a Ruby problem how?

    What’s running on the static servers? Apache? Better write them up for poor performance. I mean 8 servers to host up some static images?

    And what exactly needs to be addressed urgently, looks to me like this app is running fine.

    Now if the company couldn’t get it running properly and were running out of funds to make the needed changes to their infrastructure, then yeah, that’s urgent.

    Comment by PaulM — July 19, 2008 @ 3:31 pm

  10. Well, how about we take look at a real world example of something even bigger? One shop I worked at had an average of 20MM page views per day. We had a cluster that served up advertising images (some small JPEGs, some large JPEGs, and some very large Flash ads.) There were an average of five images per page. 100MM images per day (though, IIRC, our average per day was 120MM.)

    A MySQL backend ran the advertising server, keeping up with current hit counts, the various campaigns, and making a decision each and every time a page was served which ads were to go with that page. We had two of these MySQL servers, one of them a standby (in other words, inactive.) The images were served through three Apache webservers. Our ad software ran as an Apache module, so the webservers were also, in essence, our application servers. The images were stored on an NFS volume (served from our MySQL server.) There were three cache servers in front of the web servers servers. All handled by a single pair of load balancers.

    To top it all off, we only _really_ needed two Apache servers and two cache servers. The extra one was for redundancy, but we kept ‘em all active (unlike our MySQL server) anyways. We were groovy like that.

    So, let’s look at the numbers again, ‘kay?

    * 3 cache servers
    * 3 Apache web servers
    * 2 MySQL database servers (which also were our NFS servers, and one of them was simply a standby)

    And we successfully served over 100MM images a day, with sometimes-intensive application logic behind which image to serve. The service never failed. Not once. Not even when performing upgrades or maintenance. Not even when I personally crashed our primary MySQL server and primary load balancer at the same time! (Hey, I was new then, give a girl a break!)

    Oh yeah, and this was all five years ago, babe.

    No decent sysadmin or architect in their right mind would claim that 25 servers to handle 20MM hits as a “success”! These guys might to groovy things for Rails, and by the looks of their blog they talk a good game (”loving cloud”, my ass) but when it comes to architecture they still have a lot to learn (and a hell of a lot of growing to do, it seems) before they can make a claim to being in the big leagues.

    (By the way, love the site! Just found it last night while looking for Python stuff!)

    Sam

    Comment by Samantha Kroll — July 19, 2008 @ 3:47 pm

  11. Interesting article. I was looking for something that was really scalable computationally about a year ago and came across this article.

    http://www.sics.se/~joe/apachevsyaws.html

    It requires a change in language away from ruby and to erlang, but as i have played with erlang it has shown me how truly scalable an interpreted language can be.

    On my duel core laptop I was running 500,000 concurrent requests for processing with little difference in average computation time or wait time from when I ran just 2.

    I was amazed.

    Lee

    Comment by tetontech — July 19, 2008 @ 5:19 pm

  12. This sounds more like an architectural issue than a Ruby performance issue. Until we know more about how they built their system, it’s premature to claim that it’s a result of their choice of language and/or platform. The same thing happened with Twitter, and after they revealed their architecture, it was clear where the performance issues were.

    Cheers,
    Graham

    Comment by Graham Glass — July 19, 2008 @ 5:32 pm

  13. Realize that they’re a hosting company. All that may not even be necessary but they have the resources to cheaply acquire it.

    Premature scaling anyone? ;)

    Comment by Jeremy — July 19, 2008 @ 5:40 pm

  14. “There are serious issues with Ruby on Rails performance and architecture that need to be addressed and urgently.”

    Awesome! Please tell us what they are and fix them.

    Comment by Jonas Nicklas — July 19, 2008 @ 8:37 pm

  15. Looks like I hit something of a nerve here. I just logged on to see if there were any comments and nearly fell off my chair.

    OK, looking through these comments and the ones at YCombinator and Reddit there’s the usual mix of complete and utter hatred and whole hearted agreement. Of course, on the hatred front, I’m a moron, don’t know what I’m talking about yadda yadda yadda.

    I’m code for and manage a dev team responsible for a bunch of Ruby on Rails sites, the most popular of which has around 200K users. I’ve also written, and deployed, in to the same environment a few Python sites - one with Pylons, one with just CherryPy.

    I have a CherryPy app that picks up around 4 million hits a day. 1 app server. 1 database server. We had a rails version of it but it started failing way before a million. It was then ‘optimized’ by the Rails gurus on the team (and we have some of the best in the world) and we broke a million. Then it was ported to merb and we approach 1.5 million. Then it went to Pylons - 2.5 million, and now just plain old CherryPy - 4 million.

    Our most complex app suffers from Twitter like complexity, and it’s a bear to manage. Mongrel until recently was the recommended way to go on the app server front but ask any dedicated sys admin what they think of mongrel and they’ll sneer at you. You want to handle 40 simultaneous hits - stand up 40 mongrel servers. It doesn’t really spawn up instances/threads/processes on demand like most PHP/Python/Perl solutions do, it’s heavy on the machine and you tend to have a bunch of app servers running a bunch of mongrels but none of them really working that hard, just chewing memory. And then there’s the awesome synchronous nature of mongrel itself, the single threaded nature of Rails and in particular ActiveRecord, the abysmal performance of the Rails routing mechanism (seriously, write a simple site in Rails, port it to Merb and compare the two), etc etc etc.

    There are solutions of course - DataMapper over ActiveRecord, Thin and Rack and so on. But they’re not part of the all encompassing allmighty Rails framework are they.

    I like Rails a great deal, but as I said in the post it has issues and needs fixing. They need fixing not because it’s a pile of shit that needs to be firmed up to gain acceptance. They need fixing because Rails is so awesome and has gained so much ground over the few years of it’s life. It’s here, it’s here to stay and so it really needs to change quite a lot since so many people are now depending on it.

    No need to get all nasty… ;)

    Comment by pwrighta — July 19, 2008 @ 9:41 pm

  16. I guess the biggest issue I have with your article, is that you convienently equate page views to hits.

    20 million page views translates to 231 page views per second, not 231 hits per second. How many objects per page? Since some of the pages are galleries, I guess it could be as many as 20 to 30 hits per page. Not including any CSS, Javascript, icons or other page components. So 231 page views per second doesn’t sound so bad for 25 servers.

    Are you just another Rails hater to misrepresent facts like page views equal hits?

    Comment by Tom — July 19, 2008 @ 10:41 pm

  17. Good point Tom. I honestly didn’t think of that.

    And No, I’m not a Rails hater at all. As I said in my comment I’ve been working with it, full time, for about 2 years now.

    Comment by pwrighta — July 19, 2008 @ 10:47 pm

  18. @markus: “But what kind of speed improvement can one really bring about when not relying on C or C++?”

    I worked on an app once-upon-a-time that could serve 120 requests per second from a financial database of 1.8 billion rows (in the MS world, they partition tables, not servers, and even then, each table had ~300 million rows, quad P-III Xeon 733’s).

    This 120rps number? 1 *blade* server. That’s a P-III MOBILE. 933Mhz IIRC.

    There were 4 of them. So while I never benched the whole cluster, that’s a theoretical 480 requests per second. On hardware with about the same performance as your bottom end 1U unit today.

    And to be honest, for all my angst, ASP Classic wasn’t exactly torture. No fun sure. No objects. Very plain, flat, and write a page in your sleep sure. But for that kind of performance…

    Yes. I think a lot of Ruby people are focussing on the wrong things. Beauty over efficiency. In library/support code. The _design_ should be beautiful. Composition over Inheritance. Forget the details. Who cares about the individual lines of code.

    It wouldn’t take all that great of an investment to fix this either. DataMapper goes a long ways towards doing just that. There’s optimizations to make sure, but this little project sponsored by the community and some hours from my employer manages to best ActiveRecord two-fold for the most common, most expensive operations. And there’s so much mileage left in optimizations and refinement it’s silly.

    People, your database is not your bottle-neck. Sharding is a waste of time. Get a DBA. Optimize your databases. Then realize that you’re spending more than half your time in your O/RM alone. Throw in the rest of your stack and it’s not your database choking your app if you’ve got your act together. It’s the other way around. (for the 1% of people this doesn’t apply to, you’re smart enough you hopefully don’t need to be told I’m not talking to you :p )

    Gemstone is not a cure-all either. OODB’s are nothing new. Serializing data on the wire on every assignment operation is a *BAD IDEA*. There’s a good reason classic object serialization and remoting strategies encourage “chunky” interfaces.

    If you’re smart (and they obviously are), you can do things to hide this cost, and for most cases you can probably make it mostly transparent, but nothing’s perfect. You have to understand what’s going on under the covers. Reason it out. Profile. Evaluate. *Design* your code, don’t simply “craft” it.

    For the record, I’m a long-time fan of OODB’s. The reality is that most web applications are far from transactional though. It sucks, but that’s just the way it is. Even Twitter. All roads lead to a hybrid future I think.

    Of course it’s easy to say that. Tuning MySQL and PostgreSQL has been a big learning curve for me. No enforced clustered indexes. Mysteriously low performance on simple queries.

    On the other hand, proper UTF-8 support. More robust for complex queries. Much more transparent execution that can make tuning simpler for more complex scenarios. I do like the OSS databases a lot.

    Sometimes I miss the simplicity of MSSQL though. When as long as you had your indexes in place, got rid of scans and replaced with seeks, had an appropriate clustered index, made sure query parameters were of the same type as the columns, and that you were using 3 or fewer joins, you could simply always count on ~10ms queries pretty much no matter how large your tables got. Right out of the box. No tuning the configuration beyond perhaps putting the transaction logs on a different drive set.

    Still, I do believe the Ruby performance problem is solveable. It requires profiling. It requires expanding your view from lines-of-code to objects. It requires the occasional use of C extensions since Ruby’s method-dispatch can turn the tables on algorithmic gains when comparing pure-Ruby to an extension. Combine those extensions with the proper algorithms though. Keep in mind the idea that your object-composition and OODesign is going to pay much bigger dividends than localized code tweaks, but keep in mind that a local convenience can produce a behind-the-scenes stack-trace that’s nothing short of obscene.

    Basic stuff really.

    Do that, and I strongly believe you *can* make Ruby work in pretty much any situation. I’ve certainly made Ruby scripts run circles around c# applications often enough simply because there’s so much less noise in Ruby it’s easier to see the overall design.

    It takes an investment though, and at the end of the day, I simply don’t believe Ruby is less expensive for a business anymore.

    I do believe the right people can bend Ruby to their will in such novel and impressive ways that if you want to truly innovate, then you have to look beyond immediate cost and you’ll find a level of innovation in those people that’s just much more uncommon in other communities. You’ll have your pay-off. But you’ll have less control over it. You’ll have to be more open to changing direction and taking advantage of it, capitalizing on opportunities quickly as they present themselves.

    And because of this, if you’re not in the business of being nimble (and not everyone should be, I’ll trust my bank to be very conservative/disciplined in direction, thank you, as long as the people calling the shots have some modest vision of the future), then I don’t think Ruby’s necessarily a good fit. Hire some seasoned .NET people, give them a lot of money, and let them simply produce what you asked for and nothing more.

    Comment by Sam Smoot — July 20, 2008 @ 1:08 am

  19. Wow, some of my embedded ethernet applicances can serve a good fraction of their `success`. And, oh, they run at 100-150 MHz and burn a couple watts. But we write in C. ;)

    Comment by Chris — July 20, 2008 @ 2:30 am

  20. [...] about another story of Rails performance, I grabbed JMeter to benchmark one of my current projects. Not so much as a comparison for Ruby - [...]

    Pingback by Unscientific Jetty versus Glassfish for REST at Stephans Blog — July 20, 2008 @ 2:59 pm

  21. From 1997 - 1999 I worked for a major corporation that ran it’s website on a single Sun Ultra-2, Dual 200 UltraSPARC CPU’s, 1Gb of memory (*expensive* at the time), and 9Gb of HDD.

    On that website we ran Netscape Enterprise, Oracle, and Perl (CGI) which was later migrated to Apache, Oracle, and Perl (FCGI) to generate 100% of the pages on the site. I personally spent hours understanding every config line on the box, every line of code, every SQL query and tuned that box.

    This single box could handle about 60 req/sec. This doesn’t seem like a lot, but we have to remember it was 1999 and this server was single handedly serving 5M+ req/day.

    I love ruby as a language and love using rails for throwing together quick apps, but with the hardware available today there is *NO* excuse why we can’t easily out perform a server from 10 years ago, let along that struggles to even keep up.

    Comment by NeoMike — July 22, 2008 @ 4:30 pm

  22. Bullshit. Get it running, then (if you care) improve performance. In this case, Joyent has their service running, and that’s good enough for them. Let’s make Google run on a single mainframe too while we’re at it. If 25 servers isn’t ‘acceptable’ to Joe Blogger here, what is acceptable? 10? 5? 1? Who cares? Servers are a commodity now, if you’re Joyent it takes 10 seconds to get a new server provisioned and few bucks extra per month to keep it running.

    Comment by Skyeter — July 24, 2008 @ 9:24 am

  23. Skyeter: Ah, thanks for the wonderful insight from the ever persistent bottomless cash pit corner of the room.

    Comment by pwrighta — July 24, 2008 @ 9:25 am

  24. pwrighta. Thanks for an insightful post and for opening the discussion. I think Graham Glass is probably on point with his intuition about architectural choices.

    One of the unspoken secrets of softare development is the wastage of compute resources that is the norm with delivered systems. In my experience it isn’t unusual to see delivered systems whose efficiency is 3 or 4 orders of magnitude than they could be. I have seen two systems with identical functionality, same language, same OS, where system A (with a user base 10x that of system B) was deployed on four dual CPU Linux hosts and system B was deployed on 400+ similar hosts.

    I have seen 8 CPU Oracle servers that have a peak workload of 2 transactions per second.

    This is clearly a failing of a CS education system that seems to focus on performance in the small and doesn’t seem to teach skills needed to recognize, let alone build or support scalable well engineered systems. I wonder what it would take for this to change?

    Comment by Peter Booth — August 3, 2008 @ 10:30 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.