Jay Taylor's notes

back to listing index

Facebook trapped in MySQL ‘fate worse than death’

[web search]
Original source (gigaom.com)
Tags: mysql history facebook gigaom.com
Clipped on: 2015-10-28

.

Facebook trapped in MySQL ‘fate worse than death’

Jul. 7, 2011 - 1:00 PM PDT

Image (Asset 1/4) alt=
According to database pioneer Michael Stonebraker, Facebook is operating a huge, complex MySQL implementation equivalent to “a fate worse than death,” and the only way out is “bite the bullet and rewrite everything.”

Not that it’s necessarily Facebook’s fault, though. Stonebraker says the social network’s predicament is all too common among web startups that start small and grow to epic proportions.

During an interview this week, Stonebraker explained to me that Facebook has split its MySQL database into 4,000 shards in order to handle the site’s massive data volume, and is running 9,000 instances of memcached in order to keep up with the number of transactions the database must serve. I’m checking with Facebook to verify the accuracy of those numbers, but Facebook’s history with MySQL is no mystery.

.

The oft-quoted statistic from 2008 is that the site had 1,800 servers dedicated to MySQL and 805 servers dedicated to memcached, although multiple MySQL shards and memcached instances can run on a single server. Facebook even maintains a MySQL at Facebook page dedicated to updating readers on the progress of its extensive work to make the database scale along with the site.

The widely accepted problem with MySQL is that it wasn’t built for webscale applications or those that must handle excessive transaction volumes. Stonebraker said the problem with MySQL and other SQL databases is that they consume too many resources for overhead tasks (e.g., maintaining ACID compliance and handling multithreading) and relatively few on actually finding and serving data. This might be fine for a small application with a small data set, but it quickly becomes too much to handle as data and transaction volumes grow.

.

This is a problem for a company like Facebook because it has so much user data, and because every user clicking “Like,” updating his status, joining a new group or otherwise interacting with the site constitutes a transaction its MySQL database has to process. Every second a user has to wait while a Facebook service calls the database is time that user might spend wondering if it’s worth the wait.

Not just a Facebook problem

In Stonebraker’s opinion, “old SQL (as he calls it) is good for nothing” and needs to be “sent to the home for retired software.” After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed.

But products such as MySQL are also open-source and free, and SQL skills aren’t hard to come by. This means, Stonebraker says, that when web startups decide they need to build a product in a hurry, MySQL is natural choice. But then they hit that hockey-stick-like growth rate like Facebook did, and they don’t really have the time to re-engineer the service from the database up. Instead, he said, they end up applying Band-Aid fixes that solve problems as they occur, but that never really fix the underlying problem of an inadequate data-management strategy.

Image (Asset 2/4) alt=

There have been various attempts to overcome SQL’s performance and scalability problems, including the buzzworthy NoSQL movement that burst onto the scene a couple of years ago. However, it was quickly discovered that while NoSQL might be faster and scale better, it did so at the expense of ACID consistency. As I explained in a post earlier this year about Citrusleaf, a NoSQL provider claiming to maintain ACID properties:

ACID is an acronym for “Atomicity, Consistency, Isolation, Durability” — a relatively complicated way of saying transactions are performed reliably and accurately, which can be very important in situations like e-commerce, where every transaction relies on the accuracy of the data set.

Stonebraker thinks sacrificing ACID is a “terrible idea,” and, he noted, NoSQL databases end up only being marginally faster because they require writing certain consistency and other functions into the application’s business logic.

.

Stonebraker added, though, that NoSQL is a fine option for storing and serving unstructured or semi-structured data such as documents, which aren’t really suitable for relational databases. Facebook, for example, created Cassandra for certain tasks and also uses the Hadoop-based HBase heavily, but it’s still a MySQL shop for much of its core needs.

Is ‘NewSQL’ the cure?

But Stonebraker — an entrepreneur as much as a computer scientist — has an answer for the shortcoming of both “old SQL” and NoSQL. It’s called NewSQL (a term coined by 451 Group analyst Matthew Aslett) or scalable SQL, as I’ve referred to it in the past. Pushed by companies such as Xeround, Clustrix, NimbusDB, GenieDB and Stonebraker’s own VoltDB, NewSQL products maintain ACID properties while eliminating most of the other functions that slow legacy SQL performance. VoltDB, an online-transaction processing (OLTP) database, utilizes a number of methods to improve speed, including by running entirely in-memory instead of on disk.

It would be easy to accuse Stonebraker of tooting his own horn, but NewSQL vendors have been garnering lots of attention, investment and customers over the past year. There’s no guarantee they’re the solution for Facebook’s MySQL woes — the complexity of Facebook’s architecture and the company’s penchant for open source being among the reasons — but perhaps NewSQL will help the next generation of web startups avoid falling into the pitfalls of their predecessors. Until, that is, it, too, becomes a relic of the Web 3.0 era.

Feature image courtesy of Flickr user jimw; error image courtesy of Flickr user rubenerd.

.
.
.
Related research
Subscriber content
?
  • Image (Asset 3/4) alt= What to know when choosing database as a service .
  • Image (Asset 4/4) alt= Sector RoadMap: SQL-on-Hadoop platforms in 2013 .
.
Get all the news you need about Cloud with the Gigaom newsletter

.

.
.
.
  1. So… let me get this right? Facebook is in “MySQL Hell”.

    -They have one of, it not the most visited sites on the net.
    -Their site have next to no lag.
    -And almost every page on… the social network… requires how many db calls?

    I’m trying to figure out what people actually expected to happen here because it’s kind of confusing. Google runs search queries… not impressive. Amazon runs prices and publishers… okay. Facebook runs status updates, games, likes, groups, fan pages, applications, and hosts their on API to manage anyone’s account and pages remotely.

    4,000 Shards…. even if it was a baseless accusation the OP can’t back up… suddenly, doesn’t feel that impressive or bothersome.

    .
  2. “If /dev/null > is fast and web scale, I will use it. Is it webscale? MongoDB is webscale. ” -from http://bit.ly/qMZnFl
    This sums up 80% of what I read here between the people who have little or no idea of the technology barriers demolished by Stonebraker and the disagreements about which company is using which technology. Most if not all of these top 0.001% companies are using multiple technologies depending on the problem space they occupy. NoSQL is fine when if fits. So is Hadoop, Oracle, Mongo etc. WHen you have ONE hammer everything looks like a nail. When you have a tackhammer,sledgehammer,screwdriver,clawhammer,rubber hammer you pick the one that fits tacks,spikes,screws,nails,wtc. I expect in this interview as in most, quotes were taken out of context, by interviewers who don’t understand the technology very well. Take it all with a grain of salt. -sign me an Oracle dba with time on Ingres, Oracle, dBase 3, MS Access, Informix/Illustra, Postgres, PostgreSQL, DB2, Sybase, UDB, StreamBase, NoSQL, MS SQLServer, and some homegrown stuff.

    .
  3. You have noticed no problems w/FB? This morning I logged on. In my 10 minutes there I had 7 “can’t write to database ” errors.

    .
  4. They might very well be having problems internally but from a user’s point of view, Facebook is running smoothly. I personally have not noticed any significant delays. I’m not aware of anyone leaving Facebook over performance issues, so I don’t think they need a complete redesign.
    I think they are nearing their peak data usage. Proliferation among people under 30 is very high around the world. I’ve noticed that now people in their 40s and 50s are joining in, but they are not as active and thus do not generate a lot of demand. If they keep adding servers they’ll be OK for the next few years.

    .
  5. Right now I’m using NoSQL (Redis in this case) to help MySQL. When there are data spikes we save the data in memory with Redis so we don’t lose it because of slow MySQL.

    .
    .
  6. bullshit. he’s got too much at stake to even listen to him on this front.

    .
  7. Too many trolls in comments. Every php-coder thinks he is mysql guru. Writing sites for 10 hits/day a different process instead of writing sites for 10k hits/second. Come back to mommy, lamers.

    .
  8. There is no where out but to continue with mysql, too painful to bite the bullet at this stage to replace the mysql. Introduce more server may be the choice out as for now.

    .
  9. I have a lot of respect for Michael Stonebraker as a computer scientist. To give him credit, when credit is due, he has made indisputable contributions to field. Having met him in person, I have to say that he is not just a great computer scientist, but a great guy overall.
    Whether Facebook, or any other technology company for that matter, should use MySQL, NoSQL, z/OS mainframe or paper and pencil for keeping their data is an engineering decision and a business decision. And a team of people that would be qualified to make that call should consist of businessmen and engineers by trade, not computer scientists. There are many considerations that Stonebraker is not even aware of. Being highly intelligent and knowledgeable he, nevertheless is not an engineer, nor is he privy to the internal business information about Facebook. He simply does not have enough information that would allow him to make that call.
    Facebook is the new Google. Out of 10 engineers in the Silicon Valley, at least 9 would drop whatever it is that they are doing and go work for Facebook, if given the chance. Consequently, some of the best engineering talent is already employed by Facebook, and they likely have a good reason to be doing the things the way they do it. Engineering is ultimately a practical discipline.
    Most relational databases, save for a few column stores, store the data much the same way IBM System R did back in early 1970s. This approach to storing data works, and its practical. The computer science part here is ancient history at best. It’s great from the engineering standpoint, because it’s tried and true and it works. Done is better than perfect.

    .
  10. These seem like nice problems to have, regardless of what system(s) they are using.

    .
  11. Caveat: I represent NimbusDB, a NewSQL vendor.

    That OldSQL has failed for web-facing applications is self-evident. Every substantial web-facing application has had to supplement or replace their MySQL, SQL Server, ORACLE etc systems, in addition to caching, sharding, denormalizing and in some cases re-engineering parts of the database system. At NimbusDB we talk to people every day that describe this OldSQL pain.

    The case Mike Stonebraker makes is that the problem is not inherent to SQL but to the 30 year old internal architecture that all of these database systems use. There is no theoretical reason for SQL/ACID not to scale, and there are NewSQL products that provide existence proof of the point.

    Would Facebook and others be in a better place had they started with a SQL database that goes faster when you add nodes to a live database, and is resilient to node or datacenter failure? Obviously yes.

    SQL and ACID do scale out on commodity machines; historical implementations do not.

    .
  12. Yeah, I just had to answer Stonebraker claims on Facebook. So I join the fiction club too (and disclosure – I work for a competing newSQL company) – and say what FaceBook DBAs could have answered this interview – see here

    .
  13. The bottom line is that relational databases have a tremendous amount of overhead consumed by ‘keys’. They are fine for data that can be arranged in specific fields, such as a phone book or accounting data. In other words, the relations between objects (fields) are predetermined and the code takes advantage of this.

    Further, the maintenance costs for ‘tuning’ is huge.

    However, they fall apart when there are non-predetermined associations and the database is ‘navigated’, then the size of the database ‘explodes’ with increasing amount of data and ‘links’ are randomly created by users. This is the paradigm that Facebook uses. Examples of links in Facebook are ‘friend of’, ‘likes’, etc. A robust object-oriented data base is far more suitable for sociability and transaction processing speed. An additional advantage of the OODB-based systems is that code doesn’t have to be rewritten to add object classes and attributes.

    One example of overcoming this conundrum for IBM DB2 is to use an OODB as the front-end transaction processor and to update the DB2 in the background. See IBM paper http://www.redbooks.ibm.com/redbooks/pdfs/sg246561.pdf describing how DB2 uses the Versant OODB to improve performance. Also see http://www.versant.com/index.aspx. There are competitive OODBs around, but I personally know that this one has been around for at least 20 years and have used and customized user applications for a sophisticated system engineering tool.

    .
  14. As soon as someone trots out the term “Webscale”, they lose all credibility in my eyes. It’s such a pointless term. It’s even worse than “NoSQL”, which as troll-marketing terms go is pretty bad.

    .
  15. This is a very shoddy piece of writing. Far too many paragraphs are spend presenting MySQL / “OldSQL” as the problem with little to no substance as to why. Then, finally, after 90% of readers have probably already grown bored and gone away the reader is introduced to “NoSQL” and “NewSQL”. Then the article ends leaving the reader with no idea what “NoSQL” and “NewSQL” are except a vague notion that “NoSQL” was an attempt to get away from “OldSQL” and “NewSQL” solves everything.

    .
  16. So let say they have 10K low end machine, each costing 200$ a month. That mean each machine manage 75000 users and that it cost 0.0026$ per user per year. 24 million total cost for a company that is quoted at more than 80 billion.

    How the solution is unsuited?

    .
  17. For business savey but non-technical people you may want to uncover your unintentional, somewhat blanket statement, regarding open source: “…the company’s penchant for open source being among the reasons…”. This type of statement makes anyone look biased toward closed source software and makes it look like open source software as part of the problem which I’m sure you would agree is not true.

    1. > “…the company’s penchant for open source being among the reasons…â€.
      I read this sentence as a reason why Facebook selected MySQL and not as a derision against open source software.

      .
    .
  18. It’s nice to bring NewSQL to the forefront as the database technology for handling the woes of the social networks limitations but FaceBook has been functioning just fine. There’s #1 rule for all computer systems that works even today and that is K.I.S.S (keep it simple stupid). If the system works then leave it be, meaning don’t mess with it.

    1. No thats only what stupid administrators think…this is why we have tons of bots running around now.

      .
      .
    .
  19. Apparently the so called NewSql/mysql expert who provided the information for this article, is all wrong about database issues at Facebook. They widely use cassandra database which is not relational for thier massive content. The system that Facebook innovated is far better than google’s and amazon’s technology.
    The author didn’t convince me on his assessment that NoSql will offer only little performance over RDBMS. He should go to Mount Everest and take a break before starting out his professional life.
    For webscale apps, sacrificing ACID properties provides huge performance benefits that cannot be achieved by RDBMS.

    1. Facebook does use Cassandra, but not as widely as you seem to think. The original use case for Cassandra was mail but, as http://www.facebook.com/note.php?note_id=454991608919 explains, they migrated that functionality to HBase several months ago. There’s a good description of their main storage architecture at http://www.prodromus.com/2011/01/27/what-database-does-facebook-use – mostly MySQL in a key/value style plus memcached. Both systems use Haystack (another Facebook invention) for images and other large objects.

      Facebook’s infrastructure, far from being better than that at Google or Amazon, is very similar and developed in parallel with those others. Cassandra, for example, combines the Dynamo distribution model (from Amazon) with the BigTable data model (from Google). HBase, part of Yahoo-derived Hadoop, also has roots at both companies, while Haystack shares many ideas with both Amazon S3 and OpenStack Storage. The primary argument for sacrificing ACID is not performance but tolerance of partitions – an argument most famously advanced by Eric Brewer, formerly of Yahoo and now of Amazon.

      While I find Stonebraker’s comments as misguided and contemptible, as you do, I don’t think factual errors and ad hominem attacks make that point very well. Please, study the systems you’re talking about a little before you make general pronouncements about their relative merit.

      1. jeff,
        To be honest with you, i read papers on big table and dynamo projects. Inside our company, we are also switching some our apps to use cassandra on an experimental basis. I didn’t like to make a very long post explaining all details in the comments.
        Every system has its limitations, but the technology that facebook is using currently, carried them to hundreds of millions of users. Obviously this must be best technology so far.

        .
      .
    .
  20. 1. Engineers don’t understand data
    2. If Engineers could scale, scalability as a problem wouldn’t exist.
    3. VPs in-charge of data operations have glaring holes in their understanding of data I/O patterns Facebook needs.
    4. The rest is then misalignment of skills/toolsets/ combined with chaos + egos

    “We know the problem. We can’t do anything until the … ” screams “we don’t know the problem and don’t know how to fix it.”

    .
  21. They should call it “SQLSequel”..

    .
  22. From the start of this article it looked like Stonebreaker had no idea what he was talking about. His claim that SQL “is good for nothing” is ludicrous right on the face of it. He then further destroys his credibility by claiming that SQL skills “aren’t hard to come by”. Sure, you can find plenty of .NET or Java developers who can write syntactically correct, but horrible SQL code or who can design a poorly performing database (there are also some .NET and Java coders that are talented at SQL, so this isn’t a dig at them). The fact is though, even in this job market I get constant calls from recruiters because there is a shortage of *good* SQL developers.

    At the end of the article it all becomes clear – he’s just trying to sell his own product. A product which, despite him referring to it as “NewSQL” includes nothing that isn’t already being done out there with existing SQL engines.

    What a waste of an article (or should I say sales pitch).

    .
  23. So basically….

    A database pioneer, computer scientist… who developed Postgres and Ingres relational databases.. and an active involvement in the development of other types of databases….

    Against

    Others who build and work on database systems…. correcting him….

    Seems fair to me….

    .
  24. Are transactions that important for facebook?

    .
  25. One difference is that Walmart, fed gov, stock exchanges, etc. that have large db with high transaction volumes have adopted a three tiered architecture, middleware, I.e. TUXEDO. But that’s so ‘old school’, plus not free nor open source, that it’s seldom considered. Too bad, it works well, there is a reason all the highest TPC benchmark results are still achieved using TUXEDO. I wonder what the future holds now that TUXEDO and MySQL are controlled by Oracle????

    .
  26. Of course, Stonebraker is a businessman who will claim whatever he makes money with. But I hate this kind of dumb articles that have title and content of a single-person opinion.

    .
  27. There are lots of snippy little genii (at least in your own minds) here and too few bottles available with which to shut them up.

    .
  28. .
  29. This article explains why so many times you cannot get on FB or you get thrown off…thx for the article, it helps to enlighten us non techies.

    1. lol….ur password is stolen girl!!!

      .
    .
  30. From the article: “Facebook is operating a huge, complex MySQL implementation equivalent to ‘a fate worse than death,’ and the only way out is ‘bite the bullet and rewrite everything.’â€

    This is *so* disconnected from reality.

    I work for a $20 billion dollar company, which is growing 20% year over year, and run entirely on MySQL.

    Is this ideal? No. Do we have problems? Yes. But would consider rewrite everything? Never! *That* would be a fate worse than death.

    ” There’s no guarantee they’re the solution for Facebook’s MySQL woes — the complexity of Facebook’s architecture and the company’s penchant for open source being among the reasons”

    The article talks about open source as if it was derogatory.

    I can tell you that my company wouldn’t be possible without several open source projects (OS, database, programming language, etc).

    Heck: great companies like Facebook, Twitter, and Google, wouldn’t exist without open source software.

    Now they are huge companies, and they could afford proprietary licenses; but they still choose free — not because of the cost, but because of the freedom.

    .
  31. .
.
.
.

Comments have been disabled for this post