Thursday, May 16, 2013

This Is How To Build A Facebook Clone, According To Facebook Systems Engineer



If you want to build a Facebook-like site where people can play games and share content, here's how to do it.
Chris Scholz, a systems engineer at Facebook, recently wrote on Quora how Facebook is built. Some of the information is pretty general, since a lot of what goes into building Facebook is proprietary. 
Here's a quick summary:
  • Facebook's front-end is primarily delivered through PHP, a popular scripting language for Web development. It runs on a compiling system similar or equal to HipHop for PHP.
  • Facebook relies on dozens of data centers and thousands of servers to handle things like its user database, Graph Search queries, and photo and video uploads.
  • Facebook designs a lot of its own hardware so the site can be as efficient and cost-effective as possible. It uses "load balancers to make sure that machines which are free take on more load." Facebook physically maintains all of those machines.
  • A lot of internal tools are written in Python or Bash.
  • Lots of back-end tools are written in C, C++, Java, or Perl.
Here's the whole answer:

How is a Facebook-like site actually created from scratch?



I'm going to be pretty general, because a lot of the "how" is proprietary. A lot of it is not, though.

Facebook's front end is primarily delivered via PHP (using hundreds of custom libraries), running on something similar or equal to HipHop for PHP (HHVM), a PHP compiling system which rivals native C[++]'s ability to execute code quickly and saves us some CPU cycles for every request when compared to stock interpreters. HHVM is developed by Facebook and is open source. You can download it yourself and play with it (though you'll clearly be a few iterations behind what we're really using). There is also some fancy HTML5 and JavaScript on the front end to allow you to interact with the site without constantly posting back and reloading.

Everything I've said so far is public information, and you can figure out after a few hours of web searching or with right-click -> View Source

There is also a very complex infrastructure comprised of dozens of data centers and many thousands of servers. There are web server systems, user database systems, systems which handle Graph Search queries, systems which handle ad delivery, etc. There are even systems specifically created just for storing pictures and videos. A quick calculation using the number of users and pretty conservative values for how many pictures they might upload per month on average, and how big a typical picture is, can get you close to an estimate of around a petabyte of data a month. The exact figures are probably guarded by NDA, but anyone with half a brain can figure out that 1 billion users upload a lot of pictures, and that means a lot of storage. The current publicly available number is 100 petabytes. There are even systems dedicated to eradicating data when a user deletes an account or we need to destroy data on a device prior to it leaving our facility (NOTHING leaves Facebook without being wiped or physically destroyed, even if it's going back to a vendor - that privacy "social contract" we have with our users is sacred!).

Facebook has unique data storage and access problems that no one else in the world has. As such, we have systems no one else in the world has, and no one else in the world should be privy to how they work unless they are working on them.

That being said, we design a lot of our own hardware (see the Open Compute Project, started by FB) to be as efficient and cost effective as possible. We use load balancers to make sure that machines which are free take on more load. If a machine goes down, a spare is spun up (think Amazon Cloud Services, but we're the cloud, and we don't use no stinkin' virtualization).

All of those machines need to be maintained physically (we have an incredible team of Site Operations technicians who replace parts and fix basic systems issues - any one of these people could probably be a Systems Administrator at a company of smaller scale or complexity), and they also require very complex provisioning and sustaining infrastructure to bring up new boxes, systems to automatically remediate software issues and any hardware issues that could be remedied with a combination of in-band and out of band methods. My team primarily handles the large scale automation of sustaining and provisioning the fleet.

The infrastructure of Facebook could be the subject of an entire series of books.

As for "how" it's all built from scratch? Well, it probably ran on a box in Zuck's dorm for a little while, until that wasn't enough. Then a few more boxes were brought online in some data center. Then that wasn't enough. Then we build our own data centers. And we're still building more - we just turned up a state of the art DC in Lulea, Sweden on a site the size of a few soccer fields. All of the support systems to handle the scaling have been built along the way by the most talented people I've ever had the pleasure of working with. A great number of these problems are dealt with Ad Hoc, which is to say that we don't know how to do it until the problem presents itself and needs to be fixed (often as fast as possible). We go into everything new understanding the potential problems, and try to build systems to handle them before they ever impact anything. However, new ones always arise in a system this complex.  Remember, no one has ever done this before. What sets us apart is how fast we can deal with them and avoid them in the future.

As a company, Facebook does not dictate that any specific language be used. A lot of people like Python, so a lot of internal tools are written in Python. A lot of people (like me) like Bash, so a lot is written in Bash. There have been questions on Quora in the past about why we still use PHP when we could be using something harder, better, faster, stronger… The PHP front-end is now at a point where it's a necessary evil. We're millions of lines of PHP code deep and starting over now would be a massive effort for minimal, if any, payoff. In fact, because we're running the most advanced PHP run-time compiling system in the world, I (though not an expert) would argue we'd probably lose performance by switching to something else. Many back end tools are written in C, C++, D, Java, Perl, etc... As a company we focus on making impact, not following dogma. If you have an idea that will make the company better in some way, and you pick a language based your skill or its features and application requirements, it'll be accepted into the fold as long as it works with what's already out there and doesn't break anything. Our repository is a little like the Borg Collective - we'll take anything as long as it gets the job done properly, quickly, securely, and has the most impact for the company (and ultimately, the wonderful community who uses the service.)

There is no one person who knows "how it all works" - there are a lot of people who know how their black box works, what its inputs and outputs are, and how to build around everyone elses' black boxes. It takes a lot of open communication and teamwork to build something like Facebook from the ground up. Getting a new Facebook - as it is - off the ground in one go would be close to impossible and would cost US$Billions. Something like this has to evolve organically.

That's as close to an answer as I can give you for two reasons: 1) your question was too general - please specify what you want to know, and 2) A lot of this is covered by Non-Disclosure Agreement and simply can't be discussed. A LOT of this information is already available on the internet. Check out Facebook Newsroom for some more specifics. There are also engineering blogs and techtalks available with some simple web searches.

No comments:

Post a Comment