1 free chat request
For Facebook Chat, we rolled our own subsystem for logging chat messages (in C ) as well as an epoll-driven web server (in Erlang) that holds online users' conversations in-memory and serves the long-polled HTTP requests.Both subsystems are clustered and partitioned for reliability and efficient failover. In short, because the problem domain fits Erlang like a glove.Even without accounting for the sizeable overhead of spawning an OS process that, on average, twiddles its thumbs for a minute before reporting that no one has sent the user a message, the waiting time could be spent servicing 60-some requests for regular Facebook pages.The result of running out of Apache processes over the entire Facebook web tier is not pretty, nor is the dynamic configuration of the Apache process limits enjoyable.Cache misses and database failure can be detected by the non-database layers and either reported to the user or worked around using replication.
Because the data is persisted, a failed read request can be re-attempted.
One of the things I like most about working at Facebook is the ability to launch products that are (almost) immediately used by millions of people.
Unlike a three-guys-in-a-garage startup, we don't have the luxury of scaling out infrastructure to keep pace with user growth; when your feature's userbase will go from 0 to 70 million practically overnight, scalability has to be baked in from the start.
The naive implementation of sending a notification to all friends whenever a user comes online or goes offline has a worst case cost of O(average friendlist size * peak users * churn rate) messages/second, where churn rate is the frequency with which users come online and go offline, in events/second.
This is wildly inefficient to the point of being untenable, given that the average number of friends per user is measured in the hundreds, and the number of concurrent users during peak site usage is on the order of several millions.