With 22 million registered users and more than 4 million unique visitors every month, LinkedIn is without a doubt one of the most popular destinations on the web.
- Bandwidth (Why Low?): LinkedIn is a basic website that doesn’t attracting as massive an amount of traffic as Facebook or YouTube does. Additionally, the information that it sends back is light in content and does not require a large amount of bandwidth.
- CPU (Why Low?): The content on LinkedIn is mainly text which doesn’t require a high amount of CPU power to be displayed. The CPU will mostly be used to power the search engine and the cataloguing.
- Disk (Why Medium?): Although LinkedIn does store quite a bit of personal information on its servers, it does not need to store too much content heavy information like Netflix would. A moderate amount of disk space is necessary to store the moderate amount of users’ information on LinkedIn.
- RAM (Why Medium?): Because LinkedIn’s primary function is searching for specific people, it needs a moderate amount of RAM to run those searches. But because the data is so simple, it doesn’t need too much RAM to bring that information up.
- Scalability (Why Medium?): While it has implemented a cloud hosting solution, LinkedIn doesn’t need to have a high amount of scalability in order to accommodate its traffic flow.
Originally launched in May of 2003, LinkedIn.com has climbed to the 22nd most visited site on the web. Reid Hoffman, LinkedIn founder, created the site for professional networking. Users can connect with coworkers, peers, and companies the world over, especially as the site is now available in French, German, Italian, Portuguese, and Spanish, in addition to English.
With 22 million registered users and more than 4 million unique visitors every month, LinkedIn is without a doubt one of the most popular destinations on the web. Per day there are at least 40 million page views, 2 million searches, 250,000 invitations sent, 1 million answers posted, and 2 million email messages.
LinkedIn started off using one monolithic web application and one database, the Core Database. The Cloud was used to cache the network graph in memory. The background engine for user searches was Lucene, which ran on the same machine as The Cloud. While the Core Database was updated by WebApp, The Cloud was updated by the Core Database.
In 2006, to reduce the workload on the Core Database, Replica DB was added to the mix. Updates were then managed by a RepDB server, and user searches were subsequently moved from the The Cloud to a separate server. With the addition of Databus, a new updates chain was created:
- Changes originate in the WebApp
- The WebApp updates the Core Database
- The Core Database sends updates to the Databus
- The updates are sent by the Databus to the Replica DBs, The Cloud, and Search
Using vertical partitioning in 2008, LinkedIn allotted each Service its own domain-specific database, splitting its business logic between Services and the WebApp. Services is used to manipulate the Profile, Groups, etc. Using this architecture, other applications are allowed to access LinkedIn, including applications for recruiters, ads, etc.
The LinkedIn Cloud caches the entire LinkedIn network graph in memory with a network size of 22M nodes, 120M edges, requiring 12 GB of RAM. Currently, there are at least 40 instances in production. Using the Databus, the Cloud is updated in real-time. The cache is implemented in C++ and is accessed via JNI.
“About Us.” LinkedIn.com. http://press.linkedin.com/about (accessed November 21, 2010).
Amazon.com. “Linkedin.com Site Info.” Alexa the Web Information Company. http://www.alexa.com/siteinfo/linkedin.com (accessed November 21, 2010).
Asay, Matt. “LinkedIn and MySpace upgrade search with open-source Lucene.” CNET. http://news.cnet.com/8301-13505_3-10107745-16.html (accessed November 22, 2010).
Gobry, Pascal-Emmanuel. “Here’s How LinkedIn’s Business Works.” Business Insider. http://articles.businessinsider.com/2011-12-08/research/30489444_1_linkedin-revenue-premium-subscriptions-ad-revenue (accessed November 21, 2010).
Hurvitz, Oren. “LinkedIn Architecture.” Cookies Are For Closers. http://hurvitz.org/blog/2008/06/linkedin-architecture (accessed November 21, 2010).
Oracle Corporation. “LinkedIn Clicks with Sun & MySQL to Connect Over 25 Million Professionals Worldwide.” MySQL. http://www.mysql.com/news-and-events/generate-article.php?id=1527 (accessed November 22, 2010).
Schonfeld, Erick. “Twitter Surges Past Digg, LinkedIn, And NYTimes.com With 32 Million Global Visitors.” TechCrunch. http://techcrunch.com/2009/05/20/twitter-surges-past-digg-linkedin-and-nytimescom-with-32-million-global-visitors/ (accessed November 22, 2010).
You should check out these related Case Studies: