Google search started as an idea from two Stanford University PhD students and is now used one billion times a day, in addition to many other Google products.
- Bandwidth (Why Low?): The data that Google Search returns to the user after a query is relatively small, mainly text or linked images. This does not require much bandwidth.
- CPU (Why Low?): Google Search does not require much computing power to generate all the search results the user will see on the screen. It powers mainly its algorithms and user interface.
- Disk (Why High?): Because of Google Search’s indexing system, the memory requirements are extremely high.
- RAM (Why Low?): Much like computing power, Google Search uses a low amount of RAM when processing the user’s search request.
- Scalability (Why High?): Because Google Search receives an enormous amount of traffic worldwide and is greatly depended upon, Google has ensured that its servers are running at full capacity and will be able to perform no matter how high the demand.
Sergey Brin and Larry Page registered the domain google.com in 1997, a year after their PhD research led them to an idea that revolutionized how to search the Internet. Today, Google’s most popular product, Google Search, handles more than one billion searches a day. It has become a household name, synonymous with searching the Internet regardless of which search engine is actually used. The rest of Google’s products have pervaded the majority of activity on the Internet as well (AdWords, Gmail, Blogger, etc.). Here’s a closer look not only at the company’s history, but how its hardware systems and software gets it all done.
In 1996, Stanford PhD students Sergey Brin and Larry Page started a research project. The crux of their revolutionary search engine was called PageRank. Their new algorithm ranked a website within search results based on how many other pages linked back to the website in question. Prior to PageRank, search engines ranked websites by how often your search term appeared throughout a website. Originally, they called their search engine BackRub because backlinks (that is, links going from one website back to the website being ranked) were the determining PageRank factor.
BackRub was later renamed to Google, an unintentional misspelling of the number googol which is the number one followed by one hundred zeros. Brin and Page picked the term to represent the amount of data they wanted to provide users with their search engine. The domain was available and on September 15, 1997, google.com was born.
One year later, the brainchild of Brin and Page finally became Google Inc. A month before the incorporated title became official, co-founder of Sun Microsystems Andy Bechtolsheim gave Brin and Page $100,000 to fund the startup. In early 1999, both PhD students thought their side project was taking too much time away from their other academic responsibilities. They approached the Excite search engine CEO George Bell and offered to sell their algorithm and company to him for one million dollars, which he rejected. Later that year Google received $25 million in funding from companies like Sequoia Capital and Kleiner Perkins.
Initially, Brin and Page were opposed to the clutter that ads would create on their simple and clean search home page. However, to keep the search free and cover costs, the founders agreed to ads on the Google homepage, but text-only, to keep the design clean and to keep the speed of the search up as well. Advertisers could buy ad space based on keywords that users searched for.
In the following years, Google made acquisition after acquisition to develop and present many of their current products, like Blogger, Google Voice, Google Earth, Google Docs, and more. The company moved multiple times in its history to accommodate its growing staff, most recently landing in Mountain View, California in 2003. The massive complex that houses the Google staff is commonly referred to as the Googleplex.
On August 19, 2004, Google launched its initial public offering, with 19,605,052 shares at $85 a share. Two years later, the verb “google” was included in the Merriam-Webster and Oxford English dictionaries, defined as “to use the Google search engine to obtain information on the Internet.” Since the inception of the search engine, Google has never slowed down, and has only grown more and more a part of daily life for Internet users.
Google is such a large company with so many different products that it’s difficult to summarize them all. The majority of Google’s revenue comes from Google’s ads. Two of Google’s biggest ad revenues streams are AdWords and AdSense. Paired with Google Analytics, Google customers have the distinct advantage of buying ad space that will target users’ specific interests, and then they can watch how users respond to their design and content on their websites.
AdWords is Google’s main revenue stream, which offers companies space to advertise their websites with a pay-per-click (PPC) pricing model. Advertisers get to pick the words that trigger their ads to come up on the side of Google searches. Additionally, advertisers choose the limit they will pay to Google per click their ads receive. AdWords customers only appear on Google searches, whereas AdSense customer’s ads show up within the Google Display Network (pages within Google’s family of content, like blogs or Gmail).
In contextual advertisements, Google servers look at the page and determine high-value keywords and then present ads that have been associated with those keywords. In site-targeted advertisements, advertisers choose which sites their ads are placed on, and pay for those advertisements to be placed. Finally, search advertisements are the “sponsored” advertisements users see within a search results page. DoubleClick, a subsidiary of Google, is another large part of Google’s advertising model, and it works with AdSense. It uploads ads and reports ad performance.
The majority of the rest of the company’s products are free to use, with a few limitations. For example, Google’s email product, Gmail, offers free storage for users up to 7.4 GB, and after that, charges for extra storage requirements (from 20 GB to 16 TB). Enterprise versions of many of the same free, individualized products can be purchased for company uses. Aside from Gmail and Search, Google’s other most popular products include Blogger (the blogging service hosting through Google), Reader (the RSS feed organizer hosted through Google), and Google Docs (the dynamic, multiple-editor-capable online text editor, that can also edit spreadsheets and slide presentations).
The majority (if not all) of Google’s architecture is built from in-house software. The three most used programming languages at the Googleplex are Java, C++, and Python. Google made their own Linux-based web server called Google Web Server. Again, due to trade secrets, specifics on the server software configuration are sparse. The only detail that has been confirmed is that the web server does not run on Apache.
The storage system Google uses is its own, original file system called GoogleFS. For their distributed lock manager, Google researched and developed their own service for distributed file systems. It’s called Chubby and, since its inception, it has evolved and now replaces DNS as a name server within Google. In 2010, Google added Caffeine to their homegrown server index and search system TeraGoogle. Caffeine is a continuous indexing system which is the bread and butter that can return hundreds of thousands of search results within a matter of milliseconds.
To handle its high volume of traffic, Google developed its own design for servers (like their software) with minimal features, equipped to run at the highest capacity possible. Because they believe this server design to be part of their success in keeping costs down but efficiency high, the company has always been very tight-lipped about their hardware.
At one press conference in 2009, Google unveiled one of their servers and briefly explained some of their hardware configuration. At the time, every server had its own 12-volt battery in case of a power supply problem in the data center. Also, each server had two x86 processors (from either AMD or Intel), two hard drives, a Gigabyte motherboard, and eight memory slots. Google also released that all of their network devices also had 12-volt batteries equipped. No information about their hardware setup has been released since then, so it is likely that their design has changed.
Google has posted information on its own website about where a few of the data centers it currently owns and operates are located. They claim six data centers within the United States. They do have two operating data centers (one in Finland, one in Belgium) with plans for two more planned to be completed in the near future (one in Singapore, one in Hong Kong). However, to handle the high volume of traffic Google receives, it would seem that Google would need more computing resources than eight data centers, no matter how state-of-the-art they are. It’s natural for many companies not to disclose the location of their data centers for security purposes. Many have hypothesized that Google has at least thirty data centers worldwide. Starting in 2005, Google has used shipping containers to create many of its data centers, so they’re modularized for easy removal and addition.
Google has data centers all over the world, hosts ads on the most popular websites, and provides necessary Internet services to the majority of Internet users (e.g. email through Gmail, blogging through Blogger, scheduling through Calendar, etc.). The great or scary thing about Google is that it’s not slowing down any time soon. While that makes their competitors nervous, the constant push to be on top also has provided the world with amazing innovation and better products.
Mimicking the grand scale of Google’s success isn’t an overnight process, but Google search has allowed companies more visibility in their search results and that’s a step in the right direction. Check out NetHosting’s SEO packages to take advantage of some of the Google search popularity and be better seen by Google’s one billion users every day.
Burrows, Mike. “The Chubby Lock Service for Loosely-Coupled Distributed Systems.” Google Research Publications.
http://research.google.com/archive/chubby.html (accessed January 12, 2012).
“Data center locations.” Google.
http://www.google.com/about/datacenters/locations/index.html (accessed January 12, 2012).
Encyclopedia Britannica. “google.” Merriam-Webster.
http://www.merriam-webster.com/dictionary/google (accessed January 12, 2012).
“Google Acquires Applied Semantics.” Google.
http://www.google.com/press/pressrel/applied.html (accessed January 16, 2012).
“Google AdSense.” Google.
https://www.google.com/adsense/www/en_US/tour/howitworks.html (accessed January 16, 2012).
“Google History.” Google.
http://www.google.com/intl/en/about/corporate/company/history.html (accessed January 11, 2012).
“Google: Inside Search.” Google.
http://www.google.com/insidesearch/underthehood.html (accessed January 11, 2012).
“Google’s new record, 1 billion visitors in May.” It’s All Tech.
http://itsalltech.com/2011/06/22/googles-new-record-1-billion-visitors-in-may/ (accessed January 11, 2012).
You should check out these related Case Studies: