Google has publicly released its big data analysis software BigQuery which can process billions of rows of data in the cloud.
The latest hubbub around the web hosting industry has been big data. Big data refers to data sets that are too large to work with given standard database management tools. It becomes too difficult to capture, store, search, and share the data sets, as well as visualize them in their entirety. Big data can accumulate in size from petabytes (~1000 terabytes) to exabytes (~1000 petabytes) or even zettabytes (~1000 exabytes). It is estimated that all information on the Internet accumulates to half a zettabyte (500 exabytes).
So, working with big data and being able to host it has become a revenue generator for companies that offer web hosting. As you might expect, Google is leading the way. On Tuesday, Google announced its latest product BigQuery available to the public, on its official developer blog. Prior to this release, BigQuery had been in a limited preview. BigQuery is a data analysis tool that offers the ability to query up to 100GB of data per month for free. The other plan being released is the ability to store 2TB of data for free. Unfortunately, 2TB isn’t exactly the same big data scale that is becoming more and more prevalent in the industry today.
Analysts are calling BigQuery a Platform-as-a-Service (PaaS) product. PaaS products work with massive data sets, ranging up to a table with billions of rows. BigQuery accomplishes its queries through an SQL-like query language and boasts the ability to analyze massive data sets in seconds. Fans of the service are praising BigQuery over other big data analyzing software like Hadoop because there’s no setup or any extraneous software required. Additionally, the price point is competitive.
Note: It’s difficult to comprehend that we’ve really come up with that much digital information! But, if your company hasn’t quite reached that much data yet, think about NetHosting’s virtual hosting option which comes with either MySQL or MS SQL databases.
Google’s pricing model is by the number of queries processed. After the free 100GB, it will cost companies and organizations $.035/GB processed with a cap of 1,000 queries per day and 20TB of data processed per day. Also, customers are charged for the data processed in a column of the data, not per table. Finally, users have three different avenues to interact with the information BigQuery provides: through a browser-based query tool, a Python command-line tool, or a REST API.
Another benefit to BigQuery is its Terms of Service. They give each customer full and exclusive control and rights to its own data. Having said that, you know that Google is most likely gleaning all the information it can from the data it’s running through BigQuery so companies may be concerned about privacy.
The advantage that other big data tools like Hadoop or Hive do have, however, is flexibility. BigQuery might be faster out of the box but if you want to get some more advanced or complicated analysis of your data, then BigQuery might not be the software for your company.
As big data becomes a commodity in the hosting industry, it will be interesting to see how it evolves as a de facto feature with future hosting services and suites. Naturally, BigQuery (and most other big data analysis software like Hadoop and Hive) are available in the cloud. While the amount of data that is labeled big data might seem inconceivable to you or your company right now, the fact is that more and more people are getting online and more and more digital information is being stored and exchanged. A few short years ago, terabytes seemed like something only Google had to worry about and now a couple of terabytes is standard on a budget laptop.
To read more about what’s going on in the cloud throughout the hosting industry, take a look at our blog post about Microsoft’s research that proves the cloud has created millions of jobs since its inception.