Google Search Gets Caffeine Jolt

Google has launched a major overhaul of the way it produces its search results. If it works as designed it could mean new content appears in results much more quickly.

The key to Google, like any search engine, is that queries don’t literally search the Internet but rather Google’s index which is repeatedly updated by scanning web pages to make the index more current.

Until now, the index has been built into several “layers” which are refreshed at varying intervals. The main “layer” updated roughly once every fortnight meaning that on average the data Google used for an individual web page would be a week old. (There are exceptions for sites known to be updated more frequently, for example those scanned for Google News.) Here’s an explanation from Google of how the old system works:

That system has been replaced by “Caffeine”, a system which throws out the layers and breaks the web down into much smaller chunks. Rather than running on a cycle, the index is continuously updated. This means new content goes “live” on the index as soon as Google finds it, rather than at the next scheduled update.

According to the company, this makes its index “50 percent fresher”, whatever that means. It also notes that the entire Caffeine database is almost 100 million gigabytes and the daily refresh rate is in the hundreds of thousands of gigabytes range.

What’s not yet clear is if this will have any impact on search engine optimization and the factors which determine where pages appear in the Google rankings. It appears that the frequency at which individual pages are scanned won’t change; instead it’s the speed at which the data collected in that scan is built into the live index. Still, that change may have affect the number of pages which wind up benefiting from whatever element of Google’s algorithm favors recently-written content.