Thursday, July 10, 2008

Technology at Google

There's been a recent flurry of interesting blog posts describing different pieces of technology at Google. Udi Manber introduced the work of the Search Quality group at Google (I work in this group). His post is the first in a series of more in-depth posts. Here's my favorite bit in Udi's post:

...but the goal is always the same: improve the user experience. This is not the main goal, it is the only goal.

Amit Singhal followed up on Udi's post with a discussion of the philosophy underlying Google's ranking algorithm (Amit has promised a follow-up focused on technology). Here's how Amit describes our philosophy:
1) Best locally relevant results served globally.
2) Keep it simple.
3) No manual intervention.
In a separate series of posts on how data is used within Search Quality, Paul Haahr and Steve Baker write about using data to build language models. They observe that:

By analyzing how people use language, we build models that enable us to interpret searches better, offer spelling corrections, understand when alternative forms of words are needed, offer language translation, and even suggest when searching in another language is appropriate.

Matt Cutts follows up with a post on using data to fight web spam. He says:

Our logs data helps ensure that Google detects and has a chance to counteract new spam trends before it lowers the quality of your search experience.

Outside of Search Quality, I'm particularly pleased about the recent announcement to open source protocol buffers---our data interchange format. Protocol buffers are pervasive inside Google and are a very effective way of encoding "...almost any sort of structured information which needs to be passed across the network or stored on disk. "

Two other announcements are also worth highlighting: Google's C++ testing framework and Google's C++ style guide have both been open sourced.

Finally, here's a video of the Google Factory Tour of Search held back in May.

I have a short segment in there (at about the 65 minute mark) describing some of the work our group has done on query understanding. Earlier in the video (at about the 56 minute mark) Trystan Upstill talks about our group's work on International Search Quality (which is Amit's first point---best locally relevant results served globally). And earlier still (at about the 46 minute mark) Johanna Wright talks about Universal Search.

