Monday, January 29, 2007

Defusing the Googlebomb

A few days ago, Google made some changes to their ranking algorithms to reduce the practice of Googlebombing. A Google bomb is basically a prank to manipulate the ranking of pages in Google’s search engine. It involves getting a lot of people to put a link to a particular web page on their site with the anchor text they want associated with the web page.

For example, if I wanted a search for “basketball stud” to show my blog as the first result, I’d get as many people as I could to place a link on their website that looks like this:
<a href="http://frankmccown.blogspot.com/">basketball stud</a>

Then when Google crawls the Web and sees a large number of links that look like this, they would begin to favor this page over the rest when users search for “basketball stud”.

One of the most famous Googlebombs involves a search for miserable failure. While this used to show George Bush’s web page first in the results, it now brings up more relevant results. Danny Sullivan has written a good article about this.

How did Google reduce the affects of Google bombs? They’re not giving particulars, but they have admitted it’s purely automated. My guess is they analyze several factors:
  1. When and where was the link first found? Possibly Google tracks the growth of particular links.
  2. Does the link make sense for the web page or website? A red flag might be raised when a website about hacking points to a government web page when none of the other links do.
  3. Is the target page actually "about" the anchor text? If the words "miserable failure" aren't on the target page, it could be a bomb.

If you’d really like to dig into this subject, here’s a master’s thesis on the topic.

6 comments:

  1. Hey Frank,

    This is Jon Wrye. I don't know if you remember me. I was enrolled in your webdev class in fall 2002 (I think), but ended up dropping it. I also know Becky, as does Miriam, my wife. I started classes this spring for a second degree in CS and that's how I found your page (a link from the dept web site).

    Just wanted to say congrats on the baby and to let you know I am looking forward to having you as a professor (again) when you get back. Also, I wanted to share one of my favorite commercials/ads that you may not have seen. It is for the Berlitz Language School. Enjoy!

    http://video.google.com/videoplay?docid=-9180512665135657036

    ReplyDelete
  2. Thanks for this info

    ReplyDelete
  3. Hello Jon. I look forward to seeing you at Harding in the fall. That commericial is one of my all-time favorites! Becky says hello.

    Frank

    ReplyDelete
  4. I would add the following:

    5. Is the linking pattern natural and organic? i.e. Does it follow a gradual curve over time?

    If an "unnatural" pattern occurs -- e.g. too many links with the same anchor text popping up in a short space of time -- they treat this as spam. That's my guess anyway.

    ReplyDelete
  5. Seems like the tag should be completely irrelevant. Shouldn't google look at the actual page that is linked to that link, and then capture real keywords from that page? I could link any word in my webpage to any word. I could like the word "to" to "www.google.com" That doesn't mean that www.google.com has any content associated with "to"

    I guess I don't see the point of google taking linked words into account when indexing pages ...?

    ReplyDelete
  6. Dr. Burt- Google takes into account numerous things when trying to determine if a page is relavent to a query or not, and anchor text is just one small piece of the puzzle. They realize that many links may use meaningless anchor text, so they use algorithms to sort out the chaff. When they encounter millions of links all using the same anchor text, they pay attention since it's difficult (but obviously not impossible) for millions of different people to have colluded. You can think of it like the tagging mechanism of social bookmarking sites like del.icio.us.

    ReplyDelete