March 14th, 2006, 7:30

We've been noticing that as clouds were being built, sometimes there were terms that, despite being valid words that normally would be taken into account, due to the context around them, they should actually be ignored, yet, ZoomClouds was still considering them "important".

One of the most obvious examples is the text "Technorati tags: ...", so we'll use it as an example of what's been going on so far and how we've fixed it.

The text "Technorati tags: ..." often appears at the end of many posts. When ZoomClouds found the word "Technorati", it assumed it was a relevant term, and so it included it in the cloud. What happens is that, since in these cases, it's a term that appears in each and every post for that blog, the cloud ended up with the term Technorati as one of the most relevants in the cloud, if not the very most relevant one. Yet, chances are the blogger didn't even mention the word Technorati in his/her posts.

This forced some people to include "Technorati" as an unwanted tag, and while that somewhat fixed the problem, in fact, should this person ever write something about Technorati, it would never ever appear in the cloud, so it certainly wasn't a perfect solution.

Now, we've added a small behind the scenes context filter to ZoomClouds so when some terms are found within certain irrelevant contexts, these terms will not be considered when building the cloud. That's the case with the "Technorati tags" example, as well as a few others.

