Part II – Why Search Engines will be redundant soon…

August 21st, 2006 § 2 comments § permalink

Part II – I Seek You, and your meta-data, too…


The story until now:
Part I was a quick review into understanding Traditional Search Engines and their methods and relating them to human conversation – since the Web 2.0 is all about ‘conversations in the marketplace’. On to the second part.


What does making sense out of data mean?

In Search Engine terms, it would refer to contextualizing the huge chunk of uncontextual data that is the World Wide Web into information and eventually knowledge. To me, as a human, it simply means tagging certain keywords to any given chunk of data (e.g. a lecture, a passage, a book, a chapter, a conversation) in order to be able to recall it at any time – especially, when one of these keywords is mentioned.

For instance, the conversation in the previous post was about a traveller, (an out-of-towner) looking for directions to a tobbacconist. As I keep reminding myself, Web 2.0 is not a product, it is a process. The process has a lot of conversational threads that keep getting picked and dropped as newer and more interesting threads or new participants appear in their place.

So what would a contemporary Search engine have to consider in Web 2.0?

‘Weight’ing for Information.

From being a static display of items-for-sale behind elegant window panes, the Internet slowly transformed into a bazaar of sorts, with hawkers all around the place plying their wares. The markets grew to accomodate the new and the old. With the advent of Web 2.0, contextualization of information became the norm and not an option.

It all began with a nifty bookmarking site called del.icio.us that allowed you to access your favorite sites across the web. Technorati extended the concept to Blogs and induced bloggers to ‘tag’ their posts with their choice of keywords/tags.

With the Web evolving like a democracy, the obvious question of authority in the Web-democracy arose. Which voice among the loud babble was to be trusted? As the web evolved, so did the concept of it’s franchise. Only, in this virtual reality, links were deemed votes and tags were your campaign ads. Let’s take a quick look at the foru weights that influence your vote.

  1. Tags – Powerful Keywords
  2. Each tag is a keyword that associates a particular context, a topic, with a given chunk of data.

  3. Time – The ‘other’ Long Tail
  4. All topics & data have a peak presence time. The freshness of a particular keyword is of prime importance in its influence.
    Consider this simple example: When Iraq was attacked, almost all of the Search Engines across the world were buzzing with Search queries consisting of corresponding keywords, viz., “Iraq” “attack”. The “hotness” of the Search cooled down as the days progressed, as the world got other topics to discuss about.

  5. Trust & Authority
  6. Even in flat hierarchies like the Internet there are obvious postitions of Trust and Authority. People who blog well, and blog often gain a large following, and effectively, the crucial element of Trust.

  7. Authenticity
  8. A news on a Microsoft blog would obviously be rated higher in all terms than a news quoting a “trusted Source at Microsoft”. The only exceptions to this rule are:

    • The news is a really good bit of juicy gossip – like a rant or a ‘leaked’ secret
    • The blogger has high levels of Trust & Authority

There’s a common thread that binds all of these.. Do you see it yet?

(To be concluded)

Note:
I profusely apologise for disappearing from the Blogging scene, all of a sudden. I was forced into a short hiatus by unforeseen circumstances. We updated our website platform to a new version, recently. although the beta is pretty stable, we are still working on a better UI. As a result, I had to spend some sleepless nights and a few Blog-less weeks. ;)

Once again, my sincere apologies for the same…

Technorati Tags: , , ,

powered by performancing firefox

Why Search Engines will be redundant soon…

August 7th, 2006 § 0 comments § permalink

Part 1: Search and the Web 1.0: Gorblimey!

Those of you who reached here through Google, Yahoo or MSN are probably laughing as you read this. But do go on, there’s more. :-)

(Un)common Recurring Searches

Often our searches are simple keywords crafted with central themes in mind:

  • A name (e.g. Shrikant Joshi or Performancing)
  • A topic (e.g. Corporate Communications)
  • A context (e.g. “Spanish Omelette” +recipe)

Some of us might even burden the spartan box (or in the old days, the Butler) with an entire question. The faithful zombie then crawls its way through the innards of the webs, looking for that occasional diamond stashed away in the back alleys. Usually, in the common cases such as the ones defined above, results are returned in the correct context of our request. Often, the SERPs also throw results that are related yet not within context.

Robert Scoble’s post on Optimization had this line that caught my attention:

It all starts with the blog. Now, why can’t I put my blog on the map? When you go to Live.com and search on “Scoble” why can’t I customize my results there with more information for you?

Well, I don’t agree wholly.

Search for my name on Google. There are at least three different people called Shrikant Joshi who turn up in the top 3. We keep exchanging the first three ranks. And all of us are pretty active bloggers it would seem. The see-sawing of rankings in the Organic Search results is not a matter of concern for me. Nor do I want to customise these search results so that I would get more result-space.

I am not a key-word

What are search engines? Simply speaking, search engines are content-aggregators assigned the additional job of classification. As humans we need to have everything classified into a taxonomy so as to facilitate recollection. Our knowledge depends upon storage which in turn depends upon collection and classification of data. Classification helps recollection and hence improves perceptive retention of knowledge.

Or, in simple words:

The more you know, the wiser you are. Hence, classify and remember.

Similar to how we retain knowledge, Search Engines classify the data they crawl according to keywords. A huge index is built up and referenced and cross-referenced until all the possible avenues of keywords linking to pages and vice-versa are covered. But you probably know all that and more already.

Keywords, mmmm… Aah!

The next step would be making sense out of the data, which eventually leads to contextualization. Don’t get it? Well, simply put:

“A search engine’s job is to make sense out of all that data.”

Let’s take a simple case. Someone in your town happens to own a convenience store named Uncle Tom’s Cabin. Let us imagine that an outsider in your city is searching for it. Here’s how the conversation would go:

Outsider: “Where can I find a convenience store?”
You: “That would have to be Uncle Tom’s cabin. Go straight down for about two blocks and then take a left. It’s right across the street.”
Outsider: “Would I be likely to get some cigarettes there?”
You: “Oh! If you simply wants cigarettes, there a tobacconist just round the corner!”

A normal conversation, eh? Well, let’s take a look at it again. Only this time, we’ll look at it the way a search engine would. Let’s insert some key words into it for understanding the flow of the conversation:

1. “Where can I find a convenience store?”
[New Search Query, keyword: "convenience store"]
2. “That would have to be Uncle Tom’s cabin. Go straight down for about two blocks and then take a left. It’s right across the street.”
[Response keywords:"Uncle Tom's cabin", "directions"]
3. “Would I be likely to get some cigarettes there?”
[Refine Search Query, keyword: "cigarettes"]
4. “Oh! If you simply wants cigarettes, there a tobacconist just round the corner!”
[Response keywords: "Tobacconist","Round the corner"]

With me so far? Here’s the stumper:

If each of these sentences corresponded to an entire blog-post in the Blogosphere, how would you track this conversation? How would you rank each post with respect to the keywords. Would those keywords be enough to cover all aspects of the conversation? Would you call those keywords as appropriate descriptors of the conversation? Where would these posts appear in SERPs for the combined keywords {“Your Name” +directions}

To be continued…

Disclaimer:
I am no Search Engine Expert. These opinions are simply my $0.02 worth. Or may be less. :)

Technorati Tags: , , , , ,

powered by performancing firefox

Netscape.com says, “Hi to all Diggers!”

July 26th, 2006 § 0 comments § permalink

Surprised? Well, read on…

Early this morning, someone submitted a story on Netscape.com. And Digg fans all over the world erupted in laughter and glee. Ever since the story was submitted, this is what appears, when Netscape is loaded into your browser:

The first is a four word expletive, and the second greets “all you Diggers out there!”

The culprit?

A story titled “Unbearable Cuteness”. Ironical,eh? Here’s the what and why of the entire fiasco.

Analysis:
A quick check of the JavaScript on the page reveals this script:

via a
title="http://www.cute.com">script>alert("fuck");
alert("Hi to all you Diggers out there ;)");script>"
href="http://www.cute.com">script>alert("fuck");
alert("Hi to all you Diggers out there ;)");script>"
onclick="trackOutbound(15475);">cute.com">script>alert("fuck");
alert("Hi to all you Diggers out there ;)");

The link that was submitted with the story exploited an XSS (Cross Site Scripting) vulnerability. PacketStorm had already published this vulnerability a month ago on the 6th of June. Apparently netscape.com does not sanitise its inputs before they are submitted. As a result, specially crafted JavaScript (like this one) can be used to send ‘malicious code’.

While Netscape is looking into the matter, Diggers across the globe are having a field day running multipleHa Ha! Netscape gets hacked!!stories. Most of the l33t Diggers are already publishing their insightful comments on the stories, too.

What can I say? There is a child in all of us… :)

Technorati Tags: , , ,

powered by performancing firefox

‘GTraffic’ is here. Well, almost…

July 25th, 2006 § 0 comments § permalink

I am pretty sure the guys at Google must be sneakily reading my blog. Remember this?
Google has rolled out a special version of Google Maps for your mobile phone (via Google Blog). Well, well, well…

So is it really true, then? Is Google silently creating a presence in all possible verticals? How long will it be before they integrate all things under one roof?

I am still wondering…

Technorati Tags: , ,

powered by performancing firefox

One World (Wide Web). One Identity?

July 21st, 2006 § 0 comments § permalink

How many times have you had people sending you invitational eMails saying, “Try this cool site I found!” or “This is an amazing site!” or “You’ll absolutely love this one!” or lines to that effect?

Too many, I suspect.

Web 2.0 and the concept of User Generated Content has had the world in a tizzy for quite a while now. Innovative ideas and domain name registrations seem to go hand in hand. The people riding the waves of the Internet never had it so good. New services are introduced every day and competition is building up before you can say, “Watch out!”

As the Internet grows, as the flood of ideas increases, so will the number of identities. The number of services we use though, will continue to remain the same, maybe a few additions here and there.

Why? Because we are all loyalists to the core. We all have a list of our favorite sites that we visit regularly and we rarely visit the competition. There are innumerable excuses for this loyalty ranging from the old ‘comfort zone’, to the very latest ‘swanky look’, and the geeky ‘amazing feature-set.’

Truth is, we cannot handle multiple identities.

Having multiple identities is similar to owning two or more cell-phones. The greater the number of phones, the greater the interruption. Each cell-phone contributes an identity (in the vaguest sense of the word). Each eMail address is an identity that we have created for ourself on the WWW. Each profile on a social network is an identity that we maintain.

The number of eMails in your inbox is a fair indicator of the number of identities you have on the Web. And those of us, who are actively tracking the development of the collaborative Web, must have emails running into hundreds.

One idea would be to have a single secure identity that will cater to logins all across the internet. If such an idea were ever to gather support, it would have some interesting implications:

Naturally, this would imply a unique database to cater to all our identities across the web. But who should get the right to create and maintain such a database? The huge set of meta-data that would result would be a statistician’s dream come true! The flip-side of this is obviously the large ‘corporations’ that would give a few arms and legs (or even take a few) to get a crack at this data. (Ok, so I am a li’l partial to scientific research…)

What could be better than acquiring this data?

Having the data on your own servers! MyOpenID, Windows LiveID, Google Account Authentication, are a few names in this context. This probably explains why there is an intense competition between the Big Three and a few other key players.

If this sounds fairly Orwellian and reminds you of “1984” and Big Brother, you are probably right. :o)

The virtual world we live in, closely resembles the Orwellian 1984. Recent cases (Digg v/s Netscape, for instance) indicate as much. Search Engines indexing our content have the power to convey them to the faceless ‘Thought police.’ We have rich-sounding names like User Generated Content and Long Tail. And we have a faceless Big Brother who ‘purportedly’ keeps everything in check.

Makes you wonder: was Orwell right all along?

Technorati Tags: , , ,

powered by performancing firefox

Markets are *noisy* conversations.

June 27th, 2006 § 2 comments § permalink

Strange, isn’t it?

All of us hailed the coming of a shareable, collaborative web and ‘lovingly’ named it Web 2.0. But along with it came announcements and offerings, options and varieties; faster than anything else. So much, that the low murmur of the internet rose to a harsh, loud, incoherent noise. So much, that we are beginning to denounce it like no other.

Hypocrisy? Nope, I think “Familiarity breeds Contempt” is more like it…

Web 2.0 was a concept. Each one of us interpreted the concept and put forth ideas of their own. As a result, there was a rush of ideas and hence a flood of communication. People started ‘socializing’ on the web. Social networks boomed and people came ‘closer’.

IMHO, it all started with the advent of broadband connectivity. Being ‘always-on’ had a direct implication, that of being connected with all your near and dear ones. Web 2.0 looked upon the internet as one huge community, with local groups of people inhabiting it. This concept was publicized and then, taken too literally. Thus, were born the social networks of today.

The community is a market and markets have alternatives. Working on the same lines, social networks began to sprout, each claiming to offer something different from the other. But, the basic objective of these networks was the same – connecting people and conducting conversations across the globe.

The market analogy gives us yet another insight. Every product has competition. And every competitive product has a seller who is willing to canvass for it. The greater the competition, the larger the canvassing and the noisier the market. In the end the market become a large noisy mass of voices and nothing audible or coherent.

Get the drift?

The web as a marketplace has been inundated with offerings. The noise in the marketplace will remain until the day the sellers give up or the stocks dry up. Since, there is little chance of the latter happening, we will have to wait for the former to happen and pray that it happens sooner, rather than later.

The noise of the eMails and IMs that have been flying back and forth has overwhelmed us to the extent that we now want out. But without them, how would we communicate, let alone converse?

Or, are we wrong in assuming that eMails & IMs are the only methods of communication? What if there IS an alternative?

Will things be different?

Technorati Tags: , , ,

powered by performancing firefox

Google Doodles

June 15th, 2006 § 0 comments § permalink

This post by Doug (a Xoogler) talks about how different people with different visions read differently into one and the same thing. Well, actually he talks about the Google-Dilbert Logo that *almost* caused quite an internal scandal in the Googleplex.

Those of you who have seen it, know what I am talking about. Those who haven’t, follow this link and read this.

Technorati Tags: Xoogler, ,

powered by performancing firefox