Web Within Web : Where one-size-fits-all “Web Search” doesn’t fit!

[Guest post by Manas Garg. He puts forth an interesting viewpoint regarding search vs. navigation and how search is unable to answer many of user's intent. Do share your comments and if you want to submit a post, please use this form]

Search is not an application. It is not a feature. It is a navigation method. It is a must-have for any website.

Fortunately, for most of the websites, Google is there so they don’t have to do their own search. That’s why mysql.com doesn’t have to do search. Google does it for mysql.com. But Google’s one-size-fits-all doesn’t really fit all. SlideShare cannot rely on Google search. Flickr cannot rely on Google search. They have to do their own search. Flight booking portal cannot rely on Google search. It has to do its own search of flight availability.

Now we are getting into an interesting area. We are talking about what Google search cannot do.

But before I dive deeper into what Google search cannot do, I must lay down the yard-stick for measuring success in doing search. Since, search is primarily a navigation thing and navigation is done best when it is almost invisible to people, search is done best when people don’t realize that they are doing search.
It just flows. I search, I get the definite winner among the top results, I don’t even register that I searched. However, search is bad when I realize that I am searching. If I want to know the best restaurants in Bangalore, and I search it on Google, what do I get? Is there a definite winner? No. So, Google search has failed.

So, where all Google search fails? Several places. In fact, it’s amazing that Google search fails at so many places. And it’s even more amazing that the number of such places is increasing. Let’s look at some broad categories -

1. A rather closed cluster of information. Google search is based on popularity of a page across the web for given keywords. But consider Slideshare. It’s a small web within the world wide web. Consider scribd. Consider video sharing sites. Consider social networks. Consider friendfeed. Consider twitter. The rules that determine the popularity of some content at these places are different. On slideshare, it matters how many times a PPT was viewed, how many comments it has, how many users saw it through the end and how many left it in between. It has its own mechanism by which the popularity for a PPT can be determined. Search for “website scalability” on Google and Slideshare and compare the quality of results. Twitter? Isn’t a lot of interesting stuff happening there? Can Google mine it the way it mines general web pages? The rules for ranking are different.

So, Google search fails (well, mostly ;) when it comes to a web within the web.

2. Not everything on the web is a webpage. And Google’s one-size-fits-all doesn’t work for non-webpages. That’s why Google has several other specific search implementations – Images, Code, Groups, Blogs etc. And there are many that are yet to come – Word documents, PDF, PPT, microblogs, blog comments and what not. I can go on but you got the idea. Right? While Google is a clear winner for webpages but there are several other things which are not webpages, which need to be searched on the web and Google either doesn’t search it or doesn’t search it with equal maturity. What’s even more interesting is that you’ll find overlap in point #2 and point #1. A lot of things on the web which are not webpages are found in closed clusters which will have their own custom method for ranking.

3. There is also data on the web. List of available flights from Delhi to Mumbai, their price, airline name, airline number, available seat details, timings etc is data. It is not a webpage or something like PDF which can be just displayed. This is data and it can be shown in interesting ways and it can be used in interesting ways. Same goes with the hotels. List of hotels in a city and various attributes of those hotels like location, room sizes, ambience, facilities etc. If you are wondering what search has to do with all this data, let me remind – search is a navigation method. When I am going to a new city and I want to decide which hotel to stay in, my decision is based on various such parameters.

But Google search doesn’t work here. And it will not. The purpose is different. Google search is for heading to a page whose URL I don’t know but I know some keywords pertaining to that page. It is not for viewing the data.

4. Search as an integral component of a service. I would have taken trip planner example for this i.e. booking a flight for a city, booking the cab, the hotel, deciding the sight-seeing etc. But since that’s already widely talked about, I’ll take something else. Just to prove the wide applicability of search and how we need to see search in a different perspective.

Let’s talk about phulki.com. It’s a search engine for desi music. Its interface looks very much like Google search. But is that how it should be? No. I go to Google to navigate to the final destination page. Go to Google, search for “mysql linux download”. Click on the top result and go to mysql.com page. But I go to phulki.com for listening to the music online or for downloading it. The purpose is different. If I search for “Maula mere le le meri jaan”, I don’t want 10 results showing me the same song from different sources. I don’t care which source it comes from. I just want the best listening/downloading experience. So, show it to me as one song with a drop-down list of various sources (the supposedly best one auto-selected) and a download link (yes, just a single download link) and again a drop down list of sources for download.

If I search for “chak de india”, it should figure out that it’s an album and show me a list of songs from that movie, nicely ordered with an option to put all songs of that movie in my playlist with a single click.

And how about “Those who listened to Maula-mere-le-le-meri-jaan also listened to ek-parinda-aisa-toota”?

It will have technical challenges. It will require phulki to index differently. But phulki is not a search engine for desi music! It is a website for playing desi music!! Search being just the navigation model to reach the music that I want.

Take local search for example. When I search for electronic appliance shop in Pune Camp area, I want not only the name/phone/address, I would also like to be shown the reviews people have left for those shops on different sites.

Summary

There are a lot of contexts in which general purpose, “web search” as provided by Google and others doesn’t work (at least not with as much maturity). And it’s interesting to note that such contexts are growing. The number of sites which are web-within-the-web are increasing. More and more non-webpages are making appearance on the web. Web based applications have started bringing in data (hotels in a city is data, not a webpage; let’s accept this). And finally, there are several web based applications coming up (like phulki, local search) where search is an integral part of their workflow.

These trends spell an interesting future for search but that’s a topic for another article some other time.

What’s your take?

[About the author: Manas is interested in a variety of things like psychology, philosophy, sociology, photography, movie making etc. But since there are only 24 hours in a day and most of it goes in sleeping and earning a living, he amuses himself by writing software, reading a bit and sharing his thoughts.]

pic

 , ,
  • Related Articles

    1. Google Expands Local Search (Voice) to Delhi
    2. mylivesearch : the new generation Search Engine (potential google killer?)
    3. Desktop search – Google vs. Microsoft
    4. Hal Varian, Google’s Chief Economist on Search
    5. Google’s local search and “Click to Call” launched in India
  • comment(s) on Web Within Web : Where one-size-fits-all “Web Search” doesn’t fit!

    12 Responses to Web Within Web : Where one-size-fits-all “Web Search” doesn’t fit!

    1. vikas says:

      can a business model be build around for search for business applications.Meaning thereby if i post that i want to have a search of people who would be ready to buy my latest piece of software.The people working in the same space or customers can be interviewed on web and a research report can be given on a cost basis.

    2. Nice Post .

      I think one place where Google search fail is whenever you need to create information on runtime ( airline fare ), its a search based on DATA and Meta DATA not on the semantic analysis of data itself .

      in that sense you might not realize it but your post has nothing to do with Search and Google . basic premise you are challenging is an age old conflict in philosophy its the Logical argument VS Sophistry . Page rank is an exercise in Sophistry . semantic web means logic . since “Answer ” are always subjective so i don’t see an end to this conflict any time soon .

    3. Anshul says:

      Manas,
      I am confused as why you have selected mysql.com as your example as it is using Google Search Appliance. This is far different than the Google custom websearch that makes majority of your talk and can not be customized much.

      As far as I know, Google search appliance (part of their enterprise offering) can be adapted very well to your content and is capable of indexing more than 200 formats including presentations, images, databases, etc.

      The only catch is, its not free. But if you are able to tell more about your data, it will index it and make it searchable.

      At the bottom of every search is the indexing algorithm and ranking algorithm. And enterprise search offerings let you modify both according to your needs.

      If you have money, you probably don’t need to device your own search.

    4. Prashant says:

      The examples that have been mentioned(flight booking) rely on very dynamic data(ex. next day’s flight schedule can’t be planned 7 days back)..But then nice abstract thinking

    5. naman says:

      what about Linked CSE from Google.. doesn’t that solve all the problems.

    6. Manas says:

      @PrashantSingh :) I am not sure if the article is about the debate between logical argument Vs sophistry. Intent was to highlight the emerging areas where the application of search is required but the general-purpose-web-search doesn’t fit. I do not see search following a logical argument approach for a long time to come.

      @Anshul I didn’t mean mysql using Google appliance. What I meant to say was “If you have a website, you need to provide navigation to reach different pages of that site. One method is good old navigation which spreads links all over the pages. And the other navigation method as I mentioned in the previous article is ‘search’. You can navigate to any page of mysql by doing a search on Google for that page. Hence, mysql need not provide a search based navigation solution for its website. That’s the case with most of the websites.”

      @Naman In my opinion, no. CSE doesn’t solve these problems. It’s a way to restrict the search to specific sites and then put some refinements to those results. It cannot be applied to the examples that I have given.

      • Anshul says:

        My Bad,
        But in that case I believe the use cases you mentioned were already there, instead the age old web search just got better in scouring different type of info. Airline search, social network search, etc are now handled in a better way (atleast Yahoo! does it)

        On a contrary, even for a simple textual website, reliance on the general google.com search is bad as now you don’t have significant control over the page ranks from your own website. You are totally on the mercy of the google.com or Yahoo.com search engine algorithm. This is bad in many ways as your rankings are mostly function of what others think, not what you know is best for information seeker. One such case is a new article on your blog which is not yet indexed by google.com

        So Yes, search has become a mode of navigation but I strongly advocate you to have it implemented on your own end rather than expecting people to use google.com with site:xyz.com option. Even for normal text based sites.

        • Manas Garg says:

          Yes. That’s why search is going to have an interesting future. It will be very interesting to see what directions it takes. I have some thoughts and would put them down one day.

    7. Sameer says:

      Very interesting discussion and Manas makes some relevant points.

      Various approaches at looking at other views of search have eveolved over the years – and vertical search engines solve some of them. Google is more a “search the large static repository” without taking into account validity, transientness and other similar attributes of data – and it works fine as a very common denominator search for a huge variety of needs. Creators of content and retrievers of it indirectly collaborate and put in their collective efforts into finding each other. SEO spoils the model a bit, and anything thats not popular, yet has no commonly know “distinguishing” terms is screwed. Even so, it solves a large set of problems.

      Vertical searches know a domain deeper, represent a cleaner set of data, which is also more validated. There’s also context around each term – richer metadata – and a browse experience helps. Auto sites, dpreview are good examples. This works even with static data, really.

      As search migrates to the phone too, the pure websearch model will totally not work there – more for behavioral (how you use the device vs the desktop) reasons than connectivity/technology ones. It also opens search up to a large, slightly less savvy audience. Both the depth of data, and the breadth of it, will become critical in this scenario. At the same time, users will expect some level of validation/correctness – not just in terms of matching terms but also for “values”.

      There’s also the huge issue of non-digitized data. The web does represent a large amount of information. The big question is, are there huge niches of info/data/knowledge that are just not *there* on the web – at least not on the static, crawled one. This is true for most parts of the world, but especially so for heavily networked (not online, tho) places like India.

      Search, truly, is in its infancy – we’ll find ourselves very amused a decade or so from now :)

    8. Manas Garg says:

      The structure of the web is changing and the applications on top of web are also changing in nature. That’s going to take search in a different direction altogether.

      Google is such a killer today because from its home page, I hop over to anywhere. But as more and more vertical search engines gain prominence, “Google Web Search” will have to define a new role for itself.

    9. Pingback: Interesting Comments Roundup |Technology and Business Startups in India

    10. Rupesh says:

      Since you have talked about Music, so one more random thought:- how if some music search engine let you hum the tune and search it… this might be useful when you don’t like the song but can’t remember the lyrics or album information.
      wat say?