Search vs. Subscribe

Kevin Hale has written an article at ParticleTree which talks about the increasing importance of RSS, which in effect means the importance of subscription-based information consumption (as opposed to search). This has profound implications for the business model of, say, Google because it reduces the relevance of plain old searching – people are already having a certain amount of their customised information needs met automatically.

I’ve noticed that on my blog, about 50% of the referrals come from Google and 50% from technorati. I would imagine that for most blogs with semantic tagging functionality, this would be roughly the same.

Around two years ago, the picture was completely different – almost every referrer was google (with a few from yahoo search, msn search etc.). This shows that the “push” model of publishing, and the “aggregation” model of reading, is really taking off.

Within a short time, will we be typing in “christina_aguilera” as a social tag rather than a Google search term? Google, it seems, is trying to make that question irrelevant. Those at the forefront of web technology believe that Google is working on a system to unify basic search and the semantic keywords usually associated with sites like del.icio.us and technorati. This is, not coincidentally, why I chose to register (syn).onymo.us – tag unification. Semantic browsing. Imagine typing in a word and being able to navigate, not just via the usual links, but also through the dimensions of meaning created by tagging: Start at “Royalty”, navigate to “Empire”, “Rome”, “Italy”, “Pasta”, “Atkins Diet” and so on.

You could develop a seriously awesome browsing experience if that was the case. But wait a second – how do you create that seamless semantic browsing experience without some ghastly frames-based abomination that breaks half the web? You would need some sort of rich client-based front end.

Google has hired some Mozilla.org developers, so one can only assume that they’re serious about releasing browser-integrated functionality. The also have an IE toolbar team in there somewhere, because they released the Googlebar for IE. What we’ll probably see, in my completely irrelevant opinion, is Googlebars (or even whole new main toolbar buttons, right next to “back”, “forward” and “stop”) which provide intuitive ways to browse the web by topic.

So Google, while you’re at it, can I request something? Trust relationships. An easy way to say “this person knows their shit” or “this person is full of shit” (or a spammer) and have that information aggregate through my existing trust network to others who think I’m not full of shit. That way, spammers will be relegated to trust “islands” (unless they can somehow hijack the system) while the rest of us build strong social links that increase the quality of the browsing and aggregation experience.

By storing and propagating trust information (and I realise what a hurdle it will be to convince the tinfoil hat brigade that this isn’t the beginning of World War III on privacy), Google can maintain and extend its relevance beyond simple search.

If they don’t do it, someone else will do it via P2P and within 5 years they’ll disappear.

Ok, that last bit was just me being provocative 😉

p.s. Also, please please include some kind of rich support for geographic browsing – and I don’t just mean “where’s the nearest pizza joint”, I mean “what events are happening in a particular region between these dates”, or “who carpools between these locations at 8am from monday to
friday?. I know, I’m a whiner.

Update: I just noticed that Yahoo’s Y!Q service does something similar to what I’m talking about with the toolbar. They also provide an example web page with Y!Q integrated. It’s quite cute, but Yahoo’s branding is less appealing and subtle than Google’s. I know that _shouldn’t_ be important, but it is. If using Y!Q means having a whacking great floating div appear over your website rendered in vomitous purple and yellow, then how many serious websites are going to do it?

45 Replies to “Search vs. Subscribe”

  1. A few points..

    RSS subscriptions and searches are optimal for different sorts of information requirements, but if there wasn’t RSS there would be a lot more searching, because it’s the next best thing for regular information intake on a select group of topics. Ergo, RSS takes away from searching.

    Also, the fact is that if you subscribe to a particular publisher, there is no way that Google can deliver the information before the “lazy-ass publisher” decides to serve it – RSS and mainline content are served at the same time.

    There is quite a bit of functional overlap between the two (search v subscribe) at sites like technorati, which offer blogosphere searching by semantic keyword, and have much better tracking of the “web zeitgeist” through things like the tag page: http://www.technorati.com/tag/.

    As for not liking browser extensions, speaking as someone who has fixed computers for a living, I would say you’re in the minority. I personally hate them too, but if it was done in a platform and corporation-agnostic way (i.e. other search engines could provide the same API and users could switch, a la the Firefox search box), I could see it taking off like wildfire.

    And one final note: It’s not the system that tells a genuine new author apart from new spam, it’s people. People *are* the system. If you are a human, and not a complete hermit, then you will know other humans who can give you a “leg up” into the trust network. Have you ever used PGP? Same deal.

    Thanks for the comment!

    Dan

  2. Hm – it strikes me that we aren’t breaking down the information taxonomy clearly enough to be talking about exactly the same things. So here’s my attempt:

    Information Distribution/Publishing Models:
    (not publishing on your site, but insertion into the “broader web”, i.e. URL distribution channels)
    Crawl. A spider crawls your site at certain intervals determined by an heuristic which estimates the rate at which data changes on your site (see “Google Sitemaps” for an attempt to optimise this)
    Push. The site “pings” other sites to let it know when new content has been added. These sites then index the site. Technorati is a prime example.
    Pull (or “Subscribe”). A local or remote RSS aggregator collects newly published articles from a user-selected list of sites, at defined intervals or as a result of pings.
    Folksonomy. Users who browse part of your site publish a set of semantic keywords describing the content in a social bookmarking system.

    Information Dissection/Refinement Models:
    Basic keyword search. Could be any word in the page. Suffers from “naive indexing” model, not all words on a page accurately describe the content, and determining relevance relies on highly sophisticated statistical models which are engaged in an arms race with spammers.
    Semantic keyword search. Only finds pages that users or publishers have specifically associated with that keyword. Higher quality, but suffers from “synonym syndrome” and relies on having an active user community tagging new pages.

    Either of the information dissection models could be applied to any of the distribution models above, and both have their pros and cons.

    They could also be applied to either locally cached (“subscribed”) data or data out there on the web, or both.

    In answer to your RSS question, what sort of information is delivered depends on the site you’ve subscribed to. For example, you can subscribe to feeds which give you the latest sites tagged with particular keywords at de.icio.us, which is kind of like a “live updated search” for that keyword (e.g. j2ee). The usual model is that you just subscribe to a site, or a topic on a site (e.g. “Technology” at “Washington Post”), and it is aggregated with your other feeds and retrieved at a given interval by your RSS software or site.

    The “people go to web-sites that collect the information that would otherwise be RSSed to them” is just like an RSS aggregator web-application. Google is trying something like this with google.com/ig. Microsoft is experimenting with it at start.com. It’s one way the search giants are trying to keep people coming to web apps for news aggregation rather than having rich desktop clients.

    This comes back to my original point, where I said “Those at the forefront of web technology believe that Google is working on a system to unify basic search and the semantic keywords usually associated with sites like del.icio.us and technorati”. I think Google is going to absorb the threat of subscribed and semantic data on several fronts, by enriching the search experience so that it uses new folksonomy keywords and by integrating RSS-style user-configurable data feeds into traditional search.

    This is why I said that “Google is making the question irrelevant” (i.e. search vs. subscribe, crawl vs. ping vs. folksonomy) by integrating all the methods into their “Google account” and providing multi-dimensional search across the web.