This new meta-data system for fishing tasty information from the vast web sea is pretty amazing:
- In short, there might be information hidden on the web that cannot be gleaned from any individual page, but becomes apparent when many pages are examined together. And that information could be of great commercial value.
The result is a new online service called WebFountain. A big computer at IBM hoovers up web pages and information from other sources such as newsgroups, syndicated content and newswires. Each incoming page is analysed to determine what language it is in. The context—a news report, a page on a company's website, a web-log entry—is determined. Verbs, nouns, adjectives, proper nouns, place names and even entire phrases are extracted, and are analysed for positive or negative connotations. The page is also classified by category - is it about baseball, Iranian politics or global warming?
All the results from these various tests are then fed upwards into another layer of software that gathers statistics across multiple pages, counting references to particular words or phrases in particular contexts, and looking for trends. All of this is then wrapped up in another layer of software that allows users to query the system remotely across the internet as a "web service".
Dr Tomkins hopes to create an ecosystem of service providers who will use the WebFountain service to analyse the web in different ways to serve different markets. A clipping service, for example, which monitors the press for mentions of a particular company or product, could easily be constructed using WebFountain. A corporate public-relations firm could use WebFountain to monitor public attitudes towards its clients or track which other firms they are mentioned alongside. How have new products been received by different age groups? Are customers grumbling about a product in one part of the world, but not in another? WebFountain can send an alert if anything unexpected happens, such as a sudden surge in mentions of a particular keyword. [The Economist]
I believe the first result they will glean from this is that Blogcritics rules.