|
BREAKING NEWS & VIEWS
Venuto: Semantic Magic—Infusing Web Content With Meaning My colleague Rachel Lovinger attended the Semantic Technology Conference in San Jose a few weeks ago. She sees three developments in semantic technologies (autotagging tools, semantic search and Linked Data) that warrant publisher’s attention right now. These developments are important to note. As the amount of content published on the Web increases at a rapid pace and people have less time to casually peruse the Web, readers will be looking for ways to get the information they want more quickly, and with less noise in the way. Social networking is one way that people gravitate toward things of interest—links sent or posted by a friend. Users will adopt tools that can accurately predict what they want, and provide it on demand. There’s no better example of this ADD behavior than our findings from a recent user-research study we conducted with adults 18-25 years old. When asked to find information about a particular musician almost all users started their search with Google (no surprise there). The surprise came when this particular group of research subjects interacted with the list of search results. They spent zero time looking beyond the very first result. Many of them immediately and automatically went to the first result on the page without even scanning the other results. The more publishers can infuse flat content (and flat Web pages) with detail and meaning the more valuable that content is to user. Content is certainly king, that is, if the user can find it in the context they are looking for it in. Publishers must now start considering how content they are producing is formatted, in what structure and with what metadata, otherwise they run the risk of their content becoming “invisible.” The publishing industry’s interest in semantic technologies was evident in their strong presence at the conference. Tom Tague of Thompson Reuters’ Calais initiative was the opening keynote speaker, and the closing keynote was given by Rob Larson and Evan Sandhaus of the New York Times. The agenda also squeezed in a publishing panel that featured representatives from Huffington Post, Hearst, Tribune Company, CBS Interactive and the New York Times. Autotagging Tools There is a variety of approaches to this, but the basic concept is that a program runs through an article or an archive of content automatically extracting concepts and suggesting appropriate metadata (e.g. tags, labels, categories, etc.). Not only does this significantly reduce the editorial burden and provides the means to aggregate related content on the site, but it also creates the opportunity for more meaningful distribution of content by way of RSS, email subscriptions, content partnerships, APIs, etc. Calais is an open source tool that does this automatic tagging. Tague also mentioned a project called OpenPublish, which combines Calais with Drupal to provide an open-source publishing platform in a box. Other tools build on these basic autotagging capabilities. Digger can automatically create topic pages and semantic SEO tools. Tools like Zemanta and Apture take a step back—they don’t add tags to the content itself, but they do extract concepts and add highly relevant related links and media. Perhaps they are not as appealing to publishers creating high-quality original content, but these tools may be incredibly useful utilities for sites aggregating content or those simply seeking to enhance original content with content curated from other reliable sources on the Web. Semantic Search A variety of companies are working to create a search engine that better understands natural language queries or makes guesses about the types of tasks a person is trying to accomplish. However, what should make publishers sit up and take notice is the fact that market leaders Google and Yahoo have begun supporting RDFa and microformats—tech speak for two types of content formatting standards that give more detail about the type of content being presented on a page. Results for pages with this kind of markup give search engines more information about how to display the results, making those results more useful and compelling to users). Here’s an example of Google’s Rich Snippets treatment for a restaurant review. Yahoo calls its support of enhanced search results SearchMonkey and it goes beyond reviews and people to include formats for Citysearch, Wikipedia, Hulu and Facebook. Check out this enhanced search result for a restaurant, an Eames armchair or how Hulu’s video for a Simpsons’ episode is handled. Facebook also has an enhanced format. As an open platform, Yahoo SearchMonkey is also taking format submissions for additional types of content from other sources. Enterprising publishers with unique and desirable content have the opportunity to work with Yahoo and create more useful, informative and engaging layouts for their search results. Google’s Rich Snippets isn’t an open platform, but they are receptive to feedback and suggestions from the community, and they’re interested in providing support that’s compatible with Yahoo Search. So it’s possible that Google will eventually follow Y’s lead and support a much wider range of content types. Linked Data This is a growing movement that proposes that the Internet should be a web of linked data, not just be a web of linked pages and documents. A core principal of the semantic Web has been scaled back to a more modest approach. At the simplest level, the goal is to have a common frame of reference so that when you’re talking about Jane Smith and I’m talking about Jane Smith, it is unambiguously clear that we are talking about the same Jane Smith. The big question is: What is the common frame of reference? Some major sources of data are emerging to facilitate these linkages. Organizations such as the BBC, which discovered that they had a rich store of useful data locked away in their enterprise, have decided to open it to the public. The New York Times is also making moves in this space, announcing at the Semantic Technology Conference their intention to publish their index—a vocabulary of 500,000 tags that has been used to annotate all of their articles going back to 1851—in a form available to the Linked Data community. People will soon be building tools, creating mashups, doing research and exploring their content in ways they have never done before. If it seems crazy for the New York Times to be giving away this resource, just think about the incredible position they’ll be in as a cornerstone of this step in the evolution of online content distribution, use and reuse. Domenic Venuto is SVP and Head of the Media and Entertainment Practice with the New York City office of Razorfish. And is one of our exclusive Minsiders, veterans from the print, digital, advertising and services industries who contribute columns to minonline. The author gratefully acknowledges the contributions to this column from Rachel Lovinger, Content Strategy Lead with the New York City office of Razorfish. If you have breaking news to share please contact Steve Smith at ssmith@accessintel.com COMMENTS
|
App Central min's App Central (for min subscribers only): Stay on top of mobile app developments with exclusive app reviews, analysis and data.
Please enter the following information to have a link to The Skinny emailed to your iPhone:
White Papers
min Contests
EventsBest of the Web, April 3, 2012 Check out the photos from min's Most Intriguing event min Press
Events Calendar
min's Best of the Web Awards |
| Copyright © 2012 Access Intelligence, LLC. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of Access Intelligence, LLC is prohibited. For more details please see Terms and Conditions. |