folksonomy
Library of Congress Reports on Flickr Pilot
Anonymous — January 7, 2009 - 10:04pm
[This post is cross-posted on my personal blog]
Last month the Library of Congress released their report on their ongoing Flickr project that i have been very interested in and have written about as the project progressed. From their blog post on the report:
"Only nine months into the Library of Congress’ pilot project placing Library photos on the Web site Flickr, the photos have drawn more than 10 million views, 7,166 comments and more than 67,000 tags, according to a new report from the project team overseeing the lively project."
“The popularity and impact of the pilot have been remarkable,” said Michelle Springer, project manager for digital initiatives in the Office of Strategic Initiatives, who said total views reached 10 million in October. The site is averaging 500,000 views a month, she said, adding that Flickr members have marked 79 percent of the photos as “favorites.”
A summary of some of the outcomes:
- Increasing awareness of the digital photograph collection the Library of Congress (LC) has which has been available for years on the Library's website turning to not only an engaged audience but a lot of referral traffic to the Library's Website. "Feedback of this nature suggests that as a result of this project the Library is reaching new audiences—people who did not or could not find this material on our own site, and people who never thought to look here. "
- Gain a Better Understanding of Social Tagging and Community Input (see below for more details)
- Pilot helped the LC staff gain experience with Web 2.0 online interactions with 'patrons'
Since the beginning of the project i have been very interested in learning about some of the outcomes that the project would provide in regards to user tagging versus applied controlled vocabulary through traditional bibliographic cataloging. In the report the share that they used the Flickr API to do deeper analysis of the tagging that was done by the community (see pages 19-24 of the full report) based on nine categories that provided some interesting insight focused on issues commonly cited in comparisons of social tagging vs. assignement of controlled vocabulary terms(page 28). The categories analyzed were:
I. LC description-based (words copied from the Library-provided record): e.g., titles,
names, subjects, etc.
II. New descriptive words (words not present in the Library-provided description):
- Place: e.g., cities, counties, countries, natural feature names
- Format (physical characteristics of the original photos). Sample tags: LF, large format, black and white, bw, transparencies, glass plate
- Photographic technique. Sample tags: shallow depth of field
- Time period. Sample tags: wartime, WWII, 1912
- Creator name: e.g., photographer’s name
III. New subject words (words not present in the Library-provided description):
- Image (items seen in the image itself). Sample tags: cables, trees, apples, windows, hat, yellow
- Associations/symbolism (phrases and slogans evoked by the image). Sample tags: Rosie the riveter, Norman Rockwell, We can do it!
- Commentary (revealing the tagger’s value judgments). Sample tags: Sunday best,
- proud, dapper, vintage.
- Transcription (transcribing words found in items such as signs, posters, etc., within the photo)
- Topic (terms that convey the topic of the photo). Sample tags: architecture, navy, baseball, story
- Humor (tags intended to be humorous rather than descriptive) Sample tags: UFO, flying saucer
IV. Emotional/aesthetic responses: (personal reactions of the tagger). Sample tags: wow,
pretty, ugly, controversial
V. Personal knowledge/research (tags that could only have been added based on knowledge or research by the tagger, and that could not have been gleaned solely from the description provided or examination of the photo): For example, the tag murder used on a portrait of someone who was later murdered or tags added for the specific county when that information was not part of the description.
VI. Machine tags (added by the community not Library-supplied): e.g., geotags and Iconclass tags
VII. Variant forms (representing terms already tagged but in a different form, such as synonyms (e.g., WW2, WWII, World War II, worldwarii) or plural/singular differences (e.g., transparency/transparencies)
VIII. Foreign language (tags in foreign languages/scripts, whether they are translations of English-language tags, or new tags)
IX. Miscellaneous (tags that are not readily understood, that provide corrections to LC descriptions or to other taggers (e.g., not peaches), or tags later removed
Some of the Future Tag Analysis Interests (page 29) are also quite interesting such as actually incorporating popular concepts or variants into the LC's own controlled vocabularies (yeah something i advocate in the hybrid approach!), bringing the tags into the LC's search environment, populate bibliographic records with tags (although that have already added the Flickr URL to the "additional version available" field (MARC field 530) in some catalog records which leads users to the appropriate Flickr page that might provide historical information etc. on the image that is vaulable- see sample on page 36).
In the report they also share some of the experiences the staff learned from using Web2.0 tools in interacting with patrons that might be different from the traditional reference desk exchanges (page 37).
The good news? Skip to page 38 of the full report to see the recommendations and conculsions including details of headcount that is necessary for the program to continue and expand. But the report ends with the following good news:
"It should come as no surprise, then, that the Flickr team recommends that this experiment in Web 2.0 cease to be characterized as a pilot and evolve to an expanded involvement in this growing community (and other appropriate social networking opportunities that may arise) as resources permit. The benefits appear to far outweigh the costs and risks. "
The entire set of tags that have been applied can be seen alphabetically or as a tag cloud of the 150 most popular tags.
Many thanks to the Library of Congress staff for taking on this project and continuously sharing their progress through their blog as well as other resources (see Appendix C) and to the authors of the project report: Michelle Springer, Beth Dulabahn, Phil Michel, Barbara Natanson, David Reser, David Woodward, and Helena Zinkham!
Hybrides à la Barbosa: ebook on Taxonomies and Folksomies now Available in French
Anonymous — January 5, 2009 - 12:00pm
2008 treated my ebook on Folksonomies and Taxonomies extremely well and lead to some great conversations with colleagues and clients about the 'advantages' of user tagging when approached via hybrid routes in the Enterprise that i will be sharing here with you in a future posts.
In addition to the inclusion in many publications, including DMReview I was also interviewed for a ReadWriteTalk Podcast about why i wrote the book. Alot of great feedback was received not only about the content and the message of the ebook but the gorgeous layout and format that our design team put together and for a treat our marketing department also had these great aprons made for the Taxonomy Bootcamp sessions.
This year a translated version of the ebook is out in French titled: Le Livre De Cuisine De La Taxonomie Et De La Folksonomie which i am extremely excited about because it reaches out to a whole new market for my European colleagues. (although i admit i do not speak French!)
Hope you enjoy it- Merci!
Synaptica and Dow Jones Taxonomy Services Video Collection: Summary: Here you will find videos that have been either produced by Dow Jones or feature a Dow Jones employee or customer discussing the topic of the development, management and governance of controlled vocabularies. This includes customer case studies, conference presentations and panel discussions and product demonstrations.
November 2008 Synaptica: SharePoint Integration In November we announced our new SharePoint Integration . This video takes you through a short demo of the SharePoint Integration: http://blip.tv/file/1475940
September 2008 Synaptica Case Study: Proquest: Finding a Common Language: Bringing Complex and Disparate Vocabularies Paula R McCoy, Manager, Taxonomy Development, ProQuest Daniela Barbosa, Synaptica Business Development Manager, Dow Jones Client Solutions, Dow Jones & Company This case study addresses the challenges ProQuest faced in managing multilingual controlled vocabularies using multiple Word documents and authority files maintained in an Oracle database. Speakers describe how implementing a thesaurus management tool helped ProQuest simplify and standardize its business semantic management to create a common language and connect disparate information assets as well as handling large and varied vocabularies and authority files, linking new and existing editorial systems and enabling hierarchical views, and automating thesaurus management tasks.This session was sponsored by Dow Jones Synaptica. http://blip.tv/file/1306890
September 2008 Centralized Taxonomy Management for Enterprise Information Systems Daniela Barbosa, Synaptica Business Development Manager, Dow Jones Client Solutions, Dow Jones & Company Paula R McCoy, Manager, Taxonomy Development, ProQuest Now that you have built your taxonomies, you need to manage and maintain them in a centralized environment that can be leveraged by all of your enterprise applications including search tools, portals, and CMS/DMS systems. This session will review some best practices in centralized taxonomy management and go through the implementation of a thesaurus management tool at ProQuest, which enabled them to create a common language to connect disparate information assets using large and varied vocabularies and authority files linked to new and existing editorial systems. This session was sponsored by Dow Jones Synaptica. http://blip.tv/file/1307166
March 2008: iKMS: Marti Heyman on ROI Analysis for Taxonomy Programs Video by: Patrick Lambe www.greenchameleon.com In this talk for the Information and Knowledge Management Society of Singapore (www.ikms.org) on 13 March 2008, Marti Heyman Director of Taxonomy Services at Dow Jones, discusses the problems associated with ROI for taxonomy programs, and the key steps in ROI analysis. In this first part she discusses the issues around ROI. This session was sponsored by Dow Jones Synaptica. Part 1 of 3: http://blip.tv/file/917758/ Part 2 of 3 : http://blip.tv/file/917962/ Part 3 of 3: http://blip.tv/file/917979/
March 2008: iKMS: Christine Connors on User Driven Taxonomies Video by: Patrick Lambe www.greenchameleon.com In this talk for the Information and Knowledge Management Society of Singapore (www.ikms.org) on March 13 2008 Christine Connors Director of Semantic Technologies at Dow Jones and Business Champion of Synaptica, explains the rationale for a hybrid approach to taxonomy development, harnessing user inputs and activity as well as the traditional controlled approach, giving examples from her pioneering work at Raytheon. This talk was sponsored by Dow Jones Synaptica. In the first part, Christine gives a general rationale for a more user driven approach. Part 1 of 3: http://blip.tv/file/917603/ Part 2 of 3: http://blip.tv/file/917629/ Part 3 of 3: http://blip.tv/file/917691/
November 2007: Synaptica Case Study Abbott: From Taxonomy to Ontology: Laying the GroundWork for the Semantic Web Presented by Jennifer Borrell, Associate Information Scientist at Abbott Laboratories Jennifer takes us through how Abbott Laboratories uses Synaptica to build and maintain their Ontologies. Presents a high level overview of how Abbott views ontologies and how they are laying the Groundwork to Improve User Productivity. Sponsored by Dow Jones Client Solutions. http://blip.tv/file/482545
August 2007: Using Tools to Manage Taxonomies Video by: Patrick Lambe www.greenchameleon.com Dave Clarke, CEO of Synaptica (Synaptica/Synapse co-founder) In this video Dave Clarke describes how tools can be used to manage taxonomies, for an iKMS evening talk on 30 August 2007. In part one Dave describes how you can use tools to manage the collaboration required in building and maintaining taxonomies. In part two Dave describes how you can use tools to support the taxonomy creation process and in part three Dave describes how taxonomy tools can link different enterprise applications, including legacy taxonomies. Part 1 of 3: http://blip.tv/file/375135/ Part 2 of 3: http://blip.tv/file/375156/ Part 3 of 3: http://blip.tv/file/375196/
Tag, You're It - Keynote From Enterprise Search Summit
Anonymous — September 23, 2008 - 10:00am
I am at the opening day Keynote for Enterprise Search Summit West in San Jose today, rushing down from Pacifica on this beautiful morning driving (ok speeding) down 280 to make this early morning session. Obviously if you have been following me for a while over on my blog you know i have a 'thing' for social tagging and recently published an eBook on Hybrid approaches to Folksonomies and Taxonomies in the Enterprise so i did not want to miss it.

The Keynote is titled 'Tag, You're It: Social Tagging Strategies for the Enterprise' and is being lead by Gene Smith, Principal, nForm Experience. Gene is the author of the book 'Tagging People Powered Metadata for the Social Web'
Notes:
Why We're Here? (at the conference)
To figure out how to find *the good* Stuff
19th century explosion in paper records- flourishing of patten filings to store records and information. the one that emerged as the winner was vertical filing. folders and tabs where a key piece. Tabs in vertical filing are still seen in today's web User Interfaces.
Folders have been the dominant organizing principle - then links came into the scene.
Instead of Information explosion- think of it as a stream, immersion in the flow.
the challenge is keeping track and finding what we need later on - tags are
- fast
-simple
-social
-good enough
A tag (word) can mean a lot of different things.
Looking at different tools and why they are interesting:
Zigtag - semantic social bookmarking
When you are about to tag something, you type and pick from the list and it includes definition.
They have million of concepts- they mine public data sources for user generated content and built a inference engine to provide the concepts
LibraryThing
any person can make any two tags equivalent- but they can also remove it as well - "humor" and" "humour"- same word but different meaning in different cultures (america vs. UK) authors tagged to each are different.
Value chain of the LibraryThing features
>combine tags> tag mash search>tagsonomies (mapped to existing categories)
The big problem is getting people to use the tools you provide for them!
Cold-Start Problem-
- creating incentives- reward a person by identifying that that were the first person to tag or create social proof 'feature linker'- (who doesn't like to see their name in lights?)
- try to pre-populate the tag box- tags other people have used
Some other examples:
Wesabe - sticky tags- always applied to the item, but then allow 'not sticky' or one time tags. show you your spending habits by clustering your tags- giving benefits of the tags they used.
Dogear - built internally at IBM- architected it so it produces a RSS feed for every tag - what happened is that as people started using it- groups found interesting things to go with their RSS feeds like displaying the content into other environments- creating mashups- allowing innovation on the tags so that the value is created by the users needs.
Although this specific slide deck he used does not seem to be there yet- Gene shares his slides over at SlideShare and you can also follow Gene on Twitter @gsmith .
We Hit the Airwaves with a ReadWriteTalk Podcast
Anonymous — September 7, 2008 - 3:52pm
In early June, I published a ebook about hybrid approaches to Folksonomies and Taxonomies in the Enterprise that has been very well received. Beautifully designed, it provides a high-level overview on why companies should be looking towards user tagging as a part of their content strategies.
So last month when i was approached by ReadWriteTalk about being interviewed for a podcast on the subject of the ebook I was pretty excited and of course a bit nervous at the same time. ReadWriteTalk podcasts are just one of the many podcasts that i listen to on a weekly basis so i certainly did not want to embarrassed myself! But beyond stumbling over some words, i think i did a decent job discussing the reasons I wrote the ebook and highlighting what the ebook covers. Although it is my face on the cover of the ebook (i share some of the behind the scenes as to how that happened), I also spend some time talking about the design of the book and the wonderful team I worked with to get it produced and made available for free download for everyone.
Sean Ammirati interviewed me and did a great job of not only prepping me for the interview but also making me very comfortable as we began discussing the questions he had. The podcast was also transcribed so if you prefer to read it versus hearing me go on and on and on about why it is important to look at some of the benefits of hybrid approaches...you can.


