taxonomies
Synaptica and ProQuest Present at Taxonomy Bootcamp, 2009
Jim Sweeney — December 9, 2009 - 12:02pm
Synaptica's CEO, Dave Clarke was proud to join ProQuest's Paula McCoy for a presenation at Taxonomy Boot Camp this year. Their topic was one that often strikes a chord with anyone dealing with the management and indexing of content: "Taxonomies: Tools or People".
It discusses some of the pros and cons of the machine indexing of content versus manual indexing. The presentation was well attended and well received and we are happy to be able to share it with you here.
Should you have any questions on this topic or anything else having to do with the creation, management and dissemination of various types of controlled vocabularies for your organization, please don't hesitate to contact us for more information. We look forward to hearing from you.
Taxonomy is key to Effective ECM
Jim Sweeney — April 22, 2009 - 1:32pm
I recently attended a seminar on the 10 Steps to Business Efficiency with Content, Collaboration and Process given by the good people at AIIM (http://aiim.org) all about ECM strategies and best practices. This was a free seminar, well organized and well attended by a broad spectrum of representatives from all types of organizations, large and small, new and old industries. The topics of discussion too ranged from the most effective way to digitize archival assets; to applications to better allow for federated search across various data repositories; and then there was certainly a lot of discussion around what has become the most ubiquitous of ECM type applications, Microsoft SharePoint.
There were of course the usual quotes and statistics from AIIM, Forrester and Gartner regarding information proliferation and management today: The amount of data being produced is doubling every 18 months; 80% of this data is unstructured and 90% of that is entirely un-managed.
An interesting quote that I will paraphrase here was attributed to Thomas Washington , "The pursuit of knowledge in an age of information overload is less about the process of acquisition than it is about a proficiency of tossing things out." And regarding the storage of all of this information another interesting fact was thrown out: while 1 GB of storage may now cost an average of 20 cents, it costs $3,500 to review that same 1 GB of data and start to make sense of it in the context of your business. (AIIM)

As I listened to the various presentations and vendors I was struck by one thing: none seemed to offer a unified solution for using taxonomy more effectively to structure, classify and categorize the content that was going into these vast data repositories. Certainly it was agreed that there was value to such a process, but it is something that many organizations have still not recognized as absolutely necessary to fundamentally improve the tagging, organization and discovery of information within these huge libraries of data, documents, and other media.
It is our opinion that the integrated use of taxonomy applied to ECM applications, as well as across the rest of the enterprise, using a centralized and standardized set of vocabularies for navigation, search, discovery, meta-tagging and many other applications is a necessity in moving towards a unified means of data normalization and discoverability. To achieve this we offer services to get companies started as well as tools like Synaptica with out-of-the-box integrations to tools like SharePoint, but also more generic means of integrating with external applications via simple APIs and Web Services.
As the proliferation of data only increases over time and the means of digitizing archival records or utilizing native electronic formats becomes more efficient, storage becomes less a matter of cost and more a matter of management. The efficient means of identifying, tagging, categorizing and sorting information will be key to the effective operation of any organization.
A couple months back, my colleague also wrote up the 10 Rules of Successful ECM Implementation after attending an AAIM seminar that we have found quite useful in talking to business and technology owners about content access strategies.
We see many of our customers at the forefront of addressing these issues and working with them, we continue to work towards providing better and easier ways for data managers and end users alike to find what they are looking for. We look forward to sharing some of these use cases as well as hear from you on your successes and struggles!
Image| Flickr | ul Marqa
A CMS is not a Taxonomy Management Tool but a CMS Needs a Good Taxonomy
Daniela Barbosa — December 10, 2008 - 6:08pm
Today on a phone call, I used a point that i often use- "you can build the most 'beautiful' taxonomy ever but if you have nothing to use it for- it is not going to do you any good". One of the common uses we see for a taxonomy is to use it in conjunction with a Content Management System (CMS) and many of our existing clients have our Synaptica tool integrated into their CMS systems.
Recently at Taxonomy Bootcamp, Stephanie Lemieux from Earley & Associates and Charlie Gray from Motorola presented a great session on 'Integrating Taxonomy with a CMS for Dynamic Content' in which on slide 12 Stephanie pointed out:
~~~~~~~
Important note....
A CMS is not a taxonomy management tool
-Most requirements will not be met by the CMS, even the big players
-External tool needed to manage taxonomy versioning, scope notes, associative relationships, and more
-CMS taxonomy management is very SLOW…
---1 term with 5 synonyms & 5 translations = 3 minutes
-If the taxonomy is more than 1000 terms, an excel spreadsheet will quickly become unmanageable
---Worse if you are doing multi-lingual
~~~~~~~
The presentation went on to discuss other key aspects of taxonomy development for content management that i would encourage you to review. The reasons above that were presented as an 'important note' are just some of the reasons that many customers with robust CMS implementations use Synaptica to centrally manage their taxonomies.
In addition to the obvious core requirements in taxonomy creation and management that Synaptica covers, we also make available a little known add-on to the core product named the "Synaptica Indexing System"(IMS).
IMS is an add-on component designed to be used with the core Synaptica taxonomy and metadata management tool – and enables the human indexing of content against vocabularies stored and managed in the Synaptica system.
The Indexing Management System (IMS) can quickly be integrated with any content authoring/management tool that is already in place within your enterprise. IMS allows the content manager/indexer to search and browse the vocabularies that are stored and managed in Synaptica , dynamically building a “pick list” of indexing terms that are relevant to that piece of content.
Once the indexer completes the selection of indexing terms the IMS system passes those terms from Synaptica to the CMS to be stored as metadata. IMS can also simultaneously capture summary information about the piece of content and send it back to Synaptica to build a record within the Synaptica system itself. When IMS posts terms to the CMS it can also automatically expand the user-selected terms using related terms from the Synaptica vocabulary system. [Please see Workflow on the second page of this Spec sheet.]
In addition, editors can also submit candidate terms directly from the CMS system that will kick-off the established governance workflow for candidate terms- essentially producing a user tagging process for your key editorial staff without having to log into the Synaptica system directly to submit candidate terms.
So back to my point- the best taxonomy in the world is useless without a purpose and by having your content manager/indexers utilize a corporate wide central taxonomy that is stored in a centralized place like Synaptica, you ensure consistency and accuracy in indexing and identifying content across the enterprise.
I am always surprised when customers are blown-away by the IMS add-on and that they had never heard of that type of functionality and just today the client pointed it out why- we do have any marketing material for IMS on our product sites for this very valuable feature...so we need to fix that!
If you would like a demo of the IMS module or would like to learn more about how our other clients are using it to integrate into their CMS systems, please drop me a line daniela.barbosa@dowjones.com
In Developing a Custom Taxonomy Only Time Can Tell
Laura Dorricott — October 13, 2008 - 3:49pm
OK Quick Monday Quiz: How Many Minutes Does It Take to Create a Category (aka term, node, leaf, etc)???
I suspect that anyone who has worked on developing a taxonomy has heard this question or a variation of it. It seems like we get it daily! Once a client decides they need or want a taxonomy – they need or want it immediately so figuring out when becomes the next question.
After almost 30 years of being involved in the development of controlled vocabularies, thesauri and taxonomies I should be able to say it takes X minutes per term but I’m still forced to tell clients that it will depend on a number of things that are usually covered in the Assessment Phase of any engagement like:
• What is the topic of the taxonomy?
• What is its intended purpose?
• What systems will you use to develop and maintain it?
Once we’ve answered all these questions, the next one is frequently whether they could just use a taxonomy that is already developed. No matter what approach is ultimately chosen to create a taxonomy – it still takes time and the ultimate answer is that it depends on what the client needs, how many terms there will be, how technical those terms are and the taxonomy development tool that is being used.
Building a taxonomy for an area that you are familiar with can be done fairly quickly while building one on scientific, technical or medical areas might be much slower. Adding to the issue of the topic is the issue of the tool where the taxonomy is being built. The more efficient the tool the faster the development once terms have been decided upon and research for the terms completed.
Experience in developing taxonomies has given me some general metrics that can be used for pricing a taxonomy but the reality is that the best answer is that it all depends on what is needed.
So – how long does it take?? – it takes as long as necessary!!
Image|Flickr|h.
h.koppdelaney
Taxonomies are a Commodity
Christine Connors — October 6, 2008 - 5:48pm
For some reason or another (lots of travel, several hats at home and work) I've had trouble finalizing this post. Earlier today though, I read Paul Miller's latest post on ZDNet. There seems to be some discussion about whether or not data is a commodity. I think there IS most definitely data that are a commodity.
Taxonomies are a valuable raw material in the management of information. A file that can be bought and sold and used to improve services. They can be generated by humans, machines, or even better: humans working with machines.
Many taxonomies are a dime a dozen, with little to differentiate between versions of the same data. Some are like Kopi Luwak coffee - rare and extremely valuable. The word "taxonomy" is itself suffering from a kind of genericide. Classical definitions still apply: taxonomies have become commoditized.
The complexity of the controlled vocabulary will determine its value to a degree. A simple pick list should be easy and cheap to acquire - a list of countries, for example. Or colors, seasons, months - you get the idea. What is the value of a list of industries? Or companies? Maintenance is the primary cost factor - frequent changes require frequent updates, but an authority file in and of itself is not that complex. A broad and deep poly-hierarchical taxonomy I would expect to have more value. A poly-hierarchical taxonomy is one where a term in the taxonomy can have more than one parent term. Managing these relationships takes more time. An ontology - well, those aren't quite commodities yet, but they will get there. Why? Because they still require a great deal of thought and effort.
The source of the data will also help determine its value. Data from trusted sources - for whom integrity is paramount - should be valued higher. Is the data accurate? Is it maintained? Is it in a usable format? Does it have high availability? (Many quality vendors can be found at TaxonomyWarehouse.com.)
The uniqueness of the taxonomy will drive its value. Like our coffee example above, a taxonomy as ubiquitous as Starbucks will not be as valuable as say a pharmaceutical research vocabulary. Given the, uh, processes needed to produce Kopi Luwak, it is rare and therefore fetches a higher price, as would our R&D taxonomy.
The information security concerns also impact value. Our pharmaceutical company, or a financial services provider, is not about to release it's vocabulary into the wild. It is a significant intellectual asset that merits a substantial IT effort to protect.
I actually like the fact that taxonomies have become commoditized. Why? Competition drives improvement - in quality, in focus, in security and in usability. These are areas that the semantic web community needs to focus on - in my experience, security and usability need attention NOW. Good fences make good neighbors, and when we've got good fences, we can make more links and learn to trust. Icing on the cake!
Flickr image by INeedCoffee

