A CMS is not a Taxonomy Management Tool but a CMS Needs a Good Taxonomy

Today on a phone call, I used a point that i often use- "you can build the most 'beautiful' taxonomy ever but if you have nothing to use it for- it is not going to do you any good". One of the common uses we see for a taxonomy is to use it in conjunction with a Content Management System (CMS) and many of our existing clients have our Synaptica tool integrated into their CMS systems.

Recently at Taxonomy Bootcamp, Stephanie Lemieux from Earley & Associates and Charlie Gray from Motorola presented a great session on 'Integrating Taxonomy with a CMS for Dynamic Content' in which on slide 12 Stephanie pointed out:

Important note....
A CMS is not a taxonomy management tool
-Most requirements will not be met by the CMS, even the big players
-External tool needed to manage taxonomy versioning, scope notes, associative relationships, and more
-CMS taxonomy management is very SLOW…
---1 term with 5 synonyms & 5 translations = 3 minutes
-If the taxonomy is more than 1000 terms, an excel spreadsheet will quickly become unmanageable
---Worse if you are doing multi-lingual


The presentation went on to discuss other key aspects of taxonomy development for content management that i would encourage you to review. The reasons above that were presented as an 'important note' are just some of the reasons that many customers with robust CMS implementations use Synaptica to centrally manage their taxonomies.

In addition to the obvious core requirements in taxonomy creation and management that Synaptica covers, we also make available a little known add-on to the core product named the "Synaptica Indexing System"(IMS).

IMS is an add-on component designed to be used with the core Synaptica taxonomy and metadata management tool – and enables the human indexing of content against vocabularies stored and managed in the Synaptica system.

The Indexing Management System (IMS) can quickly be integrated with any content authoring/management tool that is already in place within your enterprise. IMS allows the content manager/indexer to search and browse the vocabularies that are stored and managed in Synaptica , dynamically building a “pick list” of indexing terms that are relevant to that piece of content.

Once the indexer completes the selection of indexing terms the IMS system passes those terms from Synaptica to the CMS to be stored as metadata. IMS can also simultaneously capture summary information about the piece of content and send it back to Synaptica to build a record within the Synaptica system itself. When IMS posts terms to the CMS it can also automatically expand the user-selected terms using related terms from the Synaptica vocabulary system. [Please see Workflow on the second page of this Spec sheet.]

In addition, editors can also submit candidate terms directly from the CMS system that will kick-off the established governance workflow for candidate terms- essentially producing a user tagging process for your key editorial staff without having to log into the Synaptica system directly to submit candidate terms.

So back to my point- the best taxonomy in the world is useless without a purpose and by having your content manager/indexers utilize a corporate wide central taxonomy that is stored in a centralized place like Synaptica, you ensure consistency and accuracy in indexing and identifying content across the enterprise.

I am always surprised when customers are blown-away by the IMS add-on and that they had never heard of that type of functionality and just today the client pointed it out why- we do have any marketing material for IMS on our product sites for this very valuable feature...so we need to fix that!

If you would like a demo of the IMS module or would like to learn more about how our other clients are using it to integrate into their CMS systems, please drop me a line daniela.barbosa@dowjones.com

Happy anniversary to me, happy anniversary to me...!

I have recently celebrated my one year anniversary with Dow Jones. It has been quite a year! I wear several hats here and that has given me the opportunity to meet a great deal of people in all areas of the business. How does it feel after a year? It feels great - I continue to be impressed with the caliber of talent exhibited by my colleagues. The domain knowledge, the business savvy, the passion for their work - it is very exciting and motivating to be surrounded by these people. 1200 Hats

Yes - I love working with the core Dow Jones teams: the product champions, the technical staff, the marketing, sales and strategy teams for Factiva, Newswires, the Wall Street Journal. Yes, I also think it's pretty cool to talk to folks in other parts of NewsCorp: MySpace, Slingshot, Fox Interactive.

Today though I want to highlight some of the people I work most closely with. I'll start with some you haven't met yet on our blog - my internally focused team of Metadata Managers who, with their teams, keep our content organized: Frances, Annika and Bouriana. Three very bright and talented women who have a significant impact on the structure of Dow Jones' Intelligent Indexing, they quietly and diligently work to improve the quality of our content indexing to ensure the most relevant documents are returned in Search and Discovery. They are the champions of new branches of our taxonomies, builders of our ontologies, curators of our primary intellectual assets. And everyone here wants to build on their work - it's a significant part of our metadata platform. Huzzah ladies - and thank you for your dedication!

Then there's Marti Heyman. Who you'd have met by now if Daniela had her way! (Only partially teasing here Marti!) Marti and I joined Dow Jones at the same time to fill the shoes of two incredible folks - Dave Clarke and Trish Yancey - who were moving on after seeing to the smooth integration of their company, Synapse, after it's acquisition by Dow Jones. I got the product side, Marti got the consulting side of Taxonomy Services. It's been my pleasure to have known Marti for several years. For a long time it's been a small world, this group of corporate taxonomists, and we've had the pleasure of speaking together, chatting on TaxoCoP calls, and now working together to take this organization to the next level, taxonomically speaking. Marti's depth of knowledge, experience, and willingness to roll up her sleeves continues to impress me. I also love that she gets a few bees in her bonnet! (Perhaps someday we'll have her tell you about why you can't use ROI as a success metric for taxonomies!)

Marti's team has been a great joy to work with too - Ian and Dan have some of the most sophisticated knowledge of practical applications of cataloging and classification I've encountered outside the academic and library world. They are a phenomenal resource for our consulting clients. And how can you not love someone who puts up thousands of Christmas trees - as Laura and her family do each and every year - with every ornament cataloged?! Now, that's a true taxonomy geek!

Of course, this being a blog for Synaptica, I cannot overlook a team that practically runs itself: Jim S., Jim D., Sean and Daniela are the folks who make Synaptica what it is. Jim and Sean are the core of our technical team, and have the ability to deliver excellent code and great customer service. Mostly, I love that they don't groan too much when Daniela and I dream up some crazy new idea! They are usually right there with us, and I appreciate their creativity and willingness to try new things with the product. Jim S. is the pillar of the team, our Product Manager, Customer Champion, Pre-Sales Support, Trainer, Chief Cook and Bottle Washer! He takes great pride in his work and is one of the best PMs it has been my pleasure to work with. What can I say about Daniela? I daresay most of you know her already. One of the next Robert Scobles, Data Portability advocate, Super Librarian, She Geek. Daniela is our Business Development Manager, and in the last year she has done more good for Synaptica and Taxonomy Services than I ever could have hoped for. She is a true customer advocate, true Dow Jones advocate, and isn't afraid to do what it takes to get the job done. I've said how glad I am to work with her before, and I'll say it again: she is a force to be reckoned with - work with her if you can!

There are so many other wonderful people here, I'm looking forward to getting to know them better. We have an incredible team, and I encourage you to reach out to them to talk shop, to talk tech, to talk business. We are one of the few companies with capabilities that run the full spectrum of content management: indexing & classification, taxonomy management, ontologies, content creation, integration, processing & delivery, archiving and user interaction; and we enjoy our work immensely. We look forward to hearing from you!

Flickr image by daintytime

Online Information 2008 Session: Proving your Value as a Research Team in the Current Financial Situation

Giving the first talk of the day at any event is never easy, but there was a good turnout at the Business Information Forum at Online Information 2008 this morning for my session on "Proving your Value as a Research Team in the Current Financial Situation". This is a subject close to my heart, and not just because of the services Dow Jones provides to many customers around the world, but because of the type of organisation we also are.

Within our own company, my group includes the Market Intelligence team, so we are ourselves a research group which has to prove its value every day. After a few generalities about the information landscape (clouded, turbulent and currently prone to violent eruptions), my talk today was mainly devoted to a case study of the "Research the Researcher" project which Dow Jones carried out to understand the challenges facing professional researchers.

In addition to pointing the way to areas where these researchers feel that they could add more value --focusing more on producing analysis and recommendations rather than gathering and organising facts -- it also showed us where their pain points are and how we can help. (Of course it was also an excellent example of how a research team -- mine -- can add value to organisation -- Dow Jones!) And it goes without saying that expertise in areas such as taxonomy and the organisation and management of an organisation's information assets is critical in adding value.

Overall impressions of the Online conference: fewer exhibitors perhaps, with smaller booths, and maybe not quite as much traffic. But that's only to be expected in the current environment. Nonetheless, there's still plenty of activity and plenty of buzz around the place.

If you are interested in the Dow Jones research study results on 'The Evolving Role of the Business Researcher', a recorded Webcast with Product Manager, Ken Sickles and Market Research Manager, Ellen Maccabe from October 2007 is available on demand.


Online Information 2008 Starts Tuesday December 2nd - Come Visit Us

I am approximately 5,371 miles (8,645 km) from London in sunny California but hope that even this far away i can enjoy and learn from the various outputs that i predict will be coming out of the Online Information 2008 Conference- including making some new connections by joining in on the conversation remotely!

Just yesterday on my Sunday afternoon walk i listened to a few episodes of the Panlibus podcast series leading to the conference including the Conference Chairman Adrian Dale's overview of the conference and a preview of Clay Shirky's keynote who I am a fan of and who is of course making some interesting predictions about the industry.

If you are at the conference there is a Crowdvine site setup where you can connect with fellow attendees which i will keep an eye on. But of course i plan to follow the Twitter conversation with this search i created of some of the possible hashtags that attendees will be using. There is also a listing of Bloggers on the conference site that will probably be blogging the conference as i am sure many others will be doing.

Our UK based Dow Jones team will be at the conference and in addition to attending the conference and exhibiting , Dow Jones will also be presenting these sessions:

Simon Alterman will be conducting a Seminar titled: Proving your Value as a Research Team in the Current Financial Situation - On Tuesday 10:30-11 in the Gallery Rooms where he will take a look at the changing nature of the Info Pro role within organisations and why the technologies and processes they are adopting can act as catalysts for growth. Simon is a dynamic speaker and a great advocate for the profession.

Mark Stapleton will be presenting on Effective News Integration for Better Business Decisions - on Wednesday at 15:30 in Theater A. Mark has years of experience in delivering solutions to clients and has lead our European team as they delivered some great solutions that drive our customers' bottom-lines.

And me?

Well i will be in sunny San Francisco at our office downtown and on Twitter (@danielabarbosa and @synaptica)- but if you stop by the Dow Jones booth- mention that you saw this post, drop your business card and tell them you want Daniela to send you some California sunshine- i will send you something special from California. See ya in the cloud! (the cloud where sun and rain don't matter that is!)

VideoSurf - a new way to search for video?

If you have been keeping up with my posts on this blog you won't be surprised to learn that today I spent my lunch hour exploring a video search offering that's new to me called VideoSurf. I was so interested in this new search tool that I interrupted my usual run of image indexing articles, and my lunch hour, to do some research and write up this post.

In a September press release VideoSurf claimed its computers can now, "see inside videos to understand and analyze the content." I would encourage anyone who has an interest in this area to take a look at the company's website, give it a whirl and see what they think.

Watch Vampire Videos Online - VideoSurf Video Search

In my experiences video search engines have relied on a combination of the metadata that is linked to the video clips, scene and key frame analysis, and automatic indexing of sound tracks synched with the video.

For example, sound tracks, synchronised to video content, can be transformed to text and indexed and then can be linked to sections of videos by looking for gaps in the video to identify scenes, with various techniques also used to create key frames, that attempt to represent a scene. These techniques are backed up with metadata to accompany a video clip.

If you have worked in the industry you know that video metadata is expensive to create. Most of what people see online is either harvested for free from other sources, or limited in size and scope. Such metadata may cover the title of a video clip, text describing the clip, clip length .etc. It may even include some information about the depicted content in the video or even abstract concepts which try to specify what a clip is about. Though this level of video metadata is the most time consuming and complex to create - it also offers the fullest level of access for users.

Audio tracks can be also be of great use and many information needs can be met by searching on audio in a video. There are however limitations; for example many VERY SCARY scenes have little dialogue in them, and depend heavily on camera-work and music to give the feeling of fear, how easy is it to find these scenes based on dialogue alone, or even based on 'seeing inside a video'. How can you look for 'fear' as a concept?

Content based image retrieval, looking at textures, basic shapes, and colours in still images, has yet to offer the promised revolution in image indexing and retrieval. In some contexts it works quite well, in many contexts end-users don't really see how it works at all. So adding a layer to video search that tries to analyse the actual content, pixel for pixel is an interesting development.

To my mind, a full set of access paths to all the layers of a video still demands the use of fairly extensive metadata, especially for depicted content and abstract concepts. Up to now, metadata has always been the way to find what an image, whether it's still or moving, is conceptually about, and what can be seen in individual images and videos. Even when that metadata is actually sounds, turned into text and stored in a database.

Is VideoSurf's offering really any different from what's gone before?

Is this system, which seems to be using Content-Based Image Retrieval (CBIR technology to some extent, a significant advance?

Reviewing some of the blog posts people have published it seems many others are interested in VideoSurf's offering as well.

For an initial idea as to how VideoSurf works, try taking a look at James McQuivey's OmniVideo blog post, "Video search, are we there yet?-. As James describes in the article, one pretty neat aspect of what VideoSurf can do is to match faces, enabling you to look for the same face in different videos, thus reducing the need to have the depicted person mentioned in the metadata exclusively. However, this clearly isn't much help if the person you're looking for is mentioned but not depicted, in which case indexed audio would help, or if the person is not well depicted, for example the person is only depicted from the side or the back. However, quibbles aside, if this works, then this is a pretty useful function in itself.

Here are some of the other bloggers who have be writing their thoughts on Video Surf. For example:

Clearly, we're on the right track and there is a lot of interest in the opportunities and technologies around video search. However I think that there is a long way to go before detailed and automatic object recognition is of any meaningful use to people. As far as I can see, it's still not there with still or moving digital images. Metadata for me is still the 'king' of visual search. There however are a growing number of needs that automatic solutions can already resolve and a growing case for solutions that work by offering a combination of automatic computer recognition of image elements, metadata schemes and controlled vocabulary search and browse support.

I'd love to know what people think, about VideoSurf and other services that provide video search.

How the Semantic Web Will Change Information Management: Three Predictions from fumsi

fumsi is a digital and print publication that provides resources and tools for people who "find, use, manage & share information" . They are part of the FreePint family of resources for professionals in the Information Management field. If you watch or subscribe to the Synaptica Central RSS feeds (right menu) you probably saw the recent pointer to the rich write-ups by James Kelway also published on fumsi on Creating User Centred Taxonomies. Jame's personal Blog User Pathways is also another must read blog if you want to learn about information management from a information architecture, interaction design, and user experience perspective which i believe is extremely important to do in today's user driven information experiences.

This Sunday morning's reading lead me to catching up on my multiple feeds and one that caught my attention was this article in fumsi by Silver Oliver who has a background in Library Science and is currently an Information Architect at the BBC titled How the Semantic Web Will Change Information Management: Three Predictions

Prediction number 1: a move from the pull to the push search paradigm, or more ‘context-aware’ applications

Today's information consumption, still starts mostly with information seeking and retrieval- processes that in today's fast moving, overloaded information companies and cost saving conscience enterprises are simply not sustainable in order to be competitive. If you happened to be a defrag this year and listened to my presentation on Pulling the Threads on User Data you heard me speaking about the need for context aware applications and standards to make data portable- ultimately leading to one of Silver's first predications that "The Semantic Web could assist in this area, by publishing data in a way that smart applications can take advantage of and so improve smart context aware recommendations. The right thing, at the right place and at the right time".

Prediction number 2: the battle of the identifiers or the age of pointing at things

Recently here on Synaptica Central, Christine Connors- Director of Semantic Technologies at Dow Jones, published a post that touched on this subject titled "Taxonomies are a Commodity " in which she ended her post with the following:
"I actually like the fact that taxonomies have become commoditized. Why? Competition drives improvement - in quality, in focus, in security and in usability. These are areas that the semantic web community needs to focus on - in my experience, security and usability need attention NOW. Good fences make good neighbors, and when we've got good fences, we can make more links and learn to trust. Icing on the cake!"

Prediction number 3: the changing role of the information professional

Silver ends this prediction with the following statement: "The skills of information professionals will be essential in populating and managing the Web of data and, to make this happen, we must make the shift from thinking repository-scale to thinking Web-scale."
Back in January 2008, i wrote a post over on my personal blog titled " Sexy Hot Trends for 2008 and Beyond- Librarians" where i highlighted some of the opportunities I saw for people with library science degrees (and no you don't need to be female and wear purple tights!- i just love that Super Librarian image!). So i obviously agree with Silver's prediction- the skill sets and experiences that information professionals can bring to the Semantic Web can be huge and I certainly hope that the Semantic Web community continues to cross-populate even more with the InfoPro communities- here at Dow Jones we are committed to doing our part to make sure that happens. Working with our InfoPro Alliance Group (headed by Anne Caputo the new SLA president ) we are looking to provide some Webinars in the new year to address Semantic Web issues that need to be addressed in Enterprise- by Information Professionals as well as other parts of the organization- so watch this space for more info as we finalize those sessions!

Image|Flickr|Leo Reynolds

Super Librarian Image above is from from the NJ State Library which includes the great Super Librarian Comic Book . You can also buy Super Librarian gear if you are so inclined.

Synaptica Announces SharePoint Integration

Our Synaptica product enhancement strategy is to continuously develop useful and innovative ways for our clients to use Synaptica for their taxonomy and metadata management needs. So it wasn't a surprise when some of our clients asked us to provide an 'out-of-the-box' integration point into SharePoint- we know first hand about the issues with managing taxonomies in Sharepoint from our own internal experiences as well as multiple client engagements over the last few years.

Microsoft SharePoint has over one-hundred million licenses in place and its adoption continues to grow globally. In 2007 an IDC survey of 300 companies found 61% were deploying SharePoint enterprise-wide, and that 28% of those using SharePoint in specific departments were expected to expand usage to the enterprise within the next 12 months- and a year later things don't seem to be slowing down.

With that kind of adoption and penetration across so many industries, it is impossible to ignore the impact that SharePoint is having as a portal for information and document sharing both internally and externally to the enterprise. As a result, Synaptica is proud to announce an integration with SharePoint that addresses some of the known pain points that users have when trying to successfully use taxonomies within SharePoint to tag, search and discover documents and other content.

In this short video overview we take you through the core elements of our Synaptica: SharePoint Integration :

View Video directly

With this Synaptica integration you can:

1) Import a complete vocabulary into SharePoint as a list: This feature provides for the import, and update, of a vocabulary (taxonomy, thesaurus, authority file, etc.) into SharePoint creating a new list which may then be applied as a column to be linked to content within a document library. As the vocabulary is updated within Synaptica, one may update the list stored in SharePoint to make sure that the most current information is being stored and applied as metadata to documents and content.

2) Provide Dynamic access to Synaptica allowing users to tag content : Employing Web Services this feature allows SharePoint to access a Synaptica system through the use of either a keyword search, or a navigable "tree browse" to allow users to find and locate specific terms and apply them as metadata. This dynamic access makes sure that SharePoint users are employing standardized terminology to tag content, where at the same time these vocabularies may be used across the enterprise and in other applications.

3) Provide Dynamic access to Synaptica for search and discovery: A SharePoint Web Part allows users to search or browse real-time through Synaptica vocabularies - using the same terms that have been applied to tag the content. This feature also can "direct" users to the proper terminology, as opposed to their having to guess at how a piece of content might have been tagged using an uncontrolled, free-text method.

With this initial iteration of the Synaptica: SharePoint integration, we hope to solve some of the biggest problems we hear about with users trying to better organize, tag and discover content within a SharePoint portal. We will be looking at expanding the integration over time and adding improved features as we learn more about how we can assist our customers and SharePoint users with these integral tasks.

For more information about this new integration and to see if you qualify for a free two week trial of Synaptica with the SharePoint Integration please contact daniela.barbosa@dowjones.com or use this Contact us form to submit your details.

10 Rules of Successful ECM Implementation

Last week I attended AIIM’s ECM seminar on Automating Document-centric Processes – Is SharePoint Enough?href> It was a really interesting and informative event, with a few general sessions, several presentations of case studies, and product demonstrations from various vendors in the ECM realm.

AIIM President John Mancinihref> closed the seminar with his 10 rules for successful ECM implementation:

  1. Build a strategy.
    When implementing an ECM solution, winging it is a bad idea. Especially if you are implementing a solution as viral as SharePoint, you should have a well-defined strategy. You should define business requirements, think about governance, analyze content systems, and identify points of integration. Formulating a strategy will save money and increase the likelihood of a successful project.
  2. Not all content is alike.
    You should think about the nature of the content you are trying to manage. Is it office-based content, transactional content, or persuasive/creative content? You need to pick a solution that matches your content.
  3. Prepare for eDiscovery.
    Sector-based regulations aren’t just a flash in the pan. Just because your business hasn’t had to deal with eDiscovery yet doesn’t mean you won’t have to in the future.
  4. Good enough is better than nothing.
    Doing something to get your content under control is better than doing nothing at all. You don’t have to start with the perfect solution.
  5. Ripping out and replacing is not usually a good starting point.
    This is especially true for more mature ECM organizations. If you have multiple repositories, you have to deal with them and think about policy structure around the information. Think about how you can provide access to information in those various repositories. Look for a vendor who will help with the integration challenge.
  6. Acknowledge the reality that this is a hybrid world.
    Paper is still part of the equation. Although we would like for everything to be digital, that is not the reality. Don’t get hung up on wanting everything to be digital—sometimes digitizing information can be too resource-intensive and unnecessary. Evaluate your strategy.
  7. Be militant about ROI and deployment times when thinking about projects.
  8. Consider alternate delivery models in your ECM approach.
    There will possibly be fewer IT people in the near future because of the economy. Consider hosted solutions as away to lower risk for management.
  9. Spend some time on standardizing the front-end of your processes.
    Consider things such as are you figuring out how to digitize things that should have been digital to begin with?
  10. Once you have something digital keep it that way.
    Why have a digital process all the way until you have to sign a document? Rather than moving from digital to analog and back to digital, consider processes that will keep content digital.

I found this list to be very relevant to some of the work I've been doing lately. Often I talk to clients who are implementing an ECM solution, but they haven't formulated a clear strategy yet. Organizations usually have content stored in several repositories, and employees don't know how to access that information, assuming they even know it exists. That's why we suggest an assessment prior to implementing a new solution. An assessmenthref> can be conducted internally if the resources are available, or our Taxonomy Services teamhref> can perform one for you. An assessment will help you identify your various content repositories and develop a strategy to access that siloed information.