Blog Home

The Trouble with Tags

by
Posted in:  Artificial Intelligence, Content Management, Data Science, Editor's Picks, Marketing

Have a lot of content you need to manage and share, but nobody can find anything on your internal network?

Maybe there is an easy solution — if you could just tag each item the right way, then everyone can quickly find exactly what they need. It doesn’t sound that hard. First, you have to figure out the right tags to use. If you are dealing with marketing material, you might label everything with a type — “pitch deck,” “whitepaper,” or “price sheet.” You might label it with the set of products that it covers. And the relevant technology trends. And regions where it applies. And so on. Now you have to tag all your items. Well, not all of them — that would take way too long to be practical. So, you choose a small subset and hope that you picked the right ones — did you leave out anything that somebody might really need?

Even with just a small fraction of your content, it takes a lot of time to do all that tagging. But just maybe, with a little discipline, your problems would then be over.

We’ve talked to hundreds of customers who have tried it, and they’ve all had the same experience — after you use the system for a while, the results are dismal. And they get steadily worse. Why is that?

Everyone is an Australian Banker

Once upon a time, I was involved with a product that had a lot of users, and we wanted to get to know them better. So, we had a great idea — let’s create some content that they would really like, and the price for getting that content was to answer just two questions: what country are you in, and what is your occupation? Basically, we were asking everyone to tag themselves.

We waited impatiently for the results to pour in so we could really start getting to know our users. We were a little surprised to discover that a huge percentage of them were Australian bankers. That was a little surprising for a product mostly used by IT folks in the developed world. The reason, of course, was that Australia was one of the first countries in the dropdown, and banking was the first occupation.

Our users had zero interest in jumping through pointless hoops to satisfy our curiosity — they wanted the content, and they wanted it now. They did whatever they could to get rid of our irritating questions. Very few people are willing to put up with overhead that doesn’t benefit them immediately, and that’s what tagging systems often feel like. So, most things never get tagged at all — people try to dodge the system as much as they can. When you force them to use it, they often do the quickest possible thing to make it go away, and you end up with a lot of documents about banking in Australia.

All Things to All People

Customers also run into the opposite problem. We were visiting one of them who was using a tagging system, and asked how many pieces of content they had in the system. “There are 2,204 items in it.” Then we asked for a common term somebody might actually search — “How about ‘cloud platform?'” Great, let’s search for that, we suggested. They did. How many items came up? “Ummm, 1650.”

The problem is that the people authoring content are very excited about it, and they want everyone to find it.  Which sales stage is this whitepaper good for?  All of them, of course! So, they pepper the item with every tag that might possibly apply. Then when you search, you get back practically every item in the system.

Wait, Which Tag Am I Supposed To Use?

Another problem is that the list of tags is never right. They would have to be defined by some all-knowing expert who understands every activity happening in the company. This remarkable person knows every important characteristic that people might need to look for in a document, and they keep up instantly with the changing nature of the business. Did you refactor your sales territories? Change the products you offer? Have a new technology emerge? A new competitor?

The tag wizard leaps to the rescue, instantly updating the tag hierarchy when that happens. And what about all the content out there that is now using the old out-of-date tags? Who updates all of those? The reality is that nobody can design and maintain a tag system that keeps up with the pace of modern business, and nobody has time to update all the tags in the system whenever the business changes. So, the tags that you are using, and the tags that every piece of content has on it, get more and more out-of-date and disconnected from reality. It’s just not a workable model at any reasonable level of scale.

There is a Better Way

This same problem happened on the Internet. A company called Yahoo got famous for building a directory to the Internet — they had teams of people on staff, tagging every website and the best webpages. It sort of worked for a while, but as the Web exploded in size, even a large team of dedicated and expert taggers had no hope of keeping up.

Then, along came Google with a completely different approach. Instead of humans tagging pages, they applied machine learning to the problem. Thus was born search that actually works. And it gets smarter over time — more content and more users makes it better, not worse. How to apply that model to your data is the topic of our next post.

Industry-Leading Enablement for Every Business

Request a Demo