We all use different ways to find what we need online. One of them is to browse through information that has been organized for you. When you go to the Business section of the New York Times, or you look at the items for sale from a particular vendor, you are using a structure that somebody set up with lists or tags. It works well for managing content that is pretty uniform and where everything is controlled by a single person or group. Inside of a company, for example, maybe you have a set of case studies and you organize them based on the product line, geographic region, and customer industry they cover. But as we discussed in the last post, this model breaks down over time when you apply it to all the information in a dynamic organization. The structure gets more out of date, content isn’t put in the right places, and it gets increasingly impossible to find what you need.
This same problem happened on the Internet – every site was organized differently, and nobody could keep track of them all. Even if the perfect web page was out there, you couldn’t find it. Google came up with the solution that scaled incredibly well – use search instead. But it’s not easy to do, because of the sheer number of pages that match almost anything you type in. I tried “machine learning”, for example; even for that pretty narrow search, Google found 54,600,000 results. I obviously don’t want 54 million web pages – it would take me lifetimes to read them all. I want Google to figure out the best ones and put them at the top of the list. The magical thing that Google realized – the foundation of their initial success in search – was that people had already identified which of those millions of web pages are great, by hand-creating links to them. Pages with many incoming links are probably very good and are the most likely to answer whatever question I had in mind when I did my search.
But business information is largely composed of documents, and they don’t link to each other. So the approach that Google uses to find the best web pages isn’t going to work when you are looking for a spreadsheet. We had to find a different way to get data, which in machine learning terminology is sometimes called “signal”. Where do we get signal for business information? We can’t ask users to vote – practically nobody will do anything that doesn’t solve their immediate problem as quickly as possible. But they do “cast a vote” by spending time looking at, downloading, and making personal copies of the content that they need.
So our first and best signal is based on user activity – what they do, as part of getting their jobs done. If everyone is really interested in a particular deck and spending a lot of time looking at it, that’s a great signal of its relevance. Then we can go a lot farther. We provide many ways for users to work with content on Highspot, such as downloading the file or leaving a comment. These activities are further hints about the items that people find useful. If somebody does a search on the name of a product the company sells, the content matching that name and seeing active use is going to rank high in the search results.
The Knowledge Graph
What Highspot does is to use all the information we gather to create what we call the Knowledge Graph. It computes things like how much influence each person has on each of their co-workers. For example, if you follow me, and you have historically looked at a lot of items that I’ve uploaded, that’s a hint that content from me is a good bet for meeting your needs. Highspot takes in hundreds of hints like these, captures them in the knowledge graph, and uses them to rank the results of a search and to choose content for your home page. The more your company uses Highspot, the smarter we can be. The knowledge graph helps us connect everyone in your company with the best content that will help them get their job done. We want to help you manage all the content you have so you can put your knowledge to work.