Enterprise Search is Better Than You Imagine
(And You Need It More Than You Realize)
An ideal enterprise search system (also known as “cognitive search” or “insight engines”) quickly retrieves documents, conversations, and other content, relevant to the searcher, across all important internal knowledge systems. Historically, “What is the PageRank equivalent at work?” has not been well-answered. Many challenges have prevented such system from actually existing, such as:
- Building integrations for every knowledge system
- Representing and enforcing fine-grained permissions in real time
- Understanding who the user is, and therefore what knowledge is most relevant to them
But things have changed. With the dominance of cloud-based collaboration products, modern APIs built into those systems, and universal identity across systems, it’s finally possible to solve enterprise search effectively.
The last quarter-century in workplace search
Google’s 1999 post-raise business plan focused on three revenue streams:
- Showing ads
- Licensing search technology
- Enterprise search
That last piece ultimately shipped as Google Search Appliance, a product which floundered for years before finally being retired in 2016. Despite a tremendous amount of effort, Google failed to overcome major obstacles and solve key technical challenges:
- A lot of the content lived in files on people’s hard drives (or in the best case, on file shares)
- Usernames differed across disparate systems
- Most systems had no APIs for getting data out
- Most systems also had limited or no ability to export permission data
Even in theory, if an organization successfully deployed agents to get the data, they couldn’t really figure out what information was most relevant to individual employees, what that employee should even have access to, and most importantly: “What results should go on page 1?”
Google certainly wasn’t the only player to try and fail in this space. Millions of dollars have been spent on fundamentally broken solutions.
A confluence of several factors has vastly improved today’s landscape for enterprise search:
- Organizations have coalesced on a set of best-of-breed collaboration tools with modern APIs (Slack, Google Workspace, Teams, etc.)
- Common identity across systems (Okta, Azure AD, etc.) have made it practical to deeply understand identity across diverse systems
- Modern hybrid search engines (such as Vespa) offer robust scalability, sophisticated search, and machine learning, which older technologies (such as Elastic or Solr) lacked
At the same time, some of those same macro trends have dramatically increased the need for great enterprise search:
- The explosion of SaaS tools has tended to silo knowledge as well as make it more and more painful to find helpful knowledge at work. Employees frequently run the same search across multiple knowledge silos.
- The dramatic shift to remote/hybrid work makes it difficult for employees to visualize social connections and intuitively understand “who knows what” in organizations
So now we have the necessary pieces in place combined with a burning need to solve the core problem: “How do I put the thing the user wants on the first page of the results?”
All sorts of attempts to solve search relevance at work - TF-IDF, BM25, the semantic web, PageRank for work, and even modern semantic search - have been largely unsuccessful. Those approaches tried to solve the problem from an “information-only perspective” - take all of the text, put it in an index, and try to use the text data in isolation to create a relevance model across all the content.
The biggest problem with this approach is that it ignores the clearest signal that a user might care about a given document or conversation: that it involves themselves or closely-connected coworkers! It turns out that enterprise content creation and distribution closely resembles social networks - so what if we apply social network techniques to enterprise search?
Globally important content (all-hands meeting notes, important company-wide policy documents) need to highly visible when relevant, in the same way that viral, popular social posts get high visibility. But most of the time, people really care more about things directly relevant to them personally:
- their own stuff
- their team’s stuff
- stuff directly related to their day-to-day work
This breakthrough insight (marrying a dynamic social graph to modern hybrid search techniques) is the final piece of the puzzle, turning this immense challenge into a tractable set of problems.
What should be shown on the home page, before the user even types a search query? The social media model suggests a crisp answer: show the results most relevant to the user.
I think it’s safe to say that everyone is looking forward to a world where the wisdom that a company knows is usefully in the hands of everyone. After all, as HP CEO Lew Platt famously said:
If HP knew what HP knows, we’d be three times more productive.
David Lanstein is co-founder and CEO at Atolio - workplace search for the modern company.
Atolio is the first good enterprise search tool I've seen, among dozens of failed attempts over two decades. By taking a fresh approach using the collaborative graph inside organizations, Atolio is finally doing for enterprise search what Google did for the web: finding what you want.