How to improve search on ?

The goal for any content driven website is to provide its users with relevant content based on Information Discovery. Most of the websites choose ‘search’ as a tool for information discovery (Netflix is a mind blowing exception to this) and in search, relevance is heavily dependent on context rather than just strings present in them. Not many websites realise this and I was surprised to see that , known for having one of the best content repositories,  giving barely useful results for my search queries.

Now, I am a happy subscriber of Harvard Business Review and have been using it often to enhance my limited knowledge by discovering interesting content around technology, business, leadership etc. and have absolutely no doubt on the high quality of its content and really great contributors. But, I am disappointed at how that content is becoming difficult to discover in the first place. Here is how its search looks like: search

All looks well, it has a  very familiar interface, like most of the other sites have and the first result does have the word I searched for “search”. Don’t go away yet, stay with me till the end 🙂

In my quest to learn more and follow everything about Artificial Intelligence I keep looking everywhere for anything related to AI. I thought let’s try what all is there in repository on AI. I search first for artificial intelligence and look at the results:

First result is a “Sales and Marketing” case study about an early stage company Empathetics (an organization that teaches empathy to healthcare professionals and staff to improve the patient experience) and I wonder why is that the top result (when sorted by relevance!) for what I searched for. I open it, scratch my head really hard to figure out what exactly is related to AI there but could not find anything. I scroll down and to my despair, the other results are also out of the world for me. Here they are:


Then I thought of trying the ‘exactness’ trick – I searched for “Artificial Intelligence” and boom!

ZERO results

ZERO results! Apparently it does not support exact string search, which ideally it should. I knew this is simply not true as does have articles on AI. Examples:

It gives irrelevant results in search even though the right content is very much there in repository.

Moving on, I tried taking my chances on AI – I search for ‘AI’. Here is what happens:

Do you notice it? There is an author Ai-Ling Jamila Malone whose name contains “AI” and simply is showing me all articles from the author. This means it is giving a higher score (probably) to words found in a wrong field (author field)  than the content itself and that too without any context.

Now it could have been deliberately done assuming most of the people want to search for names of authors but hey, that use case CAN be handled in a better way.

Moving further, I check for another hot topic – Deep Learning – and default (sorted by relevance) results appear to be relevant (see now it shows me results for artificial intelligence as well).

But the articles are old and I needed the latest ones – the moment I try sorting by publication date I see this:

the first result is this – – which may seem somewhat relevant given it has AI in its title but it is not even remotely related to “Deep Learning” and the only reason it appears in search results is because there is a word “deep” in one paragraph somewhere

and there is another paragraph with the word “learning” somewhere.

The other results are same –  — this tops the chart of weirdness for me.

they have the words “deep” and “learning” somewhere and hence it is being shown. An important point to consider – I want latest but still relevant results and fails at it. It does not identify ‘deep learning’ as a concept made of two words and is simply looking up the words appearing somewhere in the content.

The root causes for all of it can be summarised as follows:

  1. The search at is still relying on basic keyword based scoring and has no Ontology of concepts like “Artificial Intelligence” or “Deep Learning” or the relationships between the concepts.
  2. It does not account for synonyms and hence is unable to understand that “Artificial Intelligence” and “AI” are same concepts. An Ontology makes it much easier to maintain all synonyms of any given concept.
  3. It does not identify entities so is unable to differentiate between name of a person “Ai-Ling Jamila Malone” and a concept “AI”

Based on how we have designed information discovery for our Data as a Service (DaaS) platform I can say that the primary reason for all of the problems is the missing Ontology leading to missing Entity Recognition and disambiguation.

Search is important but in itself is not always the best way for information discovery. Users don’t get happy at seeing millions of search results for what they search – that only adds to information overload. What matters is, if you are telling me there are so many possible results out there, then tell me how they are distributed across different dimensions around my ‘interest’. Let me choose a direction and don’t force me to keep going through all the results in a linear fashion – nobody will live through to get to the end of the millionth result page. And not just Innoplexus but I know there are few other companies out there who are following this philosophy and making Information Discovery easier for their users. One of the examples I quoted above as well is Netflix.

True information discovery tool has to be a Digital Gyroscope helping one to explore the Data Universe by giving a sense of all possible directions and saving one from getting lost in the hyperspace.

Thanks to Ravi Ranjan for reviewing the article and helping with the headline 🙂

Looking at the world’s best restaurants for Innovation

I came across this article on HBR today which looks at the world’s best restaurants to find out how they balance Innovation and Consistency.

Here’s the summary :

Despite being able to charge hundreds of dollars for a meal and being fully booked months in advance, top restaurants often still have a hard time turning a profit. And they face an added challenge of maintaining flawless consistency, while simultaneously being innovative and cutting-edge. This requires dedicated time and space for research and experimentation, as well as a thorough process for both iterating on and standardizing new inventions. Examples of restaurants that have made  both the Michelin Guide and 50 Best Restaurants of the World list show how they encourage creativity and learning beyond the leadership or lab teams, and generate, refine, and standardize ideas.

The original article is here –

All the projects follow a specific development process, alternating between collective ideation or feedback and focused work by a small team. For restaurant dishes, the development team will quickly prototype and iterate through numerous versions of the dish and its components, either in the lab or if a lab is not available, in the main kitchen during slow hours. The trials can go over for months as numerous variations are tested in a race against seasonal ingredients.

This just goes on to prove that Consistency and Creativity are not mutually exclusive – restaurants need to be both at the same time. Innovation which is not a result of a set process will  soon become unsustainable – one may be able to innovate even without process at times but no one can repeat the feat a number of times. That consistency is possible only with a process.