-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Semantic Search Engine #2457
-
Hello!
I was interested in creating a semantic search engine for the Codecademy docs. I wanted to start the discussion about doing this. At the moment the ChatGPT hype has inspired a lot of projects where a client can basically talk to their documentation. Although this is a great solution for talking to ChatGPT based on the context of a document, I wanted to see if we could use a specific aspect of this design to make the docs more accessible. This aspect is the vector search. Pinecone has a good article that goes over the basics of what a vector search is.
From my understanding the docs currently do not have a search feature. Although it's UI is currently organized in a tree-like structure. As a result when users are searching for a specific document, they have to start with a broad topic and then narrow down their search to find the article they are looking for. This might be fine with the size of the documentation, but as the documentation scales it might become harder for a user to find the exact material they are looking for. In addition, this approach of starting broad, and then narrowing down the search poses a risk of not finding relevant articles. This is a missed opportunity for the user.
With vector search, a user can input a query (such as a sentence), and the search engine can return articles that has content that matches the user's query. This approach is more scalable, and even more precise than the current search method. Mainly because a vector database can serve articles that are within a specific threshold of similarity. Now, when a user queries "algorithms" they may get articles in JavaScript, Machine Learning, and Python, while previously they may have just narrowed down on Machine Learning.
I'd love to build this for the community! Although I think some buy-in is needed from Codecademy. Let me know what you guys think! More information can be provided upon request.
cc: @yangc95 @aherman91 @caupolicandiaz @HishamT @KTom101 @SSwiniarski
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1 -
🚀 1
Replies: 3 comments 7 replies
-
This is an amazing idea. When I started reading your proposal I was thinking we already have a search feature on the codecademy website and personally I've tried it myself. It gives you a plethora of results when you try ti search for keywords. Like this image
image
I've also noticed if you be more specific it returns the right result. But I love the idea where we can even get results on site using a sentences.
image
image
Also if we agree on working this then I would love for it to be open sourced 😅. I'll be reading the article now you gave a good thing to read while I'm travelling.
Beta Was this translation helpful? Give feedback.
All reactions
-
If this was a vector search, I would think the database would at least return articles that have "kotlin" in it.
Beta Was this translation helpful? Give feedback.
All reactions
-
❤️ 1
-
Haha yep to be honest I missed it the first time too. Though it doesn't work with the sentences as you said above. Wait I'll link the result of a search down below.
image
The results were from python and js.
Beta Was this translation helpful? Give feedback.
All reactions
-
Exactly right now the search doesn't give right results for sentences @mdwiltfong I think if higher ups decide to optimise the search engine then it would be a great contribution from your side 👍🏻👏
Beta Was this translation helpful? Give feedback.
All reactions
-
Exactly!
What's also interesting is that the broader your query, the more results. For example "C" returns over 250 results.
Beta Was this translation helpful? Give feedback.
All reactions
-
🚀 1 -
👀 1
-
Exactly!
What's also interesting is that the broader your query, the more results. For example "C" returns over 250 results.
If we think about it we can do so much about this. We can prioritise the results from the topics that a person is studying right now. Throw the results on top from the domain of a individual's learning paths if the topic they are searching for exists in docs or articles.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
Quick edit to my proposal.
From a tooling perspective most likely my suggestion wouldn't be a good replacement. Although if I were to pivot my proposal, I'm wondering if codecademy was interested in an AI assisted search feature, specifically targeted to their docs. Similar to what Langchain does in their docs
Beta Was this translation helpful? Give feedback.
All reactions
-
Hey @mdwiltfong thank you for putting so much thought into how to make the Docs experience better for people! I think this conversation is incredibly interesting, and have sent it along to members of our product development team for consideration.
Beta Was this translation helpful? Give feedback.
All reactions
-
❤️ 1 -
🚀 1