Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Semantic Search Engine #2457

mdwiltfong started this conversation in Ideas
Jun 10, 2023 · 3 comments · 7 replies
Discussion options

Hello!

I was interested in creating a semantic search engine for the Codecademy docs. I wanted to start the discussion about doing this. At the moment the ChatGPT hype has inspired a lot of projects where a client can basically talk to their documentation. Although this is a great solution for talking to ChatGPT based on the context of a document, I wanted to see if we could use a specific aspect of this design to make the docs more accessible. This aspect is the vector search. Pinecone has a good article that goes over the basics of what a vector search is.

From my understanding the docs currently do not have a search feature. Although it's UI is currently organized in a tree-like structure. As a result when users are searching for a specific document, they have to start with a broad topic and then narrow down their search to find the article they are looking for. This might be fine with the size of the documentation, but as the documentation scales it might become harder for a user to find the exact material they are looking for. In addition, this approach of starting broad, and then narrowing down the search poses a risk of not finding relevant articles. This is a missed opportunity for the user.

With vector search, a user can input a query (such as a sentence), and the search engine can return articles that has content that matches the user's query. This approach is more scalable, and even more precise than the current search method. Mainly because a vector database can serve articles that are within a specific threshold of similarity. Now, when a user queries "algorithms" they may get articles in JavaScript, Machine Learning, and Python, while previously they may have just narrowed down on Machine Learning.

I'd love to build this for the community! Although I think some buy-in is needed from Codecademy. Let me know what you guys think! More information can be provided upon request.

cc: @yangc95 @aherman91 @caupolicandiaz @HishamT @KTom101 @SSwiniarski

You must be logged in to vote

Replies: 3 comments 7 replies

Comment options

This is an amazing idea. When I started reading your proposal I was thinking we already have a search feature on the codecademy website and personally I've tried it myself. It gives you a plethora of results when you try ti search for keywords. Like this image
image

I've also noticed if you be more specific it returns the right result. But I love the idea where we can even get results on site using a sentences.
image
image
Also if we agree on working this then I would love for it to be open sourced 😅. I'll be reading the article now you gave a good thing to read while I'm travelling.

You must be logged in to vote
7 replies
Comment options

If this was a vector search, I would think the database would at least return articles that have "kotlin" in it.

Comment options

Haha yep to be honest I missed it the first time too. Though it doesn't work with the sentences as you said above. Wait I'll link the result of a search down below.
image
The results were from python and js.

Comment options

Exactly right now the search doesn't give right results for sentences @mdwiltfong I think if higher ups decide to optimise the search engine then it would be a great contribution from your side 👍🏻👏

Comment options

Exactly!

What's also interesting is that the broader your query, the more results. For example "C" returns over 250 results.

Comment options

Exactly!

What's also interesting is that the broader your query, the more results. For example "C" returns over 250 results.

If we think about it we can do so much about this. We can prioritise the results from the topics that a person is studying right now. Throw the results on top from the domain of a individual's learning paths if the topic they are searching for exists in docs or articles.

Comment options

Quick edit to my proposal.

From a tooling perspective most likely my suggestion wouldn't be a good replacement. Although if I were to pivot my proposal, I'm wondering if codecademy was interested in an AI assisted search feature, specifically targeted to their docs. Similar to what Langchain does in their docs

You must be logged in to vote
0 replies
Comment options

Hey @mdwiltfong thank you for putting so much thought into how to make the Docs experience better for people! I think this conversation is incredibly interesting, and have sent it along to members of our product development team for consideration.

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /