-
Notifications
You must be signed in to change notification settings - Fork 6
-
Hi all!
First of all, thank you for the amazing work on making MLflow accessible from Node.js! It's a great tool, and I'm excited to see how it can be applied in more projects.
I'm currently working on a project that leverages OpenAI extensively. Specifically, I use OpenAI to summarize my portfolio, process job descriptions (JDs) into contextual information, and then customize several items that will be used during interviews.
However, I'm facing a challenge: since my prompts are also being generated with the help of AI, it's become really difficult to track the versions of these prompts. Additionally, I've noticed that the AI occasionally "forgets" the context or rules I set for it, leading to inconsistent results.
For example:
- A prompt might successfully apply all the rules, but after a refinement or two, it might forget 1 or 2 rules that were applied correctly 3–4 prompts earlier.
This makes it really important to have a UI or dashboard—like MLflow's interface—to access prompt metrics and properly evaluate whether my prompts are working as intended.
That said, I noticed that MLflow's Python implementation includes an [LLM evaluation feature](https://mlflow.org/docs/latest/llms/llm-evaluate/index.html#quickstart), but I'm not able to find documentation or examples for using this feature in mlflow.js
.
Could anyone guide me on how to use the LLM evaluation features with mlflow.js? Is this currently supported or planned for future development? Any help would be greatly appreciated!
Thanks in advance!
Beta Was this translation helpful? Give feedback.