Using Models from the Chat panel
After installation, LM Studio facilitates the downloading of models from the Hugging Face Hub, including preset options.
For example, we can download the Zephyr 7B β model, adapted by TheBloke for llama.cpp's GGUF format.
Downloading a Model
Activating and loading the model into LM Studio is straightforward.
Loading a Model
You can then immediately start using the model from the Chat panel, no Internet connection required.
Chatting with Zephyr
The right panel displays and allows modification of default presets for the model. Memory usage and useful inference metrics are shown in the window's title and below the Chat panel, respectively.
Other models, like codellama Instruct 7B, are also available for download and use.
Using Codellama
LM Studio also highlights new models and versions from Hugging Face, making it an invaluable tool for discovering and testing the latest releases.
Accessing Models with APIs
A notable feature of LM Studio is the ability to create Local Inference Servers with just a click.
Local Inference Server
The Automatic Prompt Formatting option simplifies prompt construction to match the model's expected format. The exposed API aligns with the OpenAI format.
Here's an example of calling the endpoint with CURL:
curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "You are an AI assistant answering Tech questions" },
{ "role": "user", "content": "What is Java?" }
],
"temperature": 0.7,
"max_tokens": -1,
"stream": false
}'
The response provides the requested information:
{"id":"chatcmpl-iyvpdtqs1qzlv6jqkmdt9","object":"chat.completion","created":1699806651,"model":"~/.cache/lm-studio/models/TheBloke/zephyr-7B-beta-GGUF/zephyr-7b-beta.Q4_K_S.gguf","choices":[{"index":0,"message":{"role":"assistant","content":"Java is a high-level, object-oriented
programming language that was first released by Sun
Microsystems in 1995. It is now owned by Oracle Corporation.
Java is designed to be platform independent, meaning that it
can run on any operating system that has a Java Virtual
Machine (JVM) installed. Java's primary applications are in
the development of desktop applications, web applications,
and mobile apps using frameworks such as Android Studio,
Spring Boot, and Apache Struts. Its syntax is similar to
C++, but with added features for object-oriented programming
and memory management that make it easier to learn and use
than C++. Java's popularity is due in part to its extensive
library of pre-written code (known as the Java Class
Library) which makes development faster and more efficient."},"finish_reason":"stop"}],"usage":{"prompt_tokens":0,"completion_tokens":166,"total_tokens":166}}
This feature greatly aids in testing integrations with frontends like chatbots or workflow solutions like Flowise.
Conclusion
Although not open source, LM Studio is a robust addition to your local toolkit, allowing you to easily experiment with and adopt models from Hugging Face. Its user-friendly interface and versatile features make it an essential resource for anyone looking to delve into the world of large language models.
Video
You can also check the video tutorial from my GenAI's Lamp YouTube series.
[フレーム]