Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

enhancement: improved cortex model hub #1927

Open
Assignees
Labels
Milestone
@ramonpzg

Description

The cortex hub currently shows the models that can be pulled using cortex pull <model-name> without any additional information about the model. In the future, we want the hub to provide high- and low-level details per model on different hardware where cortex can be deployed to and on commonly used benchmarks like MMLU or SWE on regular and custom hardware. The information provided won't be the same as the Model Cards provided by HuggingFace but more akin to a Menu of useful information where users can pick whichever recipe suits them best.

For example, each row will have a little drop-down arrow on the right-hand side:

Image

Each arrow will reveal a menu of metrics that the user can tweak for each model.

Image

Each model will have a dedicated page as well with additional information that won't fit in the table above. The table will slightly resemble a model card but will be focused on benchmarks alongside a mini-tutorial for usage with Cortex. We'll call it a "Bench Card."

Image

We would have this be fully automated.

  1. Pick a model
  2. Trigger script that
    1. Quantizes the Model
    2. Runs Benchmarks on Different Hardware
    3. Capture Results
  3. Have an LLM describe
  4. Generate a YAML file and populate it
  5. Add it to the Hub

For the YAML file, we can take inspiration from the one used in the model cards and have something like:

model-index:
 - name: llama3
 results:
 - task:
 type: hardware-benchmark
 architecture:
 - name: x86
 - time-to-first-token: 0.7
 - ...
 - name: arm
 -

Metadata

Metadata

Labels

Type

No type

Projects

Status

No status

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    AltStyle によって変換されたページ (->オリジナル) /