lucataco/moondream2 | Run with an API on Replicate

Readme

Moondream2

moondream2 is a small vision language model designed to run efficiently on edge devices. Check out the GitHub repository for details, or try it out on the Hugging Face Space!

Benchmarks

Release	VQAv2	GQA	TextVQA	DocVQA	TallyQA (simple/full)	POPE (rand/pop/adv)
2024年07月23日 (latest)	79.4	64.9	60.2	61.9	82.0 / 76.8	91.3 / 89.7 / 86.9
2024年05月20日	79.4	63.1	57.2	30.5	82.1 / 76.6	91.5 / 89.6 / 86.2
2024年05月08日	79.0	62.7	53.1	30.5	81.6 / 76.1	90.6 / 88.3 / 85.0
2024年04月02日	77.7	61.7	49.7	24.3	80.1 / 74.2	-
2024年03月13日	76.8	60.6	46.4	22.2	79.6 / 73.3	-
2024年03月06日	75.4	59.8	43.1	20.9	79.5 / 73.2	-
2024年03月04日	74.2	58.5	36.4	-	-	-

Usage

The model is updated regularly, so we recommend pinning the model version to a specific release as shown above.

Model created over 1 year ago

Run time and cost

Readme

Moondream2