The 1.4% counts users who recognized a security problem and filed a report. It does not count users who were affected.
What This Means for Skill Authors and Installers
The dominant failure mode (silent wrong output) has no automatic detection. No error is raised. No alert fires. The skill appears healthy by every operational metric.
For skill authors: return explicit errors on unexpected input rather than plausible-looking wrong output. A skill that fails loudly is easier to debug than one that succeeds silently with bad results. Write correctness tests against representative data before publishing.
For installers: test skills with representative inputs before putting them in any workflow that touches real data. Monitor outputs, not just uptime. A skill that returns something is not the same as a skill that returns the right thing.
Full methodology and dataset: vesselofone.com/research/ai-agent-skills-ecosystem. Dataset at doi.org/10.5281/zenodo.19691714. Scan scripts at github.com/vesselofone/openclaw-skills under MIT + CC BY 4.0.
A free per-skill auditor covering SKILL.md intent, OAuth scope width, and injection patterns: vesselofone.com/tools/skill-check.
Vessel is managed OpenClaw hosting on private Linux VMs. Every agent we provision runs the skill auditor at setup. The research and dataset are open source.