New Research Says LLMs Are Surprisingly Good at Compressing... Images and Audio? - DEV Community

Skip to content

Powered by Algolia

Log in Create account

DEV Community

Copied to Clipboard

strong compression across modalities reflects an understanding of images, audio and more at a deep statistical level.

There are inherent tradeoffs between model scale, datasets, and compression performance. Bigger datasets allow bigger models, but size must match.

The results provide new perspective on model scaling laws - compression considers model size unlike log loss. Scaling hits limits.

The equivalence between prediction and compression means these models could have practical applications for compressing images, video and more. However, model size may be prohibitive compared to current methods.

The compression viewpoint offers new insights into model generalization, failure modes, tokenization, and other aspects of deep learning.

In summary, this research shows large language models have become adept general-purpose learners. Their exceptional compression capabilities demonstrate an expansive understanding of patterns in textual, visual and audio data. There is still progress to be made, but these models show increasing competence as general systems for automating prediction and compression across modalities.

Subscribe or follow me on Twitter for more content like this!

Plain English Papers (24 Part Series)

1 Meet CulturaX: A New Multilingual Dataset for Training AI Models in 167 Languages 2 Contrastive Decoding: A New Technique for Boosting Reasoning in Large Language Models ... 20 more parts... 3 New Research Says LLMs Are Surprisingly Good at Compressing... Images and Audio? 4 Bringing Still Pictures to Life with Neural Motion Textures 5 By Teaching AI to Make Pictures and Write, Scientists Improve Its Grasp of Vision and Language 6 LongLoRA: A New, More Efficient Way to Fine-Tune LLMs 7 Meet GPT4Tools: teaching existing LLMs how to use tools for visual tasks 8 UNC Researchers Present VideoDirectorGPT: Using AI to Generate Multi-Scene Videos from Text 9 Researchers discover explicit registers eliminate vision transformer attention spikes 10 Infinite Context Windows? LLMs for Streaming Applications with Attention Sinks 11 Tool-Integrated Reasoning: A New Approach for Math-Savvy LLMs 12 Enabling Language Models to Implicitly Learn Self-Improvement 13 Decoding Speech from Brain Waves - A Breakthrough in Brain-Computer Interfaces 14 Researchers: Low-Resource Languages Can Easily Jailbreak LLMs 15 There are many ways to design roundabouts. Can AI choose the best one? 16 Anomaly Detection in Multivariate Time Series with... Diffusion Models? 17 How Sora (actually) works 18 Up to 17% of AI conference reviews now written by AI 19 The death of creativity 20 LLMs are secretly good at regression calculations 21 Microsoft’s Phi-3 model is cool tech, but local LLMs are useless 22 LLMs can speak in JPEG 23 🧠 Training on code improves LLM performance on non-coding tasks 24 LLMs will lie forever

Top comments (0)

Subscribe

pic

Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Code of Conduct • Report abuse

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink.

Hide child comments as well

For further actions, you may consider blocking this person and/or reporting abuse

Devs release thousands of AI papers, models, and tools daily. Only a few will be revolutionary. We scan repos, journals, and social media to bring them to you in bite-sized summaries.

Location

Worldwide
Joined

Mar 28, 2023

More from aimodels-fyi

A beginner's guide to the Gemini-3-Flash model by Google on Replicate

#coding #ai #machinelearning #programming

A beginner's guide to the Price-Predict-V1 model by Humbleworth on Replicate

#coding #ai #machinelearning #programming

A beginner's guide to the Gpt-Image-2 model by Openai on Replicate

#coding #ai #machinelearning #programming

💎 DEV Diamond Sponsors

Thank you to our Diamond Sponsors for supporting the DEV Community

Google AI - Official AI Model and Platform Partner

Google AI is the official AI Model and Platform Partner of DEV

Neon - Official Database Partner

Neon is the official database partner of DEV

Algolia - Official Search Partner

Algolia is the official search partner of DEV

DEV Community — A space to discuss and keep up software development and manage your software career

Home
DEV Challenges
DEV++
Videos
DEV Education Tracks
DEV Help
Advertise on DEV
Organization Accounts
DEV Showcase
About
Contact
Free Postgres Database
DEV Shop
MLH

Code of Conduct
Privacy Policy
Terms of Use

Built on Forem — the open source software that powers DEV and other inclusive communities.

Made with love and Ruby on Rails. DEV Community © 2016 - 2026.

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

AltStyle によって変換されたページ (->オリジナル) / アドレス: モード: