CFinTech/SparseSSM

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
eval		eval
exact_match		exact_match
img/README.assets		img/README.assets
model		model
prune		prune
utils		utils
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Repository files navigation

SparseSSM

License: Apache 2.0 Mamba

Efficient Selective Structured State Space Models Can Be Pruned in One-Shot

State-space language models such as Mamba match Transformer quality while permitting linear complexity inference, yet still comprise billions of parameters that hinder deployment. Existing one-shot pruning methods are tailored to attention blocks and fail to account for the time-shared and discretized state-transition matrix at the heart of the selective state-space module (SSM). In this paper, we introduce SparseSSM, the first training-free pruning framework that extends the classic optimal brain surgeon (OBS) framework to state space architectures. Our layer-wise algorithm (i) derives an approximate second-order saliency score that aggregates Hessian-trace information across time steps, (ii) incorporates a component sensitivity analysis to guide feed-forward network (FFN) pruning, which also sheds light on where redundancy resides in mamba architecture, (iii) can be easily extended to semi-structured and structured sparsity. Empirically, we prune 50% of SSM weights without fine-tuning and observe no zero-shot accuracy loss, achieving the current state-of-the-art pruning algorithm for Mamba-based LLMs.

🚀Quick Start

1. Install environment

git clone https://github.com/CFinTech/SparseSSM
cd SparseSSM
pip install -r requirements.txt

2. Download dataset

The data for calibrations can be downloaded here.

3. Execute

To prune the SSM module, you can run the following command:

CUDA_VISIBLE_DEVICES=${your_gpu_id} python main.py \
 path/to/your/model wikitext2 \
 --experiment_name your_experiment_name\
 --method "sparsessm_dev" \
 --save path/to/pruned_model \
 --sparsity 0.5 \
 --nsamples 64 \
 --minlayer 0 \
 --maxlayer 100 \
 --prune_A True \
 --do_prune \
 --eval_zero_shot \
 --log_wandb \

🖼️ Method Overview

image-20250925192926282

Illustration of SparseSSM. The first row depicts the evolution of the diagonal parameter matrix $A_{log}$ within the SSM module in Mamba, together with a schematic of the forward-propagation process. In the second row, the left panel shows the procedure for obtaining a mask from the Hessian estimate at a single time step, while the right panel presents our weighted strategy for merging the masks across all time steps, darker background indicates larger weights.

📊 Comparison of Experimental Results

Performance analysis for one-shot unstructured pruning of SSM modules in Mamba models at 50ドル%$ sparsity.

image-20250925192141105

🙏 Acknowledgements

This source code is derived from the famous PyTorch reimplementation of SparseGPT and mamba-minimal.
We use Mamba checkpoints to test our method.
The README file is inspired by LLM-pruner.

Citation

If you find this work useful for your research, please consider citing our paper:

@article{tuo2025sparsessm,
 title={SparseSSM: Efficient Selective Structured State Space Models Can Be Pruned in One-Shot},
 author={Kaiwen Tuo and Huan Wang},
 journal={arXiv preprint arXiv:2506.09613},
 year={2025},
}

About

[arxiv 2025] SparseSSM: Efficient Selective Structured State Space Models Can Be Pruned in One-Shot

cfintech.github.io/SparseSSM-Web

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CFinTech/SparseSSM

Folders and files

Latest commit

History

Repository files navigation

SparseSSM

Efficient Selective Structured State Space Models Can Be Pruned in One-Shot

🚀Quick Start

1. Install environment

2. Download dataset

3. Execute

🖼️ Method Overview

📊 Comparison of Experimental Results

🙏 Acknowledgements

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

CFinTech/SparseSSM

Folders and files

Latest commit

History

Repository files navigation

SparseSSM

Efficient Selective Structured State Space Models Can Be Pruned in One-Shot

🚀Quick Start

1. Install environment

2. Download dataset

3. Execute

🖼️ Method Overview

📊 Comparison of Experimental Results

🙏 Acknowledgements

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages