Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

CycloneBoy/slm_sql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

11 Commits

Repository files navigation

SLM-SQL: An Exploration of Small Language Models for Text-to-SQL

Important Links

πŸ“–Arxiv Paper | πŸ€—HuggingFace | πŸ€–ModelScope |

News

  • August 12, 2025: The SLM-SQL prediction results on the BIRD Dev dataset.
  • August 1, 2025: The SLM-SQL 1.5B model achieved an Execution Accuracy (EX) of 70.49% on the BIRD test set, while the 0.5B model attained an EX of 61.82%.
  • July 31, 2025: Upload model to modelscope and huggingface.
  • July 30, 2025: Publish the paper to arxiv

Introduction

Large language models (LLMs) have demonstrated strong performance in translating natural language questions into SQL queries (Text-to-SQL). In contrast, small language models (SLMs) ranging from 0.5B to 1.5B parameters currently underperform on Text-to-SQL tasks due to their limited logical reasoning capabilities. However, SLMs offer inherent advantages in inference speed and suitability for edge deployment. To explore their potential in Text-to-SQL applications, we leverage recent advancements in post-training techniques. Specifically, we used the open-source SynSQL-2.5M dataset to construct two derived datasets: SynSQL-Think-916K for SQL generation and SynSQL-Merge-Think-310K for SQL merge revision. We then applied supervised fine-tuning and reinforcement learning-based post-training to the SLM, followed by inference using a corrective self-consistency approach. Experimental results validate the effectiveness and generalizability of our method, SLM-SQL. On the BIRD development set, the five evaluated models achieved an average improvement of 31.4 points. Notably, the 0.5B model reached 56.87% execution accuracy (EX), while the 1.5B model achieved 67.08% EX. We will release our dataset, model, and code to github: https://github.com/CycloneBoy/slm_sql.

Framework

slmsql_framework

Main Results

slm_sql_result

slmsql_bird_main

slmsql_spider_main

Performance Comparison of different Text-to-SQL methods on BIRD dev and test dataset.

slmsql_ablation_study

Model

Model Base Model Train Method Modelscope HuggingFace
SLM-SQL-Base-0.5B Qwen2.5-Coder-0.5B-Instruct SFT πŸ€– Modelscope πŸ€— HuggingFace
SLM-SQL-0.5B Qwen2.5-Coder-0.5B-Instruct SFT + GRPO πŸ€– Modelscope πŸ€— HuggingFace
CscSQL-Merge-Qwen2.5-Coder-0.5B-Instruct Qwen2.5-Coder-0.5B-Instruct SFT + GRPO πŸ€– Modelscope πŸ€— HuggingFace
SLM-SQL-Base-1.5B Qwen2.5-Coder-1.5B-Instruct SFT πŸ€– Modelscope πŸ€— HuggingFace
SLM-SQL-1.5B Qwen2.5-Coder-1.5B-Instruct SFT + GRPO πŸ€– Modelscope πŸ€— HuggingFace
CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct Qwen2.5-Coder-1.5B-Instruct SFT + GRPO πŸ€– Modelscope πŸ€— HuggingFace
SLM-SQL-Base-0.6B Qwen3-0.6B SFT πŸ€– Modelscope πŸ€— HuggingFace
SLM-SQL-0.6B Qwen3-0.6B SFT + GRPO πŸ€– Modelscope πŸ€— HuggingFace
SLM-SQL-Base-1.3B deepseek-coder-1.3b-instruct SFT πŸ€– Modelscope πŸ€— HuggingFace
SLM-SQL-1.3B deepseek-coder-1.3b-instruct SFT + GRPO πŸ€– Modelscope πŸ€— HuggingFace
SLM-SQL-Base-1B Llama-3.2-1B-Instruct SFT πŸ€– Modelscope πŸ€— HuggingFace

Dataset

Dataset Modelscope HuggingFace
SynsQL-Think-916k πŸ€– Modelscope πŸ€— HuggingFace
SynsQL-Merge-Think-310k πŸ€– Modelscope πŸ€— HuggingFace
bird train and dev dataset πŸ€– Modelscope πŸ€— HuggingFace

TODO

  • Release inference code
  • Upload Model
  • Release training code
  • Fix bug
  • Update doc

Thanks to the following projects

Citation

@misc{sheng2025slmsqlexplorationsmalllanguage,
 title={SLM-SQL: An Exploration of Small Language Models for Text-to-SQL}, 
 author={Lei Sheng and Shuai-Shuai Xu},
 year={2025},
 eprint={2507.22478},
 archivePrefix={arXiv},
 primaryClass={cs.CL},
 url={https://arxiv.org/abs/2507.22478}, 
}
@misc{sheng2025cscsqlcorrectiveselfconsistencytexttosql,
 title={CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning}, 
 author={Lei Sheng and Shuai-Shuai Xu},
 year={2025},
 eprint={2505.13271},
 archivePrefix={arXiv},
 primaryClass={cs.CL},
 url={https://arxiv.org/abs/2505.13271}, 
}

About

SLM-SQL: An Exploration of Small Language Models for Text-to-SQL

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /