-
Notifications
You must be signed in to change notification settings - Fork 6
Rewrite indentation code #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The objectives are: 1. Simplify the indentation code; previous implementation has become so complex it is impossible to maintain, 2. Significantly improve performance; previous indentation code was painfully slow, (see issue #6) 3. Maximum configurability; should be configured similarly to cljfmt and make previously impossible things possible (e.g. issue #21). As of this commit, objectives 1 and 2 have been met, but work on objective 3 has not yet begun. There will continue to be further improvements, particularly around performance and the "what if syntax highlighting is disabled?" scenario. These changes will unfortunately be backwards incompatible, but hopefully the improved performance and API will make up for it.
By utilising some ugly mutable state, this change more than doubles the performance of the indentation code. You can expect even further benefits in larger code blocks. This should completely eliminate any need for the `clojure_maxlines` configuration option.
Since maps are far more likely to across multiple lines than vectors (and when they are they tend to be much larger, e.g. EDN files!), we can further optimise indentation by checking if the line is inside a map *before* checking it is inside a vector. (This happens because of the performance improvement made with the addition of the optimised `CheckPair` function.)
The `=` operator will no longer alter the indent of lines within a Clojure multi-line string or regular expression. The previous behaviour was annoying for writing detailed doc-strings, as it made reformatting the file with `gg=G` not possible as it would screw up the indentation within the doc-strings. Now the behaviour matches that of VS Code and Emacs.
By setting the `clojure_align_multiline_strings` option to `-1` it will switch to traditional Lisp multi-line string indentation, of no indent.
Sometimes it seems that when `=`ing over a block, Vim will sometimes not re-highlight strings correctly until we have already ran `searchpairpos`. In this case, we implement a hacky workaround which detects the false-positive and recovers from it.
8279a89
to
242dc9d
Compare
Still lots to do here, this is just the initial work.
I've been experimenting with a custom algorithm as per the check list item above. This version eliminates all syntax highlight checks (and therefore no longer requires the various hacks to make those checks work) plus it is approx the same speed as my first rewrite attempt and uses about the same amount of code.
However the core benefits of this version is that it works when syntax highlighting is switched off, and eliminates the performance bottle neck of syntax highlight checks. This means that it will be possible to write a version of the algorithm in Vim9 script, to get another significant performance improvement! (Maintaining both a legacy Vim script and Vim9 script versions should be feasible due to how much simpler this is than the original implementation.)
TL;DR: I'll push a new version to this branch soon which will be:
- Simpler with no hacks,
- Can be become even faster (via Vim9 script),
- Works even when syntax highlighting is switched off.
Once this branch is feature complete, I'll share benchmarks comparing it to the original as is on master
.
Not complete yet, but getting there.
Some refactoring should be possible here and further optimisations. Once all optimisations I can think of have been implemented, I'll try writing an alternate Vim9 script version. (The syntax highlight group checks used in previous implementations of the indentation code was the core bottleneck, so a Vim9 script version would not have been much faster.)
The performance of the new reader-like indentation algorithm has now surpassed the performance of my previous syntax highlighting approach with better accuracy and no hacky code required.
Performance benchmarks as of current change (+ unpushed Vim9 version). This is measuring the time to reindent a whole file on a Macbook Pro 2021, so expect much more noticable performance gains on slower hardware. 🚀
(I'll share new benchmarks (and the full reports generated by Vim) once this branch is feature complete, although I don't expect it to change much.) Even with the Vim9 implementation in the code too (not pushed yet, still figuring out how best to integrate it in a maintainable way), we can expect the file size of Regarding Neovim, until Vim9 script support is added, the "New (legacy)" code will be used. If anyone would like to volunteer to write a Lua implementation of this code for Neovim please open an issue and we can discuss how to integrate it. |
Previously backslashes were accidentally detected as tokens by the indentation tokeniser. This meant that character literals, would break indentation of everything after them.
f6667a5
to
78ecad5
Compare
e04d1c0
to
e5f86af
Compare
80ec263
to
7659541
Compare
Avoids screen flickering during test execution.
01d8513
to
e50122e
Compare
garrett-hopper
commented
May 8, 2025
FYI, the "uniform"
indent_style
isn't quite following Tonsky's Better Clojure formatting
Multi-line lists that start with a symbol are always indented with two spaces,
Other multi-line lists, vectors, maps and sets are aligned with the first element (1 or 2 spaces).
Multi-line lists that start with a non-symbol are currently indented with 2 instead of 1 space.
E.g.
(ns namespace (:require [clojure.set :as s]))
The final line is indented 1 space because :require
is keyword and not a symbol.
if indent_style ==# 'uniform' | return base_indent + 1 | endif
Not sure the best way way for this to check if the first list item is a symbol or not. 🤔
I wrote this for #21 before I realized this rewrite was being done. It's a pattern that will match symbols, adapted from the tree-sitter grammar. (Though it may be easier to just special case non-symbols as first list item.)
local suffixonly = ":#'0-9\\/" -- not allowed as first character local forbidden = { [[\[\]\n\r\t \\]], -- brackets, whitespace, slashes '(){}"@~^;`,', -- parens, braces, special characters suffixonly, } local prefix = string.format("[^%s]", table.concat(forbidden)) local suffix = string.format([[\(%s\|[%s]\)*]], prefix, suffixonly) local symbol = prefix .. suffix
[^\[\]\n\r\t \\(){}"@~^;`,:#'0-9\/]\([^\[\]\n\r\t \\(){}"@~^;`,:#'0-9\/]\|[:#'0-9\/]\)*
Thanks for raising this @garrett-hopper. I'll add it to my list and try to fix it soon.
(Yet more mildly irritating rules in the common Clojure indentation styles 😆 😭)
Apologies to anyone waiting patiently for this to be merged. I haven't managed to find much time to dedicate to this and each time I come back to it, it requires a lot of context reloading as the standard Clojure indentation styles are shockingly all very complicated (contrary to Clojure itself) and to implement them efficiently in Vim script is forever painful.
Currently what exists in this branch is IMO already far superior to what is currently on the master
branch, so my current plan is:
- fix a few of the more major outstanding issues,
- document the
clojure_indent_rules
option with warnings that it may change in the future, - merge, 🥳
- open issues for the remaining known problems that I don't consider to be dealbreakers,
- gather more feedback on other issues with wider use,
- fix issues critical to the core algorithm,
- later complete Create Vim 9 script variant of the new indentation algorithm #35 and Port performance improvements to Neovim/Lua #32 once the core algorithm is perfected to avoid needing to implement the same fixes 3 times.
I hope to get round to the first 4 soon, but can't make any promises on when I will do so.
NoahTheDuke
commented
Jul 14, 2025
Pretty exciting! Thanks for shepherding this! I've been using this branch exclusively since you opened the PR and I'm quite pleased with it.
Uh oh!
There was an error while loading. Please reload this page.
Tip
As of the latest change, indentation is 2–3x faster with the potential to surpass 10x with Vim9script and Lua.
This PR contains a rewrite of the entire Clojure indentation code that aims to:
=
#34=
operator from altering the indentation of multi-line strings and setting the'lisp'
option in Clojure buffers.lisp
option #40To do
#?@(...)
,#?(...)
).letfn
,extend-protocol
, etc.).