The .NET regex flavor has a special feature called balancing groups. The main purpose of balancing groups is to match balanced constructs or nested constructs, which is where they get their name from. A technically more accurate name for the feature would be capturing group subtraction. That’s what the feature really does. It’s .NET’s solution to a problem that other regex flavors like Perl, PCRE, and Ruby handle with regular expression recursion. JGsoft V2 supports both balancing groups and recursion.
(?<capture-subtract>regex) or (?'capture-subtract'regex) is the basic syntax of a balancing group. It’s the same syntax used for .NET-style named capturing groups but with two group names delimited by a minus sign. The name of this group is "capture". You can omit the name of the group. (?<-subtract>regex) or (?'-subtract'regex) is the syntax for a non-capturing balancing group.
The name "subtract" must be the name of another group in the regex. When the regex engine enters the balancing group, it subtracts one match from the group "subtract". If the group "subtract" did not match yet, or if all its matches were already subtracted, then the balancing group fails to match. You could think of a balancing group as a conditional that tests the group "subtract", with "regex" as the "if" part and an "else" part that always fails to match. The difference is that the balancing group has the added feature of subtracting one match from the group "subtract", while a conditional leaves the group untouched.
If the balancing group succeeds and it has a name ("capture" in this example), then the group captures the text between the end of the match that was subtracted from the group "subtract" and the start of the match of the balancing group itself ("regex" in this example). It does not capture the text matched by "regex". If you need that then put anther capturing group around "regex" inside the balancing group.
The reason this works in .NET is that capturing groups in .NET keep a stack of everything they captured during the matching process that wasn’t backtracked or subtracted. Most other regex engines only store the most recent match of each capturing groups. When (\w)+ matches abc, Match.Groups[1].Value returns c as with other regex engines, but Match.Groups[1].Captures stores all three iterations of the group: a, b, and c.
Let’s apply the regex (?'open'o)+(?'between-open'c)+ to the string ooccc. (?'open'o) matches the first o and stores that as the first capture of the group "open". The quantifier + repeats the group. (?'open'o) matches the second o and stores that as the second capture. Repeating again, (?'open'o) fails to match the first c. But the + is satisfied with two repetitions.
The regex engine advances to (?'between-open'c). Before the engine can enter this balancing group, it must check whether the subtracted group "open" has captured something. It has captured the second o. The engine enters the group, subtracting the most recent capture from "open". This leaves the group "open" with the first o as its only capture. Now inside the balancing group, c matches c. The engine exits the balancing group. The group "between" captures the text between the match subtracted from "open" (the second o) and the c just matched by the balancing group. This is an empty string but it is captured anyway.
The balancing group too has + as its quantifier. The engine again finds that the subtracted group "open" captured something, namely the first o. The regex enters the balancing group, leaving the group "open" without any matches. c matches the second c in the string. The group "between" captures oc which is the text between the match subtracted from "open" (the first o) and the second c just matched by the balancing group.
The balancing group is repeated again. But this time, the regex engine finds that the group "open" has no matches left. The balancing group fails to match. The group "between" is unaffected, retaining its most recent capture.
The + is satisfied with two iterations. The engine has reached the end of the regex. It returns oocc as the overall match. Match.Groups['open'].Success returns false because all the captures of that group were subtracted. Match.Groups['between'].Value returns "oc".
We need to modify this regex if we want it to match a balanced number of o’s and c’s. To make sure that the regex won’t match ooccc, which has more c’s than o’s, we can add anchors: ^(?'open'o)+(?'-open'c)+$. This regex goes through the same matching process as the previous one. But after (?'-open'c)+ fails to match its third iteration, the engine reaches $ instead of the end of the regex. This fails to match. The regex engine will backtrack trying different permutations of the quantifiers, but they will all fail to match. No match can be found.
But the regex ^(?'open'o)+(?'-open'c)+$ still matches ooc. The matching process is again the same until the balancing group has matched the first c and left the group ‘open’ with the first o as its only capture. The quantifier makes the engine attempt the balancing group again. The engine again finds that the subtracted group "open" captured something. The regex enters the balancing group, leaving the group "open" without any matches. But now, c fails to match because the regex engine has reached the end of the string.
The regex engine must now backtrack out of the balancing group. When backtracking a balancing group, .NET also backtracks the subtraction. Since the capture of the first o was subtracted from "open" when entering the balancing group, this capture is now restored while backtracking out of the balancing group. The repeated group (?'-open'c)+ is now reduced to a single iteration. But the quantifier is fine with that, as + means "once or more" as it always does. Still at the end of the string, the regex engine reaches $ in the regex, which matches. The whole string ooc is returned as the overall match. Match.Groups['open'].Captures will hold the first o in the string as the only item in the CaptureCollection. That’s because, after backtracking, the second o was subtracted from the group, but the first o was not.
To make sure that the regex matches oc and oocc but not ooc, we need to check that the group "open" has no captures left when the matching process reaches the end of the regex. We can do this with a conditional. (?(open)(?!)) is a conditional that checks whether the group "open" matched something. In .NET, having matched something means still having captures on the stack that weren’t backtracked or subtracted. If the group has captured something, the "if" part of the conditional is evaluated. In this case that is the empty negative lookahead (?!). The empty string inside this lookahead always matches. Because the lookahead is negative, this causes the lookahead to always fail. Thus the conditional always fails if the group has captured something. If the group has not captured anything, the "else" part of the conditional is evaluated. In this case there is no "else" part. This means that the conditional always succeeds if the group has not captured something. This makes (?(open)(?!)) a proper test to verify that the group "open" has no captures left.
The regex ^(?'open'o)+(?'-open'
The regex ^(?'open'o)+(?'-open'
^(?:(?'open'o)+(?'-open'
^(?>(?'open'o)+(?'-open'
^m*(?>(?>(?'open'
This is the generic solution for matching balanced constructs using .NET’s balancing groups or capturing group subtraction feature. You can replace o, m, and c
^[^()]*(?>(?>(?'open'
You can use backreferences to groups that have their matches subtracted by a balancing group. The backreference matches the group’s most recent match that wasn’t backtracked or subtracted. The regex (?'x'[ab]){2}(?'-x')\k'x' matches aaa, aba, bab, or bbb. It does not match aab, abb, baa, or bba. The first and third letters of the string have to be the same.
Let’s see how (?'x'[ab]){2}(?'-x')\k'x' matches aba. The first iteration of (?'x'[ab]) captures a. The second iteration captures b. Now the regex engine reaches the balancing group (?'-x'). It checks whether the group "x" has matched, which it has. The engine enters the balancing group, subtracting the match b from the stack of group "x". There are no regex tokens inside the balancing group. It matches without advancing through the string. Now the regex engine reaches the backreference \k'x'. The match at the top of the stack of group "x" is a. The next character in the string is also an a which the backreference matches. aba is found as an overall match.
When you apply this regex to abb, the matching process is the same, except that the backreference fails to match the second b in the string. Since the regex has no other permutations that the regex engine can try, the match attempt fails.
^(?'letter'[a-z])+
Let’s see how this regex matches the palindrome radar. ^ matches at the start of the string. Then (?'letter'[a-z])+
The regex engine backtracks. (?'letter'[a-z])+
More backtracking follows. (?'letter'[a-z])+
Backtracking once more, the capturing stack of group "letter" is reduced to r and a. Now the tide turns. [
The backreference and balancing group are inside a repeated non-capturing group, so the engine tries them again. The backreference matches r and the balancing group subtracts it from "letter"’s stack, leaving the capturing group without any matches. Iterating once more, the backreference fails, because the group "letter" has no matches left on its stack. This makes the group act as a non-participating group. Backreferences to non-participating groups always fail in .NET, as they do in most regex flavors.
(?:
| Quick Start | Tutorial | Search & Replace | Tools & Languages | Examples | Reference |
| Introduction | Table of Contents | Special Characters | Non-Printable Characters | Regex Engine Internals | Character Classes | Character Class Subtraction | Character Class Intersection | Shorthand Character Classes | Dot | Anchors | Word Boundaries | Alternation | Optional Items | Repetition | Grouping & Capturing | Backreferences | Backreferences, part 2 | Named Groups | Relative Backreferences | Branch Reset Groups | Free-Spacing & Comments | Unicode Characters & Properties | Mode Modifiers | Atomic Grouping | Possessive Quantifiers | Lookahead & Lookbehind | Lookaround, part 2 | Lookbehind Limitations | (Non-)Atomic Lookaround | Keep Text out of The Match | Conditionals | Balancing Groups | Recursion and Subroutines | POSIX Bracket Expressions | Zero-Length Matches | Continuing Matches | Backtracking Control Verbs | Control Verb Arguments |
Page URL: https://www.regular-expressions.info/balancing.html
Page last updated: 16 June 2025
Site last updated: 29 October 2025
Copyright © 2003-2025 Jan Goyvaerts. All rights reserved.