2
\$\begingroup\$

I'm adding prefixes to headers in markdown using a script. For example, if I have:

# Hello
## World
### Let's add
## Some headers
### Yay!
# Foo
## Bar

I transform it to:

# 1 Hello
## 1.1 World
### 1.1.1 Let's add
## 1.2 Some headers
### 1.2.1 Yay!
# 2 Foo
## 2.1 Bar

I currently have the following working code, but it feels a bit hacky and prone for error. :

preg_match_all('/^(#+)\s(.*)$/m', $markdown, $headers);
$levelCount = [];
$currentLevel = 9999;
foreach ($headers[2] as $idx => $header) {
 $level = strlen($headers[1][$idx]);
 if ($level < $currentLevel) {
 // reset:
 for ($i = $level; $i != $currentLevel; $i += 1) {
 array_pop($levelCount);
 }
 }
 if (!isset($levelCount[$level])) {
 $levelCount[$level] = 1;
 } else {
 $levelCount[$level] += 1;
 }
 $prefix = implode('.', array_values($levelCount));
 $currentLevel = $level;
 $markdown = preg_replace(
 '/' . preg_quote($headers[0][$idx]) . '/',
 $headers[1][$idx] . ' ' . $prefix . ' ' . $header,
 $markdown,
 1
 );
}

What are your thoughts on this?

asked Feb 26, 2019 at 8:33
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

I may not be considering all fringe cases (please enlighten me if this breaks with any realistic input), but all of the logic can be packed into a single preg_replace_callback() call.

  • I start by declaring a 1-dimensional array which will contain all numeric counters.

  • My pattern ~^(#+)\K~m says:
    From the start of each line, capture one or more hash symbols as capture group #1, then restart the fullstring match . In doing so, my replacement string to be determined inside the custom function will not actually be replacing any characters, but adding new characters at the zero-length position marked by \K. This spares you needing to match the rest of the line and adding that to the replacement.

  • Making $levels modifiable by reference with & means that $level will be updated with each iteration of the regex matches.

  • After counting the elements in $levels and counting the length of the matched string, precise modifications can be done to $levels to provide the desired set of numbers.

  • The return value is a space followed by the dot-imploded $levels array.

  • array_slice() avoids poping in a loop.

  • I don't think your array_values() call is necessary.

Code: (Demo)

$headers = <<<HEADERS
# Hello
## World
### Let's add
## Some headers
### Yay!
# Foo
## Bar
## Bar Again
HEADERS;
$levels = [];
echo preg_replace_callback(
 '~^(#+)\K~m',
 function($m) use (&$levels) {
 $hashes = strlen($m[1]);
 $index = $hashes - 1;
 $reduction = sizeof($levels) - $hashes;
 if (!isset($levels[$index])) {
 $levels[$index] = 1;
 } else {
 ++$levels[$index];
 }
 if ($reduction > 0) {
 $levels = array_slice($levels, 0, -$reduction);
 }
 return " " . implode('.', $levels);
 },
 $headers
 );

Output:

# 1 Hello
## 1.1 World
### 1.1.1 Let's add
## 1.2 Some headers
### 1.2.1 Yay!
# 2 Foo
## 2.1 Bar
## 2.2 Bar Again
answered Mar 6, 2019 at 13:11
\$\endgroup\$
1
  • \$\begingroup\$ Thanks for your answer! I hadn't thought about using preg_replace_callback() for this and passing $levels bij reference. \$\endgroup\$ Commented Mar 7, 2019 at 7:56

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.