I'm trying to use sed to remove links like and leave just the title:
## [Some title](#some-title)
This is my command:
sed 's/^\(\#*\) *\[\([^\]]*\)\].*/1円 2円/'
I expect to have just the text without the link:
## Some title
But it doesn't work. What I do wrong?
I'm using Linux with GNU sed.
3 Answers 3
Based on what you've told us so far (I'm trying to ... leave just the title:
) and the sample input you provided (## [Some title](#some-title)
) this might be what you're trying to do, using any awk:
$ awk -F'[][]' '{print 2ドル}' file
Some title
or any sed:
$ sed 's/.*\[\([^]]*\)].*/1円/' file
Some title
but without more truly representative sample input and expected output that's just a guess.
As for what's wrong with your sed script:
sed -i 's/^\(\#*\) *\[\([^\]]*\)\].*/1円 2円/'
Using -i
like that will do "inplace" editing in GNU sed but in other sed versions, even BSD sed which also supports inplace editing but requires a backup file name, it'll do different things so you don't tell us what problem you're experiencing when running your script but maybe that's it?
Beyond that, in the first regexp segment \(\#*\)
:
- You're escaping the literal char
#
as\#
which is undefined behavior per POSIX when you wanted just#
. - You're using
#*
which matches zero or more#
s when you wanted 1 or more which is##*
or#\{1,\}
in a BRE as sed uses by default (or#+
if you were using an ERE).
In the separating spaces part <blank>*
:
- You're using
<blank>*
which matches zero or more<blank>
s when you wanted 1 or more which is<blank><blank>*
or<blank>\{1,\}
in a BRE (or<blank>+
if you were using an ERE).
In the last regexp segment \[\([^\]]*\)\].*
:
- You're using
[^\]]
and so escaping]
which is undefined behavior per POSIX when you wanted just[^]]
. - You're using
\]
at the end which is undefined behavior per POSIX since there's no unescaped[
before it when you wanted just]
.
If you fixed all of those issues you'd get:
$ sed 's/^\(##*\) *\[\([^]]*\)].*/1円 2円/' file
## Some title
or
$ sed 's/^\(#\{1,\}\) \{1,\}\[\([^]]*\)].*/1円 2円/' file
## Some title
and since you're using GNU sed which supports EREs you could write that as:
$ sed -E 's/^(#+) +\[([^]]*)].*/1円 2円/' file
## Some title
And then to leave just the title
as you said you wanted just means removing the first capture group:
$ sed 's/^##* *\[\([^]]*\)].*/1円/' file
Some title
$ sed 's/^#\{1,\} \{1,\}\[\([^]]*\)].*/1円/' file
Some title
$ sed -E 's/^#+ +\[([^]]*)].*/1円/' file
Some title
It looks like this pattern [^\]]
doesn't work in sed.
This seems to work:
sed 's/^\(#*\) \[\(.*\)\].*/1円 2円/'
-
Use extended regexps with sed's
-E
option to avoid Leaning Toothpick Syndrome. Also, if]
is the first (optionally after a^
) character in a bracket expression, it doesn't need to be escaped (seeman regex
). e.g.echo '## [Some title](#some-title)' | sed -E 's/^(#+) *\[([^]]*)\].*/1円 2円/'
. BTW, note the use of+
after#
instead of*
.+
means one-or-more,*
means zero-or-more. It's each to match more than you mean if you use + instead of * - in this case, * would match ALL URLs, not just those in # headers.cas– cas2025年07月18日 03:27:55 +00:00Commented Jul 18 at 3:27 -
ALL URLs at the beginning of a line starting with zero-or-more spaces, that is.cas– cas2025年07月18日 03:33:00 +00:00Commented Jul 18 at 3:33
-
1
[^\]]
is undefined behavior per POSIX so any sed can do whatever it likes with that. ITYM just[]]
instead.Ed Morton– Ed Morton2025年07月18日 12:21:07 +00:00Commented Jul 18 at 12:21
Writing a pandoc filter can handle the most general version of this problem:
Remove any link within any level of heading.
Headers can differ by depth and style, their contents can be formatted, and header-like strings can appear in comments and code blocks. So, for example, your markdown file could be like this:
## [Some title](#some-title)
Some text here
Another *header [with a `link`](https://www.konami.com/yugioh/)*
------------
wow!
# What if [a header link](#like-this) appears outside of code?
# What if [a header link](#like-this) appears in code?
<!--
# a [header link](#keep-me) that should not be altered
because it's commented out -->
Pandoc knows all about these cases. I don’t want to have to think about them. I just want to say "if you find a link somewhere in a heading, get rid of it." That’s a filter.
Here’s a pandoc filter (Haskell version)
Based on the behead.hs
example in the documentation:
#!/usr/bin/env runhaskell
-- removeheaderlinks.hs
import Text.Pandoc.JSON
import Text.Pandoc.Walk
main :: IO ()
main = toJSONFilter removeheaderlinks
-- if this Inline is a link, remove the link but keep the attributes
removelink :: Inline -> Inline
removelink (Link at xs _) = Span at xs
removelink x = x
-- remove all links if the block is a header
removeheaderlinks :: Block -> Block
removeheaderlinks (Header n attr content) = Header n attr $ walk removelink content
removeheaderlinks x = x
You need to have haskell installed, as well as pandoc-types
, so run cabal v2-update && cabal v2-install --lib pandoc-types --package-env .
first.
Then run this to convert:
pandoc -f markdown -t markdown --filter removeheaderlinks.hs ./example.md
Result:
## Some title
Some text here
## Another *header with a `link`*
wow!
# What if a header link appears outside of code?
# What if [a header link](#like-this) appears in code?
<!--
# a [header link](#keep-me) that should not be altered
because it's commented out -->
-
Don't you think that using Pandoc and Haskell is a bit overkill for something that can be done with a single sed command? I don't need to parse any possible Markdown code. I have few markdown files that I've written myself that I need to remove the links from.jcubic– jcubic2025年07月20日 16:53:24 +00:00Commented Jul 20 at 16:53
-
Yeah, hahah, I admit this is overkill for your case (where the headers and links look exactly like this). But I hope it’s useful for somebody else googling "remove links in headers markdown" whose files look different from yours.wobtax– wobtax2025年07月20日 17:02:35 +00:00Commented Jul 20 at 17:02
##
and a space? Can there be text after the(#some-title)
? Why are you escaping th#
? Most importantly, what OS are you using so we know what sed implementation you have?-i
in examples in questions or answers as then people reading them can't copy/paste your code to test it without trashing their input file. It's trivial for you to add-i
(or do anything else that updates the input file) later if you want it.Some title
include(
,]
,#
,)
, or newlines? Please edit your question to tell us about that and to provide a few lines of truly representative sample input and the expected output given that input.