-
Notifications
You must be signed in to change notification settings - Fork 4k
"after it" means after space ? #2500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
"...only if it's followed by a space, and there's 30 somewhere after it: "
Can "after it" be erroneously interpreted as "after space"? is it ambiguous?
I am not sure,
Please just check it,
This way is the shortest I found to avoid confusion
drop PR or rewrite it if appropiate
There is another problem,
it doesn't
"...look for X only if..."
it does
looks for X, and matches only if...
I think this way solves ambiguity too
9-regular-expressions/14-regexp-lookahead-lookbehind/article.md
Outdated
Show resolved
Hide resolved
MuhammedZakir
commented
Feb 25, 2021
"...only if it's followed by a space, and there's
30somewhere after it: "
Can "after it" be erroneously interpreted as "after space"? is it ambiguous?
Here, "it" means a "whitespace". So that interpretation is not wrong.
Co-authored-by: Muhammed Zakir <8190126+MuhammedZakir@users.noreply.github.com>
Here, "it" means a "whitespace". So that interpretation is not wrong.
If the interpretion is not wrong, the explanation without the "both" still is.
(it fixes only the second problem)
it's hard to see in this example because it gives the same result.
What if
pattern:\d+(?=.*30)(?=\s) (looks for 30 first, then space)
" ... if there is a 30 and it is followed by space "
if "it" means the 30, It clearly is not what is happening. The pattern still works.
I would allow some redundancy.
Accepted your suggestion and I'll try a new one.
I insist because it is easy to be confused with the old & hard regex. I think it is important to be precise.
i hope the last add doesnt add confussion... 😕
MuhammedZakir
commented
Feb 25, 2021
Okay, I get what you mean now! :-) And, what you added is correct, but it will lead to more confusion. I think we should first explain what's going on when regex engine does lookahead and lookbehind. Something like,
Regex engine does not move to the next position in the string. That is, for example, when matching \d(?=\w)(?=\w) against 1abc, at the end, current position in the string will be 2 (at a). The position won't change no matter how many lookahead/lookbehind patterns you add - matching \d(?=\w)(?=\w)(?=\w)(?=\w)(?=\w) against 1abc, the current position will still be at 2 (at a). Therefore, if you want to match the remaining string, <you have to start from where the match before lookahead/lookbehind ends.> (how to phrase it?). In the example 1abc, match ends at 1. To match remaining string, start matching from a, i.e. \d(?=\w)(?=\w)abc, or simply, \d(?=\w)\w+.
The problem is, I don't know how to exactly phrase it in a way newcomers can understand. I also know that the above is a verbose explanation. But like you said, it is really hard to cram everything in a short explanation. We must NOT remove examples (except unneeded ones) because regex explanations are difficult to digest without examples. Of course, above explanation is just a draft. Any idea on improving it or creating a better one?
I'm sorry.
After discovering the ambiguity I forgot to consider the article as a whole. It is NICELY explained some lines later already.
And a few lines earlier: "note it is merely a test, content is not included" (!!!!)
so good explanations, grammar problems don't mean a big deal.
I'll back to the first suggestion, your fix included.
...but I would add pattern:\d+(?=.*30)(?=\s) <=> pattern:\d+(?=\s)(?=.*30) as a reinforcement 😄
Leave only the replace: "looks for X if" by "matches if"
MuhammedZakir
commented
Feb 26, 2021
I also saw "note it is merely a test, content is not included". However, "content not included" is different from "position not moved". When using lookahead/lookbehind, even users familiar with regex write wrong regex mostly due to not knowing that the engine does not move the position in the string. For example, when someone want to get the first char of a string
- whose first char is a digit
- which is followed by alphabets
- which must not contain any other chars
they would probably use ^\d(?=[A-Za-z]+)$. This creates more problems, especially when writing complex regex patterns. So, personally, I really think we should mention it somewhere in this article. At the very least, as a note.
joaquinelio
commented
Feb 27, 2021
I think lines 27-30 are perfect already.
Agreed anyway.
But I'd go for something small.
but. To avoid frankenstain docs leading suggestions, I leave more complex than grammar/typo changes not to PRs but issues.
MuhammedZakir
commented
Feb 28, 2021
I am not sure how to write a brief explanation for that. I leave that to @iliakan (or someone else). 😬
iliakan
commented
Feb 28, 2021
Updated:
- For example,
pattern:\d+(?=\s)(?=.*30)looks forpattern:\d+that is followed by a spacepattern:(?=\s), and there's30somewhere after itpattern:(?=.*30):
Is it better now?
joaquinelio
commented
Mar 1, 2021
oops
Previous suggestions had similar explanatory "it" then I erased it.
But your new version solves the two issues (ambiguity and the "only if" thing) much, much better. 👍
iliakan
commented
Mar 1, 2021
Thanks!
joaquinelio
commented
Mar 2, 2021
Hi @MuhammedZakir
We agreed on the original problem, solved in my opinion
but judging from your text, you may not be entirely happy
maybe, if Ilya thinks it's useful (he may consider it as an overweight),
you can convert your comment with code to task.
just an idea, i haven't studied the current tasks on this article
regex is always tricky, even the simpler ones have the temible "work most of the time" risk.
iliakan
commented
Mar 2, 2021
@MuhammedZakir a worthy note indeed. A task can be useful!
joaquinelio
commented
Mar 2, 2021
@iliakan fyi
I don't know if English has the same issue to fix, you may want to know
I extended Spanish line 5 with a note about "ahead=right, behind=left" because many consider behind means "after" the text.
iliakan
commented
Mar 2, 2021
@joaquinelio The text explains these terms immediately after they are introduced, with examples.
joaquinelio
commented
Mar 2, 2021
Ok... I may rollback...
Hard to explain why. You may explain it, but it's not fixed in the head. When you use it, you doubt.
following or followed
The text I put is the how
" it follows the flow of lecture, ahead means right, the next to read and..."
rollback.
I never add text on my own
besides,
It is not a big issue, on the symbolic form you cannot be wrong, the precedence is pretty clear
MuhammedZakir
commented
Mar 3, 2021
@MuhammedZakir a worthy note indeed. A task can be useful!
I will open a new issue when I have created a task.
No description provided.