0

I have a text which looks like below

"Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more.."

Now I just want to extract the text after the link which is Please click on the link to see more..

Now it might happen that there might not be any text after the link in which case I should get an empty string.

This is what I tried to do

message = "Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more.."
if(message.split('html')[1].trim() !== '') {
 do_something()
}

But this is not elegant and if the link ends with something other than html, this won't work.

Is there any regex that can get content to the right of a url in a text if present or return empty string?

asked Dec 11, 2019 at 17:39
3
  • 1
    Try this one (?:https?:\/{2}\S+\s)(.*) with results in capture group 1, or this one - works in ECMA2018+ (?<=https?:\/{2}\S+\s).* Commented Dec 11, 2019 at 17:42
  • are you sure it's always going to end with .html ? Commented Dec 11, 2019 at 17:43
  • 1
    @Nicolas no it can end with anything that's why my solution won't work in such scenarios. Commented Dec 11, 2019 at 17:44

2 Answers 2

2

You can use the following regex (results in capture group 1):

(?:https?:\/{2}\S+\s)(.*)

s = 'Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more..'
r = /(?:https?:\/{2}\S+\s)(.*)/
m = s.match(r)
if(m) console.log(m[1])

Matches URLs beginning with http:// or https:// up to the next space (inclusive), then captures the remainder of the string into the capture group.


Or you can use the following regex in ECMA2018+ (V8 engine +) - see browser compatibility for lookbehind assertions here:

(?<=https?:\/{2}\S+\s).*

s = 'Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more..'
r = /(?<=https?:\/{2}\S+\s).*/
m = s.match(r)
if(m) console.log(m[0])

Does the same as the previous regex, just uses a positive lookbehind to ensure the URL precedes rather than matching it. The regex match is the remainder of the string after the URL.

answered Dec 11, 2019 at 17:53
3
  • Positive lookbehinds cannot have greedy quantifiers in most regex engines Commented Dec 11, 2019 at 18:05
  • 1
    @NoodleOfDeath that's why I specified the version that it's compatible with and included a link to check browser compatibility (although yes, many still don't support variable length lookbehinds). Commented Dec 11, 2019 at 18:06
  • @ctwheels I will go with first option since my browser has some issues with positive lookbehind. Thanks for your answer. Commented Dec 11, 2019 at 18:08
1

Give this a go.

var str = 'Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more..'
var expr = /https?:\/\/\S+(.*)/g
var match = expr.exec(str)
console.log(match[1])
answered Dec 11, 2019 at 17:50
2
  • Fails against URLs with %?#~:[]@!$&'()*+,;= in them (all of which can be included in a URL). Commented Dec 11, 2019 at 17:56
  • Updated try that Commented Dec 11, 2019 at 17:57

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.