How to extract text after a url in a string?

Question 1

I have a text which looks like below

"Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more.."

Now I just want to extract the text after the link which is Please click on the link to see more..

Now it might happen that there might not be any text after the link in which case I should get an empty string.

This is what I tried to do

message = "Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more.."
if(message.split('html')[1].trim() !== '') {
 do_something()
}

But this is not elegant and if the link ends with something other than html, this won't work.

Is there any regex that can get content to the right of a url in a text if present or return empty string?

Question 2

Try this one (?:https?:\/{2}\S+\s)(.*) with results in capture group 1, or this one - works in ECMA2018+ (?<=https?:\/{2}\S+\s).*

Question 3

are you sure it's always going to end with .html ?

Question 4

@Nicolas no it can end with anything that's why my solution won't work in such scenarios.

Question 5

You can use the following regex (results in capture group 1):

(?:https?:\/{2}\S+\s)(.*)

s = 'Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more..'
r = /(?:https?:\/{2}\S+\s)(.*)/
m = s.match(r)
if(m) console.log(m[1])

Matches URLs beginning with http:// or https:// up to the next space (inclusive), then captures the remainder of the string into the capture group.

Or you can use the following regex in ECMA2018+ (V8 engine +) - see browser compatibility for lookbehind assertions here:

(?<=https?:\/{2}\S+\s).*

s = 'Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more..'
r = /(?<=https?:\/{2}\S+\s).*/
m = s.match(r)
if(m) console.log(m[0])

Does the same as the previous regex, just uses a positive lookbehind to ensure the URL precedes rather than matching it. The regex match is the remainder of the string after the URL.

Question 6

Positive lookbehinds cannot have greedy quantifiers in most regex engines

Question 7

@NoodleOfDeath that's why I specified the version that it's compatible with and included a link to check browser compatibility (although yes, many still don't support variable length lookbehinds).

Question 8

@ctwheels I will go with first option since my browser has some issues with positive lookbehind. Thanks for your answer.

Question 9

Give this a go.

var str = 'Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more..'
var expr = /https?:\/\/\S+(.*)/g
var match = expr.exec(str)
console.log(match[1])

Question 10

Fails against URLs with %?#~:[]@!$&'()*+,;= in them (all of which can be included in a URL).

Question 11

Updated try that

ctwheels ctwheelsctwheels 23k9 gold badges47 silver badges81 bronze badges · Accepted Answer · 2019-12-11 17:53:15Z

You can use the following regex (results in capture group 1):

(?:https?:\/{2}\S+\s)(.*)

s = 'Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more..'
r = /(?:https?:\/{2}\S+\s)(.*)/
m = s.match(r)
if(m) console.log(m[1])

Matches URLs beginning with http:// or https:// up to the next space (inclusive), then captures the remainder of the string into the capture group.

Or you can use the following regex in ECMA2018+ (V8 engine +) - see browser compatibility for lookbehind assertions here:

(?<=https?:\/{2}\S+\s).*

s = 'Here is your result https://polo.felix.com/stat/content/table-1576073323.16.html Please click on the link to see more..'
r = /(?<=https?:\/{2}\S+\s).*/
m = s.match(r)
if(m) console.log(m[0])

Does the same as the previous regex, just uses a positive lookbehind to ensure the URL precedes rather than matching it. The regex match is the remainder of the string after the URL.

Positive lookbehinds cannot have greedy quantifiers in most regex engines
@NoodleOfDeath that's why I specified the version that it's compatible with and included a link to check browser compatibility (although yes, many still don't support variable length lookbehinds).
@ctwheels I will go with first option since my browser has some issues with positive lookbehind. Thanks for your answer.

CollectivesTM on Stack Overflow

How to extract text after a url in a string?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related