2

Is there a simple solution to parse a String by using regex in Java?

I have to adapt a HTML page. Therefore I have to parse several strings, e.g.:

href="/browse/PJBUGS-911"
=>
href="PJBUGS-911.html"

The pattern of the strings is only different corresponding to the ID (e.g. 911). My first idea looks like this:

String input = "";
String output = input.replaceAll("href=\"/browse/PJBUGS\\-[0-9]*\"", "href=\"PJBUGS-???.html\"");

I want to replace everything except the ID. How can I do this?

Would be nice if someone can help me :)

Martin Ender
44.4k11 gold badges93 silver badges132 bronze badges
asked Dec 3, 2012 at 19:00

3 Answers 3

3

You can capture substrings that were matched by your pattern, using parentheses. And then you can use the captured things in the replacement with $n where n is the number of the set of parentheses (counting opening parentheses from left to right). For your example:

String output = input.replaceAll("href=\"/browse/PJBUGS-([0-9]*)\"", "href=\"PJBUGS-1ドル.html\"");

Or if you want:

String output = input.replaceAll("href=\"/browse/(PJBUGS-[0-9]*)\"", "href=\"1ドル.html\"");
answered Dec 3, 2012 at 19:02
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your very quick answer and solution. Works fine :-)
1

This does not use regexp. But maybe it still solves your problem.

output = "href=\"" + input.substring(input.lastIndexOf("/")) + ".html\"";
Rohit Jain
214k45 gold badges420 silver badges535 bronze badges
answered Dec 3, 2012 at 19:05

5 Comments

Don't forget to add ".html" to the end
Pretty simple and straightforward this is.
@Vulcan Yes there is. He requires it for his answer.
I believe input is not a single href="/browse/..." but a whole HTML file. Hence, the explicit mentioning of replaceAll in the question.
Thanks for the edit. And yes you're probably right @m.buettner
0

This is how I would do it:

public static void main(String[] args) 
 {
 String text = "href=\"/browse/PJBUGS-911\" blahblah href=\"/browse/PJBUGS-111\" " +
 "blahblah href=\"/browse/PJBUGS-34234\"";
 Pattern ptrn = Pattern.compile("href=\"/browse/(PJBUGS-[0-9]+?)\"");
 Matcher mtchr = ptrn.matcher(text);
 while(mtchr.find())
 {
 String match = mtchr.group(0);
 String insMatch = mtchr.group(1);
 String repl = match.replaceFirst(match, "href=\"" + insMatch + ".html\"");
 System.out.println("orig = <" + match + "> repl = <" + repl + ">");
 }
 }

This just shows the regex and replacements, not the final formatted text, which you can get by using Matcher.replaceAll:

String allRepl = mtchr.replaceAll("href=\"1ドル.html\"");

If just interested in replacing all, you don't need the loop -- I used it just for debugging/showing how regex does business.

answered Dec 3, 2012 at 19:23

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.