-
-
Notifications
You must be signed in to change notification settings - Fork 506
112 Create skeleton for regex #118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
95fbe80
5ad2e00
c220969
5396793
061a34f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
# <img src="https://raw.githubusercontent.com/bobocode-projects/resources/master/image/logo_transparent_background.png" height=50/>Crazy Regex | ||
|
||
### Pre-conditions ❗ | ||
You're supposed to know how to work regex and be able to build Patterns and Matchers | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Objectives mean the learning goals of this exercise. What you specified here is prerequisites |
||
|
||
### Objectives | ||
* **build Patterns to extract** necessary parts from text ✅ | ||
* **manipulate** extracted text with **Matcher** object ✅ | ||
|
||
### Regular expressions - sequence of characters that define a search pattern for text | ||
|
||
--- | ||
|
||
There 2 peace pf puzzle: | ||
* Literal characters - I want to match literally the character I specified (like 'a') | ||
* Meta characters - I want to match any character of this kind (more generic/abstract thing) | ||
|
||
Single char | ||
|
||
* \\d -> 0-9 | ||
* \\D -> negate of \\d | ||
* \\w -> A-Za-z0-9 | ||
* \\W -> negate of \\w | ||
* \\s -> whitespace, tab | ||
* \\S -> negate of \\s | ||
* . -> anything but newline | ||
* \\. -> literal dot | ||
|
||
|
||
Quantifiers - modify single characters how many of them you want match in a row | ||
* \* -> Occurs zero or more times | ||
* \+ -> 1 or more | ||
* ? -> zero or one | ||
* {min, max} -> some range | ||
* {n} -> precise quantity | ||
|
||
|
||
Position | ||
* ^ -> beginning | ||
* $ -> end | ||
* \\b -> word boundary | ||
|
||
--- | ||
|
||
Character class -> is the thing that appears in between []. For example [abc] -> match 'a' or 'b' or 'c'. | ||
Another example [-.] -> match dash or period. Here . is not meta character anymore and ^ are special characters inside [] | ||
* [0-5] -> match all numbers from 0 to 5. [^0-5] -> match anything that NOT 0-5 | ||
BUT it works like meta character only when it on first position, otherwise - its literal, [a^bc] - like this | ||
|
||
--- | ||
|
||
Capturing Groups - whenever u do regex search it matches whole result as a group 0. | ||
* \\d{3}-\\d{3}-\\d{4} -> 212-555-1234 = GROUP 0 | ||
|
||
Parentheses can capture a subgroup: | ||
\\d{3}-(\\d{3})-(\\d{4}) where 212-555-1234 = GROUP 0, 555 = GROUP 1, 1234 = GROUP 2 | ||
|
||
We can refer to this groups by 1ドル ($ when we want to replace) and 1円 (within regex itself referring to capture group | ||
it's called back reference) | ||
|
||
--- | ||
|
||
#### 🆕 First time here? – [See Introduction](https://github.com/bobocode-projects/java-fundamentals-course/tree/main/0-0-intro#introduction) | ||
#### ➡️ Have any feedback? – [Please fill the form ](https://forms.gle/jhXEAzG4TB81S43CA) | ||
|
||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<project xmlns="http://maven.apache.org/POM/4.0.0" | ||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<parent> | ||
<artifactId>3-0-java-core</artifactId> | ||
<groupId>com.bobocode</groupId> | ||
<version>1.0-SNAPSHOT</version> | ||
</parent> | ||
<modelVersion>4.0.0</modelVersion> | ||
|
||
<artifactId>3-6-3-crazy-regex</artifactId> | ||
|
||
<properties> | ||
<maven.compiler.source>11</maven.compiler.source> | ||
<maven.compiler.target>11</maven.compiler.target> | ||
</properties> | ||
|
||
</project> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,255 @@ | ||
package com.bobocode.se; | ||
|
||
import com.bobocode.util.ExerciseNotCompletedException; | ||
|
||
import java.util.regex.Pattern; | ||
|
||
/** | ||
* {@link CrazyRegex} is an exercise class. Each method returns Pattern class which | ||
* should be created using regex expression. Every method that is not implemented yet | ||
* throws {@link ExerciseNotCompletedException} | ||
* @author Andriy Paliychuk | ||
* TODO: remove exception and implement each method of this class using java.util.regex.Pattern | ||
*/ | ||
public class CrazyRegex { | ||
|
||
/** | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Meitoseshifu every javadoc starts from |
||
* A Pattern that that finds all words "Curiosity" in text | ||
* | ||
* @return a pattern that looks for the word "Curiosity" | ||
*/ | ||
public Pattern findSpecificWord() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds first word in text | ||
* | ||
* @return a pattern that looks for the first word in text | ||
*/ | ||
public Pattern findFirstWord() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds last word in text | ||
* | ||
* @return a pattern that looks for the last word in text | ||
*/ | ||
public Pattern findLastWord() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Meitoseshifu I believe it makes sense to provide some examples inside javadoc. A text and then what should be the result if you use this pattern to find the matches. In this method, I wasn't sure what means "all numbers". E.g. if I have in the text |
||
* A Pattern that finds all numbers in text. When we have "555-555", "(555)555" and "30th" in text | ||
* our pattern must grab all that numbers: | ||
* "555" - four times, and one "30" | ||
* | ||
* @return a pattern that looks for numbers | ||
*/ | ||
public Pattern findAllNumbers() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds all dates. For instance: "1971年11月23日" | ||
* | ||
* @return a pattern that looks for dates | ||
*/ | ||
public Pattern findDates() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds different variations of word "color". | ||
* We are looking for: "color", "colour", "colors", "colours" | ||
* | ||
* @return a pattern that looks for different variations of word "color" | ||
*/ | ||
public Pattern findDifferentSpellingsOfColor() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Meitoseshifu since it's not necessarily clear what is a zip code, it would be nice to have an example here as well. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or at least explanation: "a zip code is a 5-digit number without any characters or special symbols," |
||
* A Pattern that finds all zip codes in text. | ||
* Zip code is a 5-digit number without any characters or special symbols. | ||
* For example: 72300 | ||
* | ||
* @return a pattern that looks for zip codes | ||
*/ | ||
public Pattern findZipCodes() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds different variations of word "link". | ||
* We are looking for: "lynk", "link", "l nk", "l(nk" | ||
* | ||
* @return a pattern that looks for different variations of word "link" | ||
*/ | ||
public Pattern findDifferentSpellingsOfLink() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds phone numbers. | ||
* For example: "555-555-5555" | ||
* | ||
* @return a pattern that looks for phone numbers | ||
*/ | ||
public Pattern findSimplePhoneNumber() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds numbers with following requirements: | ||
* - inside the number can be only digits from 0 to 5 | ||
* - length 3 | ||
* | ||
* @return a pattern that looks for numbers with length 3 and digits from 0 to 5 in the middle | ||
*/ | ||
public Pattern findNumbersFromZeroToFiveWithLengthThree() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds all words in text that have length 5 | ||
* | ||
* @return a pattern that looks for the words that have length 5 | ||
*/ | ||
public Pattern findAllWordsWithFiveLength() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds words and numbers with following constraints: | ||
* - not shorter than two symbols | ||
* - not longer than three symbols | ||
* | ||
* @return a pattern that looks for words and numbers that not shorter 2 and not longer 3 | ||
*/ | ||
public Pattern findAllLettersAndDigitsWithLengthThree() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Meitoseshifu method name does not correspond to the description. |
||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds all words that begin with capital letter | ||
* | ||
* @return a pattern that looks for the words that begin with capital letter | ||
*/ | ||
public Pattern findAllWordsWhichBeginWithCapitalLetter() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds only the following abbreviation: | ||
* - AK, AL, AR, AZ, CA, CO, CT, PR, PA, PD | ||
* | ||
* @return a pattern that looks for the abbreviations above | ||
*/ | ||
public Pattern findAbbreviation() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Meitoseshifu maybe instead of writing |
||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be good to have an example. |
||
* A Pattern that finds all open braces | ||
* | ||
* @return a pattern that looks for all open braces | ||
*/ | ||
public Pattern findAllOpenBraces() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds everything inside [] | ||
* | ||
* @return a pattern that looks for everything inside [] | ||
*/ | ||
public Pattern findOnlyResources() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds all https links in note.txt | ||
* | ||
* @return a pattern that looks for all https links in note.txt | ||
*/ | ||
public Pattern findOnlyLinksInNote() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds all http links in nasa.json | ||
* | ||
* @return a pattern that looks for all http links in nasa.json | ||
*/ | ||
public Pattern findOnlyLinksInJson() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds all .com, .net and .edu emails | ||
* | ||
* @return a pattern that looks for all .com, .net and .edu emails | ||
*/ | ||
public Pattern findAllEmails() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds the following examples of phone numbers: | ||
* - 555-555-5555 | ||
* - 555.555.5555 | ||
* - (555)555-5555 | ||
* | ||
* @return a pattern that looks for phone numbers patterns above | ||
*/ | ||
public Pattern findAllPatternsForPhoneNumbers() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* A Pattern that finds only duplicates | ||
* | ||
* @return a pattern that looks for duplicates | ||
*/ | ||
public Pattern findOnlyDuplicates() { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* You have a text where all names recorded as first name, last name. | ||
* Create matcher and use method replaceAll to record that names as: | ||
* - last name first name | ||
* | ||
* @return String where all names recorded as last name first name | ||
*/ | ||
public String replaceFirstAndLastNames(String names) { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* You have a text with phone numbers. | ||
* Create matcher and use method replaceAll to replace last digits: | ||
* - 555-XXX-XXXX | ||
* | ||
* @return String where in all phone numbers last 7 digits replaced to X | ||
*/ | ||
public String replaceLastSevenDigitsOfPhoneNumberToX(String phones) { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
/** | ||
* You have a text with resources and links to those resources: | ||
* - [Bobocode](https://www.bobocode.com) | ||
* Create matcher and use method replaceAll to get the following result: | ||
* - <a href="https://www.bobocode.com">Bobocode</a> | ||
* | ||
* @return String where all resources embraced in href | ||
*/ | ||
public String insertLinksAndResourcesIntoHref(String links) { | ||
throw new ExerciseNotCompletedException(); | ||
} | ||
|
||
|
||
} |