I'm reading a file using bufferedreader, so lets say i have
line = br.readLine();
I want to check if this line contains one of many possible strings (which i have in an array). I would like to be able to write something like:
while (!line.matches(stringArray) { // not sure how to write this conditional
do something here;
br.readLine();
}
I'm fairly new to programming and Java, am I going about this the right way?
3 Answers 3
Copy all values into a Set<String>
and then use contains()
:
Set<String> set = new HashSet<String> (Arrays.asList (stringArray));
while (!set.contains(line)) { ... }
[EDIT] If you want to find out if a part of the line contains a string from the set, you have to loop over the set. Replace set.contains(line)
with a call to:
public boolean matches(Set<String> set, String line) {
for (String check: set) {
if (line.contains(check)) return true;
}
return false;
}
Adjust the check accordingly when you use regexp or a more complex method for matching.
[EDIT2] A third option is to concatenate the elements in the array in a huge regexp with |
:
Pattern p = Pattern.compile("str1|str2|str3");
while (!p.matcher(line).find()) { // or matches for a whole-string match
...
}
This can be more cheap if you have many elements in the array since the regexp code will optimize the matching process.
5 Comments
Pattern.quote()
first.It depends on what stringArray
is. If it's a Collection
then fine. If it's a true array, you should make it a Collection
. The Collection
interface has a method called contains()
that will determine if a given Object
is in the Collection
.
Simple way to turn an array into a Collection
:
String tokens[] = { ... }
List<String> list = Arrays.asList(tokens);
The problem with a List
is that lookup is expensive (technically linear or O(n)
). A better bet is to use a Set
, which is unordered but has near-constant (O(1)
) lookup. You can construct one like this:
From a Collection
:
Set<String> set = new HashSet<String>(stringList);
From an array:
Set<String> set = new HashSet<String>(Arrays.asList(stringArray));
and then set.contains(line)
will be a cheap operation.
Edit: Ok, I think your question wasn't clear. You want to see if the line contains any of the words in the array. What you want then is something like this:
BufferedReader in = null;
Set<String> words = ... // construct this as per above
try {
in = ...
while ((String line = in.readLine()) != null) {
for (String word : words) {
if (line.contains(word)) [
// do whatever
}
}
}
} catch (Exception e) {
e.printStackTrace();
} finally {
if (in != null) { try { in.close(); } catch (Exception e) { } }
}
This is quite a crude check, which is used surprisingly open and tends to give annoying false positives on words like "scrap". For a more sophisticated solution you probably have to use regular expression and look for word boundaries:
Pattern p = Pattern.compile("(?<=\\b)" + word + "(?=\b)");
Matcher m = p.matcher(line);
if (m.find() {
// word found
}
You will probably want to do this more efficiently (like not compiling the pattern with every line) but that's the basic tool to use.
1 Comment
Using the String.matches(regex)
function, what about creating a regular expression that matches any one of the strings in the string array? Something like
String regex = "*(";
for(int i; i < array.length-1; ++i)
regex += array[i] + "|";
regex += array[array.length] + ")*";
while( line.matches(regex) )
{
//. . .
}