I have `a string example = "this site holds all the examples from The Java Developers Almanac and more. Copy and paste these examples directly into your applications"
after token and do some i want on string example i have arraylist like :
ArrayList <token > arl = " "this site holds ", "holds all the examples ", "the examples from The Java Developers", " Copy and paste " )
"this site holds ", i know position start and end in string test : star = 1 end = 3 " holds all the examples ", i know position stat = 3 end = 6, "the examples from The Java Developers", i know position stat = 5 end =10, "Copy and paste" i know position stat = 14 end = 17,
we can see,some element in arl overlaping :"this site holds ","holds all the examples ","the examples from The Java Developers".
The problem here is how can i merge overlaping element to recived arraylist like
ArrayList result ="" this site holds all the examples from The Java Developers","" Copy and paste"";
Here my code : but it only merge fist elecment if check is element overloaping
public ArrayList<TextChunks> finalTextChunks(ArrayList<TextChunks> textchunkswithkeyword) {
ArrayList<TextChunks > result = (ArrayList<TextChunks>) textchunkswithkeyword.clone();
//System.out.print(result.size());
int j;
for(int i=0;i< result.size() ;i++) {
int index = i;
if(i+1>=result.size()){
break;
}
j=i+1;
if(result.get(i).checkOverlapingTwoTextchunks(result.get(j))== true) {
TextChunks temp = new TextChunks();
temp = handleOverlaping(textchunkswithkeyword.get(i),textchunkswithkeyword.get(j),resultSearchEngine);
result.set(i, temp);
result.remove(j);
i = index;
continue;
}
}
return result;
}
}
Thanks in avadce
-
I'm not sure that I understand what you're asking for. Could you clarify your question? Perhaps by using a sample string that doesn't look like part of a question?jasonmp85– jasonmp852010年06月02日 04:25:55 +00:00Commented Jun 2, 2010 at 4:25
-
Sory because my english weak, i have been edit my question , hope you can understand !tiendv– tiendv2010年06月02日 04:44:56 +00:00Commented Jun 2, 2010 at 4:44
1 Answer 1
The following should do it or at least illustrates an idea for merging the chunks. Basically I'm destroying the existing chunks and recreate new ones. Sounds horrible but simplifies a lot. I just store the words in a List and iterate over that word list to build new (merged!) chunks.
private List<TextChunks> finalTextChunks(List<TextChunks> textchunkswithkeyword) {
private List<TextChunks> result = new ArrayList<TextChunk>();
private List<String> wordList = new ArrayList<String>();
// store all words in an arraylist, words are stored at their correct positions,
// ignored words from the original text are represented by null entries
for (TextChunks chunk : textchunkswithkeyword) {
int start = chunk.getStartTextchunks();
List<Token> tokens = chunk.getTokens(); // TODO - implement getTokens() in TextChunks class
for (int i = 0; i < tokens.length; i++) {
wordList.set(start+i, tokens.get(i).toString()); // TODO - overwrite toString() in Token class
}
}
// recreate the chunks
int start = 0;
boolean isChunk = false;
StringBuilder chunkBuilder;
for (int i = 0; i < wordList.size(); i++) {
String word = wordList.get(i);
if (word == null) {
if (isChunk) {
// end of chunk detected
TextChunk chunk = new TextChunk(chunkBuilder.toString().split(" "), start, i);
result.add(chunk);
isChunk = false;
} else {
// do nothing
}
} else {
if (isChunk) {
// chunk gets longer by one word
chunkBuilder.append(" ").append(word);
} else {
// new chunk starts here
chunkBuilder = new StringBuilder(word);
start = i;
isChunk = true;
}
}
if (isChunk) {
// create and add the last chunk
TextChunks chunk = new TextChunk(chunkBuilder.toString(), start, wordList.size()-1);
result.add(chunk);
}
return result;
}
(Warning - absolutely not tested, I have neither an IDE nor a compiler at hand)
EDIT
changed the code - you said, that the TextChunk class holds a token (words?) array. It was just three simple modifications.
EDIT 2
Final edit - I partially adapted my code to your classes. What you need to do:
- implement a getTokens() method in TextChunks that simply returns the
arrt
field - implement a TextChunks constructor that takes a String (with space-separated words), the start and the end. Your
Token
class already provides a static method to convert the String in an arraylist of tokens - overwrite toString() method in
Token
class so that simply returns the token String.