Timeline for Replacing string with placeholder and replacing them back after a function.
Current License: CC BY-SA 3.0
15 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| S Mar 25, 2018 at 10:41 | history | bounty ended | Community Bot | ||
| S Mar 25, 2018 at 10:41 | history | notice removed | Community Bot | ||
| Mar 23, 2018 at 17:03 | answer | added | jferard | timeline score: 1 | |
| Mar 23, 2018 at 3:26 | answer | added | Alain T. | timeline score: 1 | |
| Mar 21, 2018 at 16:02 | answer | added | cryptoplex | timeline score: 2 | |
| Mar 21, 2018 at 8:28 | answer | added | Dmitry Arkhipenko | timeline score: 2 | |
| Mar 20, 2018 at 11:29 | comment | added | Wiktor Stribiżew | Have you tried my approach? Or do you want to switch to FlashText now? | |
| Mar 19, 2018 at 10:02 | comment | added | Wiktor Stribiżew |
As for the word boundary, you must be looking for r"(?<!\w){}(?!\w)".format(phrase). Since some of your keywords start with a non-word chars, you cannot use \b. Could you please provide some more logic that you need to implement? It looks like you might need to pass a callback/lambda as the second argument to re.sub to replace each match just once.
|
|
| Mar 19, 2018 at 5:37 | comment | added | alvas | This seem to be one alternative: github.com/vi3k6i5/flashtext | |
| Mar 18, 2018 at 21:09 | comment | added | user557597 |
Single pass, I would match all the words using a regex and put them into two dimension array ( or list). Dimension 0 is the string part, dimension 1 is a flag. When you match a non-phrase string part, the flag is 0, when it is a phrase word, the flag is 1. You can then iterate the array and ignore the ones where the flag is 1. Add, delete, re-arrange elements as needed. Then join them back together. The regex is simple ((?:(?!phrase1|phrase2|phrase3)[\S\s])+)|(phrase1|phrase2|phrase3). Where, capture group 1 is a non-phrase string part, capture group 2 is a phrase.
|
|
| Mar 18, 2018 at 20:59 | comment | added | user557597 |
You're doing this the hard way. Then there'll be some functions to manipulate the text with the placeholders. So, you have a function to work on the text after adding the placeholders. And that function must do a split on whitespace or something. So, now you have an array where you manipulate all the elements except the placeholders, then you want to join the array into a string, then substitute the placeholders back using the real words. Is that correct ?
|
|
| Mar 18, 2018 at 14:50 | comment | added | Ajax1234 |
In your desired output, all strings that do not occur in phrases are removed, except for ik. Why is that?
|
|
| S Mar 17, 2018 at 9:19 | history | bounty started | alvas | ||
| S Mar 17, 2018 at 9:19 | history | notice added | alvas | Authoritative reference needed | |
| Mar 14, 2018 at 8:36 | history | asked | alvas | CC BY-SA 3.0 |