I need to classify each line as "announce, whisper or chat" once I have that sorted out I need to extract certain values to be processed.
Right now my regex is as follow:
var regex = new Regex(@"^\[(\d{2}:\d{2}:\d{2})\]\s*(?:(\[System Message\])?\s*<([^>]*)>|((.+) Whisper You :))\s*(.*)$");
- Group 0 is the entire message.
- Group 1 is the hour time of when the message was sent.
- Group 2 is wether it was an announce or chat.
- Group 3 is who sent the announce.
- Group 4 is if it was a whisper or not.
- Group 5 is who sent the whisper.
- Group 6 is the sent message by the user or system.
Classify each line:
if 4 matches
means it is a whisper
else if 2 matches
means it is an announce
else
normal chat
Should I change anything to my regex to make it more precise/accurate on the matches ?
Sample data:
[02:33:03] John Whisper You : Heya
[02:33:03] John Whisper You : How is it going
[02:33:12] <John> [02:33:16] [System Message] bla bla
[02:33:39] <John> heya
[02:33:40] <John> hello :S
[02:33:57] <John> hi
[02:33:57] [System Message] <John> has left the room
[02:33:57] [System Message] <John> has entered the room
1 Answer 1
You can always break it down in multiple lines to make it more readable. You can also use named groups which take the "magic" out of the group numbers (4 == whisper, 3 == normal, etc).
var regex = new Regex(@"^\[(?<TimeStamp>\d{2}:\d{2}:\d{2})\]\s*" +
@"(?:" +
@"(?<SysMessage>\[System Message\])?\s*" +
@"<(?<NormalWho>[^>]*)>|" +
@"(?<Whisper>(?<WhisperWho>.+) Whisper You :))\s*" +
@"(?<Message>.*)$");
string data = @"[02:33:03] John Whisper You : Heya
[02:33:03] John Whisper You : How is it going
[02:33:12] <John> [02:33:16] [System Message] bla bla
[02:33:39] <John> heya
[02:33:40] <John> hello :S
[02:33:57] <John> hi
[02:33:57] [System Message] <John> has left the room
[02:33:57] [System Message] <John> has entered the room";
foreach (var msg in data.Split(new char[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries))
{
Match match = regex.Match(msg);
if (match.Success)
{
if (match.Groups["Whisper"].Success)
{
Console.WriteLine("[whis from {0}]: {1}", match.Groups["WhisperWho"].Value, msg);
}
else if (match.Groups["SysMessage"].Success)
{
Console.WriteLine("[sys msg]: {0}", msg);
}
else
{
Console.WriteLine("[normal from {0}]: {1}", match.Groups["NormalWho"].Value, msg);
}
}
}
-
\$\begingroup\$ that is pretty cool dude thanks I was looking for a way to actually split each pattern I wanted like that but was not aware of how to do it. \$\endgroup\$Prix– Prix2011年05月31日 23:49:11 +00:00Commented May 31, 2011 at 23:49