I've created a function that will deserialize an email header into a list of key-value pairs. I've run numerous tests using MS Office Outlook 2010 and MS Office 14.0 Object Library, all of which were successful.
However, I'd still like to know if this function could be optimized/improved.
Note
- I didn't use the Dictionary(Of TKey, TValue) class because a
key
can appear multiple times. - The reason as to why I wrote this RegEx pattern is because a
value
can also contain anew line
and/or:
.
Public Shared Function Deserialize(header As String) As List(Of KeyValuePair(Of String, String))
Const pattern As String = "(\n|^)[a-z|A-Z|-]*:\s"
If (header Is Nothing) Then Throw New ArgumentNullException("header")
Dim result As New List(Of KeyValuePair(Of String, String))
Dim matches As MatchCollection = Regex.Matches(header, pattern)
If ((Not matches Is Nothing) AndAlso (matches.Count > 0)) Then
Dim endIndex As Integer = header.Length
Dim startIndex As Integer = Nothing
Dim item As Match = Nothing
Dim key As String = Nothing
Dim value As String = Nothing
For index As Integer = (matches.Count - 1) To 0 Step -1
item = matches.Item(index)
startIndex = (item.Index + item.Length)
key = item.Value.Trim().Replace(":", "")
value = header.Substring(startIndex, (endIndex - startIndex))
result.Insert(0, New KeyValuePair(Of String, String)(key, value))
endIndex = item.Index
Next
End If
Return result
End Function
-
\$\begingroup\$ Take a look at this. The regular expression in MimeMessage uses groups. You could have everything in one regex and get the specific data with matches(index).Groups("header_key") and matches(index).Groups("header_value"). mailutilities.codeplex.com/SourceControl/latest#MailUtilities/… \$\endgroup\$the_lotus– the_lotus2014年09月23日 13:04:09 +00:00Commented Sep 23, 2014 at 13:04
1 Answer 1
There are a couple of things that I find odd about this code. This is the first thing that jumps out at me.
Dim startIndex As Integer = Nothing Dim item As Match = Nothing Dim key As String = Nothing
I see no purpose in initializing variable to Nothing
. Simply declaring them would be cleaner in my opinion.
Dim startIndex As Integer
Dim item as Match
Dim key As String
I also find your For
loop a bit strange. Is there a reason you're stepping backwards? If there is, I don't see it and you should comment on why in the code.
For index As Integer = (matches.Count - 1) To 0 Step -1
Personally, I would rewrite this to loop forward like this.
For index As Integer = 0 To matches.Count - 1
'...
Next index
You could also use a comment here.
Const pattern As String = "(\n|^)[a-z|A-Z|-]*:\s"
Regular Expression patterns are rarely obvious. Tell Mr. Maintainer in English what you're matching on.
-
\$\begingroup\$ Call it a bad (?) habit but I always set my variables to "something", but I agree that it's not necessary. I used a reverse loop to avoid peeking the next item (if any) in each iteration. I agree 100% with your last part. \$\endgroup\$Bjørn-Roger Kringsjå– Bjørn-Roger Kringsjå2014年09月19日 13:54:15 +00:00Commented Sep 19, 2014 at 13:54
-
\$\begingroup\$ Good enough reason for looping backwards I suppose, but I would definitely add a comment explaining that. \$\endgroup\$RubberDuck– RubberDuck2014年09月19日 14:00:26 +00:00Commented Sep 19, 2014 at 14:00