My program needs to parse an e-mail string. There are two possibilities to enter an e-mail-address. Either with the alias or without one, plain the e-mail-address.
1st possibity:
string addressWithAlias = "test my address <[email protected]>";
2nd possibility:
string addressWithoutAlias = "[email protected]";
So, I wrote two functions:
private static string[] getAddressPartsRegex(string address)
{
string plainaddress = address.Trim();
Regex reg = new Regex(@"(.+?(?=<))<(.*@.*?)>");
var gr = reg.Match(plainaddress).Groups;
return gr.Count == 1
? new[] { plainaddress }
: new[] { gr[1].Value.Trim(), gr[2].Value.Trim() };
}
private static string[] getAddressParts(string address)
{
var splittedAdress = address.Split(' ');
return splittedAdress.Last().Trim().StartsWith("<")
? new[] { string.Join(" ", splittedAdress.Take(splittedAdress.Length - 1)), splittedAdress.Last().Trim(' ', '<', '>') }
: splittedAdress;
}
They both work fine and the results are the same. One uses regex, the other uses Split
and Join
.
What would you suggest to use, and why? What is the more beautiful function?
Are there any bugs I didn't see?
3 Answers 3
Consider taking advantage of existing features that could provide an additional layer of validation.
mainly System.Net.Mail.MailAddress
Also as mentioned in a comment, no need to be creating the regular expression every time the function is called.
static Regex mailExpression = new Regex(@"(.+?(?=<))<(.*@.*?)>");
private static MailAddress getAddress(string address) {
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");
var plainaddress = address.Trim();
var groups = mailExpression.Match(plainaddress).Groups;
return groups.Count == 1
? new MailAddress(plainaddress)
: new MailAddress(groups[2].Value.Trim(), groups[1].Value.Trim());
}
According to reference source code, internally MailAddress
will try to parse the address given to it.
This avoids having to roll your own parser as one already exists out of the box that has been tried, tested and is stable.
private static MailAddress getAddress(string address) {
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");
address = address.Trim();
return new MailAddress(address);
}
You have the added advantage of having a strongly typed object model to work with that will provide you with usable properties.
The following Unit Test demonstrates the desired behavior.
[TestClass]
public class EmailParserTest {
[TestMethod]
public void Should_Parse_EmailAddress_With_Alias() {
//Arrange
var expectedAlias = "test my address";
var expectedAddress = "[email protected]";
string addressWithAlias = "test my address <[email protected]>";
//Act
var mailAddressWithAlias = getAddress(addressWithAlias);
//Assert
mailAddressWithAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == expectedAddress && _.DisplayName == expectedAlias);
}
[TestMethod]
public void Should_Parse_EmailAddress_Without_Alias() {
//Arrange
var addressWithoutAlias = "[email protected]";
//Act
var mailAddressWithoutAlias = getAddress(addressWithoutAlias);
//Assert
mailAddressWithoutAlias
.Should()
.NotBeNull()
.And.Match<MailAddress>(_ => _.Address == addressWithoutAlias && _.DisplayName == string.Empty);
;
}
private static MailAddress getAddress(string address) {
if (address == null) throw new ArgumentNullException("address");
if (string.IsNullOrWhiteSpace(address)) throw new ArgumentException("invalid address", "address");
address = address.Trim();
return new MailAddress(address);
}
}
-
\$\begingroup\$ I'm working with
MimeKit.MailboxAddress
here, poorly, the Mimekit-library doesn't do me that favour, Mailkit throws error. But you are right, of course :) \$\endgroup\$Matthias Burger– Matthias Burger2018年03月01日 15:15:38 +00:00Commented Mar 1, 2018 at 15:15 -
\$\begingroup\$ i now let system.net.mail do this thing. it's better indeed. and interesting unit-test. what lib you use? \$\endgroup\$Matthias Burger– Matthias Burger2018年03月01日 15:29:54 +00:00Commented Mar 1, 2018 at 15:29
-
1\$\begingroup\$ I used the standard VS testing tools for the test runner and Fluent Assertions to assert \$\endgroup\$Nkosi– Nkosi2018年03月01日 15:34:12 +00:00Commented Mar 1, 2018 at 15:34
-
\$\begingroup\$ If you're wanting the regex, go one farther and declare it
static readonly Regex mailExpression = new Regex(@"(.+?(?=<))<(.*@.*?)>", RegexOptions.Compiled);
\$\endgroup\$Jesse C. Slicer– Jesse C. Slicer2023年01月17日 18:51:31 +00:00Commented Jan 17, 2023 at 18:51
Let me suggest an alternative regular expression, that correctly handles both cases:
(.*?)<?(\b\S+@\S+\b)>?
This regular expression correctly identifies both patterns you want to support. Somewhat noteworthy here is the use of \S
in the email-address to exclude whitespace characters, which are incorrectly allowed in your orignal regex. That led to accepting something like the following as valid Email specification:
bla bla <te st@ exampl e.com>
Another thing that this regex does is accept Specifications of email adresses that do not require the email to be enclosed in <>
. This happens by ensuring the address is surrounded by word boundaries (\b
).
You should be able to easily use it like so:
static Regex mailExpression = new Regex(@"(.*?)<?(\b\S+@\S+\b)>?");
private static String[] getAddressParts(string addressSpec)
{
var groups = mailExpression.Match(addressSpec).Groups;
return groups[1] == ""
? new[] { groups[2].Value }
: new[] { groups[1].Value.Trim(), groups[2].Value };
}
This does of course not preclude using the very valid suggestion by Nkosi
-
\$\begingroup\$ FWIW this regex does not correctly deal with quoted-string local parts... A fix is pretty simple, though and left as an exercise in regex for the reader. \$\endgroup\$Vogel612– Vogel6122018年03月01日 16:51:15 +00:00Commented Mar 1, 2018 at 16:51
If you are using MimeKit, then you can use the methods Parse and TryParse of MailboxAddress. Here's a modified version of the above unit test.
[TestMethod]
public void Should_Parse_EmailAddress_With_Alias()
{
//Arrange
var expectedAlias = "test my address";
var expectedAddress = "[email protected]";
string addressWithAlias = "test my address <[email protected]>";
//Act
var mailAddressWithAlias = MimeKit.MailboxAddress.Parse(addressWithAlias);
//Assert
Assert.AreEqual(expectedAddress, mailAddressWithAlias.Address);
Assert.AreEqual(expectedAlias, mailAddressWithAlias.Name);
}
Explore related questions
See similar questions with these tags.
James Bond <[email protected]>
will display in outlook justJames Bond
instead of his e-mail-address. So, this is kind of a standard :) \$\endgroup\$