1
\$\begingroup\$

I am trying to implement a sub-string extractor with "start keyword" and "end keyword" and the extracted result is from (but excluded) the given start keyword to (but excluded) end keyword. For example:

Input String Start Keyword End Keyword Output
"C# was developed around 2000 by Microsoft as part of its .NET initiative" ""(empty string) ""(empty string) "C# was developed around 2000 by Microsoft as part of its .NET initiative"
"C# was developed around 2000 by Microsoft as part of its .NET initiative" ""(empty string) ".NET" "C# was developed around 2000 by Microsoft as part of its"
"C# was developed around 2000 by Microsoft as part of its .NET initiative" "C#" ""(empty string) "was developed around 2000 by Microsoft as part of its .NET initiative"
"C# was developed around 2000 by Microsoft as part of its .NET initiative" "C#" ".NET" "was developed around 2000 by Microsoft as part of its"
"C# was developed around 2000 by Microsoft as part of its .NET initiative" ".NET" ""(empty string) "initiative"
"C# was developed around 2000 by Microsoft as part of its .NET initiative" ""(empty string) "C#" ""(empty string)
"C# was developed around 2000 by Microsoft as part of its .NET initiative" ".NET" "C#" ""(empty string)

The experimental implementation

The experimental implementation is as below.

private static string GetTargetString(string stringInput, string startKeywordInput, string endKeywordInput)
{
 int startIndex;
 if (String.IsNullOrEmpty(startKeywordInput))
 {
 startIndex = 0;
 }
 else 
 {
 if (stringInput.IndexOf(startKeywordInput) >= 0)
 {
 startIndex = stringInput.IndexOf(startKeywordInput) + startKeywordInput.Length;
 }
 else
 {
 return "";
 }
 
 }
 int endIndex;
 if (String.IsNullOrEmpty(endKeywordInput))
 {
 endIndex = stringInput.Length;
 }
 else
 {
 if (stringInput.IndexOf(endKeywordInput) > startIndex)
 {
 endIndex = stringInput.IndexOf(endKeywordInput);
 }
 else
 {
 return "";
 }
 }
 
 
 // Check startIndex and endIndex
 if (startIndex < 0 || endIndex < 0 || startIndex >= endIndex)
 {
 return "";
 }
 if (endIndex.Equals(0).Equals(true))
 {
 endIndex = stringInput.Length;
 }
 int TargetStringLength = endIndex - startIndex;
 return stringInput.Substring(startIndex, TargetStringLength).Trim();
}

Test cases

string test_string1 = "C# was developed around 2000 by Microsoft as part of its .NET initiative";
Console.WriteLine(GetTargetString(test_string1, "", ""));
Console.WriteLine(GetTargetString(test_string1, "", ".NET"));
Console.WriteLine(GetTargetString(test_string1, "C#", ""));
Console.WriteLine(GetTargetString(test_string1, "C#", ".NET"));
Console.WriteLine(GetTargetString(test_string1, ".NET", ""));
Console.WriteLine(GetTargetString(test_string1, "", "C#"));
Console.WriteLine(GetTargetString(test_string1, ".NET", "C#"));

The output of the above test cases.

C# was developed around 2000 by Microsoft as part of its .NET initiative
C# was developed around 2000 by Microsoft as part of its
was developed around 2000 by Microsoft as part of its .NET initiative
was developed around 2000 by Microsoft as part of its
initiative

If there is any possible improvement, please let me know.

asked Dec 31, 2020 at 8:37
\$\endgroup\$
2
  • 2
    \$\begingroup\$ Why not also provide an optional docs.microsoft.com/en-us/dotnet/api/system.stringcomparison parameter? \$\endgroup\$ Commented Dec 31, 2020 at 9:17
  • 1
    \$\begingroup\$ It is good practice to specify culture when doing string operations, like string.IndexOf \$\endgroup\$ Commented Jan 1, 2021 at 21:00

2 Answers 2

1
\$\begingroup\$

Maybe the last if-statement could be simplified by removing the Equals(true), for Equals(0) already returns a bool, doesn’t it?

Edit:
Actually, I think you could skip the whole if block because if endIndex is 0 it couldn’t bypass the if-statement before, could it?

If startIndex is 0 empty string will be returned startIndex >= endIndex .

If startIndex is less than 0 then empty string will be returned.

So how could endIndex be 0 at the last if-statement?

Sᴀᴍ Onᴇᴌᴀ
29.5k16 gold badges45 silver badges201 bronze badges
answered Dec 31, 2020 at 13:56
\$\endgroup\$
2
\$\begingroup\$

I tend to have the "error handling" code at the beginning of the method, which usually makes the rest of the method more simple.

private static string GetTargetString(string input, string startKeyword, string endKeyword, StringComparison comparer)
{
 if (!string.IsNullOrEmpty(startKeyword) && input.IndexOf(startKeyword, comparer) < 0) return "";
 if (!string.IsNullOrEmpty(endKeyword) && input.IndexOf(endKeyword, comparer) < 0) return "";
 var startIndex = string.IsNullOrEmpty(startKeyword)
 ? 0
 : input.IndexOf(startKeyword, comparer) + startKeyword.Length;
 var endIndex = string.IsNullOrEmpty(endKeyword) 
 ? input.Length 
 : input.IndexOf(endKeyword, comparer);
 if (startIndex < 0 || endIndex < 0 || startIndex >= endIndex) return "";
 return input.Substring(startIndex, endIndex - startIndex).Trim();
}
answered Jan 1, 2021 at 21:17
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.