Custom Contains() function

Question 1

I have the following function and I was wondering if anyone could come up with something more efficient.

It just has to split both the content and the search string and determine how many words are required to be matched before being true.

For example,

"A dog can run.".ContainsV2("Red Dog Run", 4) = false
"A dog can run.".ContainsV2("Red Dog Run", 2) = true
"A dog can run.".ContainsV2("Red Dog Run") = true
"A dog can run.".ContainsV2("Red Dog Run", MatchAllWords: true) = false

/// <summary>
/// Checks to see if any of the words are contained in the string. 
/// </summary>
/// <param name="content"></param>
/// <param name="SearchString">Search string</param>
/// <param name="NumberOfMatches">Specify how many words have to match in the search string.
/// This value is automatically dropped when the user searches less words.
/// I.e. NumberOfMatches = 3 and the user searches 'Bob', NumberOfMatches gets set to 1.</param>
/// <param name="MatchAllWords">If set to true, all of the words in the search string must be contained in what's being checked</param>
/// <returns></returns>
public static bool ContainsV2(this string content, string SearchString, int NumberOfMatches = 1, bool MatchAllWords = false)
{
 var matches = 0;
 var numberOfWords = SearchString.Split(' ').Length;
 // Update required number of matches
 NumberOfMatches = NumberOfMatches > numberOfWords ? 
 numberOfWords : 
 NumberOfMatches;
 if (!string.IsNullOrEmpty(content) && !string.IsNullOrEmpty(SearchString))
 {
 SearchString.Split(' ')
 .ToList()
 .ForEach(w => //Gets the index of each word after being split
 matches += content
 .IndexOf(
 StripNonAlphaNumeric(w), 
 StringComparison.OrdinalIgnoreCase
 ) >= 0 ? 1 : 0
 );
 }
 if (MatchAllWords)
 {
 return matches == numberOfWords;
 }
 return matches >= NumberOfMatches;
}

Question 2

Please don't add extra code after you've received answers. That invalidates the answers.

Question 3

@Quill It didn't really change anything as far as the answers, just explained what StripNonAlphaNumeric() did.

Question 4

A few remarks about your code:

Methods should have meaningful names. Your extension method looks like a newer (better?) version of the already existing Contains method but your extension method does something else than the original. A better, meaningful name would be ContainsWords.

The same can be said about your arguments. In my opinion SearchString does not cover the meaning of the argument. wordsToMatch is more appropriate. I renamed the other arguments as well as you can see in my code.

Argument names should be camel cased.

You extension method only accepts a string containing words to match. For convenient usage I'd add an overload that takes an collections of words.

Here's what I made of it:

using System;
using System.Collections.Generic;
using System.Linq;
public static class StringExtensions
{
 public static bool ContainsWords(this String source, IEnumerable<String> wordsToMatch, int matchAtLeast = 1, bool matchAll = false)
 {
 if(wordsToMatch == null)
 {
 throw new ArgumentNullException();
 }
 var sourceWords = GetWords(source);
 var matches = sourceWords.Intersect(wordsToMatch, StringComparer.CurrentCultureIgnoreCase);
 if(matchAll || wordsToMatch.Count() < matchAtLeast)
 {
 return matches.Count() == sourceWords.Count();
 }
 return matches.Count() >= matchAtLeast; 
 }
 public static bool ContainsWords(this String source, String wordsToMatch, int matchAtLeast = 1, bool matchAll = false)
 {
 if(wordsToMatch == null)
 {
 throw new ArgumentNullException();
 }
 return source.ContainsWords(GetWords(wordsToMatch), matchAtLeast, matchAll);
 }
 // Splits string in individual words.
 private static IEnumerable<String> GetWords(String source)
 {
 return source.Split(new [] {' '}, StringSplitOptions.RemoveEmptyEntries)
 .Select(x => x.Trim('.'));
 // Optionally: trim other punctuation.
 }
}

Update concerning efficiency

In your code you are splitting the search string twice. That's not very efficient. Next, you are using IndexOf on the source string for every word in the search string. That's not very efficient either. I'm not very good at algorithm analysis but my guess is that splitting the source and the words to match and then applying an Intersect on the resulting arrays is more efficient.

Question 5

I like it, but the question was on efficiency. I'll accept it for readability though.

Question 6

Here is my first review. See what you can make of it! I left comments for you. They always discuss the code right below.

It is bad style to leave documentation tags empty. Either have no param tag or fill all with appropriate descriptions:

/// <param name="content"></param>
/// <param name="SearchString">Search string</param>
/// <param name="NumberOfMatches">Specify how many words have to match in the search string.

Too much implementation detail in my opinion:

/// This value is automatically dropped when the user searches less words.
/// I.e. NumberOfMatches = 3 and the user searches 'Bob', NumberOfMatches gets set to 1.</param>

Self explanatory name:

/// <param name="MatchAllWords">If set to true, all of the words in the search string must be contained in what's being checked</param>

This is self explanatory, remove the line:

/// <returns></returns>

Use a more telling name like ContainsWords:

public static bool ContainsV2(this string content, string SearchString, int NumberOfMatches = 1, bool MatchAllWords = false)
{
 var matches = 0;
 var numberOfWords = SearchString.Split(' ').Length;

Don't use empty comments like "Sets a variable" or "Updates value". Explain it better or leave out the comment:

 // Update required number of matches

The ?: operator is hard to understand and not necessary here. Rather, use something like in my comment:

 // if (NumberOfMatches > numberOfWords)
 // {
 // NumberOfMatches = numberOfWords;
 // }
 NumberOfMatches = NumberOfMatches > numberOfWords ? 
 numberOfWords : 
 NumberOfMatches;

You already split the search string above. Either don't check for null or do it in the beginning:

 if (!string.IsNullOrEmpty(content) && !string.IsNullOrEmpty(SearchString))
 {

This is confusing. Reduce the complexity and nesting depth:

 SearchString.Split(' ')
 .ToList()
 .ForEach(w => //Gets the index of each word after being split
 matches += content
 .IndexOf(
 StripNonAlphaNumeric(w), 
 StringComparison.OrdinalIgnoreCase
 ) >= 0 ? 1 : 0
 );
 }

Rather set this at the start of the function. It is confusing here:

 if (MatchAllWords)
 {
 return matches == numberOfWords;
 }
 return matches >= NumberOfMatches;
}

Question 7

Your remarks seem reasonable, but this is not a very readable way of commenting on the pasted code

Question 8

@vauhochzett in my company as well, but this is a Q&A site ;) He could paste it, but there's other readers who probably won't. Look around the forum and see that people usually don't format their answers like that here.

Question 9

@ThomasWeller "describe" being the operational word ;) Empty tag doesn't describe anything, and "Search string" doesn't do anything to describe searchString either. If that was the case, this would be nothing else but gaming an imperfect system to adhere to certain rules on the surface, while disregarding their actual purpose

Question 10

@KonradMorawski Thanks for the explanation. You are completely correct. I will update my post accordingly.

Question 11

Sure thing, and grab an upvote - your suggestions are on the nose

venerik venerik 3972 silver badges12 bronze badges · Accepted Answer · 2016-04-05 19:23:35Z

A few remarks about your code:

Methods should have meaningful names. Your extension method looks like a newer (better?) version of the already existing Contains method but your extension method does something else than the original. A better, meaningful name would be ContainsWords.

The same can be said about your arguments. In my opinion SearchString does not cover the meaning of the argument. wordsToMatch is more appropriate. I renamed the other arguments as well as you can see in my code.

Argument names should be camel cased.

You extension method only accepts a string containing words to match. For convenient usage I'd add an overload that takes an collections of words.

Here's what I made of it:

using System;
using System.Collections.Generic;
using System.Linq;
public static class StringExtensions
{
 public static bool ContainsWords(this String source, IEnumerable<String> wordsToMatch, int matchAtLeast = 1, bool matchAll = false)
 {
 if(wordsToMatch == null)
 {
 throw new ArgumentNullException();
 }
 var sourceWords = GetWords(source);
 var matches = sourceWords.Intersect(wordsToMatch, StringComparer.CurrentCultureIgnoreCase);
 if(matchAll || wordsToMatch.Count() < matchAtLeast)
 {
 return matches.Count() == sourceWords.Count();
 }
 return matches.Count() >= matchAtLeast; 
 }
 public static bool ContainsWords(this String source, String wordsToMatch, int matchAtLeast = 1, bool matchAll = false)
 {
 if(wordsToMatch == null)
 {
 throw new ArgumentNullException();
 }
 return source.ContainsWords(GetWords(wordsToMatch), matchAtLeast, matchAll);
 }
 // Splits string in individual words.
 private static IEnumerable<String> GetWords(String source)
 {
 return source.Split(new [] {' '}, StringSplitOptions.RemoveEmptyEntries)
 .Select(x => x.Trim('.'));
 // Optionally: trim other punctuation.
 }
}

Update concerning efficiency

In your code you are splitting the search string twice. That's not very efficient. Next, you are using IndexOf on the source string for every word in the search string. That's not very efficient either. I'm not very good at algorithm analysis but my guess is that splitting the source and the words to match and then applying an Intersect on the resulting arrays is more efficient.

I like it, but the question was on efficiency. I'll accept it for readability though.

Stack Exchange Network

Custom Contains() function

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Custom Contains() function

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions