0

I'm working on a "Do you mean ..." kinda system similar to Google! The speller part is trivial (with PHP's pspell library) but what I can't solve is the case problem.

Let's say the mispelled word is "GoVeNMeNt" then the correct word should be "GoVerNMeNt" (similar to Google), but pspell library gives suggestions only in one-case (lower-case usually).

So how do I write a function transformCase which takes in the actual string ($string) and the suggestion string ($subject)? I have written the following implementation which doesn't handle all cases:

function transformCase($string,$subject){
 for ($i=0,$marker=0;$i<strlen($string);++$i)
 if (strcasecmp($string[$i],$subject[$marker])==0){
 $subject[$marker]=$string[$i];
 $marker+=1;
 }
 elseif (strlen($string)==strlen($subject))
 $marker+=1;
 return $subject;
}
echo transformCase("AbSaNcE",'absence')."\n"; # AbSeNcE :)
echo transformCase("StRioNG",'string')."\n"; # StRiNG :)
echo transformCase("GOVERMENt",'government')."\n"; # GOVERNment :<

In the last case the output should be GOVERnMENt. The algorithm also doesn't work on various other queries.

So I'd be happy if someone helps me with the algorithm :)

asked Jan 5, 2020 at 17:39
3
  • 1
    Why does the case matter? Commented Jan 5, 2020 at 17:44
  • 1
    Don't use exclamation points, you're not yelling at us (and if you are, this is not the place for those kind of posts). Rather than answer your question, a counter-question: why do you need to match case? If someone searched for GOVORnMENt, your autosuggester saying "did you mean government?" is fine. Why is it important to preserve case, when your search backend is going to do case insensitive matching anyway? Commented Jan 5, 2020 at 17:45
  • The case matters because I want to make it very similar to Google! Try searching GoVERMENt in google and it'd say "Did you mean GoVERNMENt"! So that's why the case matters Commented Jan 6, 2020 at 4:49

1 Answer 1

0

Try the next modification to your algorithm:

function transformCase($string,$subject) {
 for ($i=0,$marker=0;$i<strlen($string);++$i) {
 if (strcasecmp($string[$i],$subject[$marker])==0) {
 $subject[$marker]=$string[$i];
 $marker+=1;
 }
 // Look for the next same character in $string
 while (strcasecmp($string[$i],$subject[$marker])!=0) {
 $i+=1;
 }
 }
 return $subject;
}

The comparisson elseif (strlen($string)==strlen($subject)) don't warrant the function to work as you need. Otherwise, you can introduce additional modifications for a best performance.

answered Jan 6, 2020 at 6:36
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.