I need to perform a bunch of string manipulation tasks on an original string, then output the final string. Best I could think of is extracting some functionality into functions. Any better way to structure it? Also, are the variable/function names too long?
- strip out unicode chars from original string A, reverse them into string B
- in the remaining string A, remove duplicate words
- append: A + B
function extract_unicode_characters_from_string($str) {
preg_match_all('/\p{Han}+/u', $str, $unicode_characters_only, PREG_SET_ORDER);
$squashed = array_column($unicode_characters_only, 0);
return $squashed;
}
function strip_unicode_characters_from_string($str) {
return preg_replace('/\p{Han}+/u', '', $str);
}
function strip_duplicates($str){
$exploded_str = explode(' ', $str);
$exploded_str_no_empties = array_filter($exploded_str, function($item) { return !is_null($item) && $item !== ''; });
return implode(' ', array_unique($exploded_str_no_empties));
}
$original_string = 'one two three 喞 喝 four 刷囿 two 跏正 吁';
$array_of_unicode_characters = extract_unicode_characters_from_string($original_string);
$reversed_unicode_characters = implode(' ', array_reverse($array_of_unicode_characters));
$string_without_unicode_characters = strip_unicode_characters_from_string($original_string);
$english_words_no_dupes = strip_duplicates($string_without_unicode_characters);
echo $english_words_no_dupes . ' ' . $reversed_unicode_characters;
1 Answer 1
Yes, I think your variable names are a bit long. While it is good to be expressive, you don't want to be pushing your line width beyond the recommended max width if avoidable.
I don't know if you need many custom functions here. The shared calls which filter empty and duplicate strings can be a custom call. Otherwise, everything else is single-use.
Consider this snippet which simplifies much of the script by grouping the multibyte and single-byte non-whitespace substrings from the start. No extra exploding, and only one implode call.
Code: (Demo)
function uniqueNoEmpty($array) {
return array_unique(array_filter($array, 'strlen'));
}
$original_string = 'one two three 喞 喝 four 刷囿 two 跏正 吁';
if (!preg_match_all('~(\p{Han}+)|(\S+)~u', $original_string, $out)) {
echo 'no qualifying strings';
} else {
$singleBytes = uniqueNoEmpty($out[2]) ?? [];
$multiBytes = array_reverse(uniqueNoEmpty($out[1]));
echo implode(' ', array_merge($singleBytes, $multiBytes));
}