PHP code structure for string manipulation tasks

Question 1

I need to perform a bunch of string manipulation tasks on an original string, then output the final string. Best I could think of is extracting some functionality into functions. Any better way to structure it? Also, are the variable/function names too long?

strip out unicode chars from original string A, reverse them into string B
in the remaining string A, remove duplicate words
append: A + B

function extract_unicode_characters_from_string($str) {
 preg_match_all('/\p{Han}+/u', $str, $unicode_characters_only, PREG_SET_ORDER);
 $squashed = array_column($unicode_characters_only, 0);
 return $squashed; 
}
function strip_unicode_characters_from_string($str) {
 return preg_replace('/\p{Han}+/u', '', $str);
}
function strip_duplicates($str){
 $exploded_str = explode(' ', $str);
 $exploded_str_no_empties = array_filter($exploded_str, function($item) { return !is_null($item) && $item !== ''; });
 return implode(' ', array_unique($exploded_str_no_empties));
}
$original_string = 'one two three 喞 喝 four 刷囿 two 跏正 吁';
$array_of_unicode_characters = extract_unicode_characters_from_string($original_string);
$reversed_unicode_characters = implode(' ', array_reverse($array_of_unicode_characters));
$string_without_unicode_characters = strip_unicode_characters_from_string($original_string);
$english_words_no_dupes = strip_duplicates($string_without_unicode_characters);
echo $english_words_no_dupes . ' ' . $reversed_unicode_characters;

Question 2

Yes, I think your variable names are a bit long. While it is good to be expressive, you don't want to be pushing your line width beyond the recommended max width if avoidable.

I don't know if you need many custom functions here. The shared calls which filter empty and duplicate strings can be a custom call. Otherwise, everything else is single-use.

Consider this snippet which simplifies much of the script by grouping the multibyte and single-byte non-whitespace substrings from the start. No extra exploding, and only one implode call.

Code: (Demo)

function uniqueNoEmpty($array) {
 return array_unique(array_filter($array, 'strlen'));
}
$original_string = 'one two three 喞 喝 four 刷囿 two 跏正 吁';
if (!preg_match_all('~(\p{Han}+)|(\S+)~u', $original_string, $out)) {
 echo 'no qualifying strings';
} else {
 $singleBytes = uniqueNoEmpty($out[2]) ?? [];
 $multiBytes = array_reverse(uniqueNoEmpty($out[1]));
 echo implode(' ', array_merge($singleBytes, $multiBytes));
}

mickmackusa mickmackusa 8,8021 gold badge17 silver badges31 bronze badges · Accepted Answer · 2019-12-22 12:48:15Z

Yes, I think your variable names are a bit long. While it is good to be expressive, you don't want to be pushing your line width beyond the recommended max width if avoidable.

I don't know if you need many custom functions here. The shared calls which filter empty and duplicate strings can be a custom call. Otherwise, everything else is single-use.

Consider this snippet which simplifies much of the script by grouping the multibyte and single-byte non-whitespace substrings from the start. No extra exploding, and only one implode call.

Code: (Demo)

function uniqueNoEmpty($array) {
 return array_unique(array_filter($array, 'strlen'));
}
$original_string = 'one two three 喞 喝 four 刷囿 two 跏正 吁';
if (!preg_match_all('~(\p{Han}+)|(\S+)~u', $original_string, $out)) {
 echo 'no qualifying strings';
} else {
 $singleBytes = uniqueNoEmpty($out[2]) ?? [];
 $multiBytes = array_reverse(uniqueNoEmpty($out[1]));
 echo implode(' ', array_merge($singleBytes, $multiBytes));
}

Stack Exchange Network

PHP code structure for string manipulation tasks

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

PHP code structure for string manipulation tasks

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions