Forming error messages from a multidimensional associative array

Question 1

I'm working with an input array, which is 3 levels deep, and creating error message strings accordingly in a separate output array. Below is the code.

$e['invalid']['key'][] = 'a123';
$e['invalid']['key'][] = 'a456';
$e['invalid']['color'][] = 'red';
$e['missing']['key'][] = 'b72';
$e['missing']['color'][] = 'blue';
$e['missing']['color'][] = 'green';
echo '<pre>' . print_r($e, 1) . '</pre>';
function generateErrorMessages($e)
{
 $errors;
 foreach ($e as $type => $array)
 {
 foreach ($array as $key => $values)
 { 
 foreach ($values as $v)
 {
 switch($type)
 {
 case 'invalid':
 $errors[$type][$key][] = $v . ' is not a valid ' . $key;
 break;
 case 'missing':
 $errors[$type][$key][] = $v . ' is a required ' . $key . ' - other functions have dependencies on it';
 break;
 default: 
 // do nothing 
 }
 }
 }
 }
 return $errors;
}
$msgs = generateErrorMessages($e);
echo '<pre>' . print_r($msgs, 1) . '</pre>';

While this achieves the desired outcome, the nested foreach loops in the generateErrorMessages() method seem onerous and difficult to read. Is there a more concise way to generate the error messages?

Here is a screenshot of the input array:
enter image description here

Here is a screenshot of the output array:
enter image description here

Question 2

How do you define "difficult to read"? As any explicit code, all these loops are quite straightforward to follow. And making it more concise will likely make it more cryptic as well

Question 3

The current question title, which states your concerns about the code, is too general to be useful here. Please edit to the site standard, which is for the title to simply state the task accomplished by the code. Please see How to get the best value out of Code Review: Asking Questions for guidance on writing good question titles.

Question 4

I strongly suggest looking into JSON.

Question 5

@Andrew what exactly would be the benefit of looking into JSON?

Question 6

@mickmackusa Lots of key->value (mapped) string data, several layers deep: storing and using it as JSON would be ideal. You also then get tons of JSON library support which will help you to parse through the collection easily and intuitively. There's lots of support for e.g. (language) object <-> string JSON manipulation too, and you could use just 100% the object side if wanted. It doesn't directly address the OP's multi-loop problem, but it might help if nothing else.

Question 7

In my opinion, this code is all right. As any explicit code, these loops are quite straightforward to follow. And making it more concise will likely make it more cryptic as well. I would only make a minor brush-up, removing the unnecessary default clause and make the proper indentation. Variable interpolation is also more natural to read, in my opinion.

Also, variables must be explicitly initialized, just a mention is not enough. If you need an empty array then you must initialize an empty array. Although in the present code the $errors variable doesn't exist before the loop, the code could evolve over time, and there could be the possibility that a variable with the same name will be used above. If you just mention it, as $errors; it will keep the previous value. If you initialize it as $errors = []; you will always know it's an empty array.

function generateErrorMessages($e)
{
 $errors = [];
 foreach ($e as $type => $array)
 {
 foreach ($array as $key => $values)
 { 
 foreach ($values as $v)
 {
 switch($type)
 {
 case 'invalid':
 $errors[$type][$key][] = "$v is not a valid $key";
 break;
 case 'missing':
 $errors[$type][$key][] = "$v is a required $key - other functions have dependencies on it";
 break;
 }
 }
 }
 }
 return $errors;
}

Question 8

Can you expound on the comment that variables must be explicitly initialized?

Question 9

That's simple. If you need an empty array then you must initialize an empty array. Although in the present code the $errors variable doesn't exist before the loop, the code could evolve over time, and there could be the possibility that an array with the same name will have some contents. If you just mention it, as $errors; it will keep the previous value. If you initialize it as $errors = []; you wull always know it's an empty array.

Question 10

Can you edit that into your answer? Thank you.

Question 11

I agree with what "Your Common Sense" said, but I would like to have better variable names. Names that make clear what a variable contains. Seen in isolation variable names like $e, $array and $v don't really convey what they contain. Abbreviated variable names don't make your code easier to read. Also $errors does contain errors, but more precisely it contains error messages. So, I would write this as:

function convertToErrorMessages($errors)
{
 $errorMessages = [];
 foreach ($errors as $errorType => $error)
 {
 foreach ($error as $property => $values)
 { 
 $valueMessages = []; 
 foreach ($values as $value)
 {
 switch($errorType)
 {
 case 'invalid':
 $valueMessages[] = "$value is not a valid $property";
 break;
 case 'missing':
 $valueMessages[] = "$value is a required $property".
 " - other functions have dependencies on it";
 break;
 }
 }
 $errorMessages[$errorType][$property] = $valueMessages;
 }
 }
 return $errorMessages;
}

Note that I also collect all the value messages before assigning them to the error messages array. This prevents code repetition, especially when you have many error types. I also don't like long long lines, so I split those.

One thing I sometimes do when there are many nested braces, is leaving out braces that will never be useful. Like this:

function convertToErrorMessages($errors)
{
 $errorMessages = [];
 foreach ($errors as $errorType => $error)
 foreach ($error as $property => $values)
 { 
 $valueMessages = []; 
 foreach ($values as $value)
 switch($errorType)
 {
 case 'invalid':
 $valueMessages[] = "$value is not a valid $property";
 break;
 case 'missing':
 $valueMessages[] = "$value is a required $property".
 " - other functions have dependencies on it";
 break;
 }
 $errorMessages[$errorType][$property] = $valueMessages;
 }
 return $errorMessages;
}

This might not be to everyones liking, but I think it is acceptable or even slightly easier to read.

I also had a look as "mickmackusa" solution. A lookup array can make sense, for instance when you're using multiple languages. Also, the idea to separate the configuration from the processor code is a valid one, but for now it only seems to complicate the code (PS: mickmackusa updated his answer and it looks a lot better now).

Question 12

In your second example, although I am not a fan of missing out braces just for the sake of it, the lack of indentation of the switch makes the code worse IMHO.

Question 13

I think this is the purpose of comments. Renaming every variable name to >10 characters makes typing code take much longer on large projects, and in most cases short names work. If not, just use a comment. Typing errorMessages each time in a large project would be seriously annoying, especially when something like eMsgs or msgs would work just as well.

Question 14

@RedwolfPrograms I do understand that long names can require more typing, but for me clarity and readability of code is more important. The worst example I came across recently is the CAMT.053 file format, they abbreviate everything. Things like CdtDbtInd, AddtlInfInd, PmtInfId, RltdPties and so on. My point is: It might be clear to the author what these cryptic things mean, to someone new to the code they're not.

Question 15

@redwolf programs long variable bames shouldnt bother you. You should use some IDE And IDEs Will autocomplete the name after you type few chars. Readability should be prefered.

Question 16

@KIKOSoftware That's why I mentioned comments. $AddtlInfInd; //Addition Info Indicator would be better IMO than $additionalInfoIndicator

Question 17

Before I get started with the script polishing, I just want to voice that I don't think it makes sense to bloat/hydrate your otherwise lean data storage with redundant text. If you are planning on presenting this to the end user as a means to communicate on a "human-friendly" level, then abandon the array-like structure and write plain English sentences.

How can you make your code more manageable? I recommend a lookup array. This will allow you to separate your "processing" code from your "config" code. The processing part will be built "generally" so that it will appropriately handle incoming data based on the "specific" data in the lookup array.

By declaring the lookup as a constant (because it won't vary -- it doesn't need to be a variable), the lookup will enjoy a global scope. This will benefit you if plan to write a custom function to encapsulate the processing code. IOW, you won't need to pass the lookup into your function as an argument or use [shiver] global.

Now whenever you want to extend your error array-hydrating function to accommodate new types, you ONLY need to add a single line of code to the lookup (ERROR_LOOKUP) -- you never need to touch the processor. In contrast, a switch block will require 3 new lines of code for each new allowance. This makes scaling your script much easier, cleaner, and more concise.

Code: (Demo)

define("ERROR_LOOKUP", [
 'invalid' => '%s is not a valid %s',
 'missing' => '%s is a required %s - other functions have dependencies on it',
]);
$errors = [];
foreach ($e as $type => $subtypes) {
 if (!isset(ERROR_LOOKUP[$type])) {
 continue;
 }
 foreach ($subtypes as $subtype => $entry) { 
 foreach ($entry as $string) {
 $errors[] = sprintf(ERROR_LOOKUP[$type], $subtype, $string);
 }
 }
}
echo implode("\n", $errors);

Output:

a123 is not a valid key
a456 is not a valid key
red is not a valid color
b72 is a required key - other functions have dependencies on it
blue is a required color - other functions have dependencies on it
green is a required color - other functions have dependencies on it

Your output strings may have static text on either/both sides of the $subtype value, so sprintf() makes the variable insertion very clean and flexible. Credit to @NigelRen for suggesting this improvement to my snippet.

The continue in the outermost loop ensures that no wasted iteration occurs on deadend array data. No "do nothing" outcomes on inner loops. Alternatively, you could use array_intersect_key() to replace the conditional continue (Demo).

p.s. I have a deep-seated hatred for switch block syntax with so many break lines. This is why I often replace them with lookup arrays.

p.p.s. If you are writing an OOP structured script/project, see @slepic's post.

Question 18

Well if it's supposed to be returned in response to AJAX call, "littering" seems to be quite a standard approach

Question 19

I don't see anything to suggest that the OP is generating an AJAX response.

Question 20

@mickmackusa This is an awesome way to replace the switch statement! Prior to using the switch block I had experimented with using

$message['invalid'] = $v . ' is not a valid ' . $key; $message['missing'] = $v . ' is a required ' . $key . ' - other functions have dependencies on it'; $errors[$type][$key][] = $message[$type];

in the innermost foreach loop. I abandoned that because it didn't seem right to create the $messages array for every single pass through the loop. Your solution rectifies that issue AND the lookup's values are strings instead of variables. Utterly brilliant!

Question 21

Yet you accept another review? Okay.

Question 22

Only because the other one directly addressed the original question about the nested foreach loops. The intention originally was to try and find a more concise way to achieve the outcome without that approach. I had wondered about array_walk and array_map but neither of those were suggested. Your response about replacing the switch was a bonus.

Question 23

I know this question has already been answered. But what I am showing here is just response to mickmackusa's answer, which OP called a "bonus". And so let OP and everyone wandering here in future can see one possible way to avoid spoiling global scope with constants as shown in mickmackusa's answer. The code below is his solution encapsulated in a static class:

final class ErrorMessagesConverter
{
 private static $lookupTable = [
 'invalid' => '%s is not a valid %s',
 'missing' => '%s is a required %s - other functions have dependencies on it',
 ];
 private function __construct() {};
 public static function convert(iterable $input): array
 {
 $errors = [];
 foreach ($input as $type => $subtypes) {
 if (!\array_key_exists($type, self::lookupTable)) {
 continue;
 }
 foreach ($subtypes as $subtype => $entry) { 
 foreach ($entry as $string) {
 $errors[] = sprintf(self::lookupTable[$type], $subtype, $string);
 }
 }
 }
 return $errors;
 }
}

Shall I add that every $subtypes and $entry should be checked for being iterable before actualy iterating them...

Question 24

Fair enough. If the OP would have made any indication that their script/project was OOP designed, I may have suggested a class too. In my opinion isset() is more ideal for the continue check because 1. it is slightly faster and 2. it will continue if an existing key's value is null (null is not iterable anyhow). May I ask why you bothered declaring the empty constructor? (+1) ...p.s. Duh, I don't know why I used in_array(), I'm always pushing for key-based lookups -- I'll fix that now with isset().

Question 25

@mickmackusa well, it doesnt have to be OOP and yet no global constants involved. That would be a global function with static local variable. Another option would be an instantiable class where you could actualy define the message templates upon construction... But I understand your point... As for isset vs array_key_exists, yeah definitely could be I wasnt really thinking about this. I just replaced in_array to avoid two variables. And as for the private constructor, yeah, definitely not necesary. It's just in those rare cases when I define static class I make sure noone tries instantiate it.

score 11 · Accepted Answer · 2019-11-04 09:18:03Z

In my opinion, this code is all right. As any explicit code, these loops are quite straightforward to follow. And making it more concise will likely make it more cryptic as well. I would only make a minor brush-up, removing the unnecessary default clause and make the proper indentation. Variable interpolation is also more natural to read, in my opinion.

Also, variables must be explicitly initialized, just a mention is not enough. If you need an empty array then you must initialize an empty array. Although in the present code the $errors variable doesn't exist before the loop, the code could evolve over time, and there could be the possibility that a variable with the same name will be used above. If you just mention it, as $errors; it will keep the previous value. If you initialize it as $errors = []; you will always know it's an empty array.

function generateErrorMessages($e)
{
 $errors = [];
 foreach ($e as $type => $array)
 {
 foreach ($array as $key => $values)
 { 
 foreach ($values as $v)
 {
 switch($type)
 {
 case 'invalid':
 $errors[$type][$key][] = "$v is not a valid $key";
 break;
 case 'missing':
 $errors[$type][$key][] = "$v is a required $key - other functions have dependencies on it";
 break;
 }
 }
 }
 }
 return $errors;
}

Can you expound on the comment that variables must be explicitly initialized?
That's simple. If you need an empty array then you must initialize an empty array. Although in the present code the $errors variable doesn't exist before the loop, the code could evolve over time, and there could be the possibility that an array with the same name will have some contents. If you just mention it, as $errors; it will keep the previous value. If you initialize it as $errors = []; you wull always know it's an empty array.

Stack Exchange Network

Forming error messages from a multidimensional associative array

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Forming error messages from a multidimensional associative array

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions