My goal is to write more idiomatic ruby when working with nested data structures. More specifically, I need to break my habit of looping through everything with .each
and nested .each
es. Here's a simple example that illustrates this:
I have the following data structure which cannot be changed:
structure_hash = {
"website"=> {
"base"=>"luma.demo",
"custom_b2c_website"=>"homegoods.demo",
},
"store_view"=> {
"custom_store_view"=>"fr.homegoods.ca"
}
}
I want to reshape this such that I return the data as an array of hashes like so:
structure_array = [
{
:scope=>"website",
:code=>"base",
:url=>"luma.demo"
},
{
:scope=>"website",
:code=>"custom_b2c_website",
:url=>"homegoods.demo"
},
{
:scope=>"store",
:code=>"custom_store_view",
:url=>"fr.homegoods.ca"
}
]
I have achieved this with the following helper method with comments that illustrate the thought process I am trying to improve:
def get_vhost_data
structure_hash = { ... } # The first hash above, etc.
vhost_data = [] # We're returning an array, so initialize one
# Loop through the containing hash. Each key will be our scope, so we need to
# access that
structure_hash.each do |scope, scope_hash|
# We need to transform and add to the data in these hashes.
scope_hash.each do |code, url|
# Best to return a new hash to house the old values + the transformed ones
demo_data = {}
# The "scope" key from the outer hash needs to be changed conditionally
demo_data[:scope] = scope == 'store_view' ? scope.gsub('store_view', 'store') : scope
# The rest of the data is fine
demo_data[:code] = code
demo_data[:url] = url
# Add the newly-created hash to the containing array
vhost_data << demo_data
end
end
# Return the containing array
vhost_data
end
What Smells?
As best I can tell, the following things are fishy:
I shouldn't need to initialize an empty array -- surely
.each_with_object
?Nested
.each
here seems tedious -- is there a better way to think about what I'm trying to do that would result in something more idiomatic? For example, instead of resorting to "Okay, we need to go through each hash and..." is it more idiomatic to say: "Since you're only manipulating one of the keys of the outer hash, use aselect
instead? (Just an example, not sure that select does what I want, although it could also take care of creating the containing array...)Again, initializing the empty hash seems wrong --
.each_with_object
again?Looping through a hash to create a new hash from the existing hash's content and add to it. At first I thought
map
would be better somehow, but in my limited understanding,.map
takes existing elements and transforms them -- it doesn't add additional elements...
What I've Tried
So far, I've tried the following to address the code smells above:
def get_vhost_data_refactor
structure_hash = {...}
structure_hash.each_with_object([]) do |(scope, scope_hash), vhost_arr|
scope_hash.each_with_object({}) do |(code, url), data_hash|
data_hash[:scope] = scope == 'store_view' ? scope.gsub('store_view', 'store') : scope
data_hash[:code] = code
data_hash[:url] = url
vhost_arr << data_hash
end
end
end
which yields:
[
{
:scope=>"website",
:code=>"custom_b2c_website_3",
:url=>"sierra.demo"
},
{
:scope=>"website",
:code=>"custom_b2c_website_3",
:url=>"sierra.demo"
},
{
:scope=>"store",
:code=>"custom_store_view",
:url=>"fr.homegoods.ca"
}
]
This is close, but obviously, the .each_with_object
combination doesn't loop through each of the inner hashes properly, and more importantly, even if it did work, it's not idiomatic; it just replaces nested .each
with the slightly more helpful each_with_object
.
Any advice on how I can solve this "the Ruby way" and any tips for questions to ask in order to think "the Ruby way" would be greatly appreciated.
1 Answer 1
I would agree with your 1st and 3rd points - initializing intermediate arrays and hashes is almost always a smell. with_object
is one solution as is reduce
/inject
. each
is definitely not very idiomatic Ruby.
Your second solution is much better. The reason it gives an odd result is because the data_hash
is the same object for each iteration of the inner loop and you are just modifying it so the result is the last iteration of the loop.
I would say that this is exactly what map
is for - it is still a kind of transformation. I don't know of any ways to select into the inner loop or avoid the nesting but I would use flat_map
which does the map and then flattens out the result (you can try with just map
to see what I mean)
Something like this:
structure_hash.flat_map do |scope, scope_hash|
scope = 'store' if scope == 'store_view'
scope_hash.map do |code, url|
{
scope: scope,
code: code,
url: url,
}
end
end
-
\$\begingroup\$ This is great information, thanks! I'd forgotten that map, being a "souped up"
each
, allowed me to pull out the key (scope) and the inner hash at the same time. And, the use of flat_map saves me from using.flatten
everywhere. I was trying all sorts of merge operations, and even thought of experimenting withreduce
, too, but my fear was always sacrificing clarity for convention or idiom. I think the take-away here is that nesting is correct in this case, and your double map is the elegant approach. Thanks again. \$\endgroup\$Steve K– Steve K2021年07月06日 21:16:07 +00:00Commented Jul 6, 2021 at 21:16 -
\$\begingroup\$ After trying unsuccessfully to use a combination of flattening to arrays and then thinking about
zip
andsplat
to line them up andreduce
inside of a singleeach
, I still think the above answer is the cleanest solution. \$\endgroup\$Steve K– Steve K2021年07月08日 13:38:44 +00:00Commented Jul 8, 2021 at 13:38 -
1\$\begingroup\$ Note that the advice from radarbob's comment is still good advice: Ruby is an object-oriented language, not a hash-of-strings-to-hash-of-strings-to-strings-oriented language nor an array-of-hashes-of-symbols-to-strings-oriented language. I found that with proper domain modeling, all of these complex low-level-data-structure-manipulation problems go away because there are no complex low-level data structures anymore. \$\endgroup\$Jörg W Mittag– Jörg W Mittag2021年07月10日 10:37:02 +00:00Commented Jul 10, 2021 at 10:37
-
1\$\begingroup\$ Here's a couple of examples: stackoverflow.com/a/61119757/2988 stackoverflow.com/a/20612726/2988 stackoverflow.com/a/28051415/2988 stackoverflow.com/a/31388268/2988 stackoverflow.com/a/32502358/2988 stackoverflow.com/a/33125872/2988 stackoverflow.com/a/43033758/2988 stackoverflow.com/a/45847433/2988 stackoverflow.com/a/59532605/2988 \$\endgroup\$Jörg W Mittag– Jörg W Mittag2021年07月10日 11:14:51 +00:00Commented Jul 10, 2021 at 11:14
-
\$\begingroup\$ Absolutely, never meant to imply that the comment was not good advice, and I appreciate his taking the time to get me started in the right direction. @JörgWMittag, what I'm gathering from your comment and some of the examples you shared is that I need to alter the way I think about solving a problem. Instead of going directly to low-level structures, it's more ruby-esque to create an Object which is structured to match the data I have, is that what you're saying? (Your second SO example was particularly poignant). \$\endgroup\$Steve K– Steve K2021年07月12日 22:05:31 +00:00Commented Jul 12, 2021 at 22:05
structure_hash
up front. Iterate the flattened hashes array to transform it. Nested.each
becomes sequential. If anything is a Ruby idiom it's "everything is an object." and "objects work like you expect them to". That means plenty of helpful methods. - hmmm,... Consider a Class that can return its own flattened representation. Then, I suppose aVhost.flatten.transform
call could have a default-value code block parameter. \$\endgroup\$def flatten_hash(hash, subkey); hash.map { |k,v| v.dup.tap { |h| h[subkey] = k} }
\$\endgroup\$