I have an unusual data structure that I need to search - the structure is the result of parsing a JSON file returned from an HTTP request.
The top-level object is an Array, but from top to bottom I have Array->Hash->Array->Hash
.
I'm writing code to check the value of a given key in the innermost hash, across the entire data structure.
As it stands, my code is like this:
json = JSON.parse(File.read(file))
json.each do |hash1|
hash1.keys.each do |key|
hash1[key].each do |inner_hash|
# search in the inner hash
end
end
end
I know this is ugly. I know there are better ways to iterate over collections and to combine selections together. I'm just not sure what I should do in this instance given the repeated sliding from array to hash.
What I'm Searching
In the above code, I have the search snipped out. My goal with this iteration structure is to skim through entries (the hashes at the entry_id
level) for various conditions and either immediately act on or return the Hash object when the conditions are met. For example, if the condition is that field1
contains something
, I might want to return
[
{
"entry_id": 544,
"field1": "something",
"field2": "something else",
"field3": 456
},
{
"entry_id": 546,
"field1": "something!",
"field2": "something else!",
"field3": 012
}
]
Example of Data Structure
[
{
"12345": [
{
"entry_id": 543,
"field1": "value",
"field2": "other value",
"field3": 123
},
{
"entry_id": 544,
"field1": "something",
"field2": "something else",
"field3": 456
}
],
"23456": [
{
"entry_id": 545,
"field1": "new value",
"field2": "other new value",
"field3": 789
},
{
"entry_id": 546,
"field1": "something!",
"field2": "something else!",
"field3": 012
}
]
}
]
-
\$\begingroup\$ Please paste a short (but complete) example of that data-structure, otherwise you are forcing everyone to prepare one for testing purposes. \$\endgroup\$tokland– tokland2013年07月18日 16:26:42 +00:00Commented Jul 18, 2013 at 16:26
-
\$\begingroup\$ D'oh, I should have done that! Thanks, example added. \$\endgroup\$asfallows– asfallows2013年07月18日 17:12:20 +00:00Commented Jul 18, 2013 at 17:12
-
\$\begingroup\$ thanks. More questions: what's in the search? what value must be returned? \$\endgroup\$tokland– tokland2013年07月18日 17:16:15 +00:00Commented Jul 18, 2013 at 17:16
-
\$\begingroup\$ Edited question to address your question. This is an example - there is more than one potential use and more than one potential return result. At times I may want to collect an array of objects, at times I may want to handle matching objects inline. \$\endgroup\$asfallows– asfallows2013年07月18日 17:26:57 +00:00Commented Jul 18, 2013 at 17:26
3 Answers 3
You can iterate over the hashes/arrays using a custom enumerator (modified from a StackOverflow answer here):
def dfs(obj, &blk)
return enum_for(:dfs, obj) unless blk
yield obj if obj.is_a? Hash
if obj.is_a?(Hash) || obj.is_a?(Array)
obj.each do |*a|
dfs(a.last, &blk)
end
end
end
You can then use this enumerator builder method in any number of other helper methods for whatever you need. For example, to perform your example search, you could define:
def find_node_with_value(obj, key, value)
dfs(obj).select do |node|
node[key].respond_to?(:include?) && node[key].include?(value)
end
end
And then use it like:
find_node_with_value(json_data, "field1", "something")
# [{"entry_id"=>544, "field1"=>"something" ...}, {"entry_id"=>546, "field1"=>"something!" ...}]
You may want to look into JSONPath, which gives you XPath-like querying of JSON objects.
Note that the leading zero in your sample JSON makes it invalid.
You could do something like this...
json = JSON.parse(File.read(file))
inner_hashes = json.lazy.map{|hash| hash.values }.flatten
results = inner_hashes.select{|x| ... }
lazy
is from ruby 2.0. It's just there to let you deal with the items one at a time as they are found.
This solution assumes you don't care about the keys of the first level of hashes.