2
\$\begingroup\$

There is no intended end use for this. It is for learning and development only.

I got carried away following an example and ended up with a miniaturised vehicle registration system. You can bulk generate records, each of which is given a unique registration number in keeping with British registration format. I've written a script that compares and lists all registration numbers that fall afoul of the criteria.

It does work, but it is incredibly slow. (over an hour to check 1 mil records). I am looking for critique on the logic and any optimisation I may have missed.

Example string: AA99AAA

Example criteria: A?9?AAA

 def full_search(offensive_list)
 p 'Full check:'
 p "Comparing #{$all_vehicles.count} records against #{offensive_list.count} banned combinations"
 p 'This will take a few minutes'
 vrm_array, example_array = [], []
 vrm_list = $all_vehicles.keys.sort
 vrm_list.each do |vrm|
 vrm_array << vrm.split("") #We split each reg into an array of characters
 end
 offensive_list.each do |example|
 example.strip!
 example_array << example.split("") #and the same with our banned combinations
 end
 vrm_array.each do |vrm|
 example_array.each do |example| #itterate through vrms x examples
 @formatted_vrm = vrm.dup
 if example.length == vrm.length
 example.each_index do |index|
 if example[index] == "?" #for each wildcard we add a wildcard to the vrm for comparison
 @formatted_vrm[index] = "?"
 end
 end
 if @formatted_vrm == example then offensive_found(vrm, example) end
 end
 end
 end
end
def offensive_found(vrm, example)
 built_vrm = ""
 built_example = ""
 if vrm.class == Array #clean up formatting so we can store it
 vrm.each do |character|
 built_vrm << character
 end
 example.each do |character|
 built_example << character
 end
 else
 built_example = example #clearly redundant, but it works so...
 built_vrm = vrm
 end
 if $bad_vrms[built_example] # if we already have a record
 prev_matched = $bad_vrms[built_example] #just add to the array
 prev_matched << built_vrm
 $bad_vrms.store(built_example, prev_matched)
 else
 new_match = [built_vrm] # or create a new hash key 
 $bad_vrms.store(built_example, new_match)
 end
 #p "#{built_vrm} - matched with #{built_example}"
end

If you'd prefer you can clone the full thing on github. https://github.com/Niall47/RubySomDemo

Vogel612
25.5k7 gold badges59 silver badges141 bronze badges
asked Mar 19, 2019 at 9:00
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

You could consider using Regex to speed up the search. The cleanest way to do so would be to change your full_search.txt to be Regex expressions. For example A?9?AAA would need to be changed to A.9.AAA (in Regex the . means any single character).

Then you could change your full_search method to look like this:

def full_search(offensive_list)
 vrm_list = $all_vehicles.keys.sort
 offensive_examples = offensive_list.map(&:strip)
 offensive_examples.each do |offensive_example|
 vrm_list.grep(/^#{offensive_example}$/).each do |offensive_vrm|
 offensive_found(offensive_vrm, offensive_example)
 end
 end
end

In the Regex the ^ means start of string and the $ means end of string; this basically ensures that substrings are not matched e.g.) abcd matches with the regex .c but not with ^.c$

If you don't want to modify your list, you could so something that dynamically creates the regex in Ruby. For example: offensive_example.gsub("?", ".") this would replace all ? with ..


A couple unrelated pointers...

  • Ruby has a String#chars method that is (arguably) more readable and might have some minor performance improvements over string.split("").
  • Instead of assigning an initial empty value and building it in an each, consider using Array#map to reduce the amount of variable reassignments.
# before
vrm_array = []
vrm_list.each do |vrm|
 vrm_array << vrm.split("")
end
vrm_array.each do |vrm|
 # do stuff
end
# after
vrm_array = vrm_list.map { |vrm| vrm.split("") } # this calls `split` on each element in `vrm_list`
vrm_array.each do |vrm|
 # do stuff
end
  • If you need to take an array of characters and join them together into a single string, consider using Array#join.
irb(main):001:0> ['a', 'b', 'c'].join
=> "abc"
$bad_vrms = Hash.new { |h, k| h[k] = [] } # the argument is the default value which in this case is a block that initializes the key's value to a new instance of an empty array
# now we don't need to check if a key exists and can simply push to it
$bad_vrms[example] << vrm
answered Mar 21, 2019 at 10:37
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.