1
\$\begingroup\$

I have sample database in GES files and I wrote importer in Ruby.

Sample file: WAR_APO.GES: https://gist.github.com/regedarek/cd0ce73c5d1e14ff9818

I will have several of GES files which I want to move to my data folder in app and I want to process them manually.

  1. For most of them logic is the same but the difference are filename, model and attributes hash.
  2. I want to also and new cases like: Update when type is U or Delete if is D

How should I refactor that:

class PrototypePharmaceuticImporter
 ID = '00'
 K = "#{ID}K"
 F = "#{ID}F"
 FB = "#{ID}FB"
 D = "#{ID}D"
 I = "#{ID}I"
 U = "#{ID}U"
 E = "#{ID}E"
 TYPES = [K, F, FB, D, I, U, E]
 ARTICLE_TYPE = 57
 BARCODE = 15
 GROUP_CODE = 1
 GROUP_KEY = 40
 GROUP_NAME = 2
 FORM_CODE = 1
 FORM_KEY = 38
 FORM_NAME = 2
 FORM_SHORT_NAME = 3
 LONGNAME = 28
 PACKAGE_TYPE = 46
 PHARMACY_ONLY = 3
 PRESCRIPTION_ONLY = 47
 PRICE = 2
 WEIGHT = 14
 attr_accessor :file, :k, :f
 def initialize
 @pharmaceutics_file = File.open 'data/PAC_APO.GES', 'r'
 @forms_file = File.open 'data/DAR_APO.GES', 'r'
 @groups_file = File.open 'data/WAR_APO.GES', 'r'
 @k = []
 @f = []
 extract_groups_to_database
 end
 def extract_pharmaceutics_to_database
 add = true
 tmp = []
 type = K
 @pharmaceutics_file.each_with_index do |line, i|
 break if (i > 96 * 1000 + 760 + 9)
 _type = line.strip
 _changed = TYPES.include? _type
 if _changed && i > 0
 case type
 when K then @k << tmp
 when F then @f << tmp
 when FB then @f << tmp
 when I, U, D
 hash = {
 article_type: tmp[ARTICLE_TYPE],
 price: tmp[PRICE],
 weight: tmp[WEIGHT],
 package_type: tmp[PACKAGE_TYPE],
 group_code: tmp[GROUP_KEY],
 form_code: tmp[FORM_KEY],
 name: tmp[LONGNAME],
 barcode: tmp[BARCODE],
 pharmacy_only: tmp[PHARMACY_ONLY],
 prescription_only: tmp[PRESCRIPTION_ONLY]
 }
 Pharmaceutic.create(hash)
 end
 tmp = []
 type = _type
 end
 tmp << clean(line)
 end
 end
 def extract_forms_to_database
 add = true
 tmp = []
 type = K
 @forms_file.each_with_index do |line, i|
 break if (i > 96 * 1000 + 760 + 9)
 _type = line.strip
 _changed = TYPES.include? _type
 if _changed && i > 0
 case type
 when K then @k << tmp
 when F then @f << tmp
 when FB then @f << tmp
 when I, U, D
 hash = {
 code: tmp[FORM_CODE],
 name: tmp[FORM_NAME],
 short_name: tmp[FORM_SHORT_NAME]
 }
 PharmaceuticForm.create(hash)
 end
 tmp = []
 type = _type
 end
 tmp << clean(line)
 end
 end
 def extract_groups_to_database
 add = true
 tmp = []
 type = K
 @groups_file.each_with_index do |line, i|
 break if (i > 96 * 1000 + 760 + 9)
 _type = line.strip
 _changed = TYPES.include? _type
 if _changed && i > 0
 case type
 when K then @k << tmp
 when F then @f << tmp
 when FB then @f << tmp
 when I, U, D
 hash = {
 code: tmp[GROUP_CODE],
 name: tmp[GROUP_NAME]
 }
 PharmaceuticGroup.create(hash)
 end
 tmp = []
 type = _type
 end
 tmp << clean(line)
 end
 end
 private
 def clean line
 line.strip
 .gsub(/^[\d]{2}/, '')
 .gsub(/\\[s|S]39/, 'ß')
 .gsub(/\\a25/, 'ä')
 .gsub(/\\A25/, 'Ä')
 .gsub(/\\o25/, 'ö')
 .gsub(/\\O25/, 'Ö')
 .gsub(/\\u25/, 'ü')
 .gsub(/\\U25/, 'Ü')
 end
end

Basically, I want to get rid of duplications in these methods.

asked Jul 26, 2013 at 9:08
\$\endgroup\$

1 Answer 1

1
\$\begingroup\$

Some notes:

  • As usual, I'd recommend a more functional approach. It's not a theoretical issue, functional code is more concise, more clear. When I see @k = [] I get the shakes thinking that this variable can (and will) be modified just everywhere in the class (an instance variable is just a nasty global variable -for the scope a class, granted- with a pretty name).

  • Also related to FP, your code is hard to abstract because it's written with procedures instead of functions: they don't take values (as arguments) and return values, everything works by side-effects through instance variables. No referential transparency, no idempotence...

  • Don't create so many individual constants, group them in hashes.

  • File.open 'data/PAC_APO.GES', 'r'. Hardcoded files in classes? mmmm.

  • 96 * 1000 + 760 + 9. Magic number, define it as a constant so it has a name.

Answering your particular question about how to avoid repeating code for those 3 methods: whatever is different put it as arguments. If some value depends on something the caller cannot know yet about (the code under when I, U, D), then use an argument block as callback. So it might be written:

def extract_to_database(collection)
 add = true
 tmp = []
 type = K
 collection.each_with_index do |line, i|
 break if (i > 96 * 1000 + 760 + 9)
 _type = line.strip
 _changed = TYPES.include? _type
 if _changed && i > 0
 case type
 when K then @k << tmp
 when F then @f << tmp
 when FB then @f << tmp
 when I, U, D
 yield(tmp)
 end
 tmp = []
 type = _type
 end
 tmp << clean(line)
 end
end

Note that I kept your old imperative code (a functional approach would require rewriting the full class), but at least is now DRYed. I'd strongly encourage to try the functional paradigm, it's hard at first but it pays off on the long run. My article on this matter.

answered Jul 26, 2013 at 12:31
\$\endgroup\$
2
  • \$\begingroup\$ Could you give me simple advice how to get rid of using @k = [] from this method? I read your article and it is great but I have problem with implementing some of your hints. So I changed it to use sql for inserting and it looks like that: gist.github.com/regedarek/6172861. Thanks for this article it is great. \$\endgroup\$ Commented Aug 7, 2013 at 10:21
  • 1
    \$\begingroup\$ I'll try to write a refactor of the gist later. For now, note that @f is defined nowhere. Try to write it without instance variables. \$\endgroup\$ Commented Aug 7, 2013 at 10:30

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.