4
\$\begingroup\$

The code is working fine however, I would like to fix code after unless and can I store thread ID in the database too? Is there a way to improve performance?

require 'csv'
 class CsvDb
 class << self
 def convert_save(model_name, csv_data, field_name=nil)
 target_model = model_name.classify.constantize
 csv_file = csv_data.read
 row_headers={}
 counter=0;
 #Thread.new do
 CSV.parse(csv_file) do |row| 
 if counter==0
 temp=row
 row_headers = Hash[temp.map.with_index.to_a]
 counter +=1
 next
 end
 unless row[row_headers["name"]].nil?
 temp={}
 business_type = row[row_headers["business_type_id"]]
 business_type_id = business_type=='Online' ? 1: business_type=='Local' ? 2: 3
 temp[:business_type_id] = business_type_id
 temp[:user_id] = row[row_headers["user_id"]]
 temp[:name] = row[row_headers["name"]]
 temp[:country_id] = row[row_headers["country_id"]]
 temp[:homepage] = row[row_headers["homepage"]] ||=""
 temp[:telephone] = row[row_headers["telephone"]] ||=""
 temp[:email] = row[row_headers["email"]] ||=""
 temp[:address] = row[row_headers["address"]] ||=""
 temp[:latitude] = row[row_headers["latitude"]] 
 temp[:longitude] = row[row_headers["longitude"]]
 temp[:facebook] = row[row_headers["facebook"]] ||=""
 temp[:twitter] = row[row_headers["twitter"]] ||=""
 temp[:google] = row[row_headers["google"]] ||=""
 temp[:instagram] = row[row_headers["instagram"]] ||=""
 temp[:pinterest] = row[row_headers["pinterest"]] ||=""
 temp[:free_shipping] = row[row_headers["free_shipping"]]
 temp[:ship_details] = row[row_headers["ship_details"]] ||=""
 temp[:category_ids] = [row[row_headers["category_ids"]]]
 temp[:style_ids] = [row[row_headers["style_ids"]]]
 temp[:shipping_country_ids] = [row[row_headers["shipping_country_ids"]]]
 business = target_model.new(temp)
 business.save
 end
 end
 ActiveRecord::Base.connection.close
 end
 #end
 end 
 end
asked Apr 12, 2016 at 10:58
\$\endgroup\$
6
  • 1
    \$\begingroup\$ I would like to fix code after unless and can I store thread ID in the database too? Could you re-phrase that? I have no idea what you're trying to convey. \$\endgroup\$ Commented Apr 12, 2016 at 11:27
  • 1
    \$\begingroup\$ What do you mean by fix after unless, are you talking about formatting ? You know fixing code is off-topic. \$\endgroup\$ Commented Apr 12, 2016 at 14:43
  • \$\begingroup\$ By "fix code" do you mean "improve"? \$\endgroup\$ Commented Apr 13, 2016 at 3:53
  • \$\begingroup\$ I am not asking about formatting, I am asking about using map filter that will make 14 line to two or three \$\endgroup\$ Commented Apr 13, 2016 at 6:20
  • \$\begingroup\$ Fix means to improve not formatting, Second concern is that if thread start a task how could I know that thread complete the task or not. \$\endgroup\$ Commented Apr 13, 2016 at 6:21

2 Answers 2

3
+50
\$\begingroup\$

The following code is not tested, maybe some minor changes will be necessary: I'm not 100% sure it will work as is, don't hesitate to tell me if there is anything wrong with the code

Refactoring the existing code

The first thing I did was to set the headers option to true it will enable the parsing of the first line of the CSV and instead of giving you an array for each row it will return an hashmap with keys extracted from the header. It replace the code you wrote to parse the header of the file.

Then I replace the imbricated ternary operations to use a case when expression for the business_type_id which is far more readable.

Next I made two small function to extract all the needed values, I use reduce to automatically fill a hashmap from an array of symboles.

At least I merge all the hashmap in the constructor of the target_model before saving.

CSV.parse(csv_file, {headers: true, header_converters: :symbol}) do |row|
 business_type_id = case row[:business_type_id]
 when 'Online' then 1
 when 'Local' then 2
 else 3
 end
 target_model.new( {business_type_id: business_type_id} + extract_required_fields(row) + extract_optionals_fiels(row) )
 .save()
end
def extract_required_fields(row)
 [:user_id, :name, :country_id, :free_shipping, :category_ids, :style_ids, :shipping_country_ids]
 .reduce({}) do |carry, item|
 carry[item] = row[item]
 end
end
def extract_optionals_fiels(row)
 [:homepage, :telephone, :email, :address, :facebook, :twitter, :google, :instagram, :pinterest, :ship_details]
 .reduce({}) do |carry, item|
 carry[item] = row[item] ||= ''
 end
end

Bulk update

But this code, even if it has a better separation of concern and less duplication will not really speed your import process because your performance bottleneck is in the insert operation performed for each item by the save method.

So the solution is to make a bulk update instead of saving each row separately: you'll group the insert !

To do so, you can use the https://github.com/zdennis/activerecord-import gem which will allow you to do something like this :

entities = []
CSV.parse(csv_file, {headers: true, header_converters: :symbol}) do |row|
 business_type_id = case row[:business_type_id]
 when 'Online' then 1
 when 'Local' then 2
 else 3
 end
 entities << target_model.new( {business_type_id: business_type_id} + extract_required_fields(row) + extract_optionals_fiels(row) )
 target_model.import entities if entities.size == 100 
end
#don't forget to save the remaining entities:
target_model.import entities unless entities.size == 0

In this case we proceed the save operation each 100 row, which will make the import process way faster !

Getting the Thread's ID

To collect the thread's Id, I suggest to use the inspect method of the current_thread, you'll get more information than only the thread's id but you'll be able to have all the needed informations about which thread processed the insert operation.

thread_identifier = Thread.current.inspect
answered Apr 17, 2016 at 18:59
\$\endgroup\$
8
  • \$\begingroup\$ Should I have to write code in model now or I can use same code as it is placed now. \$\endgroup\$ Commented Apr 18, 2016 at 5:18
  • \$\begingroup\$ You should be able to use the code as it is placed now \$\endgroup\$ Commented Apr 18, 2016 at 7:59
  • \$\begingroup\$ But there is issue with bulk update, I have to skip few columns which is not possible by gem. \$\endgroup\$ Commented Apr 18, 2016 at 8:59
  • \$\begingroup\$ You have to skip columns or row ? For the columns, just remove them from the constructor's hash, for the rows you can make the append (<<) conditional. \$\endgroup\$ Commented Apr 18, 2016 at 9:28
  • \$\begingroup\$ Need to remove columns because the column name is same as of nested model. That model is just for main user help who create csv: CSV has fields [:user_id, :shipping_countries, shipping_country_ids] here shipping_countries is useless. The first case cover this case but gem one not. \$\endgroup\$ Commented Apr 18, 2016 at 9:40
2
\$\begingroup\$

No idea what "thread ID" means.

I'll just focus on the the translation of the CSV rows into an object. There's just to much repetition in the unless body. All those rows should be mentioned once, then handled in a loop instead of hard coding them like this. E.g.

unless row[row_headers["name"]].nil?
 temp={}
 business_type = row[row_headers["business_type_id"]]
 business_type_id = business_type == "Online" ? 1: business_type == "Local" ? 2 : 3
 temp[:business_type_id] = business_type_id
 for name in [:user_id, :name, :country_id, :latitude, :longitude, :free_shipping, :category_ids, :style_ids, :shipping_country_ids]
 temp[name] = row[row_headers[name.to_s]]
 end
 for name in [:homepage, :telephone, :email, :address, :facebook, :twitter, :google, :instagram, :pinterest, :ship_details]
 temp[name] = row[row_headers[name.to_s]] ||= ""
 end
 business = target_model.new(temp)
 business.save
end

That said, I'm sure there are better ways to do this in Ruby ...

answered Apr 17, 2016 at 11:34
\$\endgroup\$
1
  • \$\begingroup\$ thanks, since records are in millions and process take much time so I would like to do it using thread to improve performance. Here concern is that how can I store thread data and its status in the database because thread id can't be remain same. \$\endgroup\$ Commented Apr 17, 2016 at 18:01

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.