1
\$\begingroup\$

I'm trying to write code that reads data from a CSV and pushes it to a Salesforce application using the API. My code processes the data in a for loop, but it takes a long time (3 hours) to run the function. What can I do to optimize my code to run faster?

Here's an example of my code which reads Patient Diagnosis data from a flatfile which us more than 200k records. Inside the for loop, I query the patient list which has 100k+ records, transform the object then add it to a list for bulk processing. My code looks like this:

Iterating over ptdiag which contains flatfile data

for (int i = 0; i < ptdiags.Count; i += BATCH_SIZE)
{
 var batchContents = SFToBTMapping.Bulk_PtDiag_Content(ptdiags.Skip(i).Take(BATCH_SIZE).ToList(),sfPatients);
 var batch = BulkUpsert(job.Id, batchContents);
}

Function that transforms the object. Here I query sfpatients to link a patientid to the diagnosis object

 public static string Bulk_PtDiag_Content(List<Ptdiag> ptdiags, List<SfPatient__c> sfpatients)
 {
 string res = "Patient__c,DiagKey__c,NickName__c" +
 ",Sequence__c,ShortDescr__c,PTDiagKey__c" + Environment.NewLine;
 foreach (var d in ptdiags)
 {
 var sfd = Map_BTSQL_Patientdiag_To_SF_Patientdiag(d);
 sfd.Patient__c = sfpatients.FirstOrDefault(c => c.PatientKey__c == d.Ptkey.ToString())?.Id;
 res += string.Join(",", sfd.Patient__c, sfd.DiagKey__c, sfd.NickName__c
 , sfd.Sequence__c, sfd.ShortDescr__c.Replace(",",""), sfd.PTDiagKey__c);
 if (ptdiags.Last() != d)
 res += Environment.NewLine;
 }
 return res;
 }

Method that creates a mapping for Ptdiag

 public static SfPatientDiag__c Map_BTSQL_Patientdiag_To_SF_Patientdiag(Ptdiag d)
 {
 return new SfPatientDiag__c
 {
 DiagKey__c = d.Diagkey.ToString(),
 Diagnosis__r = new SfDiagnosis__c { Diagnosis_Key__c = d.Diagkey.ToString() },
 NickName__c = d.Nickname,
 Patient__r = new SfPatient__c { PatientKey__c = d.Ptkey.ToString() },
 Sequence__c = d.Sequence != null ? Convert.ToDouble(d.Sequence) : 0,
 ShortDescr__c = d.Shortdescr,
 PTDiagKey__c = d.Ptdiagkey.ToString()
 };
 }
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked May 13, 2021 at 13:40
\$\endgroup\$
3
  • \$\begingroup\$ If your code is processing 200k records and it takes 3h to do so, then 99% of that time will be in BulkUpsert method. Did you try comment it out and run your code to see the results? \$\endgroup\$ Commented May 13, 2021 at 19:12
  • \$\begingroup\$ please share BulkUpsert code as well. \$\endgroup\$ Commented May 13, 2021 at 22:53
  • \$\begingroup\$ Bulk Upsert really wasn't an issue. Querying a large object in a for loop is what was causing the overhead. I replaced list with a dictionary to improve the performance significantly. Now it takes less than 2 mins to complete. \$\endgroup\$ Commented May 15, 2021 at 11:20

1 Answer 1

1
\$\begingroup\$

In addition to the post by @aepot, I made the following changes which reduced the completion time significantly.

I used a dictionary instead of a list to query patient Id. Querying a large list of objects in for loop is what really slowed down the processing. Here's what the code looks like now:

for (int i = 0; i < ptdiags.Count; i += BATCH_SIZE)
{
 var batchContents = SFToBTMapping.Bulk_PtDiag_Content(ptdiags.Skip(i).Take(BATCH_SIZE).ToList(), sfPatients.ToDictionary(p => p.PatientKey__c));
 var batch = BulkUpsert(job.Id, batchContents);
}
public static string Bulk_PtDiag_Content(List<Ptdiag> ptdiags, Dictionary<string, SfPatient__c> sfpatients)
 {
 var sb = new StringBuilder();
 sb.AppendLine("Patient__c,DiagKey__c,NickName__c,Sequence__c,ShortDescr__c,PTDiagKey__c");
 foreach (Ptdiag d in ptdiags)
 {
 SfPatientDiag__c sfd = Map_BTSQL_Patientdiag_To_SF_Patientdiag(d);
 sfd.Patient__c = sfpatients.GetValueOrDefault(d.Ptkey.ToString()) != null ? sfpatients[d.Ptkey.ToString()].Id : "";
 sb.Append(string.Join(",", sfd.Patient__c, sfd.DiagKey__c, sfd.NickName__c
 , sfd.Sequence__c, sfd.ShortDescr__c.Replace(",", ""), sfd.PTDiagKey__c));
 if (ptdiags.Last() != d)
 sb.AppendLine();
 }
 return sb.ToString();
 }
answered May 15, 2021 at 11:17
\$\endgroup\$
3
  • \$\begingroup\$ try-catch is a bad practice in this case, use Dictionary.TryGetValue instead. \$\endgroup\$ Commented May 15, 2021 at 11:36
  • \$\begingroup\$ Thanks for the suggestion. I have updated my code to use a dictionary null checker. \$\endgroup\$ Commented May 15, 2021 at 13:53
  • \$\begingroup\$ sfpatients.GetValueOrDefault(d.Ptkey)?.Id ?? "" would be enough. One attempt to read the value is twice faster than two. \$\endgroup\$ Commented May 15, 2021 at 13:57

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.