3
\$\begingroup\$

I need to optimize this snippet to be faster:

Dim lst = (From t In DocElet.ChildNodes Select ID = t.item("ID").outerxml).Distinct().ToList()
 Parallel.For(0, lst.Count, Sub(i)
 Dim P As XmlElement = GetElement(lst(i))
 Dim ls = (From t In DocElet.ChildNodes Where t.item("ID").innerText = P.InnerText Select t)
 Parallel.ForEach(ls, Sub(D)
 Dim verif_date As String = D.Item("DAD").InnerText
 Sej.ID = D.Item("ID").InnerText
 End Sub)
 End Sub)

This is the XML structure:

<?xml version="1.0" encoding="utf-8"?>
<PatientData>
<Sejour><ID></ID><DAD></DAD></Sejour>
 </PatientData>

I'm asking how I can optimize my code because it takes a lot of time (50 sec) in the case where the list contains 20000 elements.

Ocelot20
1,89311 silver badges18 bronze badges
asked Jun 27, 2014 at 14:48
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$

The indentation makes your snippet quite hard to read - the VB.NET syntax for inline delegates is fairly bulky (at least compared to C#), I'd give it more air; being able to read the code is step one.

Dim lst = (From t In DocElet.ChildNodes 
 Select ID = t.item("ID").outerxml).Distinct()
 .ToList()
Parallel.For(0, lst.Count, 
 Sub(i)
 Dim P As XmlElement = GetElement(lst(i))
 Dim ls = (From t In DocElet.ChildNodes 
 Where t.item("ID").innerText = P.InnerText 
 Select t)
 Parallel.ForEach(ls, 
 Sub(D)
 Dim verif_date As String = D.Item("DAD").InnerText
 Sej.ID = D.Item("ID").InnerText
 End Sub)
 End Sub)

Ok. So you're running a Parallel.For, and inside that, you're running a Parallel.ForEach. That's quite a lot of overhead just there, it's likely that you're shooting yourself in the foot here, I'd first remove the inner Parallel.ForEach:

Parallel.For(0, lst.Count, 
 Sub(i)
 Dim P As XmlElement = GetElement(lst(i))
 Dim ls = (From t In DocElet.ChildNodes 
 Where t.item("ID").innerText = P.InnerText 
 Select t)
 For Each foo In ls
 Dim verif_date As String = D.Item("DAD").InnerText
 Sej.ID = D.Item("ID").InnerText
 Next
 End Sub)

What's the purpose of verif_date? If it's not used, remove it:

 For Each foo In ls
 Sej.ID = D.Item("ID").InnerText
 Next

Now the outer loop is making yet another selection:

 Dim ls = (From t In DocElet.ChildNodes 
 Where t.item("ID").innerText = P.InnerText 
 Select t)

I'd try to flatten that. I'm not very familiar with the LINQ syntax in VB.NET, but what you want is a SelectMany, where you'd select the child nodes where "ID"'s inner text matches the element's inner text for the node you're iterating - in other words the ls query gets outside the outer loop and into the lst query, so you only need a single loop (which can probably run in parallel).

If the XML is massive, it could be worth trying some PLINQ:

Dim lst = (From t In DocElet.ChildNodes 
 Select ID = t.item("ID").outerxml)
 .Distinct()
 .AsParallel()
 .SelectMany(...)
 .ToList()

Now you can iterate lst with Parallel.ForEach (time it with a normal For Each first) and do your thing.

As an aside, I have to say your naming is inconsistent and hard to follow: Sub(i) looks good (Sub(index) would be better), Sub(D) doesn't. I realize VB.NET isn't case-sensitive, but if you're going to have a convention to make function parameters camelCase, by all means stick to it. lst would probably be better off as elements or whatever makes sense - use real, readable words for your identifiers, not just arbitrary abbreviations.

answered Jun 27, 2014 at 15:21
\$\endgroup\$
2
  • 1
    \$\begingroup\$ is it true : accessing XML from multiple threads with Parallel.For/Parallel.ForEach is not guaranteed to work correctly as these classes are not thread safe. \$\endgroup\$ Commented Jun 27, 2014 at 15:36
  • 1
    \$\begingroup\$ Awesome, thanks for that comment, that falls under "good to know"! ;) \$\endgroup\$ Commented Jun 27, 2014 at 16:36

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.