Wednesday, January 23, 2008
Namespace Strangenesses in XML Infosets Transformed by LINQ to XML
The purpose of namespace prefixes is to provide abbreviations for global or group-level namespaces, which otherwise would bloat the already substantial overhead of XML Infosets. I've found LINQ to XML not to process namespace declarations as I expected when processing some semi-real-world documents.
Updated 1/23/2008: See end of post.
Bloating All Prefixed Elements with Duplicate Local Namespace Declarations
LINQ to XML works exclusively with expanded namespace prefixes, so relatively simple documents with a few namespaces become unwieldy to store and difficult for humans to read. For example, this simple Atom 1.0-formatted source Infoset returned by an ADO.NET Data Services URL query is quite easy to read:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <feed xml:base="http://localhost:50539/Northwind.svc/" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns="http://www.w3.org/2005/Atom"> <entry adsm:type="NorthwindModel.Orders"> <id>http://localhost:50539/Northwind.svc/Orders(11077)</id> <updated /> <title /> <author> <name /> </author> <link rel="edit" href="Orders(11077)" title="Orders" /> <content type="application/xml"> <ads:OrderID adsm:type="Int32">11077</ads:OrderID> <ads:OrderDate adsm:type="Nullable`1[System.DateTime]">2007年05月06日T00:00:00</ads:OrderDate> <ads:RequiredDate adsm:type="Nullable`1[System.DateTime]">2007年06月03日T00:00:00</ads:RequiredDate> <ads:ShippedDate adsm:type="Nullable`1[System.DateTime]" ads:null="true" /> <ads:Freight adsm:type="Nullable`1[System.Decimal]">8.5300</ads:Freight> <ads:ShipName>Rattlesnake Canyon Grocery</ads:ShipName> <ads:ShipAddress>2817 Milton Dr.</ads:ShipAddress> <ads:ShipCity>Albuquerque</ads:ShipCity> <ads:ShipRegion>NM</ads:ShipRegion> <ads:ShipPostalCode>87110</ads:ShipPostalCode> <ads:ShipCountry>USA</ads:ShipCountry> </content> <link rel="related" title="Customers" href="Orders(11077)/Customers" type="application/atom+xml;type=entry" /> <link rel="related" title="Employees" href="Orders(11077)/Employees" type="application/atom+xml;type=entry" /> <link rel="related" title="Order_Details" href="Orders(11077)/Order_Details" type="application/atom+xml;type=feed" /> <link rel="related" title="Shippers" href="Orders(11077)/Shippers" type="application/atom+xml;type=entry" /> </entry> <!-- ... --> </feed>
Applying a LINQ to XML query that returns only abbreviated <entry> groups for the USA in reverse OrderDate order results in this mess:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <feed> <entry> <content type="application/xml" xmlns="http://www.w3.org/2005/Atom"> <ads:OrderID adsm:type="Int32" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">11077</ads:OrderID> <ads:OrderDate adsm:type="Nullable`1[System.DateTime]" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb"> 2007年05月06日T00:00:00 </ads:OrderDate> <ads:RequiredDate adsm:type="Nullable`1[System.DateTime]" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb"> 2007年06月03日T00:00:00 </ads:RequiredDate> <ads:ShippedDate adsm:type="Nullable`1[System.DateTime]" ads:null="true" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" /> <ads:Freight adsm:type="Nullable`1[System.Decimal]" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">8.5300</ads:Freight> <ads:ShipName xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">Rattlesnake Canyon Grocery</ads:ShipName> <ads:ShipAddress xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">2817 Milton Dr.</ads:ShipAddress> <ads:ShipCity xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">Albuquerque</ads:ShipCity> <ads:ShipRegion xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">NM</ads:ShipRegion> <ads:ShipPostalCode xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">87110</ads:ShipPostalCode> <ads:ShipCountry xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb">USA</ads:ShipCountry> </content> </entry>
<!-- ... --> </feed>
You can remove the duplicate local namespaces by string manipulation but doing so results in brittle code.
Bloating Some Unprefixed Elements with Duplicate Group Namespace Declarations
An alternative is to transform, rather than filter, the document because the compiler is reported to cache the namespaces you add with code and remove them from the output. My Visual Basic 9.0 XML literal transform code is similar to the following abbreviated version (the three namespaces are imported with Imports directives, which aren't shown):
Private Sub TransformOrders() Dim xdOrders As XDocument = XDocument.Load(strDataPath & "Orders.xml", _ LoadOptions.PreserveWhitespace) Dim xdDetails As XDocument = XDocument.Load(strDataPath & _ "Order_Details.xml", LoadOptions.PreserveWhitespace) Dim Orders As XDocument = _ <?xml version="1.0" encoding="utf-8" standalone="yes"?> <feed xmlns="http://www.w3.org/2005/Atom" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" <%= From o In xdOrders...<content> _ Where o...<ads:ShipCountry>.Value = "USA" _ Order By o...<ads:OrderDate>.Value Descending _ Select New XElement( _ <entry> <content type="application/xml"> <Order> <ads:OrderID adsm:type="Int32"> <%= o...<ads:OrderID>.Value %> </ads:OrderID> <!-- ... --> <ads:ShipCountry> <%= o...<ads:ShipCountry>.Value %> </ads:ShipCountry> </Order> </content> </entry>) %>> </feed> End Sub
However, the compiler doesn't get rid of all duplicate namespaces, as illustrated by the following output Infoset:
<feed xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom"> <entry xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom"> <content type="application/xml"> <Order> <ads:OrderID adsm:type="Int32">11006</ads:OrderID> <ads:OrderDate adsm:type="Nullable`1[System.DateTime]">2007年04月07日T00:00:00</ads:OrderDate> <ads:RequiredDate adsm:type="Nullable`1[System.DateTime]">2007年05月05日T00:00:00</ads:RequiredDate> <ads:ShippedDate adsm:type="Nullable`1[System.DateTime]">2007年04月15日T00:00:00</ads:ShippedDate> <ads:Freight adsm:type="Nullable`1[System.Decimal]">25.1900</ads:Freight> <ads:ShipName>Great Lakes Food Market</ads:ShipName> <ads:ShipAddress>2732 Baker Blvd.</ads:ShipAddress> <ads:ShipCity>Eugene</ads:ShipCity> <ads:ShipRegion>OR</ads:ShipRegion> <ads:ShipPostalCode>97403</ads:ShipPostalCode> <ads:ShipCountry>USA</ads:ShipCountry> </Order> </content> </entry> <!-- ... --> </feed>
The <entry> element (shown set bold) has a duplicate set of namespace declarations.
Bloating More Unprefixed Elements with Duplicate Group Namespace Declarations
Adding code to insert related Order_Detail elements, as shown below, results in namespace declaration duplication in the those elements.
Private Sub TransformOrderAndOrderDetails() Dim xdOrders As XDocument = XDocument.Load(strDataPath & "Orders.xml", _ LoadOptions.PreserveWhitespace) Dim xdDetails As XDocument = XDocument.Load(strDataPath & _ "Order_Details.xml", LoadOptions.PreserveWhitespace) Dim Orders As XDocument = _ <?xml version="1.0" encoding="utf-8" standalone="yes"?> <feed xmlns="http://www.w3.org/2005/Atom" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" <%= From o In xdOrders...<content> _ Where o...<ads:ShipCountry>.Value = "USA" _ Order By o...<ads:OrderDate>.Value Descending _ Select New XElement( _ <entry> <content type="application/xml"> <Order> <ads:OrderID adsm:type="Int32"> <%= o...<ads:OrderID>.Value %> </ads:OrderID> <!-- ... --> <ads:ShipCountry> <%= o...<ads:ShipCountry>.Value %> </ads:ShipCountry> <Order_Details <%= From d In xdDetails...<content> _ Where d...<ads:OrderID>.Value = o...<ads:OrderID>.Value _ Select New XElement( _ <Order_Detail> <ads:OrderID adsm:type="Int32"> <%= d...<ads:OrderID>.Value %> </ads:OrderID> <ads:ProductID adsm:type="Int32"> <%= d...<ads:ProductID>.Value %> </ads:ProductID> <ads:Quantity adsm:type="Short"> <%= d...<ads:Quantity>.Value %> </ads:Quantity> <ads:QuantityPerUnit> <%= p...<ads:QuantityPerUnit>.Value %> </ads:QuantityPerUnit> <ads:UnitPrice adsm:type="Decimal"> <%= d...<ads:UnitPrice>.Value %> </ads:UnitPrice> <ads:Discount adsm:type="Single"> <%= d...<ads:Discount>.Value %> </ads:Discount> </Order_Detail>) _ %>> </Order_Details> </Order> </content> </entry>) %>> </feed> End Sub
Each Order_Detail group has its own duplicate namespace declarations:
<feed xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom"> <entry xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom"> <content type="application/xml"> <Order> <ads:OrderID adsm:type="Int32">11006</ads:OrderID> <ads:OrderDate adsm:type="Nullable`1[System.DateTime]">2007年04月07日T00:00:00</ads:OrderDate> <ads:RequiredDate adsm:type="Nullable`1[System.DateTime]">2007年05月05日T00:00:00</ads:RequiredDate> <ads:ShippedDate adsm:type="Nullable`1[System.DateTime]">2007年04月15日T00:00:00</ads:ShippedDate> <ads:Freight adsm:type="Nullable`1[System.Decimal]">25.1900</ads:Freight> <ads:ShipName>Great Lakes Food Market</ads:ShipName> <ads:ShipAddress>2732 Baker Blvd.</ads:ShipAddress> <ads:ShipCity>Eugene</ads:ShipCity> <ads:ShipRegion>OR</ads:ShipRegion> <ads:ShipPostalCode>97403</ads:ShipPostalCode> <ads:ShipCountry>USA</ads:ShipCountry> <Order_Details> <Order_Detail xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom"> <ads:OrderID adsm:type="Int32">11006</ads:OrderID> <ads:ProductID adsm:type="Int32">1</ads:ProductID> <ads:Quantity adsm:type="Short">8</ads:Quantity> <ads:UnitPrice adsm:type="Decimal">18.0000</ads:UnitPrice> <ads:Discount adsm:type="Single">0</ads:Discount> </Order_Detail> <Order_Detail xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns="http://www.w3.org/2005/Atom"> <ads:OrderID adsm:type="Int32">11006</ads:OrderID> <ads:ProductID adsm:type="Int32">29</ads:ProductID> <ads:Quantity adsm:type="Short">2</ads:Quantity> <ads:UnitPrice adsm:type="Decimal">123.7900</ads:UnitPrice> <ads:Discount adsm:type="Single">0.25</ads:Discount> </Order_Detail> </Order_Details> </Order> </content> </entry> <!-- ... --> </feed>
Meaningless Empty Namespace Inserted with Functional Construction
C# 3.0 and VB 9.0 functional construction code doesn't repeat global namespace declarations, but it isn't immune to namespaces strangeness either. This C# code (and its corresponding VB port) adds the empty namespace shown below:
private void btnOrder_DetailsLookup_Click(object sender, EventArgs e) { XDocument xdOrders = XDocument.Load(strDataPath + "Orders.xml", LoadOptions.PreserveWhitespace); XDocument xdDetails = XDocument.Load(strDataPath + "Order_Details.xml", LoadOptions.PreserveWhitespace); XNamespace atom = "http://www.w3.org/2005/Atom"; XNamespace ads = "http://schemas.microsoft.com/ado/2007/08/dataweb"; XNamespace adsm = "http://schemas.microsoft.com/ado/2007/08/dataweb/metadata"; XDocument Orders = new XDocument(new XDeclaration("1.0", "utf-8", "yes"), new XElement(atom + "feed", new XAttribute("xmlns", "http://www.w3.org/2005/Atom"), new XAttribute(XNamespace.Xmlns + "ads", "http://schemas.microsoft.com/ado/2007/08/dataweb"), new XAttribute(XNamespace.Xmlns + "adsm", "http://schemas.microsoft.com/ado/2007/08/dataweb/metadata"), (from o in xdOrders.Descendants(atom + "content") where o.Element(ads + "ShipCountry").Value == "USA" orderby o.Element(ads + "OrderDate").Value descending select new XElement("entry", new XElement("content", new XAttribute("type", "application/xml"), new XElement("Order", new XElement(ads + "OrderID", o.Element(ads + "OrderID").Value, new XAttribute(adsm + "type", "int32")), <!-- ... --> new XElement(ads + "ShipCountry", o.Element(ads + "ShipCountry").Value), new XElement("Order_Details", from d in xdDetails.Descendants(atom + "content") where d.Element(ads + "OrderID").Value == o.Element(ads +"OrderID").Value select new XElement("Order_Detail", new XElement(ads + "OrderID", d.Element(ads + "OrderID").Value, new XAttribute(adsm + "type", "System.Int32")), new XElement(ads + "ProductID", d.Element(ads + "ProductID").Value, new XAttribute(adsm + "type", "System.Int32")), new XElement(ads + "Quantity", d.Element(ads + "Quantity").Value, new XAttribute(adsm + "type", "System.Int16")), new XElement(ads + "UnitPrice", d.Element(ads + "UnitPrice").Value, new XAttribute(adsm + "type", "System.Decimal")), new XElement(ads + "Discount", d.Element(ads + "Discount").Value, new XAttribute(adsm + "type", "System.Single"))))))))); }
The resulting Infoset's <entry> group has a spurious xmlns="" attribute:
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:ads="http://schemas.microsoft.com/ado/2007/08/dataweb" xmlns:adsm="http://schemas.microsoft.com/ado/2007/08/dataweb/metadata"> <entry xmlns=""> <content type="application/xml"> <Order> <ads:OrderID adsm:type="int32">11006</ads:OrderID> <ads:OrderDate adsm:type="Nullable`1[System.DateTime]">2007年04月07日T00:00:00</ads:OrderDate> <ads:RequiredDate adsm:type="Nullable`1[System.DateTime]">2007年05月05日T00:00:00</ads:RequiredDate> <ads:ShippedDate adsm:type="Nullable`1[System.DateTime]">2007年04月15日T00:00:00</ads:ShippedDate> <ads:Freight adsm:type="Nullable`1[System.Decimal]">25.1900</ads:Freight> <ads:ShipName>Great Lakes Food Market</ads:ShipName> <ads:ShipAddress>2732 Baker Blvd.</ads:ShipAddress> <ads:ShipCity>Eugene</ads:ShipCity> <ads:ShipRegion>OR</ads:ShipRegion> <ads:ShipPostalCode>97403</ads:ShipPostalCode> <ads:ShipCountry>USA</ads:ShipCountry> <Order_Details> <Order_Detail> <ads:OrderID adsm:type="System.Int32">11006</ads:OrderID> <ads:ProductID adsm:type="System.Int32">1</ads:ProductID> <ads:Quantity adsm:type="System.Int16">8</ads:Quantity> <ads:UnitPrice adsm:type="System.Decimal">18.0000</ads:UnitPrice> <ads:Discount adsm:type="System.Single">0</ads:Discount> </Order_Detail> <Order_Detail> <ads:OrderID adsm:type="System.Int32">11006</ads:OrderID> <ads:ProductID adsm:type="System.Int32">29</ads:ProductID> <ads:Quantity adsm:type="System.Int16">2</ads:Quantity> <ads:UnitPrice adsm:type="System.Decimal">123.7900</ads:UnitPrice> <ads:Discount adsm:type="System.Single">0.25</ads:Discount> </Order_Detail> </Order_Details> </Order> </content> </entry> <!-- ... --> </feed>
Update 1/23/2008: After further consideration, I've concluded that the spurious xmlns="" (empty namespace) declaration isn't benign. The source document at the top of this post has <entry> and <content> groups without a namespace declaration; these groups are within the Atom namespace. Therefore, there's no justification that I can see for removing them from the Atom namespace with xmlns="".
(削除) The xmlns="" attribute probably is benign or even useful because the inferred schema for the source document doesn't include Order, Order_Details, and Order_Detail groups. However, I didn't include the XML document, or reference or infer an XML schema in the C# project because C# doesn't support IntelliSense for functional construction. (削除ここまで)
VB MVP Bill McCarthy, who has written extensively about LINQ to XML, suggested in his comment to this post:
Last one first: the xmlns="" is required because you have a default namespace at the root of the document, then you have entry, content, Order, Order_Details and Order_Detail all without a namespace (or more correctly the empty namespace. If they were meant to be in the default namespace, and you are using the explicit XElement constructors, you need to supply that namespace. VB makes it easy if you use XML literals.
Bill is correct that you must "supply that namespace." Thanks, Bill.
But the only way I can find to supply it in my sample code is to add an expanded name to each element that precedes an element in a prefixed namespace, as shown in black in the following snippets:
select new XElement("{http://www.w3.org/2005/Atom}entry", // Following doesn't solve the xmlns="" issue; it throws a "The prefix '' cannot be redefined // from '' to 'http://www.w3.org/2005/Atom' within the same start element tag. //new XAttribute("xmlns", "http://www.w3.org/2005/Atom"), new XElement("{http://www.w3.org/2005/Atom}content", new XAttribute("type", "application/xml"), new XElement("{http://www.w3.org/2005/Atom}Order", new XElement(ads + "OrderID", o.Element(ads + "OrderID").Value, new XAttribute(adsm + "type", "int32")), new XElement("{http://www.w3.org/2005/Atom}Order_Details", from d in xdDetails.Descendants(atom + "content") where d.Element(ads + "OrderID").Value == o.Element(ads +"OrderID").Value select new XElement("{http://www.w3.org/2005/Atom}Order_Detail",
It's interesting that attempts to add the default namespace declaration with an attribute throw the runtime exception noted in the first comment. I don't recall seeing any documentation about this peculiar expanded name syntax requirement.
Note: For the record, Order, Order_Details and Order_Detail elements aren't in the actual Atom or ads namespace. They were inserted into the document for demonstration purposes.
Bill's comment also includes the following observation about the problem with the VB literal implementation:
As to the second [to] last example, the VB code has a superfluous New XElement that is stuffing you up I think.
I don't think that either New XElement statement is superfluous. The first is required for each Order group and the second is required for each Order_Detail group in the enumeration.
So I don't believe mystery is solved for VB's XML literal syntax.
Posted by Roger Jennings (--rj) at 9:30 AM 1 comments
Labels: LINQ, LINQ to XML, XQuery, XSLT
Tuesday, January 15, 2008
LINQ to XML: Grouping and Aggregation Gotchas, Part II
LINQ to XML: Grouping and Aggregation Gotchas, Part I of January 7, 2008 bemoaned the lack of running sum examples for business document generation with LINQ to XML. LINQ to XML's powerful grouping and aggregation features make it a logical replacement for inscrutable XSLT 1.0 or 2.0 transforms.
It's very difficult to write a narrative on applying LINQ to XML aggregation to XML Infosets generated with VB 9.0's XML literal data type. Non-trivial examples appear to be better suited to demonstrate these techniques. As Julie Lerman observed in her LINQ - Learn by example post of January 7, 2008:
I have often likened learning LINQ as well as learning Entity SQL to learning T-SQL. Learning T-SQL has, for me, been an evolution. I still have to look up the Cast & Convert topic 99% of the time in the docs whenever I need to convert a datetime to a desired char format.
Even if there were 1001 examples, I know that there will always be something I want to do that I can't find an example for and will have to rely on gaining a better understanding of LINQ in order to accomplish it.
It's a common practice to roll up totals within an individual business document, such as the count and sum of an order's or invoice's extended line items, then add sales taxes and prepaid freight charges, if applicable, and provide a total order or invoice amount. The next rollup(s) might be by territory, customer or both, following by grand totals and averages for the entire Infoset.
Following is an example of the last order from an Infoset generated from hierarchical Northwind Customer, Order, and Order_Detail entities as well as many:1 associations with related Employee, Shipper and Product entities with rollups by Customer and grand totals for the first 100 orders with rollups in blue:
<Customers> ... <Customer> <CompanyName CustomerID="WHITC">White Clover Markets</CompanyName> <City>Seattle</City> <Country>USA</Country> <Orders> <Order CustomerID="WHITC"> <OrderID>10344</OrderID> <SalesPerson EmployeeID="4">Margaret Peacock</SalesPerson> <OrderDate>11/1/1996</OrderDate> <RequiredDate>11/29/1996</RequiredDate> <ShippedDate>11/5/1996</ShippedDate> <ShipperName ShipperID="2">United Package</ShipperName> <ShipToName>White Clover Markets</ShipToName> <ShipToCountry>USA</ShipToCountry> <Details> <Order_Detail OrderID="10344"> <Quantity>35</Quantity> <ProductID>4</ProductID> <ProductName>Chef Anton's Cajun Seasoning</ProductName> <SKU>48 - 6 oz jars</SKU> <UnitPrice>17ドル.60</UnitPrice> <Discount>0.0 %</Discount> <Extended>616ドル.00</Extended> </Order_Detail> <Order_Detail OrderID="10344"> <Quantity>70</Quantity> <ProductID>8</ProductID> <ProductName>Northwoods Cranberry Sauce</ProductName> <SKU>12 - 12 oz jars</SKU> <UnitPrice>32ドル.00</UnitPrice> <Discount>25.0 %</Discount> <Extended>1,680ドル.00</Extended> </Order_Detail> </Details> <OrderItems>2</OrderItems> <OrderSubtotal>2,296ドル.00</OrderSubtotal> <PrepaidFreight>23ドル.29</PrepaidFreight> <OrderTotal>2,319ドル.29</OrderTotal> </Order> <Order CustomerID="WHITC"> <OrderID>10269</OrderID> <SalesPerson EmployeeID="5">Steven Buchanan</SalesPerson> <OrderDate>7/31/1996</OrderDate> <RequiredDate>8/14/1996</RequiredDate> <ShippedDate>8/9/1996</ShippedDate> <ShipperName ShipperID="1">Speedy Express</ShipperName> <ShipToName>White Clover Markets</ShipToName> <ShipToCountry>USA</ShipToCountry> <Details> <Order_Detail OrderID="10269"> <Quantity>60</Quantity> <ProductID>33</ProductID> <ProductName>Geitost</ProductName> <SKU>500 g</SKU> <UnitPrice>2ドル.00</UnitPrice> <Discount>5.0 %</Discount> <Extended>114ドル.00</Extended> </Order_Detail> <Order_Detail OrderID="10269"> <Quantity>20</Quantity> <ProductID>72</ProductID> <ProductName>Mozzarella di Giovanni</ProductName> <SKU>24 - 200 g pkgs.</SKU> <UnitPrice>27ドル.80</UnitPrice> <Discount>5.0 %</Discount> <Extended>528ドル.20</Extended> </Order_Detail> </Details> <OrderItems>2</OrderItems> <OrderSubtotal>642ドル.20</OrderSubtotal> <PrepaidFreight>4ドル.56</PrepaidFreight> <OrderTotal>646ドル.76</OrderTotal> </Order> </Orders> <CustomerOrders>2</CustomerOrders> <OrdersSubtotal>2,938ドル.20</OrdersSubtotal> <OrdersFreight>27ドル.85</OrdersFreight> <OrdersTotal>2,966ドル.05</OrdersTotal> </Customer> <GrandTotals> <TotalOrders>100</TotalOrders> <TotalNetSales>124,898ドル.40</TotalNetSales> <AverageNetSale>1,248ドル.98</AverageNetSale> <TotalFreight>5,746ドル.95</TotalFreight> <AverageFreight>57ドル.47</AverageFreight> <TotalGrossSales>130,645ドル.40</TotalGrossSales> </GrandTotals> </Customers>
Here's the code that created the preceding Infoset:
Private Sub OrdersByCustomer ' Create hierarchical Orders and Order Details by Customer Infoset using 1:many and ' many:1 associations with order summary and orders by customer summary aggregations Dim CustomerOrders = _ <Customers> <%= From c In CustomerList _ Select _ <Customer> <CompanyName CustomerID=<%= c.CustomerID %>> <%= c.CompanyName %> </CompanyName> <City><%= c.City %></City> <Country><%= c.Country %></Country> <Orders> <%= From o In c.Orders _ Order By o.OrderID Descending _ Select _ <Order CustomerID=<%= o.CustomerID %>> <OrderID><%= o.OrderID %></OrderID> <SalesPerson EmployeeID=<%= o.EmployeeID %>> <%= o.Employee.FirstName & " " & o.Employee.LastName %> </SalesPerson> <OrderDate> <%= o.OrderDate.Value.ToShortDateString() %> </OrderDate> <RequiredDate> <%= o.RequiredDate.Value.ToShortDateString() %> </RequiredDate> <ShippedDate> <%= o.ShippedDate.Value.ToShortDateString() %> </ShippedDate> <ShipperName ShipperID=<%= o.ShipVia %>> <%= o.Shipper.CompanyName %> </ShipperName> <ShipToName><%= o.ShipName %></ShipToName> <ShipToCountry><%= o.ShipCountry %></ShipToCountry> <Details> <%= From d In o.Order_Details _ Let s = CDec(d.Quantity) * CDec(d.UnitPrice) * _ (1 - CDec(d.Discount)) _ Select _ <Order_Detail OrderID=<%= d.OrderID %>> <Quantity><%= d.Quantity %></Quantity> <ProductID><%= d.ProductID %></ProductID> <ProductName> <%= d.Product.ProductName %> </ProductName> <SKU><%= d.Product.QuantityPerUnit %></SKU> <UnitPrice> <%= String.Format("{0:c}", d.UnitPrice) %> </UnitPrice> <Discount> <%= String.Format("{0:p1}", d.Discount) %> </Discount> <Extended> <%= String.Format("{0:c}", s) %> </Extended> </Order_Detail> %> </Details> <OrderItems><%= o.Order_Details.Count %></OrderItems> <OrderSubtotal> <%= String.Format("{0:c}", _ (From t In o.Order_Details _ Select CDec(t.Quantity) * _ CDec(t.UnitPrice) * _ (1 - CDec(t.Discount))).Sum()) %> </OrderSubtotal> <PrepaidFreight> <%= String.Format("{0:c}", o.Freight.Value) %> </PrepaidFreight> <OrderTotal> <%= String.Format("{0:c}", o.Freight + _ (From t In o.Order_Details _ Select CDec(t.Quantity) * CDec(t.UnitPrice) * _ (1 - CDec(t.Discount))).Sum()) %></OrderTotal> </Order> %> </Orders> <CustomerOrders><%= c.Orders.Count() %></CustomerOrders> <OrdersSubtotal> <%= String.Format("{0:c}", Aggregate o In c.Orders Into _ Sum(Aggregate d In o.Order_Details _ Into Sum(d.Quantity * d.UnitPrice * (1 - d.Discount)))) %> </OrdersSubtotal> <OrdersFreight> <%= String.Format("{0:c}", Aggregate f In c.Orders _ Into Sum(f.Freight)) %> </OrdersFreight> <OrdersTotal> <%= String.Format("{0:c}", (From f In c.Orders _ Select f.Freight).Sum() + Aggregate o In c.Orders _ Into Sum(Aggregate d In o.Order_Details _ Into Sum(d.Quantity * d.UnitPrice * (1 - d.Discount)))) %> </OrdersTotal> </Customer> %> <GrandTotals> <TotalOrders><%= OrderList.Count %></TotalOrders> <TotalNetSales> <%= String.Format("{0:c}", Aggregate d In Order_DetailList _ Into Sum(d.Quantity * d.UnitPrice * (1 - d.Discount))) %> </TotalNetSales> <AverageNetSale> <%= String.Format("{0:c}", Aggregate o In OrderList _ Into Average(Aggregate d In o.Order_Details _ Into Sum(d.Quantity * d.UnitPrice * (1 - d.Discount)))) %> </AverageNetSale> <TotalFreight> <%= String.Format("{0:c}", Aggregate o In OrderList _ Into Sum(o.Freight)) %> </TotalFreight> <AverageFreight> <%= String.Format("{0:c}", Aggregate o In OrderList _ Into Average(o.Freight)) %> </AverageFreight> <TotalGrossSales> <%= String.Format("{0:c}", (Aggregate o In OrderList _ Into Sum(o.Freight)) + Aggregate o In OrderList _ Into Sum(Aggregate d In o.Order_Details _ Into Sum(d.Quantity * d.UnitPrice * (1 - d.Discount)))) %> </TotalGrossSales> </GrandTotals> </Customers> End Sub
Notice the repetition of aggregation code in expressions for OrderSubtotal and OrderTotal, OrdersSubtotal and OrdersTotal, and TotalNetSales and TotalGrossSales elements. Let variables are scoped to their own expression hole, so they can't be used in successive calculations.
Despite the extra aggregation operations, performance is excellent. Writing the 4,614-element document from in-memory List<T> objects created by my LINQ In-Memory Object Generator (LIMOG) utility requires an average of only 84 ms. on my 2.8 (削除) MHz (削除ここまで) GHz dual-core development machine.
You can open in IE the complete 181-kB XML document from SkyDrive.
Posted by Roger Jennings (--rj) at 9:49 AM 3 comments
Labels: Aggregate Functions, LINQ, LINQ to XML, XPath, XQuery, XSLT