Export sql data to xml and csv

Question 1

I have the below code that takes the sql data and exports it to xml or csv. The xml is processing at good speed but my csv sometimes takes 1 hour depending on how many data rows it exports. My question is my code effective writing the csv or it can be improved for speed? Has around 40 columns and x rows which can be over 10k.

public static void SqlExtract(this string queryStatement, string xFilePath, string fileName)
    {
      string connectionString = @"Data Source=ipaddress; Initial Catalog=name; User ID=username; Password=password";
      using (SqlConnection _con = new SqlConnection(connectionString))
      {
        using (SqlCommand _cmd = new SqlCommand(queryStatement, _con))
        {
          using (SqlDataAdapter _dap = new SqlDataAdapter(_cmd))
          {
            if (fileName == "item1" || fileName == "item2" || fileName == "item3")
            {
              DataSet ds = new DataSet("FItem");
              _con.Open();
              _dap.Fill(ds);
              _con.Close();
              FileStream fs = new FileStream(xFilePath, FileMode.Create, FileAccess.Write, FileShare.None);
              StreamWriter writer = new StreamWriter(fs, Encoding.UTF8);
              ds.WriteXml(writer, XmlWriteMode.IgnoreSchema);
              fs.Close();
              StringWriter sw = new StringWriter();
              ds.WriteXml(sw, XmlWriteMode.IgnoreSchema);
              string OutputXML = sw.ToString();
              OutputXML = OutputXML.Replace("Table", "Item");
              System.IO.File.WriteAllText(xFilePath, OutputXML);
            }
            else
            {
              DataTable table1 = new DataTable("Table1");
              _con.Open();
              _dap.Fill(table1);
              _con.Close();
              string exportCSV = string.Empty;
              foreach (DataRow row in table1.Rows)
              {
                int i = 1;
                foreach (DataColumn column in table1.Columns)
                {
                  if (row[1].ToString() == "002" && i > 41 || row[1].ToString() == "END" && i > 4)
                  {
                    //do nothing
                  }
                  else
                  {
                    if (i > 1)
                    {
                      exportCSV += ";";
                    }
                    exportCSV += row[column.ColumnName].ToString();
                  }
                  i++;
                }
                exportCSV += "\r\n";
              }
              //Write CSV
              File.WriteAllText(xFilePath, exportCSV.ToString());
            }
          }
        }
      }
    }

Question 2

also, you may look into turning this into async Task(), utilizing "await _con.OpenAsync().ConfigureAwait(false);" and " _cmd.ExecuteReaderAsync"

Question 3

Your biggest problem targeting performance is the use of string concatenation by using exportCSV +=. Each time such a line will be executed a new string object will be created which takes time.

By using a StringBuilder like so

StringBuilder exportCSV = new StringBuilder(1024);
foreach (DataRow row in table1.Rows)
{
 int i = 1;
 foreach (DataColumn column in table1.Columns)
 {
 if (row[1].ToString() == "002" && i > 41 || row[1].ToString() == "END" && i > 4)
 {
 //do nothing
 }
 else
 {
 if (i > 1)
 {
 exportCSV.Append(";");
 }
 exportCSV.Append(row[column.ColumnName].ToString());
 }
 i++;
 }
 exportCSV.AppendLine();
}

the performance will get a lot better.

The xml-part should be rewritten like so

DataSet ds = new DataSet("FItem");
_con.Open();
_dap.Fill(ds);
_con.Close();
StringWriter sw = new StringWriter();
ds.WriteXml(sw, XmlWriteMode.IgnoreSchema);
string outputXML = sw.ToString().Replace("Table", "Item");
System.IO.File.WriteAllText(xFilePath, OutputXML);

Question 4

thanks so much for the info it greatly improved performance now it’s like under a minute. Learn something new today thanks a lot.

Question 5

Do you have any tips for improving speed on the xml part?

Question 6

sorry what do you mean I’m writing it twice? If I leave one or the other I will not be able to access the xml I’ll get an error failed to parse document.

Question 7

You are using StreamWriter to write to a FileStream and later on you use a StringWriter and use the content of it to call File.WriteAllText() which overwrites the formerly written file. See: docs.microsoft.com/en-us/dotnet/api/…

Question 8

I need to do that to replace the tag name as when I start to write first time the tag is "table" default I think but I need it to be "Item"

Question 9

Some quick remarks:

Don't start variable names with underscores unless they're class-wide private ones.
Local variables should be camelCased. OutputXML doesn't follow that rule.
Split your code into smaller methods that don't intermix data retrieval and file writing. Your method does multiple things, depending on some (odd) logic. Even its name doesn't make sense: it's called SqlExtract yet it generates CSV or XML files depending on seemingly arbitrary conditions.
Don't hardcode your connection string. Put it in a .config file and access it via the ConfigurationManager.
Why do _con.Open(); and _con.Close();? The using takes care of this.
Don't loop through table1.Rows. Instead have a method that transforms a DataRow row into a csv line and use that to construct a IEnumerable<T>, which you then can use in combination with string.Join(). Something like var csvContents = string.Join(Environment.NewLine, table1.Rows.Cast<DataRow>().Select(x => DataRowToCsvLine(x)));. That's one line (instead of 20) which expresses far better what your code does (and yes, the DataRowToCsvLine method is also a couple of lines, but it doesn't pollute the main logic).
row[1].ToString() == "END": are you sure it is always going to be "END"? Or is it possible it can be "end"? Consider using string.Equals() instead, which allows for case insensitive comparisons. Same for fileName == "item1" etc.
Properly dispose of StringWriter with a using block. Example: https://social.msdn.microsoft.com/Forums/vstudio/en-US/7d949d5c-a41c-4dc7-bbcb-429761f851d1/is-calling-flush-and-close-necesary-for-stringwriter-and-xmltextwriter?forum=netfxbcl

None of the above might improve the speed of your code, but it will make it much more maintainable and consistent.

Question 10

Thanks for the tips I will amend this mistakes. The last par of the comments I can guarantee that END will be upper case. Same as with the item name.

Question 11

Something along these lines: where your "entry point" is extension method DataTable.SqlToFile() (see DoWork below as a usage example). I'd suggest breaking your main logic of "serializing" datatable as xml/csv into separate methods - SqlToFile (saves "serialized" datatable into a file), SqlExtract (extracts datatable rows as xml or csv), and 2 separate methods SqlAsXML() and SqlAsCSV() to get xml and csv strings respectively. To serialize into xml, I'm using GetXML() ( or you can use WriteXML() as you did). And to serialize into csv, by using StringBuilder() and iterating in Rows, "saving" each row as comma-delimited string.

In addition, data access logic is implemented as its own class.

public static class SQLExtensions
{
 public static bool SqlToFile(this System.Data.DataTable dt, string xFilePath, string fileName)
 {
 try
 {
 string result = dt.SqlExtract(fileName);
 System.IO.File.WriteAllText(xFilePath, result);
 return true;
 }
 catch (System.IO.IOException ex)
 {
 return false;
 }
 catch (Exception ex)
 {
 return false;
 }
 }
 public static string SqlExtract(this System.Data.DataTable dt, string fileName)
 {
 string result = String.Empty;
 var xmlfiles = new[] { "item1", "item2", "item3" };
 if (xmlfiles.Contains(fileName))
 {
 result = dt.SqlAsXML(fileName);
 }
 else
 {
 string delimiter = ";";
 result = dt.FilterData().SqlAsCSV(delimiter);
 }
 return result;
 }
 public static string SqlAsXML(this System.Data.DataTable dt, string fileName)
 {
 var ds = new System.Data.DataSet();
 ds.Tables.Add(dt);
 string xml = ds.GetXml();
 return xml;
 }
 #region CSV related
 public static IEnumerable<System.Data.DataRow> FilterData(this System.Data.DataTable dtIns)
 {
 var list = dtIns
 .Rows.Cast<System.Data.DataRow>()
 .Where(
 r => !((r[1].ToString() == "002" && dtIns.Rows.IndexOf(r) > 41) || (r[1].ToString() == "END" && dtIns.Rows.IndexOf(r) > 4))
 )
 ;
 return list;
 }
 public static string SqlAsCSV(this System.Data.DataTable dt, string delimiter = ",")
 {
 StringBuilder sb = new StringBuilder();
 for (int i = 0; i < dt.Rows.Count - 1; i++)
 {
 System.Data.DataRow dr = dt.Rows[i];
 sb.AppendLine(dr.RowDataAsCSV(delimiter));
 }
 return sb.ToString();
 }
 public static string SqlAsCSV(this IEnumerable<System.Data.DataRow> rows, string delimiter = ",")
 {
 StringBuilder sb = new StringBuilder();
 foreach (var row in rows)
 {
 sb.AppendLine(row.RowDataAsCSV(delimiter));
 }
 return sb.ToString();
 }
 public static string RowDataAsCSV(this System.Data.DataRow row, string delimiter = ",")
 {
 string rowdata = String.Join(delimiter, row.ItemArray.Select(f => f.ToString()));
 return rowdata;
 }
 #endregion
}
public class DataAccess
{
 public async Task< System.Data.DataTable> DataAsync(string connectionInfo, string commandSQL)
 {
 System.Data.DataTable dt = new System.Data.DataTable("data");
 using (System.Data.SqlClient.SqlConnection connection = new System.Data.SqlClient.SqlConnection(connectionInfo))
 {
 try
 {
 await connection.OpenAsync().ConfigureAwait(false);
 }
 catch (InvalidOperationException ex)
 {
 return null;
 }
 catch (System.Data.SqlClient.SqlException ex)
 {
 return null;
 }
 using (System.Data.SqlClient.SqlCommand command = new System.Data.SqlClient.SqlCommand(commandSQL, connection))
 {
 using (var r = await command.ExecuteReaderAsync(System.Data.CommandBehavior.SequentialAccess).ConfigureAwait(false))
 {
 dt.Load(r);
 }
 }
 connection.Close();
 }
 return dt;
 }
}
public class MainMainMain
{
 public async Task DoWork(string commandSQL, string xFilePath, string fileName)
 {
 string connectionString = @"Data Source=ipaddress; Initial Catalog=name; User ID=username; Password=password";
 var da = new DataAccess();
 var dt = await da.DataAsync(connectionString, commandSQL ).ConfigureAwait(false);
 if (dt != null)
 {
 bool converted = dt.SqlToFile(xFilePath, fileName);
 }
 else
 {
 //oh-ho
 }
 }
}

furthermore, you could add common interface, and have AsXML and AsCSV as implmentations.

Question 12

Could you add why you made the changes you did? Its not much help to the OP if you just modify there code without an explanation for why they should change that.

Question 13

good point, will do

Question 14

@yob Can you let me know if I do this changes will this increase performance? At the moment with all changes made from OP It takes for 10k rows around 2 min which is not bad considering it originally took around 15 mins for csv. Are you able to provide more info why this is more better approach? Thanks.

Question 15

@QuickSilver, - i tried export to csv from selecting 8 columns and 2 columns from table containing 49700 rows. timings were ~23 sec and ~13 seconds.

Question 16

@QuickSilver - 2 points, - 1) check if data "extraction" is a "bottleneck", 2) and remove "if (row[1].ToString() == "002" && i > 41 || row[1].ToString() == "END" && i > 4)" from within the loop, and instead use DataView with RowFilter

Heslacher Heslacher 50.9k5 gold badges83 silver badges177 bronze badges · Accepted Answer · 2020-02-24 10:54:32Z

Your biggest problem targeting performance is the use of string concatenation by using exportCSV +=. Each time such a line will be executed a new string object will be created which takes time.

By using a StringBuilder like so

StringBuilder exportCSV = new StringBuilder(1024);
foreach (DataRow row in table1.Rows)
{
 int i = 1;
 foreach (DataColumn column in table1.Columns)
 {
 if (row[1].ToString() == "002" && i > 41 || row[1].ToString() == "END" && i > 4)
 {
 //do nothing
 }
 else
 {
 if (i > 1)
 {
 exportCSV.Append(";");
 }
 exportCSV.Append(row[column.ColumnName].ToString());
 }
 i++;
 }
 exportCSV.AppendLine();
}

the performance will get a lot better.

The xml-part should be rewritten like so

DataSet ds = new DataSet("FItem");
_con.Open();
_dap.Fill(ds);
_con.Close();
StringWriter sw = new StringWriter();
ds.WriteXml(sw, XmlWriteMode.IgnoreSchema);
string outputXML = sw.ToString().Replace("Table", "Item");
System.IO.File.WriteAllText(xFilePath, OutputXML);

thanks so much for the info it greatly improved performance now it’s like under a minute. Learn something new today thanks a lot.
sorry what do you mean I’m writing it twice? If I leave one or the other I will not be able to access the xml I’ll get an error failed to parse document.
You are using StreamWriter to write to a FileStream and later on you use a StringWriter and use the content of it to call File.WriteAllText() which overwrites the formerly written file. See: docs.microsoft.com/en-us/dotnet/api/…
I need to do that to replace the tag name as when I start to write first time the tag is "table" default I think but I need it to be "Item"

Stack Exchange Network

Export sql data to xml and csv

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Export sql data to xml and csv

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions