I am trying to implement a CSV data plotting program in C#. In the plotting part, I use ScottPlot.NET to perform plotting operation. There are five column in the given CSV file, therefore, I use Channel1
to Channel5
to present these 5 channels. The following program only plots first channel data. The given CSV file has millions of rows. I am wondering how to improve the performance for the case of drawing millions of points.
The experimental implementation
using ScottPlot.WinForms;
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
namespace PlottingApp
{
public partial class Form1 : Form
{
private readonly ScottPlot.WinForms.FormsPlot formsPlot1 = new ScottPlot.WinForms.FormsPlot();
private ScottPlot.Plottables.DataLogger Logger1;
public Form1()
{
InitializeComponent();
}
private void button_add_channel_Click(object sender, EventArgs e)
{
OpenFileDialog openFileDialog = new OpenFileDialog();
// Set the title and filter for the dialog
openFileDialog.Title = "Open File";
openFileDialog.Filter = "CSV files (*.csv)|*.csv|Text files (*.txt)|*.txt|All files (*.*)|*.*";
// Display the dialog and check if the user clicked OK
if (openFileDialog.ShowDialog() == DialogResult.OK)
{
// Get the selected file name
string fileName = openFileDialog.FileName;
var contents = File.ReadAllText(fileName);
string[] lines = contents.Split(
new string[] { Environment.NewLine },
StringSplitOptions.None
);
foreach (var item in lines)
{
string[] eachData = item.Split(
new string[] { "," },
StringSplitOptions.None
);
Channels channels = new Channels();
if (!double.TryParse(eachData[0], out channels.Channel1))
{
continue;
}
if (!double.TryParse(eachData[1], out channels.Channel2))
{
continue;
}
if (!double.TryParse(eachData[2], out channels.Channel3))
{
continue;
}
if (!double.TryParse(eachData[3], out channels.Channel4))
{
continue;
}
if (!double.TryParse(eachData[4], out channels.Channel5))
{
continue;
}
Logger1.Add(channels.Channel1);
formsPlot1.Refresh();
}
}
else
{
Console.WriteLine("No file selected.");
}
}
private void Form1_Load(object sender, EventArgs e)
{
panel1.Controls.Add(formsPlot1);
formsPlot1.Width = panel1.Width;
formsPlot1.Height = panel1.Height;
// create loggers and add them to the plot
Logger1 = formsPlot1.Plot.Add.DataLogger();
Logger1.LineWidth = 3;
}
}
}
Channels
classpublic class Channels { public double Channel1; public double Channel2; public double Channel3; public double Channel4; public double Channel5; public override string ToString() { return Channel1.ToString() + '\t' + Channel2.ToString() + '\t' + Channel3.ToString() + '\t' + Channel4.ToString() + '\t' + Channel5.ToString(); } }
2 Answers 2
I am wondering how to improve the performance for the case of drawing millions of points.
Example how to read and write a CSV file
Be sure to read C# documentation on FileStream
. It has ways of dealing with large files.
ToString Override
Excellent. Thus the object's "instance information" naturally fits into the app's OO design and the .NET framework. For example console
implicitly calls ToString
console.Writeline(myChannelObject);
And if you create a custom collection class, its ToString
can look like this:
StringBuilder me = new StringBuilder();
foreach(var record in this.CSVrecords)
me.AppendLine(record);
return me.ToString();
Encapsulate Object Instantiation
Constructor parameters tells you what is required for a new object. Otherwise you have to read the source code and hope you didn't miss anything or any other mistake that otherwise is easily caught in the constructor.
OpenFileDialog openFileDialog = new OpenFileDialog(
"Open File",
"CSV files (*.csv)|*.csv|Text files (*.txt)|*.txt|All files (*.*)|*.*"
);
CSV Collection Class
Drastically simplify and encapsulate collection manipulation. For example, take advantage of built-in List<T>
functionality to add, sort, filter records
MSDN says to not inherit List<T>
, so "have a" List<T>
. This hides the extensive List
public members & you control the class' API - this is "Domain Specific Language in action.
If client code needs to iterate the collection for itself - or a filter subset even, NET collections have iterators (it's an object). An iterator enables foreach
in the client code.
CSV Record
There are five column in the given CSV file
And those columns are named "channel1", etc.?
You will want record-objects to enable/facilitate collection filtering, sorting, testing or enforcing uniqueness, etc.
Because CSV is a well defined format you don't need elaborate manipulation to parse out column values. Every row accounts for all columns even "blank" values.
public class CSVrecord {
public string channel1 {get; protected set;}
public CSVrecord(string aRecord) {
string fields = item.Split(
new string[] { "," },
StringSplitOptions.None
);
channel1 = fields[0];
// et cetera
}
override public string ToString({
string.Format('"{0}","{1}","{2}","{3}","{4}"',
channel1, channel2, channel3, channel4, channel5);
}
}
CLIENT CODE
that massive if
block all but disappears.
public CSVrecords = new CSVrecords();
// reading file stream redacted
forEach (var record in rawRecords)
CSVrecords.Add(new CSVrecord(record));
console.WriteLine(CSVrecords);
-
\$\begingroup\$ Thank you for answering. How about the plotting part? The critical performance issue is at drawing progress. \$\endgroup\$JimmyHu– JimmyHu2024年06月07日 14:00:45 +00:00Commented Jun 7, 2024 at 14:00
-
\$\begingroup\$ Need more info. I don't understand the
channel
class & plot-types. What kind of graphs? Each channel a separate graph? A million data points for ONE graph? Do you need 10^6 points for curve fitting? Is the plot going to cover a wall? Seriously, For paper, 11.5x8.5 or A4 very few points are needed. And generally this is true anyway. A visual graph is not a precision instrument. It's a data summary, as a practical matter \$\endgroup\$radarbob– radarbob2024年06月08日 04:57:28 +00:00Commented Jun 8, 2024 at 4:57 -
\$\begingroup\$ Thank you for answering. In the point of "Encapsulate Object Instantiation", it seems there is no OpenFileDialog constructor which takes that two parameters. \$\endgroup\$JimmyHu– JimmyHu2024年06月11日 11:18:53 +00:00Commented Jun 11, 2024 at 11:18
-
\$\begingroup\$
String.Format
isn't necessary. you can just use $"{channel1},{channel2},{channel3},{channel4},{channel5}"; \$\endgroup\$Short Int– Short Int2025年07月25日 21:25:41 +00:00Commented Jul 25 at 21:25
I suggest using CsvHelper
instead of doing it by yourself. This would save time and efforts, as CsvHelper
is memory optimized and covers most cases that you might encounter with CSV serialization and deserialization. It's also support mapping to models out of the box.
The second part is that, I see you're refreshing the form on each iteration, which means, you're trying to show the changes in real-time. If so, you need to convert your work into an asynchronous operations, this would avoid user-interface hangs (it has other advantages as well).
-
\$\begingroup\$ Sounds like a good idea. "Mapping to models" sounds interesting but I'm not certain the model (
channel
class) is optimal for feeding the plotting function. \$\endgroup\$radarbob– radarbob2024年06月08日 05:03:15 +00:00Commented Jun 8, 2024 at 5:03 -
\$\begingroup\$ Mapping models is just one feature out of many. You can configure your mapping with either
Attributes
orClassMap
. There are many configurations and features that will make working with CSV much easier to maintain. \$\endgroup\$iSR5– iSR52024年06月08日 12:34:08 +00:00Commented Jun 8, 2024 at 12:34
Explore related questions
See similar questions with these tags.