My console app needs to read several big CSV files to update its data, which are stored in a SQLite database. I am new to C# and would like to know if there are ways to improve this sample code I wrote to simulate the app flow.
What the code does:
The DataRequester
asks a DataLoader
object for data. For simplicity, here there is just one DataLoader
object.
The DataLoader
check if fileExists. If not, it triggers an event. The event handler is in the DataRequester
object. If yes, it loads it asyncrously, parse it then triggers an event. The handler is in the DataRequester
object and will store the data in the database.
My doubts are about:
- The way I implemented the async await pattern
- The way I used
Task.Delay().Wait()
- If I should expect better performance using a
ParallelFor
instead than the simplefor
loop but the bottleneck would be the disk - Should I limit the created threads?
Note: The following code has been checked in Visual Studio Code with .Net Core 1.1.1
using System;
using System.Threading.Tasks;
using System.Collections.Generic;
using static System.Console;
namespace tasks
{
public class Program
{
static void Main(string[] args)
{
WriteLine("** Program Starts");
WriteLine("** Update Data");
var requester = new DataRequester();
requester.UpdateData();
WriteLine("** Program Ends");
Console.ReadLine();
}
}
public class DataEventArgs: EventArgs
{
public string FileName {get; set;}
public List<string> Data {get; set;}
public DataEventArgs(string fileName, List<string> data)
{
FileName = fileName;
Data = data;
}
}
public class DataRequester
{
int numberOfCSVFiles = 1000;
int completed = 0;
public void UpdateData()
{
// The real code loads fileNames from a CSV File
var fileNames = new string[numberOfCSVFiles];
var loader = new DataLoader(fileNames);
loader.OnDataLoaded += HandleDataLoaded;
loader.OnDataNotFound += HandleDataNotFound;
loader.RequestData();
}
public void HandleDataLoaded(object sender, DataEventArgs args)
{
completed++;
float percentage = ((float) completed/ (float) numberOfCSVFiles)*100;
WriteLine($"Received the data for file: {args.FileName}. Request Completion: {percentage.ToString("n2")}%");
// Update SQLite Database ...
}
public void HandleDataNotFound(object sender, DataEventArgs args)
{
completed++;
float percentage = ((float) completed/ (float) numberOfCSVFiles)*100;
WriteLine($"Data not found for file: {args.FileName}. Request Completion: {percentage.ToString("n2")}%");
}
}
public class DataLoader
{
private static readonly Random random = new Random();
public event EventHandler<DataEventArgs> OnDataLoaded;
public event EventHandler<DataEventArgs> OnDataNotFound;
public string[] FileNames {get; set;}
public DataLoader(string[] fileNames){FileNames = fileNames;}
private bool CheckFileName(string fileName)
{
var fileExists = (random.Next(10)<9)?true:false;
return fileExists;
}
private DataEventArgs LoadData(string fileName)
{
// Simulate data loading
int waiting;
lock(random)
{
waiting = random.Next(10)*1000;
}
Task.Delay(waiting).Wait();
return new DataEventArgs(fileName, new List<string>());
}
private Task<DataEventArgs> LoadDataAsync(string fileName)
{
return Task.Run<DataEventArgs>(() =>
{
return LoadData(fileName);
});
}
public async void RequestData()
{
var tasks = new List<Task>();
for(int i=0; i<FileNames.Length; i++)
{
var fileName = String.Concat("file #",(i+1).ToString());
if(CheckFileName(fileName))
{
var task = LoadDataAsync(fileName).ContinueWith((t) =>
{
OnDataLoaded?.Invoke(this, t.Result);
});
tasks.Add(task);
}
else
{
var args = new DataEventArgs(fileName, new List<string>());
OnDataNotFound?.Invoke(this, args);
}
}
await Task.WhenAll(tasks);
WriteLine("Request completed.");
}
}
}
1 Answer 1
Event mistakes
You're not using events correctly.
public class DataEventArgs : EventArgs { public string FileName { get; set; } public List<string> Data { get; set; } public DataEventArgs(string fileName, List<string> data) { FileName = fileName; Data = data; } }
The properties should be read-only.
public event EventHandler<DataEventArgs> OnDataLoaded; public event EventHandler<DataEventArgs> OnDataNotFound;
Events do not have the On
prefix. You use it only for methods that raise a particular event like:
private void OnDataLoaded(..) => DataLoaded(this, ..);
private DataEventArgs LoadData(string fileName)
or
private Task<DataEventArgs> LoadDataAsync(string fileName)
We do not return EventArgs
from methods (unless it's a factory method). You should return the actual result or fire the event there.
public void HandleDataLoaded(object sender, DataEventArgs args)
and
public void HandleDataNotFound(object sender, DataEventArgs args)
These event handlers should not be public
and their names are by convention wrong. It should be e.g.
public void loader_DataNotFound(object sender, DataEventArgs args)
public void loader_DataNotFound(object sender, DataEventArgs args)
Also by convention the EventArgs
parameter is usually named just e
. args
is the parameter you get from Main
.
async mistakes
You're not using async
correctly and you did it exactly the wrong way around.
private Task<DataEventArgs> LoadDataAsync(string fileName)
This method should be decorated with async
but it isn't whereas
public async void RequestData()
this one should not. async
needs to be used with a return type. If it's void
then you should use just Task
. You may/should use void
only for async
event handlers.
Naming mistakes
The classes and method names are not optimal e.g. the DataLoader
has a method called RequestData
and the DataRequester
a method called UpdateData
.
If I saw them for the first time I expect the DataLoader
to have a LoadData
method and the DataRequester
a RequestData
method. If it's updating something then it should be an Updater
or otherwise it's really very confusing.
Explore related questions
See similar questions with these tags.
private static readonly Random random = new Random();
and feel free tolock
on it around its usages if you're worried about concurrency. \$\endgroup\$private static int
inside theUpdateData
method? This is not a valid C# code and thus unfortunatelly off-topic. \$\endgroup\$