what is proper way to save all lines from text file to objects. I have .txt file something like this
0001Marcus Aurelius 20021122160 21311
0002William Shakespeare 19940822332 11092
0003Albert Camus 20010715180 01232
From this file I know position of each data that is written in file, and all data are formatted.
Line number is from 0 to 3
Book author is from 4 to 30
Publish date is from 31 to 37
Page num. is from 38 to 43
Book code is from 44 to 49
I made class Data which holds information about start, end position, value, error.
Then I made class Line that holds list of type Data, and list that holds all error founded from some line. After load data from line to object Data I loop through lineError and add errors from all line to list, because I need to save errors from each line to database.
My question is this proper way to save data from file to object and after processing same data saving to database, advice for some better approach?
public class Data
{
public int startPosition = 0;
public int endPosition = 0;
public object value = null;
public string fieldName = "";
public Error error = null;
public Data(int start, int end, string name)
{
this.startPosition = start;
this.endPosition = end;
this.fieldName = name;
}
public void SetValueFromLine(string line)
{
string valueFromLine = line.Substring(this.startPosition, this.endPosition - this.startPosition);
// if else statment that checks validity of data (lenght, empty value)
this.value = valueFromLine;
}
}
public class Line
{
public List<Data> lineData = new List<Data>();
public List<Error> lineError = new List<Error>();
public Line()
{
AddObjectDataToList();
}
public void AddObjectDataToList()
{
lineData.Add(new Data(0, 3, "lineNumber"));
lineData.Add(new Data(4, 30, "bookAuthor"));
lineData.Add(new Data(31, 37, "publishData"));
lineData.Add(new Data(38, 43, "pageNumber"));
lineData.Add(new Data(44, 49, "bookCode"));
}
public void LoadLineDataToObjects(string line)
{
foreach(Data s in lineData)
{
s.SetValueFromLine(line);
}
}
public void GetAllErrorFromData()
{
foreach (Data s in lineData)
{
if(s.error != null)
{
lineError.Add(s.error);
}
}
}
}
public class File
{
public string fileName;
public List<Line> lines = new List<Line>();
}
-
3You may want to research serialization - if it has been saved to a DB though why do you need the text form anymore? Please read How to Ask and take the tourŇɏssa Pøngjǣrdenlarp– Ňɏssa Pøngjǣrdenlarp2018年01月20日 18:39:12 +00:00Commented Jan 20, 2018 at 18:39
-
So your question is actually how to parse the text file to a database?Rafael– Rafael2018年01月20日 18:50:09 +00:00Commented Jan 20, 2018 at 18:50
-
No my question is what is best approach to save data from file to objects, because after i have all lines from file saved to objects, I need to make some validation on data and it's easier to loop through all data from first line and check for example do I have author data in my base, book code etc. If some line do not have data from my database I need to skip saving that line in database. I do not have problem with saving data to database, that works fine. I only need advice is this model good for doing that thing saving data from one line to objects and checking if some of data exists.TJacken– TJacken2018年01月20日 19:00:55 +00:00Commented Jan 20, 2018 at 19:00
-
Are you re-inventing msdn.microsoft.com/en-us/library/…Mark Schultheiss– Mark Schultheiss2018年01月20日 19:14:14 +00:00Commented Jan 20, 2018 at 19:14
-
No, thx for this it seems interesting, my file is not csv. I do not have delimiter sign. As you can see in my example above I have line number and author name connected together. I only know where is position of every data from file and that position is constant.TJacken– TJacken2018年01月20日 19:21:58 +00:00Commented Jan 20, 2018 at 19:21
1 Answer 1
I assume that the focus is on using OOP. I also assume that parsing is a secondary task and I will not consider options for its implementation.
First of all, it is necessary to determine the main acting object. Strange as it may seem, this is not a Book
, but the string itself (e.g. DataLine
). Initially, I wanted to create a Book
from a string (through a separate constructor), but that would be a mistake.
What actions should be able to perform DataLine
? - In fact, only one - process
. I see two acceptable options for this method:
process
returnsBook
or throws exceptions. (Book process()
)process
returns nothing, but interacts with another object. (void process(IResults result)
)
The first option has the following drawbacks:
It is difficult to test (although this applies to the second option). All validation is hidden inside
DataLine
.It is impossible/difficult to return a few errors.
The program is aimed at working with incorrect data, so expected exceptions are often generated. This violates the ideology of exceptions. Also, there are small fears of slowing performance.
The second option is devoid of the last two drawbacks. IResults
can contain methodserror(...)
, to return several errors, and success(Book book)
.
The testability of the process
method can be significantly improved by adding IValidator
. This object can be passed as a parameter to the DataLine
constructor, but this is not entirely correct. First, this unnecessary expense of memory because it will not give us tangible benefits. Secondly, this does not correspond to the essence of the DataLine
class. DataLine
represents only a line that can be processed in one particular way. Thus, a good solution is the void process (IValidator validator, IResults result)
.
Summarize the above (may contain syntax errors):
interface IResults {
void error (string message);
void success (Book book);
}
interface IValidator {
// just example
bool checkBookCode (string bookCode);
}
class DataLine {
private readonly string _rawData;
// constructor
/////////////////
public void process (IValidator validator, IResults result) {
// parse _rawData
bool isValid = true; // just example! maybe better to add IResults.hasErrors ()
if (! validator.checkBookCode (bookCode)) {
result.error("Bad book code");
isValid = false;
}
if (isValid) {
result.success(new Book (...));
// or even result.success (...); to avoid cohesion (coupling?) with the Book
}
}
}
The next step is to create a model of the file with the lines. Here again there are many options and nuances, but I would like to pay attention to IEnumerable<DataLine>
. Ideally, we need to create a DataLines
class that will support IEnumerable<DataLine>
and load from a file or from IEnumerable<string>
. However, this approach is relatively complex and redundant, it makes sense only in large projects. A much simpler version:
interface DataLinesProvider {
IEnumerable <DataLine> Lines ();
}
class DataLinesFile implements DataLinesProvider {
private readonly string _fileName;
// constructor
////////////////////
IEnumerable <DataLine> Lines () {
// not sure that it's right
return File
. ReadAllLines (_fileName)
.Select (x => new DataLine (x));
}
}
You can infinitely improve the code, introduce new and new abstractions, but here you must start from common sense and a specific problem.
P. S. sorry for "strange" English. Google not always correctly translate such complex topics.
4 Comments
DataLine
just store single line and validate it only on demand). If you working with really big files than you will need to break it into kinda chunks, maybe even add some bulk-validations-requests to DB.