I need to write 5 crores records with 72 columns into text file, the file size is growing as 9.7gb .
I need to check each and every columm length need to format as according to the length as defined in XML file.
Reading records from oracle one by one and checking the format and writing into text file.
To write 5 crores records it is taking more than 24 hours. How can I increase the performance in the below code?
Dim valString As String = Nothing
Dim valName As String = Nothing
Dim valLength As String = Nothing
Dim valDataType As String = Nothing
Dim validationsArray As ArrayList = GetValidations(Directory.GetCurrentDirectory() + "\ReportFormat.xml")
Console.WriteLine("passed xml")
Dim k As Integer = 1
Try
Console.WriteLine(System.DateTime.Now())
Dim selectSql As String = "select * from table where
" record_date >= To_Date('01-01-2014','DD-MM-YYYY') and record_date <= To_Date('31-12-2014','DD-MM-YYYY')"
Dim dataTable As New DataTable
Dim oracleAccess As New OracleConnection(System.Configuration.ConfigurationManager.AppSettings("OracleConnection"))
Dim cmd As New OracleCommand()
cmd.Connection = oracleAccess
cmd.CommandType = CommandType.Text
cmd.CommandText = selectSql
oracleAccess.Open()
Dim Tablecolumns As New DataTable()
Using oracleAccess
Using writer = New StreamWriter(Directory.GetCurrentDirectory() + "\FileName.txt")
Using odr As OracleDataReader = cmd.ExecuteReader()
Dim sbHeaderData As New StringBuilder
For i As Integer = 0 To odr.FieldCount - 1
sbHeaderData.Append(odr.GetName(i))
sbHeaderData.Append("|")
Next
writer.WriteLine(sbHeaderData)
While odr.Read()
Dim sbColumnData As New StringBuilder
Dim values(odr.FieldCount - 1) As Object
Dim fieldCount As Integer = odr.GetValues(values)
For i As Integer = 0 To fieldCount - 1
Dim vals As Array = validationsArray(i).ToString.ToUpper.Split("|")
valName = vals(0).trim
valDataType = vals(1).trim
valLength = vals(2).trim
Select Case valDataType
Case "VARCHAR2"
If values(i).ToString().Length = valLength Then
sbColumnData.Append(values(i).ToString())
'sbColumnData.Append("|")
ElseIf values(i).ToString().Length > valLength Then
sbColumnData.Append(values(i).ToString().Substring(0, valLength))
'sbColumnData.Append("|")
Else
sbColumnData.Append(values(i).ToString().PadRight(valLength))
'sbColumnData.Append("|")
End If
Case "NUMERIC"
valLength = valLength.Substring(0, valLength.IndexOf(","))
If values(i).ToString().Length = valLength Then
sbColumnData.Append(values(i).ToString())
'sbColumnData.Append("|")
Else
sbColumnData.Append(values(i).ToString().PadLeft(valLength, "0"c))
'sbColumnData.Append("|")
End If
'sbColumnData.Append((values(i).ToString()))
End Select
Next
writer.WriteLine(sbColumnData)
k = k + 1
Console.WriteLine(k)
End While
End Using
writer.WriteLine(System.DateTime.Now())
End Using
End Using
Console.WriteLine(System.DateTime.Now())
'Dim Adpt As New OracleDataAdapter(selectSql, oracleAccess)
'Adpt.Fill(dataTable)
Return Tablecolumns
Catch ex As Exception
Console.WriteLine(System.DateTime.Now())
Console.WriteLine("Error: " & ex.Message)
Console.ReadLine()
Return Nothing
End Try
-
\$\begingroup\$ Writing to the console is very slow. If odr contains a lot of rows, you might see improvement by removing the console.writeline \$\endgroup\$the_lotus– the_lotus2015年09月21日 15:34:31 +00:00Commented Sep 21, 2015 at 15:34
1 Answer 1
So, what is happening here
Select Case valDataType
Case "VARCHAR2"
If values(i).ToString().Length = valLength Then
sbColumnData.Append(values(i).ToString())
'sbColumnData.Append("|")
ElseIf values(i).ToString().Length > valLength Then
sbColumnData.Append(values(i).ToString().Substring(0, valLength))
'sbColumnData.Append("|")
Else
sbColumnData.Append(values(i).ToString().PadRight(valLength))
'sbColumnData.Append("|")
End If
Case "NUMERIC"
valLength = valLength.Substring(0, valLength.IndexOf(","))
If values(i).ToString().Length = valLength Then
sbColumnData.Append(values(i).ToString())
'sbColumnData.Append("|")
Else
sbColumnData.Append(values(i).ToString().PadLeft(valLength, "0"c))
'sbColumnData.Append("|")
End If
'sbColumnData.Append((values(i).ToString()))
End Select
if values(i).ToString().Length
is < valLength
? You are calling 3 times .ToString()
on the object. A much faster and better way would be to just do it once like so
Dim currentValue As String = values(i).ToString()
Select Case valDataType
Case "VARCHAR2"
If vcurrentValue.Length = valLength Then
sbColumnData.Append(currentValue)
ElseIf currentValue.Length > valLength Then
sbColumnData.Append(currentValue.Substring(0, valLength))
Else
sbColumnData.Append(currentValue.PadRight(valLength))
End If
Case "NUMERIC"
valLength = valLength.Substring(0, valLength.IndexOf(","))
If currentValue.Length = valLength Then
sbColumnData.Append(currentValue)
Else
sbColumnData.Append(currentValue.PadLeft(valLength, "0"c))
End If
End Select
I don't like this
Dim vals As Array = validationsArray(i).ToString.ToUpper.Split("|") valName = vals(0).trim valDataType = vals(1).trim valLength = vals(2).trim
for a couple of reasons. First you are creating a string array out of this ArrayList
(however this is created) and then store it in an Array
so casting it to an object to cast it again to a String
on the next lines.
But then you use the valLength String
's to compare with Length
which is an Integer
.
The valName
variable is never used, so you can just remove it along with the commented code. Code which isn't used or is commented is just dead code which should be removed to increase readability.
Another thing bothering me is the use of abbreviations for variables names. You won't gain any performancy increase by doing so, but you loose a lot of readability.
Dim validationValues As String() = validationsArray(i).ToString.ToUpper.Split("|")
Dim valueDataType As String = validationValues(1).Trim()
Dim valueLength As String = validationValues(1).Trim()
Dim currentValue As String = values(i).ToString()
Select Case valDataType
Case "VARCHAR2"
Dim length As Integer = Convert.ToInt32(valueLength)
If vcurrentValue.Length = length Then
sbColumnData.Append(currentValue)
ElseIf currentValue.Length > length Then
sbColumnData.Append(currentValue.Substring(0, length))
Else
sbColumnData.Append(currentValue.PadRight(length))
End If
Case "NUMERIC"
Dim length As Integer = Convert.ToInt32(valueLength.Substring(0, valueLength.IndexOf(",")))
If currentValue.Length = length Then
sbColumnData.Append(currentValue)
Else
sbColumnData.Append(currentValue.PadLeft(length, "0"c))
End If
End Select
Another thing which is slowing down the performance is that you do this
Dim vals As Array = validationsArray(i).ToString.ToUpper.Split("|") valName = vals(0).trim valDataType = vals(1).trim valLength = vals(2).trim
for each datarow and each column. You should store these values once an reuse the values for each other datarow.
k = k + 1 Console.WriteLine(k)
This doesn't serve any real purpose but will slow down your code. Get rid of it.
You are using using
statements to enclose objects which implements IDisposable
which is good, but why don't you be consistent ? The OracleCommand
also implements IDisposable
but there is no using.
Dim Tablecolumns As New DataTable()
this isn't used anywhere except for the return value of the method. You can simply change it to a Sub
and remove this variable.
Doing a select * from table where...
won't help any in the meaning of increasing performance. You only need columns which are either VARCHAR2
or NUMERIC
so you should only add these columns to your select query which meets this requirement. There is for instance no use for any Date
field like record_date
. You don't need to retrieve this column to restrict the query to the value of it.
-
\$\begingroup\$ Heslacher the code is running very fast and it came to very less time i.e, 2days to 2 hrs . Thank you very much for quick help \$\endgroup\$user96888– user968882015年09月15日 13:00:33 +00:00Commented Sep 15, 2015 at 13:00
Explore related questions
See similar questions with these tags.