This is the largest help request I've ever written. My current process is to try to keep narrowing it down until I get the smallest code that can produce an error.
I have downloaded the attributes_active csv table from https://www.ffiec.gov/npw/FinancialReport/DataDownload .
I'm trying to load that into a table in SQL Server. It won't import directly (it eventually needs to be done automatically, so I need to figure out the python anyway). But it won't import directly.
Interesting note: I can import this file directly into an MS Access database. It produces an error table. If I tell it that the double quote character is used for strings, then it imports with no errors.
So I use it to produce the error table, and it tells me for example that row 53 is a problem (it's actually 54, if you include the header)
The first few lines of the access import errors look like this:
Error Field Row
Type Conversion Failure ORG_TYPE_CD 53
Type Conversion Failure STATE_CD 53
Type Conversion Failure ID_FDIC_CERT 53
Type Conversion Failure SLHC_TYPE_IND 53
Type Conversion Failure CNTRY_INC_CD 53
Type Conversion Failure ORG_TYPE_CD 56
Type Conversion Failure STATE_CD 56
Type Conversion Failure ID_FDIC_CERT 56
Type Conversion Failure SLHC_TYPE_IND 56
Type Conversion Failure CNTRY_INC_CD 56
Type Conversion Failure ORG_TYPE_CD 523
Type Conversion Failure STATE_CD 523
Type Conversion Failure ID_FDIC_CERT 523
Type Conversion Failure SLHC_TYPE_IND 523
Type Conversion Failure CNTRY_INC_CD 523
Type Conversion Failure ID_FDIC_CERT 610
Type Conversion Failure SLHC_TYPE_IND 610
Type Conversion Failure CNTRY_INC_CD 610
Type Conversion Failure ORG_TYPE_CD 714
Type Conversion Failure STATE_CD 714
Type Conversion Failure ID_FDIC_CERT 714
Type Conversion Failure SLHC_TYPE_IND 714
Type Conversion Failure CNTRY_INC_CD 714
Type Conversion Failure ORG_TYPE_CD 759
Type Conversion Failure STATE_CD 759
Type Conversion Failure ID_FDIC_CERT 759
Type Conversion Failure SLHC_TYPE_IND 759
Type Conversion Failure CNTRY_INC_CD 759
Type Conversion Failure ORG_TYPE_CD 796
Type Conversion Failure STATE_CD 796
Type Conversion Failure ID_FDIC_CERT 796
Type Conversion Failure SLHC_TYPE_IND 796
Type Conversion Failure CNTRY_INC_CD 796
The problem is I look at the raw text of those lines and those fields and there's nothing odd to distinguish them from the same fields in the adjacent lines. I do however think the fact that I get no import errors when I specifically tell access to use the double quote for strings is significant - but I'm doing the same thing for the pandas!
I'm using the following code to try to read this into SQL Server:
def ingest_npw_attributes_active():
filename = 'CSV_ATTRIBUTES_ACTIVE.CSV'
full_filename = os.path.join(raw_data_dir, filename)
print(full_filename)
dt = {'AUTH_REG_DIST_FRS': int,
'STREET_LINE2': "string",
'ID_THRIFT': "string",
'ID_TAX': int
}
init_connection()
df = pd.read_csv(full_filename, dtype=dt, header=0, quotechar='"')
place_holder = "?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,"
place_holder += "?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?"
sql_str = "exec dbo.save_attributes_active " + place_holder
# build assignment string for sql
values = "("
for col_name in df.columns:
values += "row['" + col_name + "'], "
values = values[:-1] + ")"
for index, row in df.iterrows():
if index < 2:
print(row)
values = (row['#ID_RSSD'], row['D_DT_START'], row['D_DT_END'], row['BHC_IND'], row['BROAD_REG_CD'],
row['CHTR_AUTH_CD'], row['CHTR_TYPE_CD'], row['FBO_4C9_IND'], row['FHC_IND'], row['FUNC_REG'],
row['INSUR_PRI_CD'], row['MBR_FHLBS_IND'], row['MBR_FRS_IND'], row['SEC_RPTG_STATUS'],
row['EST_TYPE_CD'], row['BANK_CNT'], row['BNK_TYPE_ANALYS_CD'], row['D_DT_EXIST_CMNC'],
row['D_DT_EXIST_TERM'], row['FISC_YREND_MMDD'], row['D_DT_INSUR'], row['D_DT_OPEN'],
row['FNCL_SUB_HOLDER'], row['FNCL_SUB_IND'], row['IBA_GRNDFTHR_IND'], row['IBF_IND'],
row['ID_RSSD_HD_OFF'], row['MJR_OWN_MNRTY'], row['NM_LGL'], row['NM_SHORT'], row['NM_SRCH_CD'],
row['ORG_TYPE_CD'], row['REASON_TERM_CD'], row['CNSRVTR_CD'], row['ENTITY_TYPE'],
row['AUTH_REG_DIST_FRS'], row['ACT_PRIM_CD'], row['CITY'], row['CNTRY_NM'], row['ID_CUSIP'],
row['STATE_ABBR_NM'], row['PLACE_CD'], row['STATE_CD'], row['STATE_HOME_CD'], row['STREET_LINE1'],
row['STREET_LINE2'], row['ZIP_CD'], row['ID_THRIFT'], row['ID_THRIFT_HC'], row['DOMESTIC_IND'],
row['ID_ABA_PRIM'], row['ID_FDIC_CERT'], row['ID_NCUA'], row['COUNTY_CD'], row['DIST_FRS'],
row['ID_OCC'], row['CNTRY_CD'], row['DT_END'], row['DT_EXIST_CMNC'], row['DT_EXIST_TERM'],
row['DT_INSUR'], row['DT_OPEN'], row['DT_START'], row['ID_TAX'], row['PROV_REGION'], row['URL'],
row['SLHC_IND'], row['SLHC_TYPE_IND'], row['PRIM_FED_REG'], row['STATE_INC_CD'], row['CNTRY_INC_CD'],
row['STATE_INC_ABBR_NM'], row['CNTRY_INC_NM'], row['ID_LEI'], row['IHC_IND'])
# print(values)
return_key = cursor.execute(sql_str, values).fetchval()
print('return_key =', return_key)
conn.close()
return
I have the SQL Server table defined as follows:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[npwAttributesActive]
(
[id] [bigint] IDENTITY(1,1) NOT NULL,
[ID_RSSD] [int] NOT NULL,
[D_DT_START] [nvarchar](50) NULL,
[D_DT_END] [nvarchar](50) NULL,
[BHC_IND] [int] NULL,
[BROAD_REG_CD] [int] NULL,
[CHTR_AUTH_CD] [int] NULL,
[CHTR_TYPE_CD] [int] NULL,
[FBO_4C9_IND] [int] NULL,
[FHC_IND] [int] NULL,
[FUNC_REG] [int] NULL,
[INSUR_PRI_CD] [int] NULL,
[MBR_FHLBS_IND] [int] NULL,
[MBR_FRS_IND] [int] NULL,
[SEC_RPTG_STATUS] [int] NULL,
[EST_TYPE_CD] [int] NULL,
[BANK_CNT] [nvarchar](1) NULL,
[BNK_TYPE_ANALYS_CD] [int] NULL,
[D_DT_EXIST_CMNC] [nvarchar](50) NULL,
[D_DT_EXIST_TERM] [nvarchar](50) NULL,
[FISC_YREND_MMDD] [int] NULL,
[D_DT_INSUR] [nvarchar](50) NULL,
[D_DT_OPEN] [nvarchar](50) NULL,
[FNCL_SUB_HOLDER] [int] NULL,
[FNCL_SUB_IND] [int] NULL,
[IBA_GRNDFTHR_IND] [int] NULL,
[IBF_IND] [int] NULL,
[ID_RSSD_HD_OFF] [int] NULL,
[MJR_OWN_MNRTY] [int] NULL,
[NM_LGL] [nvarchar](150) NULL,
[NM_SHORT] [nvarchar](50) NULL,
[NM_SRCH_CD] [int] NULL,
[ORG_TYPE_CD] [int] NULL,
[REASON_TERM_CD] [int] NULL,
[CNSRVTR_CD] [int] NULL,
[ENTITY_TYPE] [nvarchar](50) NULL,
[AUTH_REG_DIST_FRS] [int] NULL,
[ACT_PRIM_CD] [nvarchar](50) NULL,
[CITY] [nvarchar](50) NULL,
[CNTRY_NM] [nvarchar](50) NULL,
[ID_CUSIP] [nvarchar](50) NULL,
[STATE_ABBR_NM] [nvarchar](50) NULL,
[PLACE_CD] [int] NULL,
[STATE_CD] [int] NULL,
[STATE_HOME_CD] [int] NULL,
[STREET_LINE1] [nvarchar](50) NULL,
[STREET_LINE2] [nvarchar](50) NULL,
[ZIP_CD] [nvarchar](50) NULL,
[ID_THRIFT] [int] NULL,
[ID_THRIFT_HC] [nvarchar](50) NULL,
[DOMESTIC_IND] [nvarchar](50) NULL,
[ID_ABA_PRIM] [int] NULL,
[ID_FDIC_CERT] [int] NULL,
[ID_NCUA] [int] NULL,
[COUNTY_CD] [int] NULL,
[DIST_FRS] [int] NULL,
[ID_OCC] [int] NULL,
[CNTRY_CD] [int] NULL,
[DT_END] [int] NULL,
[DT_EXIST_CMNC] [int] NULL,
[DT_EXIST_TERM] [int] NULL,
[DT_INSUR] [int] NULL,
[DT_OPEN] [int] NULL,
[DT_START] [int] NULL,
[ID_TAX] [int] NULL,
[PROV_REGION] [nvarchar](50) NULL,
[URL] [nvarchar](50) NULL,
[SLHC_IND] [int] NULL,
[SLHC_TYPE_IND] [int] NULL,
[PRIM_FED_REG] [nvarchar](50) NULL,
[STATE_INC_CD] [int] NULL,
[CNTRY_INC_CD] [int] NULL,
[STATE_INC_ABBR_NM] [nvarchar](50) NULL,
[CNTRY_INC_NM] [nvarchar](50) NULL,
[ID_LEI] [nvarchar](50) NULL,
[IHC_IND] [int] NULL
) ON [PRIMARY]
The stored procedure is
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[save_attributes_active]
@ID_RSSD int,
@D_DT_START nvarchar(50),
@D_DT_END nvarchar(50),
@BHC_IND int,
@BROAD_REG_CD int,
@CHTR_AUTH_CD int,
@CHTR_TYPE_CD int,
@FBO_4C9_IND int,
@FHC_IND int,
@FUNC_REG int,
@INSUR_PRI_CD int,
@MBR_FHLBS_IND int,
@MBR_FRS_IND int,
@SEC_RPTG_STATUS int,
@EST_TYPE_CD int,
@BANK_CNT nvarchar(1),
@BNK_TYPE_ANALYS_CD int,
@D_DT_EXIST_CMNC nvarchar(50),
@D_DT_EXIST_TERM nvarchar(50),
@FISC_YREND_MMDD int,
@D_DT_INSUR nvarchar(50),
@D_DT_OPEN nvarchar(50),
@FNCL_SUB_HOLDER int,
@FNCL_SUB_IND int,
@IBA_GRNDFTHR_IND int,
@IBF_IND int,
@ID_RSSD_HD_OFF int,
@MJR_OWN_MNRTY int,
@NM_LGL nvarchar(150),
@NM_SHORT nvarchar(50),
@NM_SRCH_CD int,
@ORG_TYPE_CD int,
@REASON_TERM_CD int,
@CNSRVTR_CD int,
@ENTITY_TYPE nvarchar(50),
@AUTH_REG_DIST_FRS int,
@ACT_PRIM_CD nvarchar(50),
@CITY nvarchar(50),
@CNTRY_NM nvarchar(50),
@ID_CUSIP nvarchar(50),
@STATE_ABBR_NM nvarchar(50),
@PLACE_CD int,
@STATE_CD int,
@STATE_HOME_CD int,
@STREET_LINE1 nvarchar(50),
@STREET_LINE2 nvarchar(50),
@ZIP_CD nvarchar(50),
@ID_THRIFT int,
@ID_THRIFT_HC nvarchar(50),
@DOMESTIC_IND nvarchar(50),
@ID_ABA_PRIM int,
@ID_FDIC_CERT int,
@ID_NCUA int,
@COUNTY_CD int,
@DIST_FRS int,
@ID_OCC int,
@CNTRY_CD int,
@DT_END int,
@DT_EXIST_CMNC int,
@DT_EXIST_TERM int,
@DT_INSUR int,
@DT_OPEN int,
@DT_START int,
@ID_TAX int,
@PROV_REGION nvarchar(50),
@URL nvarchar(50),
@SLHC_IND int,
@SLHC_TYPE_IND int,
@PRIM_FED_REG nvarchar(50),
@STATE_INC_CD int,
@CNTRY_INC_CD int,
@STATE_INC_ABBR_NM nvarchar(50),
@CNTRY_INC_NM nvarchar(50),
@ID_LEI nvarchar(50),
@IHC_IND int
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO dbo.npwAttributesActive ([ID_RSSD], [D_DT_START], [D_DT_END], [BHC_IND], [BROAD_REG_CD], [CHTR_AUTH_CD], [CHTR_TYPE_CD], [FBO_4C9_IND], [FHC_IND], [FUNC_REG], [INSUR_PRI_CD],
[MBR_FHLBS_IND], [MBR_FRS_IND], [SEC_RPTG_STATUS], [EST_TYPE_CD], [BANK_CNT], [BNK_TYPE_ANALYS_CD], [D_DT_EXIST_CMNC], [D_DT_EXIST_TERM], [FISC_YREND_MMDD], [D_DT_INSUR], [D_DT_OPEN],
[FNCL_SUB_HOLDER], [FNCL_SUB_IND], [IBA_GRNDFTHR_IND], [IBF_IND], [ID_RSSD_HD_OFF], [MJR_OWN_MNRTY], [NM_LGL], [NM_SHORT], [NM_SRCH_CD], [ORG_TYPE_CD], [REASON_TERM_CD], [CNSRVTR_CD],
[ENTITY_TYPE], [AUTH_REG_DIST_FRS], [ACT_PRIM_CD], [CITY], [CNTRY_NM], [ID_CUSIP], [STATE_ABBR_NM], [PLACE_CD], [STATE_CD], [STATE_HOME_CD], [STREET_LINE1], [STREET_LINE2], [ZIP_CD],
[ID_THRIFT], [ID_THRIFT_HC], [DOMESTIC_IND], [ID_ABA_PRIM], [ID_FDIC_CERT], [ID_NCUA], [COUNTY_CD], [DIST_FRS], [ID_OCC], [CNTRY_CD], [DT_END], [DT_EXIST_CMNC], [DT_EXIST_TERM], [DT_INSUR],
[DT_OPEN], [DT_START], [ID_TAX], [PROV_REGION], [URL], [SLHC_IND], [SLHC_TYPE_IND], [PRIM_FED_REG], [STATE_INC_CD], [CNTRY_INC_CD],
[STATE_INC_ABBR_NM], [CNTRY_INC_NM], [ID_LEI], [IHC_IND] )
VALUES (@ID_RSSD, @D_DT_START, @D_DT_END, @BHC_IND, @BROAD_REG_CD, @CHTR_AUTH_CD, @CHTR_TYPE_CD, @FBO_4C9_IND, @FHC_IND, @FUNC_REG, @INSUR_PRI_CD,
@MBR_FHLBS_IND, @MBR_FRS_IND, @SEC_RPTG_STATUS, @EST_TYPE_CD, @BANK_CNT, @BNK_TYPE_ANALYS_CD, @D_DT_EXIST_CMNC, @D_DT_EXIST_TERM, @FISC_YREND_MMDD,
@D_DT_INSUR, @D_DT_OPEN, @FNCL_SUB_HOLDER, @FNCL_SUB_IND, @IBA_GRNDFTHR_IND, @IBF_IND, @ID_RSSD_HD_OFF, @MJR_OWN_MNRTY, @NM_LGL, @NM_SHORT, @NM_SRCH_CD,
@ORG_TYPE_CD, @REASON_TERM_CD, @CNSRVTR_CD, @ENTITY_TYPE, @AUTH_REG_DIST_FRS, @ACT_PRIM_CD, @CITY, @CNTRY_NM, @ID_CUSIP, @STATE_ABBR_NM, @PLACE_CD, @STATE_CD,
@STATE_HOME_CD, @STREET_LINE1, @STREET_LINE2, @ZIP_CD, @ID_THRIFT, @ID_THRIFT_HC, @DOMESTIC_IND, @ID_ABA_PRIM, @ID_FDIC_CERT, @ID_NCUA, @COUNTY_CD, @DIST_FRS,
@ID_OCC, @CNTRY_CD, @DT_END, @DT_EXIST_CMNC, @DT_EXIST_TERM, @DT_INSUR, @DT_OPEN, @DT_START, @ID_TAX, @PROV_REGION, @URL, @SLHC_IND, @SLHC_TYPE_IND,
@PRIM_FED_REG, @STATE_INC_CD, @CNTRY_INC_CD, @STATE_INC_ABBR_NM, @CNTRY_INC_NM, @ID_LEI, @IHC_IND
)
SELECT 1
END
However, I'm getting the following error:
Traceback (most recent call last):
File "C:/Users/kgreen/Source/Repos/MapTools/ingest_source.py", line 142, in <module>
main()
File "C:/Users/kgreen/Source/Repos/MapTools/ingest_source.py", line 136, in main
ingest_npw_data()
File "C:/Users/kgreen/Source/Repos/MapTools/ingest_source.py", line 130, in ingest_npw_data
ingest_npw_attributes_active()
File "C:/Users/kgreen/Source/Repos/MapTools/ingest_source.py", line 122, in ingest_npw_attributes_active
return_key = cursor.execute(sql_str, values).fetchval()
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Parameter 19 (""): The supplied value is not a valid instance of data type float. Check the source data for invalid values. An example of an invalid value is data of numeric type with scale greater than precision. (8023) (SQLExecDirectW)')
Addendum First few lines of the CSV
#ID_RSSD,D_DT_START,D_DT_END,BHC_IND,BROAD_REG_CD,CHTR_AUTH_CD,CHTR_TYPE_CD,FBO_4C9_IND,FHC_IND,FUNC_REG,INSUR_PRI_CD,MBR_FHLBS_IND,MBR_FRS_IND,SEC_RPTG_STATUS,EST_TYPE_CD,BANK_CNT,BNK_TYPE_ANALYS_CD,D_DT_EXIST_CMNC,D_DT_EXIST_TERM,FISC_YREND_MMDD,D_DT_INSUR,D_DT_OPEN,FNCL_SUB_HOLDER,FNCL_SUB_IND,IBA_GRNDFTHR_IND,IBF_IND,ID_RSSD_HD_OFF,MJR_OWN_MNRTY,NM_LGL,NM_SHORT,NM_SRCH_CD,ORG_TYPE_CD,REASON_TERM_CD,CNSRVTR_CD,ENTITY_TYPE,AUTH_REG_DIST_FRS,ACT_PRIM_CD,CITY,CNTRY_NM,ID_CUSIP,STATE_ABBR_NM,PLACE_CD,STATE_CD,STATE_HOME_CD,STREET_LINE1,STREET_LINE2,ZIP_CD,ID_THRIFT,ID_THRIFT_HC,DOMESTIC_IND,ID_ABA_PRIM,ID_FDIC_CERT,ID_NCUA,COUNTY_CD,DIST_FRS,ID_OCC,CNTRY_CD,DT_END,DT_EXIST_CMNC,DT_EXIST_TERM,DT_INSUR,DT_OPEN,DT_START,ID_TAX,PROV_REGION,URL,SLHC_IND,SLHC_TYPE_IND,PRIM_FED_REG,STATE_INC_CD,CNTRY_INC_CD,STATE_INC_ABBR_NM,CNTRY_INC_NM,ID_LEI,IHC_IND
37,"04/15/2009 00:00:00","12/31/9999 00:00:00",0,1,2,200,0,0,0,7,1,0,0,1,,0,,"12/31/9999 00:00:00",0,"01/01/1934 00:00:00","09/01/1904 00:00:00",0,0,0,0,0,0,"BANK OF HANCOCK COUNTY ","BANK OF HANCOCK CTY ",1072861144,1,0,0,"NMB",6,"52211 ","SPARTA","UNITED STATES ","0","GA",72584,13,0,"12855 BROAD STREET","0","31087 ",16553,"0","Y",61107146,10057,0,141,6,0,1007,99991231,0,99991231,19340101,19040901,20090415,0,"0","0",0,0,"FDIC",0,0,"0 ","","0",0
73,"12/31/2008 00:00:00","12/31/9999 00:00:00",0,2,1,330,0,0,0,3,0,0,0,1,,0,,"12/31/9999 00:00:00",0,"01/04/1971 00:00:00","01/01/1936 00:00:00",0,0,0,0,0,0,"UTILITY EMPLOYEES FEDERAL CREDIT UNION ","UTILITY EMPL FCU ",788018087,6,0,0,"FCU",12,"52213 ","HOQUIAM","UNITED STATES ","0","WA",0,53,0,"220 MYRTLE STREET","0","98550 ",0,"0","Y",325179988,0,1851,27,12,0,1007,99991231,0,99991231,19710104,19360101,20081231,910591861,"0","0",0,0,"NCUA",0,0,"0 ","","0",0
242,"01/01/2012 00:00:00","12/31/9999 00:00:00",0,1,2,200,0,0,0,7,1,1,0,1,,0,,"12/31/9999 00:00:00",0,"01/01/1934 00:00:00","01/01/1922 00:00:00",0,0,0,0,0,0,"FIRST COMMUNITY BANK XENIA-FLORA ","FIRST CMNTY BK XENIA FLORA ",574907456,1,0,0,"SMB",8,"52211 ","XENIA","UNITED STATES ","0","IL",83739,17,0,"260 FRONT STREET","0","62899 ",0,"0","Y",81220537,3850,0,25,8,0,1007,99991231,0,99991231,19340101,19220101,20120101,370274860,"0","WWW.FCBXENIAFLORA.COM/INDEX.HTML",0,0,"FRS",17,0,"IL","","0",0
279,"01/01/2012 00:00:00","12/31/9999 00:00:00",0,1,2,300,0,0,0,7,1,0,0,1,,0,,"12/31/9999 00:00:00",0,"01/01/1997 00:00:00","01/01/1934 00:00:00",0,0,0,0,0,0,"MINEOLA COMMUNITY BANK, SSB ","MINEOLA CMNTY BK SSB ",98889854,6,0,0,"SSB",11,"52211 ","MINEOLA","UNITED STATES ","0","TX",48648,48,0,"215 W BROAD","0","75773 ",2523,"0","Y",311972526,28868,0,499,11,0,1007,99991231,0,99991231,19970101,19340101,20120101,750440734,"0","0",0,0,"FDIC",48,0,"TX","","0",0
354,"12/04/2019 00:00:00","12/31/9999 00:00:00",0,1,2,200,0,0,0,7,0,0,0,1,,0,"01/01/1901 00:00:00","12/31/9999 00:00:00",0,"03/21/1934 00:00:00","01/01/1901 00:00:00",0,0,0,0,0,0,"BISON STATE BANK ","BISON ST BK ",715161421,1,0,0,"NMB",10,"52211 ","BISON","UNITED STATES ","0","KS",6950,20,0,"223 MAIN STREET","0","67520 ",0,"0","Y",101107475,14083,0,165,10,0,1007,99991231,19010101,99991231,19340321,19010101,20191204,0,"0","0",0,0,"FDIC",0,0,"0 ","","0",0
-
@Parfait I added the CSVsample to the end of the question.elbillaf– elbillaf2020年07月21日 03:02:17 +00:00Commented Jul 21, 2020 at 3:02
2 Answers 2
Without actually seeing full set of CSV data, your issue may essentially come down to blanks showing up in CSV cells.
CSV -> MS Access Type Conversion
Without a specified table, MS Access imports data based on reading first few rows. Before specifying double quotes, Access reads "" as zero-empty string which will be problematic if Access initially defined corresponding fields as int, double, or date/time.
Pandas -> SQL Server Type Conversion
Similarly, Pandas reads data based on first few rows if not specified as you did with dtype argument. Empty strings may be the issue when migrating the Pandas data frame to SQL Server table. Per your error, you may have CSV cells with "" and attempting to migrate to non-string columns:
Parameter 19 (""): The supplied value is not a valid instance of data type float.
By default, read_csv will convert values with "" in integer columns to NaN and thereby rendering entire column to float, no longer int which can impact migration. There may no specific solution for you but consider an array of solutions that may need to be combined:
Use the
na_filteroption inread_csvthat does not replace blanks withNaNbut does convert number columns with empty strings asobjectwhich may affect SQL Server import.df = pd.read_csv(full_filename, dtype=dt, na_filter=False)Note
quote='"'is already the default andheader=0may be redundant if not replacingnames.Read as normal, then convert all columns to object and replace blanks with
Nonewhich translates properly toNULLin SQL. Possibly,pyodbcwill attempt to cast to final types.df = pd.read_csv(full_filename, dtype=dt) df = df.astype(object).where(pd.notnull(df), None)Alternatively, dump to a staging temp table of all
varcharand runUPDATEorMERGEwith type casting to final table. This can all be handled in stored procedure.Individually adjust each number column to anticipate empty blanks. Maybe place a placeholder value
9999in blank integers to be cleaned out after migration.df['D_DT_EXIST_TERM'] = df['D_DT_EXIST_TERM'].replace('', None) df['FISC_YREND_MMDD'] = df['FISC_YREND_MMDD'].fillna(9999).astype('int')
Finally, maybe you can avoid using the heavy library, pandas, altogether which is best for data analytics, and migrate with built-in csv module using csv.reader. This may better handle blanks cast as None (and not Numpy's np.nan float entity) . Many online blogs, tutorials, and questions can help with this approach. To better control CSV column to SQL column alignment, look into csv.DictReader.
Comments
from getpass import getpass
from mysql.connector import connect, Error
import pandas as pd
def query_execute(cmd):
print(f"{cmd}")
try:
query.execute(cmd)
for i in query:
print(i)
except Error as e:
print(f"{e}\n\n")
def dblq(ar):
b = '\"'
e = '\",'
dbl = ""
for i in ar[:-1]:
if isinstance(i, str):
dbl += f"{b}{i}{e}"
else:
dbl += f"{i},"
if isinstance(ar[-1], str):
dbl += f"{b}{ar[-1]}{b}"
else:
dbl += f"{ar[-1]}"
return (dbl)
def noqs(ar):
noq = ""
for i in ar[:-1]:
noq += f"{i}, "
noq += f"{ar[-1]}"
return (noq)
mydb = connect(
host="localhost",
user=input("Enter username: "),
password=getpass("Enter Password: "),
database="entropia"
)
query = mydb.cursor()
# Dictionary map data type pandas to sql
map_types = {'object': 'VARCHAR', 'int64': 'INT', 'float64': 'FLOAT'}
df = pd.read_csv('./Data/EU-Assets.csv')
info_ar = []
dfd = df.dtypes
for c, i in enumerate(list(df.columns.values)):
info_ar.append([i, max(df[i].astype(str).map(len)), str(dfd[c])])
names = tuple(df.columns.values)
Values = [f"{i} {map_types[info_ar[c][2]]}({info_ar[c][1]})" for c, i in enumerate(names)]
make_table = ["CREATE TABLE assetz ("]
for i in Values[:-1]:
make_table.append(f"{i},")
make_table.append(f"{Values[-1]})")
create_table = str(tuple(make_table)).replace("'", "").replace(",,", ",").replace("(,", "(").replace(")))", "))")
query_execute(f"{create_table[1:]}")
for index, row in df.iterrows():
if index < len(df):
query_execute(f"INSERT INTO assetz ({noqs(names)}) VALUES ({dblq(tuple(row))})")
mydb.commit()