I am new to SQLAlchemy ORM. I am trying to build a AWS S3 ingestion program which will ingest any CSV file from S3 bucket to Postgres through ORM. I am trying to read the first row of the CSV file and store the result into a list (columns_names). The code is giving an error:
could not assemble any primary key columns for mapped table.
The table is created in database only after declaring a PRIMARY KEY column. Is primary key mandatory for creating table via ORM? Also how do I dynamically create columns from list columns_names?
Here is my code:
import boto
import boto3
import botocore
import os
from datetime import datetime
import s3fs
import pandas as pd
import configparser
import re
from sqlalchemy import create_engine
from sqlalchemy import MetaData, Table, Column, Integer, String
from sqlalchemy.orm.session import sessionmaker
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base
config = configparser.ConfigParser(allow_no_value=True)
config.read('IngestionConfig.config')
table_name = config.get('db-settings','table_name')
S3Bucket = config.get('AWS-settings','BucketName')
S3Key = config.get('AWS-settings','filename')
s3_client = boto3.client('s3')
response = s3_client.get_object(Bucket = S3Bucket, Key= S3Key)
file = response["Body"]
filedata = file.read()
contents = filedata.decode('utf-8')
first_line = contents.split('\n',1)[0]
col_names = re.sub(r"\s+", '_', first_line).replace('"', r'')
columns_names= []
columns_names = col_names.split(',')
postgresql_db = create_engine('postgresql://ayan.putatunda@localhost/postgres',echo = True)
Base = declarative_base()
class test(Base):
__tablename__ = table_name
for name in columns_names:
name = Column(String)
Base.metadata.create_all(postgresql_db)
1 Answer 1
SQLAlchemy ORM does require a primary key because its design requires a way to identify the row corresponding to object, so it's not possible to use table without primary key in ORM.
You can dynamically create tables by first creating a dictionary with your table information:
col_lst = ['col_1', 'col_2', 'col_3']
attr_dict = {'__tablename__': 'myTableName'}
for col in col_lst:
attr_dict[col] = Column(Integer)
Next using the type function create the table Class using SQLAlchemy’s declarative_base
method:
Base = declarative_base()
MyTableClass = type('MyTableClass', (Base,), attr_dict)
-
Thank you! So in my case I dont know the number of columns that the table will have. It is dependent on the CSV file. It may be 10 or 50 columns. Hence I was copying the column names to a list (columns_names). In your code given above how will I achieve this?Ayan Putatunda– Ayan Putatunda2020年01月24日 05:19:12 +00:00Commented Jan 24, 2020 at 5:19
-
@AyanPutatunda I updated my answer to show creating the dictionary keys from list items.bmcculley– bmcculley2020年01月24日 05:34:15 +00:00Commented Jan 24, 2020 at 5:34
-
U rock! Thank you so much! I had to add the SERIAL datatype for primary key, but this works exactly the way I wanted. Thanks again! ` attr_dict = {'tablename': table_name,'uniqueid':Column(Integer, primary_key=True)} for col in columns_names: attr_dict[col] = Column(String)`Ayan Putatunda– Ayan Putatunda2020年01月24日 05:46:36 +00:00Commented Jan 24, 2020 at 5:46
Explore related questions
See similar questions with these tags.