0

I am new to SQLAlchemy ORM. I am trying to build a AWS S3 ingestion program which will ingest any CSV file from S3 bucket to Postgres through ORM. I am trying to read the first row of the CSV file and store the result into a list (columns_names). The code is giving an error:

could not assemble any primary key columns for mapped table.

The table is created in database only after declaring a PRIMARY KEY column. Is primary key mandatory for creating table via ORM? Also how do I dynamically create columns from list columns_names?

Here is my code:

import boto
import boto3
import botocore
import os
from datetime import datetime
import s3fs
import pandas as pd 
import configparser
import re
from sqlalchemy import create_engine
from sqlalchemy import MetaData, Table, Column, Integer, String
from sqlalchemy.orm.session import sessionmaker
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base
config = configparser.ConfigParser(allow_no_value=True)
config.read('IngestionConfig.config')
table_name = config.get('db-settings','table_name')
S3Bucket = config.get('AWS-settings','BucketName')
S3Key = config.get('AWS-settings','filename')
s3_client = boto3.client('s3')
response = s3_client.get_object(Bucket = S3Bucket, Key= S3Key)
file = response["Body"]
filedata = file.read() 
contents = filedata.decode('utf-8')
first_line = contents.split('\n',1)[0]
col_names = re.sub(r"\s+", '_', first_line).replace('"', r'')
columns_names= []
columns_names = col_names.split(',') 
postgresql_db = create_engine('postgresql://ayan.putatunda@localhost/postgres',echo = True)
Base = declarative_base()
class test(Base):
 __tablename__ = table_name
 for name in columns_names:
 name = Column(String)
Base.metadata.create_all(postgresql_db)
asked Jan 24, 2020 at 4:41

1 Answer 1

1

SQLAlchemy ORM does require a primary key because its design requires a way to identify the row corresponding to object, so it's not possible to use table without primary key in ORM.

You can dynamically create tables by first creating a dictionary with your table information:

col_lst = ['col_1', 'col_2', 'col_3']
attr_dict = {'__tablename__': 'myTableName'}
for col in col_lst:
 attr_dict[col] = Column(Integer)

Next using the type function create the table Class using SQLAlchemy’s declarative_base method:

Base = declarative_base()
MyTableClass = type('MyTableClass', (Base,), attr_dict)
answered Jan 24, 2020 at 5:14
3
  • Thank you! So in my case I dont know the number of columns that the table will have. It is dependent on the CSV file. It may be 10 or 50 columns. Hence I was copying the column names to a list (columns_names). In your code given above how will I achieve this? Commented Jan 24, 2020 at 5:19
  • @AyanPutatunda I updated my answer to show creating the dictionary keys from list items. Commented Jan 24, 2020 at 5:34
  • U rock! Thank you so much! I had to add the SERIAL datatype for primary key, but this works exactly the way I wanted. Thanks again! ` attr_dict = {'tablename': table_name,'uniqueid':Column(Integer, primary_key=True)} for col in columns_names: attr_dict[col] = Column(String)` Commented Jan 24, 2020 at 5:46

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.