Update Google Sheet with Python

Question 1

Below is a code to update a google sheet using the database values. The script is working fine but it is rather slow because of the sleep function which I have to put because of the "Quota exceeding error" below. If I remove the sleep time it would work much faster but after a certain number of rows get updated it will throw the below error.

I have to update three columns namely A,B & C of which the value start row is from 4th row onwards. So basically I need advise/recommendations to improve this code on the below points:

To make the query execution faster without getting the below error.
Is there a way to generalise this script, means without hardcoding the spreadsheet name (DB details I can get in a generalised way and read from config, for testing purpose only hardcoded the values)

gspread.exceptions.APIError: { "error": {
 "code": 429,
 "message": "Insufficient tokens for quota 'ReadGroup' and limit 'USER-100s' of service 'sheets.googleapis.com' for consumer 'project_number:*******'.",
 "status": "RESOURCE_EXHAUSTED",
 "details": [
 {
 "@type": "type.googleapis.com/google.rpc.Help",
 "links": [
 {
 "description": "Google developer console API key",
 "url": "https://console.developers.google.com/project/***/apiui/credential"
 }
 ]
 }
 ] } }

Python Code:

import psycopg2
import gspread
from oauth2client.service_account import ServiceAccountCredentials
from time import sleep
import datetime
def update_sheet(sheet, table, rangeStart='A', rangeEnd='C'):
 for index, row in enumerate(table):
 range = '{start}{i}:{end}{i}'.format(start=rangeStart, end=rangeEnd, i=index+4)
 cell_list = sheet.range(range)
 for i, cell in enumerate(cell_list):
 start_time = datetime.datetime.now()
 cell.value = row[i]
 sheet.update_cells(cell_list)
 end_time = datetime.datetime.now()
 if (end_time - start_time).total_seconds() < 1:
 sleep(1.01 - (end_time - start_time).total_seconds())
cnx_psql = psycopg2.connect(host="xxx.xxx.xxx.xx", database="postgres", user="postgres",password="******", port="5432")
print('DB connected')
psql_cursor = cnx_psql.cursor()
METADATA_QUERY = '''select product_id,CAST(low_stock_date as TEXT) low_stock_date,sku from test.low_stock_date;'''
psql_cursor.execute(METADATA_QUERY)
results = psql_cursor.fetchall()
cell_values = (results)
scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']
creds = ServiceAccountCredentials.from_json_keyfile_name('/Users/lins/Documents/GS_secret/secret_key.json',scope)
client = gspread.authorize(creds)
sheet = client.open_by_url('https://docs.google.com/spreadsheets/d/1****zQ/edit#gid=0').sheet1
#Function Call
update_sheet(sheet, cell_values)
psql_curso.close()
cnx_psql.close()

Question 2

Is there an indentation error or do you really call sheet.update_cells every iteration of the for loop?

Question 3

Thanks for the above column method, works seamlessly. I want to update Sheet2 and not sheet 1 in my spreadsheet. For this, I changed the URL to the URL of sheet 2 in the code but still it updates sheet 1 value. Any idea on how to fix this?

sheet = client.open_by_url('https://docs.google.com/spreadsheets/d/XXXXXXWwZdIWMYcQthyb_-UuepKS9xxxxxgQ/edit#gid=1424234960').sheet1

Question 4

Have you done any research into what is causing the Quota exceeding error? By definition, any time an error message is reported the code is not working as intended. We can't debug the issue on this sight, nor can we tell you how to work around the issue in a more efficient manor.

Question 5

The update_cells call can update multiple cells at once. So you should calculate all new values locally and push the changes in one update. This should use only one token, instead of one per cell (untested code):

def update_sheet(sheet, table, start='A', end='C'):
 to_update = []
 for i, row in enumerate(table):
 cells = sheet.range(f'{start}{i+4}:{end}{i+4}')
 for cell, value in zip(cells, row):
 cell.value = value
 to_update.extend(cells)
 sheet.update_cells(to_update)

I also changed the names and indentation to follow Python's official style-guide, PEP8, and used an f-string for easier string formatting.

This still has the problem that it needs to get each row from the sheet, each of which does an API call. Instead you can get each column as a range and update one column at a time, which saves calls if your table has less columns than rows. For this we need to transpose the table first, though:

def update_sheet(sheet, table, columns="ABC", header=4):
 to_update = []
 table = list(zip(*table)) # transpose the table
 for col_name, col in zip(columns, table): # iterating over columns now
 r = f"{col_name}{header}:{col_name}{len(col)+header}" # changed range
 print(r) # for debugging
 cells = sheet.range(r)
 for cell, value in zip(cells, col):
 cell.value = value
 to_update.extend(cells)
 sheet.update_cells(to_update)

With this it is no problem to use e.g. this table of size 2500 x 3:

import numpy as np
...
table = list(map(list, np.arange(2500*3).reshape(-1, 3).astype(str)))
update_sheet(sheet, table)

Question 6

@Linu: How large is your table? It worked for me for a small test example (100 rows, 3 columns). If you have many more rows than columns, transposing the table first might gain you some less calls.

Question 7

@GraipherThe table has a max rows of 2500.Any other way to deal with this?

Question 8

@Linu: No need to change your SQL, just transpose your list of lists (using e.g. the common Python idiom list(zip(*table)). I added it in the second function. The only other difference is that the range gets the whole column now, so we need all column names instead of start and end. But for that you could also write a convenience function that gives you all column names between start and end. Possibly even using this recent answer of mine.

Question 9

Being non-working code (with an error), should you not have voted to close it instead ?

Question 10

@RohitGupta The code in the OP was working, albeit slowly due to it containing a workaround for avoiding running into the rate limit of API requests being made to Google Sheets. My answers shows some ways to make fewer API calls, making that workaround unnecessary.

Graipher Graipher 41.7k7 gold badges70 silver badges134 bronze badges · Accepted Answer · 2019-11-22 14:03:34Z

The update_cells call can update multiple cells at once. So you should calculate all new values locally and push the changes in one update. This should use only one token, instead of one per cell (untested code):

def update_sheet(sheet, table, start='A', end='C'):
 to_update = []
 for i, row in enumerate(table):
 cells = sheet.range(f'{start}{i+4}:{end}{i+4}')
 for cell, value in zip(cells, row):
 cell.value = value
 to_update.extend(cells)
 sheet.update_cells(to_update)

I also changed the names and indentation to follow Python's official style-guide, PEP8, and used an f-string for easier string formatting.

This still has the problem that it needs to get each row from the sheet, each of which does an API call. Instead you can get each column as a range and update one column at a time, which saves calls if your table has less columns than rows. For this we need to transpose the table first, though:

def update_sheet(sheet, table, columns="ABC", header=4):
 to_update = []
 table = list(zip(*table)) # transpose the table
 for col_name, col in zip(columns, table): # iterating over columns now
 r = f"{col_name}{header}:{col_name}{len(col)+header}" # changed range
 print(r) # for debugging
 cells = sheet.range(r)
 for cell, value in zip(cells, col):
 cell.value = value
 to_update.extend(cells)
 sheet.update_cells(to_update)

With this it is no problem to use e.g. this table of size 2500 x 3:

import numpy as np
...
table = list(map(list, np.arange(2500*3).reshape(-1, 3).astype(str)))
update_sheet(sheet, table)

@Linu: How large is your table? It worked for me for a small test example (100 rows, 3 columns). If you have many more rows than columns, transposing the table first might gain you some less calls.
@GraipherThe table has a max rows of 2500.Any other way to deal with this?
@Linu: No need to change your SQL, just transpose your list of lists (using e.g. the common Python idiom list(zip(*table)). I added it in the second function. The only other difference is that the range gets the whole column now, so we need all column names instead of start and end. But for that you could also write a convenience function that gives you all column names between start and end. Possibly even using this recent answer of mine.
Being non-working code (with an error), should you not have voted to close it instead ?
@RohitGupta The code in the OP was working, albeit slowly due to it containing a workaround for avoiding running into the rate limit of API requests being made to Google Sheets. My answers shows some ways to make fewer API calls, making that workaround unnecessary.

Stack Exchange Network

Update Google Sheet with Python

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Update Google Sheet with Python

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions