I am pulling some financial data from the IEX Cloud API. Since it is a paid API w/ rate limits, I would like to do this as efficiently as possible (i.e., as few calls as possible). Given that the tickers
list is likely to have thousands of indexes (oftentimes with duplicates), is there a better/more efficient way than what I'm currently doing?
(Feel free to use the token in the code, as there is no limit to the number of API calls that you can make to sandboxed data.)
import os
from iexfinance.stocks import Stock
import iexfinance
import pandas as pd
# Set IEX Finance API Token (Test)
os.environ['IEX_API_VERSION'] = 'iexcloud-sandbox'
os.environ['IEX_TOKEN'] = 'Tsk_5798c0ab124d49639bb1575b322841c4'
# List of ticker symbols
tickers = ['MSFT', 'V', 'ERROR', 'INVALID', 'BRK.B', 'MSFT']
# Create output Dataframe
output_df = pd.DataFrame(columns=['Net Income', 'Book Value'])
# Loop through companies in tickers list
for ticker in tickers:
# Call IEX Cloud API, and output to Dataframe
try:
company = Stock(ticker, output_format='pandas')
try:
# Get income from last 4 quarters, sum it, and store to temp Dataframe
df_income = company.get_income_statement(period="quarter", last='4')
df_income['TTM'] = df_income.sum(axis=1)
income_ttm = int(df_income.loc['netIncome', 'TTM'])
# Get book value from most recent quarter, and store to temp Dataframe
df_book = company.get_balance_sheet(period="quarter")
book_value = int(df_book.loc['shareholderEquity'])
# If IEX Cloud errors, make stats = 0
except (iexfinance.utils.exceptions.IEXQueryError, TypeError, KeyError) as e:
book_value = 0
income_ttm = 0
# Store stats to output Dataframe
output_df.loc[ticker] = [income_ttm, book_value]
except ValueError:
pass
print(output_df)
-
\$\begingroup\$ Please do not edit the question, especially the code, after an answer has been posted. Changing the question may cause answer invalidation. Everyone needs to be able to see what the reviewer was referring to. What to do after the question has been answered. The code must be in the question. \$\endgroup\$pacmaninbw– pacmaninbw ♦2023年02月26日 17:31:30 +00:00Commented Feb 26, 2023 at 17:31
-
\$\begingroup\$ If you put something in the code that invalidates your privacy please notify a moderator so that they might be able to help. \$\endgroup\$pacmaninbw– pacmaninbw ♦2023年02月26日 17:32:20 +00:00Commented Feb 26, 2023 at 17:32
2 Answers 2
This seems mostly to be an API question? Assuming all that data for each of the companies is required and there's no combined API endpoint that can be fed with, say, 10s or 100s of symbols, this would seem to be the best way. If so, there'd still be possible concurrent execution options to make multiple API requests at the same time and possible get more throughput with it, though the number of API calls would still be the same.
The streaming endpoints looking interesting at first, but they don't support the data here; any other endpoints all look like they only accept single symbols.
What you should definitely do is create a (few) function(s) and ideally
get rid of the try
blocks. From a first look I've zero idea where
stuff gets thrown. But I'm guessing mostly keys not being present.
Preferably this could be handled directly (with default values or
similar features) that do not require lots of exception handling. Yes,
Python might encourage this, but simply catching ValueError
around a
huge block will eventually catch unintended exceptions too.
So at least something like this would be a good start:
def fetch_company_info(ticker):
company = Stock(ticker, output_format='pandas')
book_value = 0
income_ttm = 0
try:
# Get income from last 4 quarters, sum it, and store to temp Dataframe
df_income = company.get_income_statement(period="quarter", last=4)
df_income['TTM'] = df_income.sum(axis=1)
income_ttm = int(df_income.loc['netIncome', 'TTM'])
# Get book value from most recent quarter, and store to temp Dataframe
df_book = company.get_balance_sheet(period="quarter")
book_value = int(df_book.loc['shareholderEquity'])
# Ignore IEX Cloud errors
except iexfinance.utils.exceptions.IEXQueryError:
pass
return income_ttm, book_value
I also shuffled the variables a bit around - with the right order this could even be type checked ...
In any case, I can't see anything wrong here and I've removed the other two exceptions as they look more like hiding problems than actual proper handling of the exact problem that was encountered. With the example names I also couldn't trigger them at all.
The main body of the script is now rather small:
# Loop through companies in tickers list
for ticker in tickers:
# Call IEX Cloud API, and store stats to Dataframe
output_df.loc[ticker] = fetch_company_info(ticker)
print(output_df)
If there's no need for reusability then that's it, otherwise I'd suggest
a main
function and moving all currently global definitions into it,
then have some argument parser or configuration file handling, etc. to
make it a more fully grown solution.
After some discussion in the comments it turns out that splitting the
return values from the assignment can lead to easy mistakes like the
order of the values being wrong. With that in mind, either the output_df
global should be kept in the function, or (better) be passed in as
a variable instead:
for ticker in tickers:
fetch_company_info(ticker, output_df)
and
def fetch_company_info(ticker, output_df):
...
output_df[ticker] = [income_ttm, book_value]
-
1\$\begingroup\$ I did come across this recent post from the API provider, and it sounds like you can batch calls up to 100 at a time. Not sure if that changes any of the approach or not though? \$\endgroup\$user53526356– user535263562019年09月02日 23:23:35 +00:00Commented Sep 2, 2019 at 23:23
-
1\$\begingroup\$ Ah yes, go for that. The general approach would be largely the same, you'll just have to split the ticker symbols into (arbitrary-sized) batches and then detangle the output into separate rows again. Plus the error handling might be a bit more complicated if only some of the symbols aren't available etc., but that's the price for more efficient calls I'd say. \$\endgroup\$ferada– ferada2019年09月03日 09:38:52 +00:00Commented Sep 3, 2019 at 9:38
-
1\$\begingroup\$ Okay cool, thanks for the help. So I should store all the symbols to one global list, then loop through that list 100 indexes at a time—each time writing to the
output_df
? \$\endgroup\$user53526356– user535263562019年09月03日 19:12:49 +00:00Commented Sep 3, 2019 at 19:12 -
1\$\begingroup\$ Never mind, looks it's just returning
income_ttm
andbook_value
in the opposite order from what I was expecting. My mistake. \$\endgroup\$user53526356– user535263562019年09月04日 15:28:56 +00:00Commented Sep 4, 2019 at 15:28 -
1\$\begingroup\$ Oops. Fixed. To be fair that's a good confirmation that it should be moved into the function :) \$\endgroup\$ferada– ferada2019年09月04日 19:17:31 +00:00Commented Sep 4, 2019 at 19:17
The most suspicious thing to me is this:
except ValueError:
pass
It's on a very broad try
block, which leads me to believe that it was thrown in to attempt a pseudo fail-safe loop. Fail-safe loops are not a bad thing, but this isn't a great way to go about it. Contrary to most circumstances, it's actually a good idea here to broaden your caught exception class to Exception
, so long as you output the error before continuing with the loop.
I'll also say: if ValueError
is produced by a condition that you know and understand, try to check for that condition before it raises an exception, print a message and continue. And/or - if you understand the source line of this common exception but aren't able to check for failure conditions beforehand, apply a narrower try/catch to only that line, again emitting a sane message and continuing on with your loop.
Explore related questions
See similar questions with these tags.