Conditional traffic light system

Question 1

I have written Python code which uses multiple if conditions and a for loop. The main objective of the code is to produce a traffic light system based on certain conditions.

Red = -1
Yellow = 0
Green = 1

It takes 4 months (m0, m1, m2, m3) and dataframe as an input, runs the condition through each row, and returns -1, 0 or 1.

scenarios

The code compares Month 1 with Month 0, Month 2 with Month 1 and Month 3 with Month 2.

For input:

if month+1 < Month for any value, then red else green.

For example, if the revenue of July 2020 is less than June 2020 then the input is red, otherwise green. Based on three comparisons the outcome is calculated. The outcome could be either 1-, 0, or 1.

The code I have written works well but is not optimised in any way. Is there a better way to do this?

This would be an O(n) operation but, at least there should be a way to write it concisely in Python. Or if code can be improved operationally as well.

def getTrafficLightData(df, dimension, m1, m2, m3, m4):
'''
Inputs - 
 Dataframe
 dimension = on which we want to calculate traffic light system
 m1, m2, m3, m4 - Could be any for months but we have taken consecutive months for Traffic Light System.
 Example Call - getTrafficLightData(report6_TLS_data, "Revenue_","2020-6","2020-7","2020-8","2020-9")
'''
TFS_df = pd.DataFrame(columns=[dimension + "_TLS"])
if dimension == "Overstrike_":
 suffix = "%"
 for i in range(len(df)):
 if (
 (
 df[dimension + m1 + suffix].iloc[i]
 > df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 > df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 > df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [1]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 < df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 < df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 < df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [-1]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 < df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 > df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 < df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [-1]
 elif (
 (
 df[dimension + m2 + suffix].iloc[i]
 > df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 < df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 < df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [-1]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 > df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 > df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 < df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [0]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 > df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 < df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 > df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [1]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 < df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 > df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 > df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [1]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 < df[dimension + m2 + suffix].iloc[i]
 )
 and ( #
 df[dimension + m2 + suffix].iloc[i]
 < df[dimension + m3 + suffix].iloc[i]
 )
 and ( #
 df[dimension + m3 + suffix].iloc[i]
 > df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [0]
 else:
 TFS_df.loc[i] = [0]
 return TFS_df
else:
 if dimension == "Margin_":
 suffix = "%"
 else:
 suffix = ""
 for i in range(len(df)):
 if (
 (
 df[dimension + m1 + suffix].iloc[i]
 > df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 > df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 > df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [-1]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 < df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 < df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 < df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [1]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 < df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 > df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 < df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [1]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 > df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 < df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 < df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [1]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 > df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 > df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 < df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [0]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 > df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 < df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 > df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [-1]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 < df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 > df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 > df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [-1]
 elif (
 (
 df[dimension + m1 + suffix].iloc[i]
 < df[dimension + m2 + suffix].iloc[i]
 )
 and (
 df[dimension + m2 + suffix].iloc[i]
 < df[dimension + m3 + suffix].iloc[i]
 )
 and (
 df[dimension + m3 + suffix].iloc[i]
 > df[dimension + m4 + suffix].iloc[i]
 )
 ):
 TFS_df.loc[i] = [0]
 else:
 TFS_df.loc[i] = [0]
 return TFS_df

The function is called in below way -

report6_TLS_data['Revenue_TLS']=getTrafficLightData(report6_TLS_data, "Revenue_","2020-6","2020-7","2020-8","2020-9")
report6_TLS_data["Margin_TLS"]=getTrafficLightData(report6_TLS_data, "Margin_","2020-6","2020-7","2020-8","2020-9")
report6_TLS_data["Overstrike_TLS"]=getTrafficLightData(report6_TLS_data, "Overstrike_","2020-6","2020-7","2020-8","2020-9")

Any pointers would be helpful.

The input data is of form -

ym PART NUMBER BranchCode Revenue_2019-1 Revenue_2019-10 Revenue_2019-11 Revenue_2019-12 Revenue_2019-2 Revenue_2019-3 Revenue_2019-4 Revenue_2019-5 Revenue_2019-6 Revenue_2019-7 Revenue_2019-8 Revenue_2019-9 Revenue_2020-1 Revenue_2020-2 Revenue_2020-3 Revenue_2020-4 Revenue_2020-5 Revenue_2020-6 Revenue_2020-7 Revenue_2020-8 Revenue_2020-9 Margin_2019-1 Margin_2019-10 Margin_2019-11 Margin_2019-12 Margin_2019-2 Margin_2019-3 Margin_2019-4 Margin_2019-5 Margin_2019-6 Margin_2019-7 Margin_2019-8 Margin_2019-9 Margin_2020-1 Margin_2020-2 Margin_2020-3 Margin_2020-4 Margin_2020-5 Margin_2020-6 Margin_2020-7 Margin_2020-8 Margin_2020-9 Overstrike_2019-1 Overstrike_2019-10 Overstrike_2019-11 Overstrike_2019-12 Overstrike_2019-2 Overstrike_2019-3 Overstrike_2019-4 Overstrike_2019-5 Overstrike_2019-6 Overstrike_2019-7 Overstrike_2019-8 Overstrike_2019-9 Overstrike_2020-1 Overstrike_2020-2 Overstrike_2020-3 Overstrike_2020-4 Overstrike_2020-5 Overstrike_2020-6 Overstrike_2020-7 Overstrike_2020-8 Overstrike_2020-9 Transactions_2019-1 Transactions_2019-10 Transactions_2019-11 Transactions_2019-12 Transactions_2019-2 Transactions_2019-3 Transactions_2019-4 Transactions_2019-5 Transactions_2019-6 Transactions_2019-7 Transactions_2019-8 Transactions_2019-9 Transactions_2020-1 Transactions_2020-2 Transactions_2020-3 Transactions_2020-4 Transactions_2020-5 Transactions_2020-6 Transactions_2020-7 Transactions_2020-8 Transactions_2020-9 Margin_2019-1% Margin_2019-10% Margin_2019-11% Margin_2019-12% Margin_2019-2% Margin_2019-3% Margin_2019-4% Margin_2019-5% Margin_2019-6% Margin_2019-7% Margin_2019-8% Margin_2019-9% Margin_2020-1% Margin_2020-2% Margin_2020-3% Margin_2020-4% Margin_2020-5% Margin_2020-6% Margin_2020-7% Margin_2020-8% Margin_2020-9% Overstrike_2019-1% Overstrike_2019-10% Overstrike_2019-11% Overstrike_2019-12% Overstrike_2019-2% Overstrike_2019-3% Overstrike_2019-4% Overstrike_2019-5% Overstrike_2019-6% Overstrike_2019-7% Overstrike_2019-8% Overstrike_2019-9% Overstrike_2020-1% Overstrike_2020-2% Overstrike_2020-3% Overstrike_2020-4% Overstrike_2020-5% Overstrike_2020-6% Overstrike_2020-7% Overstrike_2020-8% Overstrike_2020-9%
0 BAGG001 BC 71.75 90.00 20.25 43.50 42.50 30.00 70.00 44.25 45.00 46.75 129.50 58.00 81.00 36.00 33.25 0.75 15.00 24.75 0.00 0.00 2.50 32.97 39.15 8.95 14.31 18.95 7.86 30.68 19.27 19.74 18.12 59.38 22.30 34.95 17.59 14.10 0.32 6.35 5.30 0.00 0.00 1.06 0.00 0.00 0.00 1.00 0.00 1.00 0.00 0.00 3.00 3.00 1.00 1.00 2.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 0.00 8 16 5 9 5 6 12 7 10 7 13 10 13 5 11 1 2 4 0 0 1 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 1.00
1 BAGG001 PK 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 25.50 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 9.90 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Question 2

Your syntax is incorrect. Please fix your indentation.

Question 3

Looks like a copy paste error...df[dimension + m2 + suffix].iloc[i] > df[dimension + m2 + suffix].iloc[i]

Question 4

It's important to understand that Pandas is a wrapper around Numpy, and both are geared toward vectorization. Avoid loops and repetition as much as possible, and this method can really be boiled down. It's worth saying that

if month+1 < Month for any value, then red else green.

is absolutely not what's going on. For now I've made a proposed solution that deals with a fixed number of months using a lookup table. You can adapt it as necessary. It's based on the table that you showed, interpreting comparisons as bits in a field.

Otherwise: don't bake an underscore into your dimension string; don't produce a double-underscore in the output dataframe; add type hints; and accept the months as a variadic argument or tuple.

import numpy as np
import pandas as pd
lookup = np.array((-1, 0, -1, 1, -1, 1, 0, 1))
to_binary = np.array((4, 2, 1)).T
def get_traffic_light_data_new(
 df: pd.DataFrame,
 dimension: str,
 *months: str,
) -> np.array:
 assert len(months) == 4
 by_month = df[[f'{dimension}_{month}' for month in months]].to_numpy()
 signs = by_month[:, 1:] - by_month[:, :-1] > 0
 index = np.matmul(signs, to_binary)
 return lookup[index]
def test():
 scenarios = pd.DataFrame(
 [
 [3, 2, 1, 0],
 [1, 2, 3, 4],
 [1, 2, 1.5, 1.7],
 [2, 1, 3, 4],
 [3, 2, 1, 4],
 [3, 2, 2.5, 2.4],
 [5, 6, 1, 0],
 [3, 4, 5, 4],
 ],
 columns=(
 'Revenue_2020-6',
 'Revenue_2020-7',
 'Revenue_2020-8',
 'Revenue_2020-9',
 ),
 )
 args = (
 scenarios,
 'Revenue',
 '2020-6',
 '2020-7',
 '2020-8',
 '2020-9',
 )
 new_result = get_traffic_light_data_new(*args)
 print(new_result)
if __name__ == '__main__':
 test()

Question 5

the condition I have stated is for input. what i mean here is the colurs on the inputs are based on above side. for example, if there are 4 months ( June july, Aug and September), and if you see first row on the input side, it means july revenue was less than june, august revenue was less than july and sep revenue was less than august, so all three are red. If all three are red, output should be red. and similarly for other conditions. The input data contains values for revenue for 4 months, then based on which condition it satisfies on the input side, output should be accordingly.

Question 6

Your description violates scenario 3. There is a decrease, yet the output is green. Both your old implementation and my proposed implementation match the coloured table and not the description.

Question 7

I guess I have not been clear in my comment. Apologies. What I am saying is if my data satisfies input condition for all 3 columns of the left side, my code should give me right hand (output) color. Thank you.

Reinderien Reinderien 71k5 gold badges76 silver badges256 bronze badges · Answer 1 · 2020-10-28 23:10:46Z

It's important to understand that Pandas is a wrapper around Numpy, and both are geared toward vectorization. Avoid loops and repetition as much as possible, and this method can really be boiled down. It's worth saying that

if month+1 < Month for any value, then red else green.

is absolutely not what's going on. For now I've made a proposed solution that deals with a fixed number of months using a lookup table. You can adapt it as necessary. It's based on the table that you showed, interpreting comparisons as bits in a field.

Otherwise: don't bake an underscore into your dimension string; don't produce a double-underscore in the output dataframe; add type hints; and accept the months as a variadic argument or tuple.

import numpy as np
import pandas as pd
lookup = np.array((-1, 0, -1, 1, -1, 1, 0, 1))
to_binary = np.array((4, 2, 1)).T
def get_traffic_light_data_new(
 df: pd.DataFrame,
 dimension: str,
 *months: str,
) -> np.array:
 assert len(months) == 4
 by_month = df[[f'{dimension}_{month}' for month in months]].to_numpy()
 signs = by_month[:, 1:] - by_month[:, :-1] > 0
 index = np.matmul(signs, to_binary)
 return lookup[index]
def test():
 scenarios = pd.DataFrame(
 [
 [3, 2, 1, 0],
 [1, 2, 3, 4],
 [1, 2, 1.5, 1.7],
 [2, 1, 3, 4],
 [3, 2, 1, 4],
 [3, 2, 2.5, 2.4],
 [5, 6, 1, 0],
 [3, 4, 5, 4],
 ],
 columns=(
 'Revenue_2020-6',
 'Revenue_2020-7',
 'Revenue_2020-8',
 'Revenue_2020-9',
 ),
 )
 args = (
 scenarios,
 'Revenue',
 '2020-6',
 '2020-7',
 '2020-8',
 '2020-9',
 )
 new_result = get_traffic_light_data_new(*args)
 print(new_result)
if __name__ == '__main__':
 test()

the condition I have stated is for input. what i mean here is the colurs on the inputs are based on above side. for example, if there are 4 months ( June july, Aug and September), and if you see first row on the input side, it means july revenue was less than june, august revenue was less than july and sep revenue was less than august, so all three are red. If all three are red, output should be red. and similarly for other conditions. The input data contains values for revenue for 4 months, then based on which condition it satisfies on the input side, output should be accordingly.
Your description violates scenario 3. There is a decrease, yet the output is green. Both your old implementation and my proposed implementation match the coloured table and not the description.
I guess I have not been clear in my comment. Apologies. What I am saying is if my data satisfies input condition for all 3 columns of the left side, my code should give me right hand (output) color. Thank you.

Stack Exchange Network

Conditional traffic light system

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Conditional traffic light system

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions