1
pd.read_sql_query("""SELECT Tab1.Title, NewTab.NewCol1 FROM
 (SELECT Col1 AS NewCol, COUNT(*) AS NewCol1
 FROM Tab2 GROUP BY Col1) AS NewTab
 JOIN Tab1 ON NewTab.NewCol=Tab1.Id
 WHERE Tab1.Num=1
 ORDER BY NewCol1 DESC""", conn)

My goal is to rewrite it using only pandas' methods and functions. First things first, I'd like to assign a new column NewCol that would contain also a new column PostId, but I highly doubt that I should do it in two steps. Could anyone please guide me towards solution or provide a full code I could analyze?

asked Dec 8, 2019 at 20:49

1 Answer 1

2

Would you like to rewrite this query in pandas in only one line? It might be done but it's highly unreadable. Something like this looks much neater

NewTab = Tab2.groupby('Col1').size().reset_index(name = 'NewCol1').rename(columns = {'Col1': 'NewCol'})

And now you can merge those two tables:

result_df = pd.merge(NewTab, Tab1, left_on = 'NewCol', right_on = 'Id')[result_df.Num == 1]

You can now sort the data frame after merging and specify the columns:

result_df.sort_values(by=['NewCol1'], inplace = True)
result_df = result_df[['Title','NewCol1']]
answered Dec 8, 2019 at 22:11
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.