Skip to main content
Code Review

Return to Question

Question Protected by Jamal
added 4 characters in body; edited title
Source Link
200_success
  • 145.5k
  • 22
  • 190
  • 478

Better implementation of Excel's SUMIFS implemented using pandasPANDAS, the Python Data Analysis Library

I've implemented Excel's SUMIFS function in Pandas using the following code. Is there a better—more Pythonic—implementationbetter — more Pythonic — implementation?

from pandas import Series, DataFrame
import pandas as pd
df = pd.read_csv('data.csv')
# pandas equivalent of Excel's SUMIFS function
df.groupby('PROJECT').sum().ix['A001']

One concern I have with this implementation is that I'm not explicitly specifying the column to be summed.

Data File

Here's an example CSV data file (data.csv), although I'm displaying | instead of commas to improve the visual appearance.

DATE | EMPLOYEE | PROJECT | HOURS
02/01/14 | Smith, John | A001 | 4.0
02/01/14 | Smith, John | B002 | 4.0
02/01/14 | Doe, Jane | A001 | 3.0
02/01/14 | Doe, Jane | C003 | 5.0
02/02/14 | Smith, John | B002 | 2.0
02/02/14 | Smith, John | C003 | 6.0
02/02/14 | Doe, Jane | A001 | 8.0

Equivalent Excel SUMIFS Function

If I were to open data.csv in Excel and wanted to determine how many hours were worked on project A001, I would use the SUMIFS formula as follows:

=SUMIFS($D2:$D8, $C2:$C8, "A001")

Where the SUMIFS function syntax is:

=SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2,
 criteria2], ...)

Better implementation of Excel's SUMIFS using pandas, the Python Data Analysis Library

I've implemented Excel's SUMIFS function in Pandas using the following code. Is there a better—more Pythonic—implementation?

from pandas import Series, DataFrame
import pandas as pd
df = pd.read_csv('data.csv')
# pandas equivalent of Excel's SUMIFS function
df.groupby('PROJECT').sum().ix['A001']

One concern I have with this implementation is that I'm not explicitly specifying the column to be summed.

Data File

Here's an example CSV data file (data.csv), although I'm displaying | instead of commas to improve the visual appearance.

DATE | EMPLOYEE | PROJECT | HOURS
02/01/14 | Smith, John | A001 | 4.0
02/01/14 | Smith, John | B002 | 4.0
02/01/14 | Doe, Jane | A001 | 3.0
02/01/14 | Doe, Jane | C003 | 5.0
02/02/14 | Smith, John | B002 | 2.0
02/02/14 | Smith, John | C003 | 6.0
02/02/14 | Doe, Jane | A001 | 8.0

Equivalent Excel SUMIFS Function

If I were to open data.csv in Excel and wanted to determine how many hours were worked on project A001, I would use the SUMIFS formula as follows:

=SUMIFS($D2:$D8, $C2:$C8, "A001")

Where the SUMIFS function syntax is:

=SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2,
 criteria2], ...)

Excel's SUMIFS implemented using PANDAS, the Python Data Analysis Library

I've implemented Excel's SUMIFS function in Pandas using the following code. Is there a better — more Pythonic — implementation?

from pandas import Series, DataFrame
import pandas as pd
df = pd.read_csv('data.csv')
# pandas equivalent of Excel's SUMIFS function
df.groupby('PROJECT').sum().ix['A001']

One concern I have with this implementation is that I'm not explicitly specifying the column to be summed.

Data File

Here's an example CSV data file (data.csv), although I'm displaying | instead of commas to improve the visual appearance.

DATE | EMPLOYEE | PROJECT | HOURS
02/01/14 | Smith, John | A001 | 4.0
02/01/14 | Smith, John | B002 | 4.0
02/01/14 | Doe, Jane | A001 | 3.0
02/01/14 | Doe, Jane | C003 | 5.0
02/02/14 | Smith, John | B002 | 2.0
02/02/14 | Smith, John | C003 | 6.0
02/02/14 | Doe, Jane | A001 | 8.0

Equivalent Excel SUMIFS Function

If I were to open data.csv in Excel and wanted to determine how many hours were worked on project A001, I would use the SUMIFS formula as follows:

=SUMIFS($D2:$D8, $C2:$C8, "A001")

Where the SUMIFS function syntax is:

=SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2,
 criteria2], ...)
added 90 characters in body; edited tags
Source Link
Jamal
  • 35.2k
  • 13
  • 134
  • 238

I've implemented Excel's SUMIFS function in pandasPandas using the following code. Is there a better—more Pythonic—implementation?

from pandas import Series, DataFrame
import pandas as pd
df = pd.read_csv('data.csv')
# pandas equivalent of Excel's SUMIFS function
df.groupby('PROJECT').sum().ix['A001']

One concern I have with this implementation is that I'm not explicitly specifying the column to be summed.

Data File

Here's an example CSV data file (data.csv), although I'm displaying | instead of commas to improve the visual appearance.

DATE | EMPLOYEE | PROJECT | HOURS
02/01/14 | Smith, John | A001 | 4.0
02/01/14 | Smith, John | B002 | 4.0
02/01/14 | Doe, Jane | A001 | 3.0
02/01/14 | Doe, Jane | C003 | 5.0
02/02/14 | Smith, John | B002 | 2.0
02/02/14 | Smith, John | C003 | 6.0
02/02/14 | Doe, Jane | A001 | 8.0
DATE | EMPLOYEE | PROJECT | HOURS
02/01/14 | Smith, John | A001 | 4.0
02/01/14 | Smith, John | B002 | 4.0
02/01/14 | Doe, Jane | A001 | 3.0
02/01/14 | Doe, Jane | C003 | 5.0
02/02/14 | Smith, John | B002 | 2.0
02/02/14 | Smith, John | C003 | 6.0
02/02/14 | Doe, Jane | A001 | 8.0

Equivalent Excel SUMIFS Function

If I were to open data.csv in Excel and wanted to determine how many hours were worked on project A001, I would use the SUMIFS formula as follows:

=SUMIFS($D2:$D8, $C2:$C8, "A001")
=SUMIFS($D2:$D8, $C2:$C8, "A001")

Where the SUMIFS function syntax is:

=SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2,
 criteria2], ...)
=SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2,
 criteria2], ...)

I've implemented Excel's SUMIFS function in pandas using the following code. Is there a better—more Pythonic—implementation?

from pandas import Series, DataFrame
import pandas as pd
df = pd.read_csv('data.csv')
# pandas equivalent of Excel's SUMIFS function
df.groupby('PROJECT').sum().ix['A001']

One concern I have with this implementation is that I'm not explicitly specifying the column to be summed.

Data File

Here's an example CSV data file (data.csv), although I'm displaying | instead of commas to improve the visual appearance.

DATE | EMPLOYEE | PROJECT | HOURS
02/01/14 | Smith, John | A001 | 4.0
02/01/14 | Smith, John | B002 | 4.0
02/01/14 | Doe, Jane | A001 | 3.0
02/01/14 | Doe, Jane | C003 | 5.0
02/02/14 | Smith, John | B002 | 2.0
02/02/14 | Smith, John | C003 | 6.0
02/02/14 | Doe, Jane | A001 | 8.0

Equivalent Excel SUMIFS Function

If I were to open data.csv in Excel and wanted to determine how many hours were worked on project A001, I would use the SUMIFS formula as follows:

=SUMIFS($D2:$D8, $C2:$C8, "A001")

Where the SUMIFS function syntax is:

=SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2,
 criteria2], ...)

I've implemented Excel's SUMIFS function in Pandas using the following code. Is there a better—more Pythonic—implementation?

from pandas import Series, DataFrame
import pandas as pd
df = pd.read_csv('data.csv')
# pandas equivalent of Excel's SUMIFS function
df.groupby('PROJECT').sum().ix['A001']

One concern I have with this implementation is that I'm not explicitly specifying the column to be summed.

Data File

Here's an example CSV data file (data.csv), although I'm displaying | instead of commas to improve the visual appearance.

DATE | EMPLOYEE | PROJECT | HOURS
02/01/14 | Smith, John | A001 | 4.0
02/01/14 | Smith, John | B002 | 4.0
02/01/14 | Doe, Jane | A001 | 3.0
02/01/14 | Doe, Jane | C003 | 5.0
02/02/14 | Smith, John | B002 | 2.0
02/02/14 | Smith, John | C003 | 6.0
02/02/14 | Doe, Jane | A001 | 8.0

Equivalent Excel SUMIFS Function

If I were to open data.csv in Excel and wanted to determine how many hours were worked on project A001, I would use the SUMIFS formula as follows:

=SUMIFS($D2:$D8, $C2:$C8, "A001")

Where the SUMIFS function syntax is:

=SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2,
 criteria2], ...)
Source Link

Better implementation of Excel's SUMIFS using pandas, the Python Data Analysis Library

I've implemented Excel's SUMIFS function in pandas using the following code. Is there a better—more Pythonic—implementation?

from pandas import Series, DataFrame
import pandas as pd
df = pd.read_csv('data.csv')
# pandas equivalent of Excel's SUMIFS function
df.groupby('PROJECT').sum().ix['A001']

One concern I have with this implementation is that I'm not explicitly specifying the column to be summed.

Data File

Here's an example CSV data file (data.csv), although I'm displaying | instead of commas to improve the visual appearance.

DATE | EMPLOYEE | PROJECT | HOURS
02/01/14 | Smith, John | A001 | 4.0
02/01/14 | Smith, John | B002 | 4.0
02/01/14 | Doe, Jane | A001 | 3.0
02/01/14 | Doe, Jane | C003 | 5.0
02/02/14 | Smith, John | B002 | 2.0
02/02/14 | Smith, John | C003 | 6.0
02/02/14 | Doe, Jane | A001 | 8.0

Equivalent Excel SUMIFS Function

If I were to open data.csv in Excel and wanted to determine how many hours were worked on project A001, I would use the SUMIFS formula as follows:

=SUMIFS($D2:$D8, $C2:$C8, "A001")

Where the SUMIFS function syntax is:

=SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2,
 criteria2], ...)
lang-py

AltStyle によって変換されたページ (->オリジナル) /