0

I have data grabbed to Python that is in table form:

Name Sport Score 
John Golf 100
Jill Rugby 55
John Hockey 100
Bob Golf 45

How can I format this table in Python that would make it easy to sort or group items. For example, if I wanted to see all the names of people that played Golf or all of the people that scored 100 on any sport. Or all of the data for just John.

asked Oct 24, 2015 at 19:11
9
  • Please clarify; is your problem storing this data, or printing this data? Commented Oct 24, 2015 at 19:13
  • ordered dictionary or named tuple may serve your purpose Commented Oct 24, 2015 at 19:13
  • @JesseTG The data will be stored and then written to excel Commented Oct 24, 2015 at 19:21
  • @AhsanulHaque Everyone says that dictionaries are inherently unordered? Commented Oct 24, 2015 at 19:21
  • Is there a reason why you do not use pandas ? Commented Oct 24, 2015 at 19:22

5 Answers 5

1

pandas' DataFrame will be the way to go:

import pandas as pd
df = pd.DataFrame({'Name': ['John', 'Jill', 'John', 'Bob'], 
 'Sport' : ['Golf', 'Rugby', 'Hockey', 'Golf'],
 'Score': [100, 50, 100, 45]}) 
# the names of people that played Golf
df[df['Sport'] == 'Golf']['Name'].unique()
>> ['John' 'Bob']
# all of the people that scored 100 on any sport.
df[df['Score'] == 100]['Name'].unique()
>> ['John']
# all of the data for just John.
df[df['Name'] == 'John']
>> Name Score Sport
 0 John 100 Golf
 2 John 100 Hockey
answered Oct 24, 2015 at 19:23
Sign up to request clarification or add additional context in comments.

Comments

1

map and filter with namedtuples and lambdas can be used for this task.

from collections import namedtuple
# Create a named tuple to store the rows
Row = namedtuple('Row', ('name', 'sport', 'score'))
data = '''Name Sport Score 
 John Golf 100
 Jill Rugby 55
 John Hockey 100
 Bob Golf 45'''
# Read the data, skip the first line
lines = data.splitlines()[1:]
rows = []
for line in lines:
 name, sport, score = line.strip().split()
 rows.append(Row(name, sport, int(score)))
# People that played Golf
golf_filter = lambda row: row.sport == 'Golf'
golf_players = filter(golf_filter, rows)
# People that scored 100 on any sport
score_filter = lambda row: row.score == 100
scorers = filter(score_filter, rows)
# People named John
john_filter = lambda row: row.name == 'John'
john_data = filter(john_filter, rows)
# If you want a specific column than you can map the data
# Names of golf players
get_name = lambda row: row.name
golf_players_names = map(get_name, golf_players)

Results:

>>> golf_players
[Row(name='John', sport='Golf', score=100),
 Row(name='Bob', sport='Golf', score=45)]
>>> john_data
[Row(name='John', sport='Golf', score=100),
 Row(name='John', sport='Hockey', score=100)]
>>> scorers
[Row(name='John', sport='Golf', score=100),
 Row(name='John', sport='Hockey', score=100)]
>>> golf_players_names
['John', 'Bob']
answered Oct 24, 2015 at 19:39

Comments

1

What about this one?

yourDS={"name":["John","Jill","John","Bob"],
 "sport":["Golf","Rugby","Hockey","Golf"],
 "score":[100,55,100,45]
}

This should hold the relation of each entry as list are ordered.

To avoid the effect of duplicate element in a list, first make a new set from the list.

For your expected query, you can do something like that.

for index,value in enumerate(yourDS["score"]):
 if value=="x":
 print yourDS["name"][index] 

It's better to use a list to store the result and make it a set, to avoid some cases for example, if a man has score of x in two different games.

answered Oct 24, 2015 at 19:44

1 Comment

But how would I "query" it, for example, how would I get all of the people that have score x
-1

You can create list of lists. each row will be a list inside a list.

lst1=[['John','Golf',100],['Jill','Rugby',55],['John','Hockey',100],['Bob','Golf',45]]
lst100=[]
for lst in lst1:
 if lst[2]==100:
 lst100.append(lst)
print lst100
answered Oct 24, 2015 at 19:23

Comments

-1

If you want to retrieve information based on your data, I'd go with SQL. It's well-suited to answering questions like these:

...to see all the names of people that played Golf...

...all of the people that scored 100 on any sport...

...all of the data for just John.

The most popular database language these days is SQL, and as it happens Python actually has built-in support for it through the sqlite3 module.

SQL, while not a monumental task to learn, is beyond the scope of this answer. To learn that, I'd recommend checking out Codecademy, Code School, or SQLZOO (they're all interactive).

Or, if you just want to read it in and write it out without caring about what it actually means, consider using the csv module, which is also built-in.

answered Oct 24, 2015 at 19:25

2 Comments

Yes I thought this all sounds perfect for SQL but as I want to develop this code for users that may not have SQL installed (may even push this as .exe with dependancies bundled in) how would that affect things?
Actually, sqlite3 is bundled with python, so it's already there. Part of the standard library.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.