3

I tried to read a GeoJSON file with Pandas, but I got a ValueError message:

'ValueError: Expected object or value'

Here's the approach I used:

import pandas as pd
geojsonPath = r"Z:\dems\address.geojson"
pd_json = pd.io.json.read_json(geojsonPath,lines=True) 
pd_json.head()

Attached is an extract from the file

{
"type": "FeatureCollection",
"name": "cameron-addresses-county",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [
{ "type": "Feature", "properties": { "X": -78.1422444, "Y": 41.3286117, "hash": "93dd7b7e3ee3e8af", "number": "501", "street": "CASTLE GARDEN RD", "unit": null, "city": null, "district": null, "region": null, "postcode": null, "id": 7579 }, "geometry": { "type": "Point", "coordinates": [ -78.1422444, 41.3286117 ] } },
{ "type": "Feature", "properties": { "X": -78.143584, "Y": 41.3284045, "hash": "853eb0c5f6e70fe3", "number": "64", "street": "BELDIN DR", "unit": null, "city": null, "district": null, "region": null, "postcode": null, "id": 4502 }, "geometry": { "type": "Point", "coordinates": [ -78.143584, 41.3284045 ] } },
{ "type": "Feature", "properties": { "X": -78.1711061, "Y": 41.3282128, "hash": "99a13ba635404d80", "number": "9760", "street": "MIX RUN RD", "unit": null, "city": null, "district": null, "region": null, "postcode": null, "id": 8448 }, "geometry": { "type": "Point", "coordinates": [ -78.1711061, 41.3282128 ] } },
{ "type": "Feature", "properties": { "X": -78.1429278, "Y": 41.3282883, "hash": "70319cf9e435b858", "number": null, "street": null, "unit": null, "city": null, "district": null, "region": null, "postcode": null, "id": null }, "geometry": { "type": "Point", "coordinates": [ -78.1429278, 41.3282883 ] } },
{ "type": "Feature", "properties": { "X": -78.1427173, "Y": 41.3282733, "hash": "759f051e7a587eb2", "number": "465", "street": "CASTLE GARDEN RD", "unit": null, "city": null, "district": null, "region": null, "postcode": null, "id": 6447 }, "geometry": { "type": "Point", "coordinates": [ -78.1427173, 41.3282733 ] } },
{ "type": "Feature", "properties": { "X": -78.1433463, "Y": 41.3282308, "hash": "9fbb571fc16a6cb2", "number": "61", "street": "BELDIN DR", "unit": null, "city": null, "district": null, "region": null, "postcode": null, "id": 4466 }, "geometry": { "type": "Point", "coordinates": [ -78.1433463, 41.3282308 ] } },
{ "type": "Feature", "properties": { "X": -78.1432403, "Y": 41.3282179, "hash": "8f837d813626f1e1", "number": null, "street": null, "unit": null, "city": null, "district": null, "region": null, "postcode": null, "id": null }, "geometry": { "type": "Point", "coordinates": [ -78.1432403, 41.3282179 ] } },
{ "type": "Feature", "properties": { "X": -78.1715165, "Y": 41.3280965, "hash": "5004ba87bd6e668b", "number": "9736", "street": "MIX RUN RD", "unit": null, "city": null, "district": null, "region": null, "postcode": null, "id": 7434 }, "geometry": { "type": "Point", "coordinates": [ -78.1715165, 41.3280965 ] } }
Taras
35.8k5 gold badges77 silver badges151 bronze badges
asked Jun 14, 2022 at 0:45
3
  • 2
    You have a special library to do that, its name is geopandas. Do you work with anaconda? Commented Jun 14, 2022 at 1:07
  • 1
    Please post the full error and the full stack trace. Commented Jun 14, 2022 at 2:40
  • 1
    Please, do not forget about "What should I do when someone answers my question?" Commented Jun 21, 2022 at 5:07

1 Answer 1

5

There are several things to keep in mind:

  • Do not forget to close the GeoJSON with ]}
  • There is no need to call the read_json() via pd.io.json.read_json, simply pd.read_json. Even if it is placed in the pandas/pandas/io/json/
  • "ValueError: Expected object or value" error comes because in terms of JSON your geojsonPath variable is the right type but with wrong values.

So, to get everything working you can either:

  1. As was commented by @SalimRodríguez, try to read your GeoJSON with GeoPandas

    Output data format: GeoDataFrame

    import geopandas as gpd
    absolute_path_to_file = 'C:/Documents/Python Scripts/address.geojson'
    addresses = gpd.read_file(absolute_path_to_file)
    print(addresses)
     X Y ... id geometry
    0 -78.142244 41.328612 ... 7579.0 POINT (-78.14224 41.32861)
    1 -78.143584 41.328404 ... 4502.0 POINT (-78.14358 41.32840)
    2 -78.171106 41.328213 ... 8448.0 POINT (-78.17111 41.32821)
    3 -78.142928 41.328288 ... NaN POINT (-78.14293 41.32829)
    4 -78.142717 41.328273 ... 6447.0 POINT (-78.14272 41.32827)
    5 -78.143346 41.328231 ... 4466.0 POINT (-78.14335 41.32823)
    6 -78.143240 41.328218 ... NaN POINT (-78.14324 41.32822)
    7 -78.171516 41.328097 ... 7434.0 POINT (-78.17152 41.32810)
    
  2. If geometry is not important, you can can skip it simply by parsing your GeoJSON as a normal JSON

    Output data format: DataFrame

    import json
    import pandas as pd
    absolute_path_to_file = 'C:/Documents/Python Scripts/address.geojson'
    with open(absolute_path_to_file) as f:
     data = json.load(f)
    raw_data = [feature['properties'] for feature in data['features']]
    addresses = pd.DataFrame(raw_data)
    print(addresses)
     X Y hash ... region postcode id
    0 -78.142244 41.328612 93dd7b7e3ee3e8af ... None None 7579.0
    1 -78.143584 41.328404 853eb0c5f6e70fe3 ... None None 4502.0
    2 -78.171106 41.328213 99a13ba635404d80 ... None None 8448.0
    3 -78.142928 41.328288 70319cf9e435b858 ... None None NaN
    4 -78.142717 41.328273 759f051e7a587eb2 ... None None 6447.0
    5 -78.143346 41.328231 9fbb571fc16a6cb2 ... None None 4466.0
    6 -78.143240 41.328218 8f837d813626f1e1 ... None None NaN
    7 -78.171516 41.328097 5004ba87bd6e668b ... None None 7434.0
    
  3. If geometry still matters, then parse your GeoJSON as a normal JSON in a little bit different manner

    Output data format: DataFrame

    import json
    import pandas as pd
    from shapely.geometry import Point
    absolute_path_to_file = 'C:/Documents/Python Scripts/address.geojson'
    with open(absolute_path_to_file) as f:
     data = json.load(f)
    raw_data = [feature['properties'] | {'geometry': Point(feature['geometry']['coordinates'])} for feature in data['features']]
    addresses = pd.DataFrame(raw_data)
    print(addresses)
     X Y ... id geometry
    0 -78.142244 41.328612 ... 7579.0 POINT (-78.1422444 41.3286117)
    1 -78.143584 41.328404 ... 4502.0 POINT (-78.143584 41.3284045)
    2 -78.171106 41.328213 ... 8448.0 POINT (-78.1711061 41.3282128)
    3 -78.142928 41.328288 ... NaN POINT (-78.1429278 41.3282883)
    4 -78.142717 41.328273 ... 6447.0 POINT (-78.1427173 41.3282733)
    5 -78.143346 41.328231 ... 4466.0 POINT (-78.1433463 41.3282308)
    6 -78.143240 41.328218 ... NaN POINT (-78.1432403 41.3282179)
    7 -78.171516 41.328097 ... 7434.0 POINT (-78.1715165 41.3280965)
    

If it is still important to obtain a GeoDataFrame as a final output data format, one can achieve it either with

  • for option (2):

     gdf = gpd.GeoDataFrame(addresses, geometry=gpd.points_from_xy(addresses["X"], addresses["Y"]))
    
  • or for option (3):

     gdf = gpd.GeoDataFrame(addresses, geometry=addresses["geometry"])
    

References:

answered Jun 14, 2022 at 6:14
2
  • 1
    This is a high-quality answer. I learned new stuff, thank you Commented Jun 14, 2022 at 12:36
  • How do I split the 'geometry' column into long and lat? Commented Sep 10, 2022 at 23:23

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.