I'm not sure if this is a bug or I'm just not structuring the data correctly, I couldn't find any examples for writing maps.
Given a table with a simple schema with a map field
from pyiceberg.schema import Schema
from pyiceberg.types import StringType, MapType, NestedField
map_type = MapType(key_id=1001, key_type=StringType(), value_id=1002, value_type=StringType())
schema = Schema(NestedField(field_id=1, name='my_map', field_type=map_type))
table = catalog.create_table(..., schema=schema)
table map.test( 1: my_map: optional map<string, string> ), partition by: [], sort order: [], snapshot: null
I first construct an arrow table with the converted schema
data = {'my_map': [{'symbol': 'BTC'}]}
pa_table = pa.Table.from_pydict(data, schema=schema.as_arrow())
pyarrow.Table my_map: map<large_string, large_string> child 0, entries: struct<key: large_string not null, value: large_string not null> not >null child 0, key: large_string not null child 1, value: large_string not null ---- my_map: [[keys:["symbol"]values:["BTC"]]]
When writing though, the schema validation complains that I haven't provided key and value fields
>>> table.append(pa_table)
┏━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ┃ Table field ┃ Dataframe field ┃
┡━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ ✅ │ 1: my_map: optional map<string, string> │ 1: my_map: optional map<string, string> │
│ ❌ │ 2: key: required string │ Missing │
│ ❌ │ 3: value: required string │ Missing │
└────┴─────────────────────────────────────────┴─────────────────────────────────────────┘
asked Feb 7, 2025 at 18:16
bphi
3,1753 gold badges30 silver badges41 bronze badges
-
Try to define the data as "key" and "value". The schema needs this in my opinion. I would try it like this: data = {'my_map': [[{"key": "symbol", "value": "BTC"}]]}r5tr3– r5tr32025年02月07日 18:29:45 +00:00Commented Feb 7, 2025 at 18:29
-
Same error unfortunately, arrow parses that into the same table structure as in the questionbphi– bphi2025年02月07日 18:50:11 +00:00Commented Feb 7, 2025 at 18:50
lang-py