0

In Pyiceberg 0.10.0, it is now possible to use a botocore session for a rest catalog, so:

import io
import os
import pandas as pd
import pyarrow as pa
from boto3 import Session
from pyiceberg.catalog import load_catalog
boto3_session = Session(profile_name='a_profile', region_name='us-east-1')
catalog = load_catalog(
 "catalog",
 type="rest",
 botocore_session=boto3_session._session,
 warehouse="arn:aws:s3tables:us-east-1:XXXXXXXXXXX:bucket/a_bucket",
 uri=f"https://s3tables.us-east-1.amazonaws.com/iceberg",
 **{
 "rest.sigv4-enabled": "true",
 "rest.signing-name": "s3tables",
 "rest.signing-region": "us-east-1"
 })
table = catalog.load_table("namespace.a_table")
json_string = "[{\"data\":\"000000000000\", ...}]"
df = pd.read_json(io.StringIO(json_string), orient='records')
arrow_table = pa.Table.from_pandas(df=df, schema=table.schema().as_arrow())
table.overwrite(arrow_table)

It works until we overwrite:

OSError: When reading information for key 'metadata/snap-6778585584222594295-0-3ae9518f-fd1c-488f-b3d2-4ca1724317a1.avro' in bucket '2c8e7acb-67a1-4dc9-8ym9eg38966b8bazzfjn487w5o9wruse1b--table-s3': AWS Error UNKNOWN (HTTP status 400) during HeadObject operation: No response body.

To "fix" it, we can do:

boto3_session = Session(profile_name='a_profile', region_name='us-east-1')
catalog = load_catalog(
 "catalog",
 type="rest",
 botocore_session=boto3_session._session,
 warehouse="arn:aws:s3tables:us-east-1:XXXXXXXXXXX:bucket/a_bucket",
 uri=f"https://s3tables.us-east-1.amazonaws.com/iceberg",
 **{
 "rest.sigv4-enabled": "true",
 "rest.signing-name": "s3tables",
 "rest.signing-region": "us-east-1"
 })
table = catalog.load_table("namespace.a_table")
json_string = "[{\"data\":\"000000000000\", ...}]"
df = pd.read_json(io.StringIO(json_string), orient='records')
arrow_table = pa.Table.from_pandas(df=df, schema=table.schema().as_arrow())
credentials = boto3_session.get_credentials().get_frozen_credentials()
os.environ["AWS_ACCESS_KEY_ID"] = credentials.access_key
os.environ["AWS_SECRET_ACCESS_KEY"] = credentials.secret_key
if credentials.token:
 os.environ["AWS_SESSION_TOKEN"] = credentials.token
table.overwrite(arrow_table)

which works but defeats the purpose.

We can access .schema() and such. So it seems the overwrite method is not using the proper SigV4Adapter (pyiceberg/catalog/rest/init.py).

I am not able to fix it. I'd like to not need the environments to access iceberg tables from s3tables buckets.

asked Oct 22, 2025 at 21:52
2
  • Did you manage to solve the issue? I have the same issue trying to use refreshable credentials. Somehow pyiceberg needs the credentials (key/token) to work properly - related issue: using botocore Commented Dec 9, 2025 at 8:41
  • No, it's still an issue: github.com/apache/iceberg-python/issues/2657 Commented Dec 9, 2025 at 13:53

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.