Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Introspection query used for array with native types #1015

Answered by antonmyronyuk
ipeluffo asked this question in Q&A
Discussion options

Hi all,

I found that asyncpg is running an introspection query which was taking too long causing some issues in a Postgres 12 instance.

After doing some research I found:

  1. Slow introspection when using multiple custom types in a query #530
  2. https://docs.sqlalchemy.org/en/14/dialects/postgresql.html#disabling-the-postgresql-jit-to-improve-enum-datatype-handling
  3. Delay on first access to enum columns using prepared statements #727

However, most (if not all) who complained about this issue use custom types such as enums but in my case, the query is used to introspect array types.

Below is the result of the query run by asyncpg:

+----+----------+--------+----+--------+--------+---------+-------------+-----------+---------+-----+-------------+-----------------+------------------+
|oid |ns |name |kind|basetype|elemtype|elemdelim|range_subtype|attrtypoids|attrnames|depth|basetype_name|elemtype_name |range_subtype_name|
+----+----------+--------+----+--------+--------+---------+-------------+-----------+---------+-----+-------------+-----------------+------------------+
|1043|pg_catalog|varchar |b |null |0 |null |null |null |null |1 |null |- |null |
|2950|pg_catalog|uuid |b |null |0 |null |null |null |null |1 |null |- |null |
|2951|pg_catalog|_uuid |b |null |2950 |, |null |null |null |0 |null |uuid |null |
|1015|pg_catalog|_varchar|b |null |1043 |, |null |null |null |0 |null |character varying|null |
+----+----------+--------+----+--------+--------+---------+-------------+-----------+---------+-----+-------------+-----------------+------------------+

Then, my questions are:

  1. Why does asyncpg need to do introspection of arrays of native types? Is this behaviour expected?
  2. Is it possible to register codecs to prevent asyncpg from running the introspection query?

It's worth to mention that I've seen the suggested idea to disable JIT in PG which is an option I'm considering.

Thanks in advance for any help

You must be logged in to vote

Fix is actually in master branch 5c9986c

Need to release a new version

Replies: 2 comments 1 reply

Comment options

I faced this exact issue with cockroachdb when querying arrays of uuids and arrays of varchars and here's how I got around it. I hope it's helpful for anyone facing this

# original function to get these records
async def set_array_type_codecs(conn):
"""
Cockroachdb sets array types of native types in the same namespace as pg native type in 'pg_catalog.pg_type'
asyncpg assumes that all new composite types are not going to be in there, so it does not have native codecs for arrays of native types, and goes through a very expensive introspection process to infer the types.

This lists all array types in cockroachdb and creates codecs for them and registers them as well.

Some array types have been commented out since they're not used in the code.

Solution loosely based on the one in here https://github.com/MagicStack/asyncpg/issues/413
"""
start = time()
# got the array types by querying:
# """SELECT *
# FROM pg_catalog.pg_type
# where typcategory = 'A';"""
typenames = [
 "_bit",
 "_bool",
 # '_box2d',
 "_bpchar",
 "_bytea",
 "_char",
 "_date",
 "_float4",
 "_float8",
 # '_geography',
 # '_geometry',
 # '_inet',
 "_int2",
 # '_int2vector',
 "_int4",
 "_int8",
 "_interval",
 # "_jsonb",
 "_name",
 "_numeric",
 # '_oid',
 # '_oidvector',
 # '_record',
 # '_regclass',
 # '_regnamespace',
 # '_regproc',
 # '_regprocedure',
 # '_regrole',
 # '_regtype',
 "_text",
 "_time",
 "_timestamp",
 "_timestamptz",
 "_timetz",
 # '_tsquery',
 # '_tsvector',
 "_uuid",
 # "_varbit",
 "_varchar",
]
schema = "pg_catalog"
conn._check_open()
timeout = 5 # seconds
TYPE_BY_NAMES = f"""\
SELECT
 t.oid,
 t.typelem AS elemtype,
 t.typtype AS kind,
 t.typname as name,
 parent.typname as elemtype_name,
 'pg_catalog' as ns,
 parent.typdelim as elemdelim,
 0 as depth,
 NULL as basetype,
 NULL as range_subtype,
 NULL as attrtypoids,
 NULL as attrnames,
 NULL as basetype_name,
 NULL as range_subtype_name
FROM
 pg_catalog.pg_type AS t
LEFT JOIN
 pg_catalog.pg_type as parent on parent.oid = t.typelem
WHERE
 t.typname in {str(tuple(typenames))}
"""
array_types_records = []
query_args = []
array_type = await conn._execute(
 TYPE_BY_NAMES,
 query_args,
 0,
 timeout,
 ignore_custom_codec=True,
)
array_types_records.extend(array_type)
conn._protocol.get_settings().register_data_types(ARRAY_TYPES)
end = time()
print(f"Finished in {end-start} seconds")
@event.listens_for(self.engine.sync_engine, "connect")
def register_custom_types(dbapi_connection, connection_record):
 dbapi_connection.run_async(
 lambda connection: set_array_type_codecs(connection)
 )

And then later made the output in array_types_records as a hardcoded list of dictionaries so that it doesn't spend time querying for them on every connection.

You must be logged in to vote
0 replies
Comment options

@elprans Is this ever going to be fixed? I think there are ~5 issues similar to this

You must be logged in to vote
1 reply
Comment options

Fix is actually in master branch 5c9986c

Need to release a new version

Answer selected by ipeluffo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

AltStyle によって変換されたページ (->オリジナル) /