Get the maximum column sizes for each NUMERIC columns in all tables in postgresql

Question 1

I'm trying to calculate the maximum length of NUMERIC columns in a postgres db. There are a number of tables in the db, and most of those tables contain a number of numeric columns.

I'm importing a fairly large number of json data into the database. SQLModel or pydantic fails to insert numeric fields if the destination column precision/scale is less than that of the input. For now, I'm seeding data to generic NUMERIC(16,5) columns, but I'd like to reduce storage space by optimizing column sizes. (mine is a semi-readonly dataset, the column sizes won't differ much in future)

For reference, following is my abortive stab at solving the problem...

SELECT
 table_schema,
 TABLE_NAME,
 COLUMN_NAME,
 (
 xpath (
 '/row/max/text()',
 query_to_xml (
 format (
 'SELECT LENGTH ( CAST ( MAX ( %I ) AS CHARACTER VARYING ( 40 ) ) ) from %I.%I',
 COLUMN_NAME,
 table_schema,
 TABLE_NAME
 ),
 TRUE,
 TRUE,
 ''
 )
 )
 ) [ 1 ] :: TEXT :: INT AS max_length
FROM
 information_schema.COLUMNS
WHERE
 table_schema = 'public'
 AND data_type = 'numeric'
ORDER BY
 table_schema,
 TABLE_NAME,
 COLUMN_NAME;

Even better would be to split the max column lengths into precision & scale.

Question 2

The type modifiers don't influence the storage space. If you store 3.14 in numeric(3,2) or in numeric(100,50) doesn't matter space-wise. The more digits, the more space the number takes.

Question 3

Since the numbers coming from external source - the best place to check those numbers are outside of the database. Use any language, scripting or compiled, to do the scan of JSON files. You can also use tools like jq to extract the field and find maximum value.

The PostgreSql by itself can handle numeric values for up to ridiculous amounts.

up to 131072 digits before the decimal point; up to 16383 digits after the decimal point

But if numeric(16,5) does work for you now, then you can replace it with bigint datatype and store the value as integer. Just do not forget to divide it on a client by 10e+5 before displaying. Speed wise you will get improvement - the size of bigint is just 8 bytes and it supported natively by 64bit processors.

Splitting value to two integers can be also possible, but if your values really require (16,5), then integer data type can be not enough for the integer part of value. The 4 byte integer cannot hold 10e+11 values, so you will need to go to bigint anyway.

In the sample code you showed, you are working with XML, but previously you stated, that data comes as JSON. So which one is it? If you want, to parse the JSON blobs in the database, you can use JSON functions. Or was it a mistake and you data actually comes in XML format?

Question 4

the input data is JSON. the code shown above is stolen from dba.stackexchange.com/a/215809/273417 ... unfortunately I'm not familiar with XML function and wasn't able to adopt it for NUMERIC data-types...

Question 5

Alright, I got it working. Fixed the column name (length):

SELECT
 table_schema,
 TABLE_NAME,
 COLUMN_NAME,
 (
 xpath (
 '/row/length/text()',
 query_to_xml (
 format (
 'SELECT LENGTH ( CAST ( MAX ( %I ) AS CHARACTER VARYING ( 40 ) ) ) from %I.%I',
 COLUMN_NAME,
 table_schema,
 TABLE_NAME
 ),
 TRUE,
 TRUE,
 ''
 )
 )
 ) [ 1 ] :: TEXT :: INT AS max_length
FROM
 information_schema.COLUMNS
WHERE
 table_schema = 'public'
 AND data_type = 'numeric'
ORDER BY
 table_schema,
 TABLE_NAME,
 COLUMN_NAME;

Is there a way to split the above values into numeric scale and precision?

White Owl White Owl 1,0293 silver badges9 bronze badges · Answer 1 · 2023-05-18 12:50:46Z

Since the numbers coming from external source - the best place to check those numbers are outside of the database. Use any language, scripting or compiled, to do the scan of JSON files. You can also use tools like jq to extract the field and find maximum value.

The PostgreSql by itself can handle numeric values for up to ridiculous amounts.

up to 131072 digits before the decimal point; up to 16383 digits after the decimal point

But if numeric(16,5) does work for you now, then you can replace it with bigint datatype and store the value as integer. Just do not forget to divide it on a client by 10e+5 before displaying. Speed wise you will get improvement - the size of bigint is just 8 bytes and it supported natively by 64bit processors.

Splitting value to two integers can be also possible, but if your values really require (16,5), then integer data type can be not enough for the integer part of value. The 4 byte integer cannot hold 10e+11 values, so you will need to go to bigint anyway.

In the sample code you showed, you are working with XML, but previously you stated, that data comes as JSON. So which one is it? If you want, to parse the JSON blobs in the database, you can use JSON functions. Or was it a mistake and you data actually comes in XML format?

the input data is JSON. the code shown above is stolen from dba.stackexchange.com/a/215809/273417 ... unfortunately I'm not familiar with XML function and wasn't able to adopt it for NUMERIC data-types...

masroore masroore 1315 bronze badges · Answer 2 · 2023-05-20 03:42:42Z

Alright, I got it working. Fixed the column name (length):

SELECT
 table_schema,
 TABLE_NAME,
 COLUMN_NAME,
 (
 xpath (
 '/row/length/text()',
 query_to_xml (
 format (
 'SELECT LENGTH ( CAST ( MAX ( %I ) AS CHARACTER VARYING ( 40 ) ) ) from %I.%I',
 COLUMN_NAME,
 table_schema,
 TABLE_NAME
 ),
 TRUE,
 TRUE,
 ''
 )
 )
 ) [ 1 ] :: TEXT :: INT AS max_length
FROM
 information_schema.COLUMNS
WHERE
 table_schema = 'public'
 AND data_type = 'numeric'
ORDER BY
 table_schema,
 TABLE_NAME,
 COLUMN_NAME;

Is there a way to split the above values into numeric scale and precision?

Stack Exchange Network

Get the maximum column sizes for each NUMERIC columns in all tables in postgresql

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Get the maximum column sizes for each NUMERIC columns in all tables in postgresql

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions