In this answer (https://stackoverflow.com/questions/517579/strings-as-primary-keys-in-sql-database) a single remark caught my eye:
Also keep in mind that there's often a very big difference between a CHAR and a VARCHAR when doing index comparisons
Does this apply / still apply for Postgres?
I found pages on Oracle claiming that CHAR
is more or less an alias for VARCHAR
and so index performance is the same, but I found nothing definitive on Postgres.
4 Answers 4
CHAR
and VARCHAR
are implemented exactly the same in Postgres (and Oracle). There is no difference in speed when using those data types.
However, there is one difference that can make a difference in performance: a char
column is always padded to the defined length. So if you define a column as char(100)
and one as varchar(100)
but only store 10 characters in each, the char(100)
column uses 100 characters for each value (the 10 characters you stored, plus 90 spaces), whereas the varchar
column only stores 10 characters.
Comparing 100 character with 100 characters is going to be slower than comparing 10 characters with 10 characters - although I doubt you can actually measure this difference in a SQL query.
If you declare both with the length of 10 characters and always store exactly 10 characters in them, then there is absolutely no difference whatsoever (this is true for Oracle and Postgres)
So the only difference is the padding that is done for the char
data type.
Also keep in mind that there's often a very big difference between a CHAR and a VARCHAR when doing index comparisons
The above quote is only true if (and only if) the char
column is defined too wide (i.e. you are wasting space due to padding). If the length of the char
column is always used completely (so no padding occurs), then the above quote is wrong (at least for Postgres and Oracle)
From my point of view, the char
data type does not really have any real-word use. Just use varchar
(or text
in Postgres) and forget that char
exists.
-
4Comparing 100 character with 100 characters is going to be slower than comparing 10 characters with 10 characters - although I doubt you can actually measure this difference in a SQL query. – Depending on what the query does in addition to sorting, the difference can be huge. That’s why Postgres 9.5 has a new "abbreviated keys" feature: pgeoghegan.blogspot.de/2015/01/…chirlu– chirlu2016年01月12日 23:57:38 +00:00Commented Jan 12, 2016 at 23:57
I agree with everything said by a_horse_with_no_name, and I generally agree with Erwin's comment advice:
No, char is inferior (and outdated). text and varchar perform (almost) the same.
Metadata
With one minor exception, the only time I use char()
is when I want the meta-data to say this MUST have have x-characters. Though I know that char()
only complains if the input is over the limit, I'll frequently protect against underruns in a CHECK
constraint. For example,
CREATE TABLE foo (
x char(10) CHECK ( length(x) = 10 )
);
INSERT INTO foo VALUES (repeat('x', 9));
I do this for a few reasons,
char(x)
is sometimes inferred with schema-loaders as being a fixed-width column. This may make a difference in a language that is optimized for fixed-width strings.- It establishes a convention that makes sense and is easily enforced. I can write a schema-loader in a language to generate code from this convention.
Need an example of where I may do this,
- Two-letter state abbreviations, though because this list can be enumerated, I'll typically do it with an
ENUM
. - Vehicle Identification Numbers
- Model Numbers (of fixed size)
On errors
Notice some people may be uncomfortable with the incongruity of error messages on both sides of the limit, but it doesn't bother me
test=# INSERT INTO foo VALUES (repeat('x', 9));
ERROR: new row for relation "foo" violates check constraint "foo_x_check"
DETAIL: Failing row contains (xxxxxxxxx ).
test=# INSERT INTO foo VALUES (repeat('x', 11));
ERROR: value too long for type character(10)
Contrast with varchar
Moreover, I think the above suggestion fits really well with a convention of almost always use text
. You ask about varchar(n)
too. I never use that. At least, I can't remember the last time I used varchar(n)
.
- If a spec has a static-width field that I trust, I use
char(n)
, - Otherwise, I use
text
which is effectivelyvarchar
(no limit)
If I found a spec that had variable-length text-keys that were meaningful and that I trusted to have a constant max-length, I would use varchar(n)
too. However, I can't think of anything that fits that criteria.
Additional notes
char
here is not to be confused with"char"
which is a one-byte type and has solid performance and space-saving benefits.
Related Q & A:
-
1And then there is Don't use char(n) even for fixed-length identifiersuser1822– user18222020年08月27日 12:07:30 +00:00Commented Aug 27, 2020 at 12:07
Postgresql
sales_reporting_db=# create table x (y char(2));
CREATE TABLE
sales_reporting_db=# insert into x values ('Y');
INSERT 0 1
sales_reporting_db=# select '*' || y || '*' from x;
?column?
----------
*Y*
Oracle
SQL> create table x ( y char(2));
Table created.
SQL> insert into x values ('Y');
1 row created.
SQL> select '*' || y || '*' from x;
'*'|
----
*Y *
Postgresql did not pad with spaces.
-
1That's just an optical illusion in Postgres. Try
SELECT pg_column_size(y) FROM x;
András Váczi– András Váczi2018年05月20日 10:17:10 +00:00Commented May 20, 2018 at 10:17
I found this most useful, and a fast 3 line explanation:
From CHAR(n) Vs VARCHAR(N) Vs Text In Postgres
- If you want to store some text with an unknown length, use the
TEXT
data type.- If you want to store some text with an unknown length, but you know the maximum length, use
VARCHAR(n)
.- If you want to store some text with a known exact length, use
CHAR(N)
.
-
2I was wondering why this got voted down. I think it's because check constraints and the like are considered a better way of enforcing string length because there is no penalty for changing them later. Due to that, most people suggest always using TEXT type.ChetPrickles– ChetPrickles2020年05月15日 01:03:54 +00:00Commented May 15, 2020 at 1:03