We use PostgreSQL 9.1.7 on Ubuntu Linux 12.04 on a master server and PostgreSQL 9.1.7 on FreeBSD 9.0-RELEASE on a replica server. The replica and master servers return different results on the same SQL query. A query plan shows that an index (a BTree one, we do not use hash indexes at all) is used to get the result so it looks like an index is in inconsistent or incomplete state on the replica server. The query on the master server:
db1=# select id from users where email='[email protected]';
id
---------
1698116
(1 row)
db1=#
The query on the replica server:
db1=> select id from users where email='[email protected]';
id
----
(0 rows)
db1=> select created_at from users where id=1698116;
created_at
----------------------------
2013年03月04日 10:40:05.221214
(1 row)
db1=>
As you can see the replica DB already contains a user with proper ID so the data is in place but just not indexed yet for some reason. We double checked the replica was in receiving/reapplying state so this was not a temporary outage. The user never got indexed. We also used to experience similar problems with PostgreSQL 9.0 on CentOS 5.6 so we don't think this is something FreeBSD- or PostgreSQL 9.1-specific.
We use the replica server to run lots of heavy SQL queries, can this be a root of the problem? Anyway how can we efficiently detect and prevent situations like this in the future? The replica was not down today and there was no single error line in logs so we detected this inconsistency only by occasion.
1 Answer 1
Assuming the locale of the database is en_GB.UTF-8
both in Ubuntu (master) and FreeBSD (slave), I believe the differences in sort semantics alone may account for the fact that the index is unusable on the slave.
Here's an example of how they sort differently:
On Ubuntu 12.04:
$ export LANG=en_GB.UTF-8
$ cat >file
"0102"
0102
$ sort file
0102
"0102"
On FreeBSD 9.0-RELEASE:
$ export LANG=en_GB.UTF-8
$ cat >file
"0102"
0102
$ sort file
"0102"
0102
This shows that the same locale orders differently the two strings "0102"
and 0102
(even though they do not even contain any character outside the US-ASCII set...)
Here's a test that I suggest you try on your own dataset:
On the master:
$ psql -d dbname -Atc 'select email from users' | LC_COLLATE=en_GB.UTF-8 >email.master
On the slave:
$ psql -d dbname -Atc 'select email from users' | LC_COLLATE=en_GB.UTF-8 >email.slave
Now compare email.master
and email.slave
with diff or cmp. I suspect you'll find that they are not identical. In which case it demonstrates that the index replica can't be used, since its build rules on the master differ from the scanning rules on the slave.
-
1Yes, if you replicate to an OS with a different locale implementation, things will break badly.Peter Eisentraut– Peter Eisentraut2013年03月04日 16:49:22 +00:00Commented Mar 4, 2013 at 16:49
Explore related questions
See similar questions with these tags.
locale
is used and can it be trusted to compare strings exactly the same between linux and freebsd? If it can't, that may cause the sort of problem you're describing.LANG
set toC
and the master host hasLANG
set toen_GB.UTF-8
. Most other rows in theusers
table can be found by theiremail
fields on the replica DB without a problem (well we did not check every other row just a couple of them).