Here is my table definition
-- Table: public.my_table
-- DROP TABLE public.my_table;
CREATE TABLE public.my_table
(
my_row_id text COLLATE pg_catalog."default" NOT NULL,
my_item_id text COLLATE pg_catalog."default" NOT NULL,
my_timestamp timestamp with time zone NOT NULL DEFAULT now(),
-- ...... some other data
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
ALTER TABLE public.my_table
OWNER to my_db;
The cardinality of the table is about 10-20 million rows
Here is the query I want to speed up
select my_row_id, my_item_id
from my_table
where my_timestamp > timestamp '2018-09-13 00:00:00'
order by my_timestamp desc
I am tempted to just index the timestamp, but I worry that since the precision of timestamp is millisecond, this means the the cardinality of the index itself will be almost as large as the cardinality of the table (eg. the number of rows will be equal to the number of index entry)
Should I have custom index that index with the precision of a day?
Like store an extra column text
2018年08月17日
from timestamp with timezone
2018年08月17日 07:06:43.508778
And index the text instead?
-
1Indexed are based on your requirement, if you query frequently based on timestamp column, then its mandatory to make index. If it is too huge user partition.TheDataGuy– TheDataGuy2018年08月19日 10:46:38 +00:00Commented Aug 19, 2018 at 10:46
1 Answer 1
For the query you give, the optimal general index would be on (my_timestamp, my_row_id, my_item_id)
, with the order of the last two columns possibly switched. This will allow an index-only scan to be used which possibly doesn't visit the table at all, since all columns you reference would be available in the index. How much of a performance improvement this would give you depends on how much of your table meets the timestamp WHERE condition, and how well-vacuumed you keep the table. You might do even better with a partial index if the timestamp hardcoded into your query never changes, or changes in a very rote manner.
Like store an extra column text 2018年08月17日 from timestamp with timezone 2018年08月17日 07:06:43.508778
There is little reason to think that this would be helpful in general, and certainly wouldn't help with the query you show (you would have to rewrite the query before it could use such an index, and then what happens if the timestamp is exactly '2018-09-13 00:00:00' ?)