2

I'm using Postgres' full text search to return results, but it seems any query with 3 characters or fewer returns no results. Is there a way to configure Postgres so that I can query with 2 and 3 character long strings? My data set is extremely small, so performance would not be an issue.

For reference, here's my table definition:

CREATE TABLE doctokens (
 id serial primary key,
 inner_text TEXT NOT NULL,
 tokens tsvector
)
INSERT INTO doctokens (inner_text) VALUES ('Route builders give you a fluent syntax')

After an insert, I create the token tsvector:

UPDATE doctokens SET tokens = setweight(to_tsvector('english', COALESCE(inner_text, '')), 'A')

And here's how I'm querying the table:

SELECT ts_headline('english', inner_text, query, 'StartSel = <em>, StopSel = </em>, ShortWord = 1') as inner_text_highlights
FROM (SELECT inner_text, ts_rank_cd(tokens, query) AS rank, query
 FROM doctokens, plainto_tsquery('english', 'rou') AS query
 WHERE tokens @@ query
 ORDER BY rank DESC
 LIMIT 5) AS query_results;
asked Sep 2, 2019 at 19:14
1
  • I have other rows that contain something like Routing (inserted the same way as above), and searching for routi does not yield any results, either. So, maybe what I'm running into is just a limitation of the capabilities of Postgres' full text search? Commented Sep 2, 2019 at 20:34

1 Answer 1

2

'route' and 'routing' stem to 'rout'. Neither 'rou' nor 'routi' do. So, they don't match.

You can use the prefix match to make 'rou' match, but plainto_tsquery doesn't understand that so you have to make the tsquery yourself:

SELECT ts_headline('english', inner_text, query, 'StartSel = <em>, StopSel = </em>, ShortWord = 1') as inner_text_highlights
FROM (SELECT inner_text, ts_rank_cd(tokens, query) AS rank, query
 FROM doctokens, (select 'rou:*'::tsquery as query ) AS query
 WHERE tokens @@ query
 ORDER BY rank DESC
 LIMIT 5) AS query_results;
 inner_text_highlights 
--------------------------------------------------
 <em>Route</em> builders give you a fluent syntax
 Does <em>Routing</em> rhyme with looting?

Prefix matching won't let you match 'routi:*', because the things it would match against get the 'i' stripped off when stemming.

answered Sep 2, 2019 at 20:59
5
  • If I wanted to get the benefits of both search methods, should I just have two separate queries in my SELECT - once with plainto_tsquery() and one with prefix matching? Commented Sep 2, 2019 at 21:53
  • @Dave You could do that, probably by applying a UNION ALL (rather than an OR). But I think that the ts_rank_cd function, executed on two different query strings, might not return comparable results, so you might not want to sort them together. I'd probably just execute the plainto_tsquery one first, and then if it returned no rows, try the other one. (Or give the user a checkbox, 'plain' or 'advanced') Commented Sep 2, 2019 at 22:05
  • Is there any way to make routi also match, or is that not really possible with tsquery? I know I could resort to using WHERE inner_text LIKE '%routi%', but was curious if there is any way to make it work without using LIKE. Commented Sep 2, 2019 at 22:20
  • You could use a configuration which doesn't do stemming, like 'simple' rather than 'english'. But at that point, I don't know if you are really doing full text search anymore. If is really important, you could keep two tsvector columns or FTS indexes, with different configurations to choose between them. Or one FTS and one pg_trgm: You can search with LIKE and rank with <->, but pg_trgm doesn't offer something like ts_headline. Commented Sep 3, 2019 at 1:11
  • Thanks. That's the route I ultimately took - I created an English tsvector and a simple tsvector at insertion. Then, I query the English lexemes with plainto_tsquery('english', :query) and the simple lexemes using an approach similar to what you laid out, union the two data sets together, and grab the top 5 with the highest rank. Commented Sep 3, 2019 at 14:50

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.