I have the following table with data as shown below; linenr
s are monotonically increasing but not necessarily consecutive; when the key
field contains an ellipsis ...
that indicates an entry that is continued from above:
create table source (
linenr integer unique not null,
key text not null,
value text );
insert into source values
( 2, 'tags', 'a' ),
( 3, '...', 'b' ),
( 4, 'title', 'The Title' ),
( 5, 'note', 'this is' ),
( 6, '...', 'an EXAMPLE' ),
( 8, 'title', 'over' ),
( 9, '...', 'three' ),
( 10, '...', 'lines' ),
( 11, 'about', 'grouping' );
Now I want to have a view where a group number is assigned based on the content of the key
field; group numbers do not have to be consecutive but should be distinct for each group of rows that starts with a key other than ...
and continues thru all lines where key
is ...
, like this:
╔════════╤═══════╤═══════╤════════════╗
║ linenr │ group │ key │ value ║
╠════════╪═══════╪═══════╪════════════╣
║ 2 │ 1 │ tags │ a ║
║ 3 │ 1 │ ... │ b ║
║ 4 │ 2 │ title │ The Title ║
║ 5 │ 3 │ note │ this is ║
║ 6 │ 3 │ ... │ an EXAMPLE ║
║ 8 │ 4 │ title │ over ║
║ 9 │ 4 │ ... │ three ║
║ 10 │ 4 │ ... │ lines ║
║ 11 │ 5 │ about │ grouping ║
╚════════╧═══════╧═══════╧════════════╝
I tried to do this with windows / partitions and the tabibitosan pattern, but haven't been able to come up with anything working; also, using at the preceding row with lag()
doesn't help where there are multiple consecutive lines with ...
. In a spreadsheet this is an extremely easy thing to do but in SQL I can seemingly not refer to the previous row of the current query, can I?
2 Answers 2
create table source ( linenr integer unique not null, key text not null, value text ); insert into source values ( 2, 'tags', 'a' ), ( 3, '...', 'b' ), ( 4, 'title', 'The Title' ), ( 5, 'note', 'this is' ), ( 6, '...', 'an EXAMPLE' ), ( 8, 'title', 'over' ), ( 9, '...', 'three' ), ( 10, '...', 'lines' ), ( 11, 'about', 'grouping' );
You can get it using a simple CASE.
select linenr, key, value, case when key <> '...' then 1 end as rst from source;
linenr | key | value | rst -----: | :---- | :--------- | ---: 2 | tags | a | 1 3 | ... | b | null 4 | title | The Title | 1 5 | note | this is | 1 6 | ... | an EXAMPLE | null 8 | title | over | 1 9 | ... | three | null 10 | ... | lines | null 11 | about | grouping | 1
Now SUM(rst) ignores null values and do the trick.
select linenr, key, value, sum(rst) over (order by linenr) grp from ( select linenr, key, value, case when key <> '...' then 1 end as rst from source ) x;
linenr | key | value | grp -----: | :---- | :--------- | --: 2 | tags | a | 1 3 | ... | b | 1 4 | title | The Title | 2 5 | note | this is | 3 6 | ... | an EXAMPLE | 3 8 | title | over | 4 9 | ... | three | 4 10 | ... | lines | 4 11 | about | grouping | 5
dbfiddle here
-
I'm glad to be able to help.McNets– McNets2018年02月05日 21:48:37 +00:00Commented Feb 5, 2018 at 21:48
Cheaper and simpler with a single window function:
SELECT *
, count(key <> '...' OR NULL) OVER (ORDER BY linenr) AS grp
FROM source;
dbfiddle here
Related:
-
!!! this. To be sure, the accepted solution is probably more instructive to the neophyte; OTOH the lazy evaluation part of
key != '...' or null
is something useful to keep in mind. Problem solved with a crisp, performant one-liner, what's not to like?John Frazer– John Frazer2018年02月06日 10:24:29 +00:00Commented Feb 6, 2018 at 10:24
Explore related questions
See similar questions with these tags.