sql - delete duplicate rows but keep preferred row -
i have simple database table
create table demo ( id integer primary key, fv integer, sv text, rel_id integer, foreign key (rel_id) references demo(id));
and want delete duplicate rows grouped fv
, sv
. fairly popular question great answers. need twist on scenario. in cases rel_id
null
want keep row. in other case goes.
so using following values
insert demo (id,fv,sv,rel_id) values (1,1,'somestring',null), (2,2,'somemorestring',1), (3,1,'anotherstring',null), (4,2,'somemorestring',3), (5,1,'somestring',3)
either
id | fv | sv | rel_id ---+----+------------------+------- 1 | 1 | 'somestring' | null 2 | 2 | 'somemorestring' | 1 3 | 1 | 'anotherstring' | null
or
id | fv | sv | rel_id ---+----+------------------+------- 1 | 1 | 'somestring' | null 3 | 1 | 'anotherstring' | null 4 | 2 | 'somemorestring' | 3
would valid results. as
id | fv | sv | rel_id ---+----+------------------+------- 3 | 1 | 'anotherstring' | null 4 | 2 | 'somemorestring' | 3 5 | 1 | 'somestring' | 3
would not be. first entry had null
rel_id
takes presidency above not null
.
i have (which answer on basic duplicate question) query remove duplicates not sure how continue modify query fit needs.
delete demo id not in (select min(id) id demo group fv,sv)
as not null
entry inserted database before null
entry not null
1 not deleted. guaranteed rel_id
point entry rel_id
null
therefore there no danger of deleting referenced entry. further guaranteed there no 2 rows in same group rel_id null
. therefore row rel_id null
unique whole table.
or basic algorithm:
- go on rows , group them
fv
,sv
- look each group row
rel_id null
. if there keep row (and delete rest). else pick 1 row of choice , delete rest.
i seem have worked out
delete demo id not in (select min(id) id demo out_buff rel_id null or not exists (select id demo in_buff rel_id null , in_buff.fv = out_buff.fv , in_buff.sv = out_buff.sv) group fv,sv);
by selecting in inner select
either row rel_id
value null
or rows matching on group by
arguments, using anti-condition existence of row rel_id null
. query looks ineffective. naive assumption put running time @ at least o(n^2)
.
Comments
Post a Comment