sql - delete duplicate rows but keep preferred row -


i have simple database table

create table demo (   id     integer primary key,   fv     integer,   sv     text,   rel_id integer,   foreign key (rel_id)   references demo(id)); 

and want delete duplicate rows grouped fv , sv. fairly popular question great answers. need twist on scenario. in cases rel_id null want keep row. in other case goes.

so using following values

insert demo (id,fv,sv,rel_id) values (1,1,'somestring',null),        (2,2,'somemorestring',1),        (3,1,'anotherstring',null),        (4,2,'somemorestring',3),        (5,1,'somestring',3) 

either

id | fv |     sv           | rel_id ---+----+------------------+-------  1 | 1  | 'somestring'     | null  2 | 2  | 'somemorestring' | 1  3 | 1  | 'anotherstring'  | null 

or

id | fv |     sv           | rel_id ---+----+------------------+-------  1 | 1  | 'somestring'     | null  3 | 1  | 'anotherstring'  | null  4 | 2  | 'somemorestring' | 3 

would valid results. as

id | fv |     sv           | rel_id ---+----+------------------+-------  3 | 1  | 'anotherstring'  | null  4 | 2  | 'somemorestring' | 3  5 | 1  | 'somestring'     | 3 

would not be. first entry had null rel_id takes presidency above not null.

i have (which answer on basic duplicate question) query remove duplicates not sure how continue modify query fit needs.

delete demo  id not in (select min(id) id                   demo                   group fv,sv) 

as not null entry inserted database before null entry not null 1 not deleted. guaranteed rel_id point entry rel_id null therefore there no danger of deleting referenced entry. further guaranteed there no 2 rows in same group rel_id null. therefore row rel_id null unique whole table.

or basic algorithm:

  1. go on rows , group them fv , sv
  2. look each group row rel_id null. if there keep row (and delete rest). else pick 1 row of choice , delete rest.

sqlfiddle

i seem have worked out

delete demo      id not in (select min(id) id                       demo out_buff                       rel_id null or                             not exists (select id demo in_buff                                         rel_id null ,                                               in_buff.fv = out_buff.fv ,                                               in_buff.sv = out_buff.sv)                       group fv,sv); 

by selecting in inner select either row rel_id value null or rows matching on group by arguments, using anti-condition existence of row rel_id null. query looks ineffective. naive assumption put running time @ at least o(n^2).


Comments

Popular posts from this blog

node.js - Mongoose: Cast to ObjectId failed for value on newly created object after setting the value -

[C++][SFML 2.2] Strange Performance Issues - Moving Mouse Lowers CPU Usage -

gradle error "Cannot convert the provided notation to a File or URI" -