본문 바로가기

DB/Oracle

START WITH and CONNECT BY in Oracle SQL

START WITH and CONNECT BY in Oracle SQL



select
... start with initial-condition connect by nocycle recurse-condition select ... connect by recurse-condition select ... start with initial-condition connect by nocycle recurse-condition select ... connect by recurse-condition
start with .. connect by문장은 계층적 관계를 가진 데이터베이스를 select할때 사용되는 문장이다. (보통 parent -> child나 boss->employee나 thing->parts의 관계에 주로 이용된다.)
execution plain이나 pxplained 쿼리를 이용할때 주로 이용하는 문장이다.
recurse-condition은 prior키워드의 사용으로 만든다.

connect by
  prior foo = bar
이것은 recursion을 구성한다. 모든 레코드는 다음 하위 계층구조 레벨의 부분을 가진다고 할때 bar=foo를 찾는다. foo가 현재 계층 레벨에 발견된 레벨이 된다.

A simple example

다음 예제에서 이러한 attributes의 집합으로 선택된 데이터를 보여준다. parent와 child 구조를 검색. child는 테이블에서 유니크 제약조건을 전제로 한다. 이것은 이것은 즉 자식은 2부모를 가질 수 없다는 것을의미한다.

테이블에 들어간 데이터는 동일한 부모를 가진 자식 테이블들과, 부모 데이터로 구성된다.
set feedback off

create table test_connect_by (
  parent     number,
  child      number,
  constraint uq_tcb unique (child)
);
5 = 2+3
insert into test_connect_by values ( 5, 2);
insert into test_connect_by values ( 5, 3);

18 = 11+7
insert into test_connect_by values (18,11);
insert into test_connect_by values (18, 7);

17 = 9+8
insert into test_connect_by values (17, 9);
insert into test_connect_by values (17, 8);
26 = 13+1+12
insert into test_connect_by values (26,13);
insert into test_connect_by values (26, 1);
insert into test_connect_by values (26,12);
15=10+5
insert into test_connect_by values (15,10);
insert into test_connect_by values (15, 5);
38=15+17+6
insert into test_connect_by values (38,15);
insert into test_connect_by values (38,17);
insert into test_connect_by values (38, 6);
38, 26 and 18 have no parents (the parent is null)
insert into test_connect_by values (null, 38);
insert into test_connect_by values (null, 26);
insert into test_connect_by values (null, 18);
이제 계층구조의 데이터를 셀렉트 해보자.
select lpad(' ',2*(level-1)) || to_char(child) s 
  from test_connect_by 
  start with parent is null
  connect by prior child = parent;
This select statement results in:
38
  15
    10
    5
      2
      3
  17
    9
    8
  6
26
  13
  1
  12
18
  11
  7

Interpreting connect by statements

어떻게 start with ... connect by 셀렉트문이 읽고, 인터프리터 하는가? 만약 오라클이 그러한 SQL문장을 만나게 되면, 다음 수도코드에 보여지는 부분을 처리하게 된다.
for rec in (select * from some_table) loop
  if FULLFILLS_START_WITH_CONDITION(rec) then
    RECURSE(rec, rec.child);
  end if;
end loop;

procedure RECURSE (rec in MATCHES_SELECT_STMT, parent_id IN field_type) is
  begin
  APPEND_RESULT_LIST(rec);     
  for rec_recurse in (select * from some_table) loop
    if FULLFILLS_CONNECT_BY_CONDITION(rec_recurse.id, parent_id) then
      RECURSE(rec_recurse,rec_recurse.id);
    end if;
  end loop;
end procedure RECURSE;
Thanks to Frank Trenkamp who spotted an error in the logic in the above pseudo code and corrected it.
Thanks also to Abhishek Ghose who made me think about a better way to describe the logic.

Pruning branches (가지치기)

가끔, 계층형 트리에서 일정 부분만을 요구할 때가 있다. 그리고 가지 치기를 원한다. 여기에는 데이터 트리가 채워져 있다. 각 차일드는 오른쪽에 새로운 숫자를 추가해서 부모에 추가된 것의 숫자이다.
create table prune_test (
  parent  number,
  child   number
);

insert into prune_test values (null,   1);
insert into prune_test values (null,   6);
insert into prune_test values (null,   7);

insert into prune_test values (   1,  12);
insert into prune_test values (   1,  14);
insert into prune_test values (   1,  15);

insert into prune_test values (   6,  61);
insert into prune_test values (   6,  63);
insert into prune_test values (   6,  65);
insert into prune_test values (   6,  69);

insert into prune_test values (   7,  71);
insert into prune_test values (   7,  74);

insert into prune_test values (  12, 120);
insert into prune_test values (  12, 124);
insert into prune_test values (  12, 127);

insert into prune_test values (  65, 653);

insert into prune_test values (  71, 712);
insert into prune_test values (  71, 713);
insert into prune_test values (  71, 715);

insert into prune_test values (  74, 744);
insert into prune_test values (  74, 746);
insert into prune_test values (  74, 748);

insert into prune_test values ( 712,7122);
insert into prune_test values ( 712,7125);
insert into prune_test values ( 712,7127);

insert into prune_test values ( 748,7481);
insert into prune_test values ( 748,7483);
insert into prune_test values ( 748,7487);

이제 트리를 참조할 수 있게 되었다. 그러나 1과 71에 모든 브렌치를 제거하고자 한다. 이것은 connect by문장을 포함하고 있다.
select
  lpad(' ', 2*level) || child
from
  prune_test
start with
  parent is null
connect by
  prior child=parent 
  and parent not in (1, 71); 
   
This returns:
  1
  6
    61
    63
    65
      653
    69
  7
    71
    74
      744
      746
      748
        7481
        7483
        7487
See also another example for pruning.

Do two items stand in a ancestor descendant relationship

Sometimes, one want's to know if two items are in an ancestor descendant relationship, that is if XYZ as grandfather, or grand-grandfather, or ... of ABC. The following template of a query can be used to determine that.
set feedback off

drop table parent_child;

create table parent_child(parent_ varchar2(20), child_ varchar2(20));

insert into parent_child values (null,  'a')

insert into parent_child values (  'a',  'af');
insert into parent_child values (  'a',  'ab');
insert into parent_child values (  'a',  'ax');

insert into parent_child values ( 'ab', 'abc');
insert into parent_child values ( 'ab', 'abd');
insert into parent_child values ( 'ab', 'abe');

insert into parent_child values ('abe','abes');
insert into parent_child values ('abe','abet');

insert into parent_child values ( null,   'b');

insert into parent_child values (  'b',  'bg');
insert into parent_child values (  'b',  'bh');
insert into parent_child values (  'b',  'bi');

insert into parent_child values ( 'bi', 'biq');
insert into parent_child values ( 'bi', 'biv');
insert into parent_child values ( 'bi', 'biw');
The following query 'asks' for a parent and a supposed child (grand child, grand grand child) and answers the question if the are indeed in an ancester successor relationship.
set verify off

select
  case when count(*) > 0 then
    '&&parent is an ancestor of &&child' else
    '&&parent is no ancestor of &&child' end 
    "And here's the answer"
from
  parent_child
where
  child_ = '&&child'
start with
  parent_ = '&&parent'
connect by 
  prior child_ = parent_;

undefine child
undefine parent