Try it here
Subscribe
Oracle SQL Hierarchical Queries

Hierarchical Queries

hierarchical_queries

What is mean by Hierarchy?

Hierarchy is nothing but the parent child relationship within same table or the view.

Hierarchical queries are very traditional and one of the useful feature in day to day life. Every company has hierarchical data. The traversing of the hierarchy is bottom up or top down.

You can use hierarchical queries to travel along parent-child relationships in your data. For example, family trees, computer directory structures, and company organization charts.

Bottom up hierarchy :

If user wants to find the top hierarchy of employee then bottom up hierarchy is used.

Top down hierarchy :

If user finds out the employees under manager then top down hierarchy is used.

There are 2 types of Hierarchical Queries in oracle:

  1. Basic hierarchical query
  2. Cyclic hierarchical query

1.PRIOR

PRIOR is a unary operator which is used or indicates that "father of" the records or first record.

2.Connect By

Connect by is an Oracle-specific way to create data trees using SQL. It has two key clauses, start with and connect by.

You state the parent-child relationship here. This links the columns that store the parent and child values. You access values from the parent row using the keyword prior.

In a company each employee's "parent" is their manager. Thus you need to join the parent row's employee_id to the child's manager_id. So you connect the prior employee_id to the current manager_id, like so:

connect by prior employee_id = manager_id

Put this all together and you get the following query:

SELECT * FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr;

3.Start With

You state which rows are the roots here. These are the rows that appear at the "top" of the tree.

In a company org chart this is the CEO. Here that's employee_id 7839, Steven King. So you can begin the chart with him using:

start with manager_id = 7839

But if you do this, you need to change your query when a new CEO replaces him!

It's better to go with a more generic method. The CEO has no manager they report to. So their manager_id is null. So you could identify them with:

start with manager_id is null

In a company each employee's "parent" is their manager. Thus you need to join the parent row's employee_id to the child's manager_id. So you connect the prior employee_id to the current manager_id, like so:

connect by prior employee_id = manager_id

Put this all together and you get the following query:

Example :

SELECT * FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr;

4.Level

With the queries so far it's hard to tell how senior someone is in the company. Adding the generation they belong to makes this easier. This allows you to see who's at the same level in the hierarchy. How you do this depends on whether you're using connect by or recursive with.

With connect by you can use the pseudo-column level. This returns the current depth in the tree, starting with 1 for the roots. Each new set of children increases the depth by one. So all a manager's reports are one level below them.

You can select this on its own to show the current row's depth in the tree:

SELECT LEVEL, empno, ename, mgr FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr;

This helps. But it's still tricky to tell how senior someone is in the company. Indenting their name based on their position makes this easier. Combining level with lpad makes this easy.

The syntax of lpad is:

lpad ( str1, N, str2 )

It adds the characters in str2 before those in str1 until the string is N characters long. So you can use level to place spaces before each employee's name. This makes it easy to see where people place in the organization:

SELECT LEVEL, empno, LPAD( , level,  ) || ename emp_name, mgr FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr;

Sorting Output: Connect By

When you build a hierarchical query, the database returns the rows in an order matching the tree structure.

Connect by returns rows in depth-first search order. If you use a regular order by you'll lose this sort.

But you can preserve the depth-first tree and sort rows with the same parent. You do this with the siblings clause of order by.

So to show a manager's reports after them, sorting employees with the same manager by hire date (first to last), you can do:

SELECT LEVEL, empno, LPAD( , level,  ) || ename emp_name, mgr,hiredate FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr
ORDER  SIBLINGS BY hiredate;

If Manager is manager of himself then what to do? (Detecting Loops)

It's possible to store loops in your hierarchy. Usually this is a data error. But some structures may contain loops by design.

For example, the following sets the CEO's manager to be a lowly programmer:

UPDATE emp
SET    mgr = 7839
WHERE  empno = 7839;

This leads to a circle in your data. So you could get stuck in an infinite loop. Luckily Oracle Database has cycle detection to stop this.

If you try and build a hierarchy that contains a loop, connect by throws an ORA-1436 error:

SELECT LEVEL, empno, LPAD( , level,  ) || ename emp_name, mgr FROM emp
START WITH mgr =7839
CONNECT BY PRIOR empno = mgr;

You can avoid this using the nocycle keyword. This spots when the query returns to the same row. The database hides the repeated row and continues processing the tree.

To use it, place nocycle after connect by:

SELECT LEVEL, empno, LPAD( , level,  ) || ename emp_name, mgr FROM emp
START WITH mgr =7839
CONNECT BY NOCYCLE PRIOR empno = mgr;

Displaying Tree Details:

Using level makes it possible to see how deep the current row is in the tree. But it can still be tricky to see how the rows relate to each other. Connect by has many options to help with this.

Connect_by_root

The operator connect_by_root returns the value of a column from the root row.

SELECT LEVEL, empno, LPAD( , level*3,  ) || ename emp_name, mgr,hiredate,connect_by_root ename boss  FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr
ORDER  SIBLINGS BY hiredate;

Sys_connect_by_path : If user wants to see the bosses of employee

In this case user needs to use keyword 'SYS_CONNECT_BY_PATH(column_name,Delimiter)'

It can be useful to see values from all the rows between the root and the current row. Sys_connect_by_path allows you to do this. It builds up a string, adding the value from the first argument for the current row to the end of the list. It separates these using the second argument.

SELECT LEVEL, empno, LPAD( ' ', level*3, ' ' ) || ename emp_name, mgr,hiredate,SYS_CONNECT_BY_PATH(ename,'/') boss  FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr
ORDER  SIBLINGS BY hiredate;

Connect_by_isleaf

You can identify the leaf rows using connect_by_isleaf. This returns 1 if the current row is a leaf. Otherwise it returns 0.

Combining these allows you to display:

  • The last_name of the CEO (the root row) on every row
  • A comma separated list of the management chain from the current employee up to the CEO
  • The employees who aren't managers (the leaves)

The complete query for this is:

SELECT LEVEL, empno, LPAD( ' ', level*3, ' ' ) || ename emp_name, mgr,hiredate,
connect_by_root ename boss,
sys_connect_by_path(ename,'/') chart,
connect_by_isleaf is_leaf  FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr
ORDER  SIBLINGS BY hiredate;

The virtual columns used in the Hierarchical query

Level Show the level of node (depth level), root node with level 1
Prior Operator shows the previous record
Connect_By_Isleaf Leaf node or not? the leaf node is 1, if not is 0
CONNECT_BY_ROOT(Column) Return the column value of root node (level = 1)
NOCYCLE Operator in the 'connect by' clause to remove endless loop. For example, A manages B, B manages C, C manages A. In case, connect by determines manager who will be repeated infinitely and? NOCYCLE operator will be helpful in this case.
CONNECT_BY_ISCYCLE Column determine the value is be duplicated or not? The duplicated result is 1 and if not is 0. The column needs to be used together with? NOCYCLE operator in 'connect by' clause
SYS_CONNECT_BY_PATH(column, separated values) Create column value path from root node to current node, each of note is separated by "separated values".
ORDER SIBLINGS BY Arrange child nodes in the same level.

Recursive Query

Recursive With

Oracle Database 11.2 introduced another method for accessing trees: recursive sub-query factoring. Aka recursive with.

This is the ANSI compliant way to build hierarchies in SQL. It's composed of two queries. A base query and a recursive one.

Base Query

You use this to define the root rows in your tree. This is like the start with clause in connect by. So to begin the chart with the CEO, use:

  SELECT empno, ename, mgr 
  FROM   emp WHERE mgr IS NULL

This maps to the connect by clause. Here you join the source table to the with clause on the columns storing the parent-child values.

For the company chart, you need to join each employee to their manager. This is the employee from the previous row. So you're linking the org_chart to the employees table.

This gives the following query:

  SELECT e.empno, e.ename, e.mgr 
  FROM   org_chart oc
  JOIN   emp e
  ON     e.mgr = oc.employee_id;

When you use recursive with, you must provide aliases for all the columns it returns. These go between the query name and "as":

WITH org_chart (
  employee_id, first_name, manager_id
) AS ( ...

To complete the query you need to union all the base and recursive queries together. Put this all together and you get:

WITH org_chart (
  employee_id, first_name, manager_id
) AS (
  SELECT empno, ename, mgr 
  FROM   emp WHERE mgr IS NULL
  UNION  ALL
  SELECT e.empno, e.ename, e.mgr 
  FROM   org_chart oc
  JOIN   emp e
  ON     e.mgr = oc.employee_id
)
  SELECT * FROM org_chart ;

Recursive with doesn't have an in-built equivalent of level. You need to build your own. You can do this by selecting the value 1 in the base query. And incrementing it by one in the recursive part.

WITH org_chart (
  employee_id, first_name, manager_id,lvl
) AS (
  SELECT empno, ename, mgr ,1 lvl
  FROM   emp WHERE mgr IS NULL
  UNION  ALL
  SELECT e.empno, e.ename, e.mgr , oc.lvl+1
  FROM   org_chart oc
  JOIN   emp e
  ON     e.mgr = oc.employee_id
)
  SELECT * FROM org_chart ;

As with connect by, you can use this pad out values in the final select.

Sorting Output: Recursive With

Recursive with allows you to choose whether you want to traverse the tree using depth-first or breadth-first search. You define this in the search clause.

Depth-First Search

This starts at the root. Then picks one of its children. It then gets the child's child. And so on, down the tree accessing child nodes first. When it hits a leaf, it goes back up the tree until it finds an unvisited child.

So it always goes as far down the tree it can before accessing another row at the same level.

To use depth-first search, specify this in the search clause. The columns you sort by defines which order the database returns siblings. And the set clause defines a new column storing this sequence. It starts with 1 at for the first root. For each new row it increments by 1.

So to return employees in a depth first tree, sorting employees with the same manager by the date they were hired (first to last), use:

search depth first by hire_date set hire_seq

Which gives the following complete query:

WITH org_chart (
  employee_id, first_name, manager_id,lvl
) AS (
  SELECT empno, ename, mgr ,1 lvl
  FROM   emp WHERE mgr IS NULL
  UNION  ALL
  SELECT e.empno, e.ename, e.mgr , oc.lvl+1
  FROM   org_chart oc
  JOIN   emp e
  ON     e.mgr = oc.employee_id
)SEARCH DEPTH FIRST BY hiredate SET hire_seq
  SELECT * FROM org_chart 
ORDER BY hire_seq;

Breadth-First Search

Instead of travelling down the tree, you can go across it. This is breadth-first search.

Again, this starts with the root. But it accesses all the rows at the same level before going down to any children. So if you have many roots, it visits all these first. Then all the first generation children before going down to the second generation. And so on.

The sorting columns define which order you access nodes at the same depth.Unlike depth-first search, successive rows at the same level may alternate between parent rows.

So the following returns all the employees at the same rank next to each other. It sorts these by their hire date, first to last:

search breadth first by hire_date set hire_seq

Which completes the query like so:

WITH org_chart (
  employee_id, first_name, manager_id,lvl
) AS (
  SELECT empno, ename, mgr ,1 lvl
  FROM   emp WHERE mgr IS NULL
  UNION  ALL
  SELECT e.empno, e.ename, e.mgr , oc.lvl+1
  FROM   org_chart oc
  JOIN   emp e
  ON     e.mgr = oc.employee_id
)SEARCH BREADTH FIRST BY hiredate SET hire_seq
  SELECT * FROM org_chart 
ORDER BY hire_seq;

Detecting Loops(Recursive With)

You control loop detection using the cycle clause of recursive with. Here you state which columns mark a loop. The database keeps track of the values it sees in these columns. If the current row's values for these appear in one of it's ancestors, you have a loop.

The syntax for this is:

cycle  set <loop_column> to <loop_value> default <default_value>

The <loop_column> is a new one you define. When the database detects a cycle, it changes this from the default value to the <loop_value>

The org chart has a loop if you visit the same employee_id twice. So you want to check this column for cycles.

The snippet below does this and adds the column looped to your results. This starts with the value N. If the database finds a cycle, it sets it to Y:

cycle employee_id set looped to 'Y' default 'N'

You place this after your with clause, which gives the following complete query:

WITH org_chart (
  employee_id, first_name, manager_id,deptno,lvl
) AS (
  SELECT empno, ename, mgr ,deptno, 1 lvl
  FROM   emp WHERE mgr = 7839
  UNION  ALL
  SELECT e.empno, e.ename, e.mgr ,e.deptno, oc.lvl+1
  FROM   org_chart oc
  JOIN   emp e
  ON     e.mgr = oc.employee_id
)CYCLE employee_id SET looped TO 'Y' DEFAULT 'N'
  SELECT * FROM org_chart;

Unlike connect by, this includes the rows you visit twice. So the CEO, Steven King, appears twice in the results. If you want to exclude these, filter them out in your final where clause using the loop column you defined.

Using recursive with you can choose any columns in your query to mark a "loop". This allows you to stop processing before you get back to the same row.

For example, you can cycle on deptno. So if you cycle on deptno, like so:

cycle deptno set looped to 'Y' default 'N'

Displaying Tree Details: Recursive With

Recursive with doesn't have the built-in options like connect by. But you can emulate them.

Values from the root (connect_by_root)

To return a value from the parent row, select it in the base query. And in the recursive part, return this column from the with clause name.

Showing the path from root to current row (sys_connect_by_path)

Again, start by selecting the value you want in the base query. In the recursive part, append the values you want to add with an appropriate separator.

Identifying the leaves (connect_by_isleaf)

Displaying leaf rows is more complex with recursive with. To find these, add the depth (level) to the tree. Then sort it using depth-first search.

Depth-first search always goes as far down the tree it can. After hitting a leaf, it goes back up to the next unvisited child. So you know a row is a leaf if the next row is at the same or higher depth in the tree. So you need to check if the next row's level is less than or equal to the current.

You can test for this using lead. This gets a value from the next row in the results. Assuming seq follows a depth-first order, the following returns LEAF if the row is a leaf. And null otherwise:

 CASE 
           WHEN LEAD ( lvl, 1, 1 ) OVER ( ORDER BY seq ) <= lvl THEN 'LEAF'
         END is_leaf

Putting these all together gives:

WITH org_chart (
  employee_id, employee_name, manager_id,
	      root_emp, chart, lvl
) AS (
SELECT empno, ename, mgr, ename root_emp, ename chart, 1 lvl
  FROM   emp
  WHERE  mgr IS NULL
  UNION  ALL
  SELECT e.empno, e.ename, e.mgr,
         oc.root_emp, oc.chart || ', ' || e.ename, oc.lvl+1
  FROM   org_chart oc
  JOIN   emp e
  ON     e.mgr = oc.employee_id
) search DEPTH FIRST BY manager_id SET seq
  SELECT oc.*, 
         CASE 
           WHEN LEAD ( lvl, 1, 1 ) OVER ( ORDER BY seq ) <= lvl THEN 'LEAF'
         END is_leaf
  FROM   org_chart oc;

Writer profile pic

Uk01 on Nov 09, 2015 at 12:09 am


If you like dEexams.com and would like to contribute, you can write your article here or mail your article to admin@deexams.com . See your article appearing on the dEexams.com main page and help others to learn.



Post Comment

Comments( 0)

×

Forgot Password

Please enter your email address below and we will send you information to change your password.