Tags are a useful means of associating keyword metadata with a data structure such as an image, a video, audio or a piece of text. There are many ways of storing tags in a database – for example one could store a string of tags separated by a space, comma or semi-colon. Alternatively, one could store tags in a database table and then create another table to hold the list of tags for a given entity. My preferred method of storing tags, in an Oracle database at least, is to use the concept of nested tables.
Nested tables are a type of PL/SQL collection which may be compared to arrays or objects in other programming languages. One of the advantages of using nested tables over other PL/SQL collection types is that they don’t have a fixed size, and they can be stored in a database table (not just in a program unit, like regular PL/SQL tables or associative arrays). The best way to show how nested tables work is to use an example, so let’s dive right in and set up our database for the tags example.
The first thing we want to do is create the nested table type in our database. I’m going to keep it nice and simple and name my type “tags”:
CREATE TYPE tags IS TABLE OF VARCHAR2(50);
If you are familiar with PL/SQL tables, the above should look very similar to the syntax for creating a PL/SQL table, but without the typical “index by binary_integer” bit stuck at the end. With our type created, we can now go ahead and use this in a database table. I’m going to create a table named “post”, which one might use to store blog posts. This table will have three columns – post_id, post_title and post_tags.
CREATE TABLE post ( post_id NUMBER(11), post_title VARCHAR2(50), post_tags tags ) NESTED TABLE post_tags STORE AS tags_table;
Great – you now have a database table with a column of type “tags” which is itself a table. The next thing we want to do is insert some records into our new table.
INSERT INTO post(post_id, post_title, post_tags) VALUES(1, 'First Post', tags('oracle', 'tags', 'example')); INSERT INTO post(post_id, post_title, post_tags) VALUES(2, 'Second Post', tags('oracle', 'tags')); INSERT INTO post(post_id, post_title, post_tags) VALUES(3, 'Third Post', tags('oracle')); COMMIT;
As you can see from the above INSERT statements, inserting tags into our nested table is very straightforward, we simply tell Oracle that we are using the type “tags” and pass the VARCHAR2 values we want to store to it. If you try to insert a tag with over 50 characters you will get an error as we defined the tags table as a table of VARCHAR2(50) fields. So what does our new table look like when we go to retrieve data from it? Let’s take a look.
SELECT * FROM post;
You should see output like the following
POST_ID POST_TITLE POST_TAGS
------------ ----------------------------------------------------------- ---------
1 First Post <Object>
2 Second Post <Object>
3 Third Post <Object>
Hardly ideal is it? Fortunately, we can use the TABLE function to easily get back a list of tags for a given post quite easily. Try the following SQL statement:
SELECT column_value FROM TABLE( SELECT post_tags FROM post WHERE post_id = 1 );
This outputs the following:
COLUMN_VALUE -------------------------------------------------- oracle tags example
Try the statement again, but this time pass in the post_id value for another row. This is useful, as it gives us a means of getting the tag data out of the database, but what if we want to get back all the tags when retrieving the other columns in the table? This requires a bit more work, but it’s not exactly complex:
SELECT p.post_id, p.post_title, t.column_value FROM post p, TABLE( SELECT post_tags FROM post WHERE post_id = p.post_id ) t;
The above statement should produce the following:
POST_ID POST_TITLE COLUMN_VALUE
------------ ------------------------------------------------------- ---------------
1 First Post oracle
1 First Post tags
1 First Post example
2 Second Post oracle
2 Second Post tags
3 Third Post oracle
A step in the right direction, but the post_id and post_title are duplicated for each tag for that record. Surely there’s a way of outputting all the tags on a single row, joining them together with a delimiting character such as a comma? There are a number of ways of doing this, from using a stored function to using 11g’s WITHIN GROUP feature. The method with the least code that works on 9i and above is to use the XMLAGG function to produce the concatenated result. The SQL to achieve the desired result in this manner is as follows:
SELECT p.post_id, p.post_title, RTRIM(XMLAGG(XMLELEMENT(e, t.column_value || ',')).EXTRACT('//text()'), ',') tags FROM post p, TABLE( SELECT post_tags FROM post WHERE post_id = p.post_id ) t GROUP BY p.post_id, p.post_title;
This produces the following result:
POST_ID POST_TITLE TAGS
------------ ---------------------------------------- -----------------------
1 First Post oracle,tags,example
2 Second Post oracle,tags
3 Third Post oracle
So far we haven’t seen any particular reason why one would necessarily use a nested table to store tags over any other method. For me, the primary advantage is that it makes it very simple to perform group functions on the tags – making it ridiculously easy to generate tag clouds based on how many times a tag is used. In the above example, we can easily see that the tag “oracle” is used 3 times, “tags” is used twice and “example” only once. But in a scenario where you have many posts and many, many tags, it is impossible to do this counting in your head. Let’s take a look at just how easy it is to do this counting with our nested table.
SELECT t.column_value, COUNT(*) FROM post p, TABLE( SELECT post_tags FROM post WHERE post_id = p.post_id ) t GROUP BY t.column_value;
And the output:
COLUMN_VALUE COUNT(*) -------------------------------------------------- ---------- tags 2 example 1 oracle 3
Want to order the results? Simply add an ORDER BY clause and Bob’s your uncle.
SELECT t.column_value, COUNT(*) FROM post p, TABLE( SELECT post_tags FROM post WHERE post_id = p.post_id ) t GROUP BY t.column_value ORDER BY 2 DESC
The result:
COLUMN_VALUE COUNT(*) -------------------------------------------------- ---------- oracle 3 tags 2 example 1
For me, this is the main reason I store tags in a nested table – but there are other advantages, too. What if you are editing a post and you want to change one of the tags? Nested tables make this simple:
UPDATE TABLE(SELECT post_tags FROM post WHERE post_id = 1) SET column_value = 'sample' WHERE column_value = 'example'
Similarly, if you wanted to delete a tag from a row, this is also very straightforward:
DELETE FROM TABLE( SELECT post_tags FROM post WHERE post_id = 1 ) WHERE column_value = 'tags';
Nested tables are a powerful PL/SQL collection type, and you can do even more powerful things with them if you start using PL/SQL procedures and functions. But the beauty about nested tables for me is how powerful and flexible they are using nothing but raw SQL statements. For further information about nested tables and other PL/SQL collection types, see the following resources:
- http://www.developer.com/db/article.php/10920_3379271_1/Oracle-Programming-with-PLSQL-Collections.htm
- http://www.dba-oracle.com/t_display_multiple_column_values_same_rows.htm
- http://www.devshed.com/c/a/Oracle/Database-Interaction-with-PLSQL-Nested-Tables/
- http://soft.buaa.edu.cn/oracle/bookshelf/Oreilly/prog2/ch19_01.htm
- http://sql-plsql.blogspot.com/2007/05/oracle-plsql-nested-tables.html
- http://download.oracle.com/docs/cd/B10501_01/appdev.920/a96624/05_colls.htm
- http://www.smart-soft.co.uk/Oracle/oracle-plsql-tutorial-part-9.htm