Wednesday, August 25, 2010

XML Parsing with Namespaces via XMLTable

In Basic XML Parsing via XMLTable I covered the basics for parsing simple XML stored in the database via SQL, but most XML has one or more namespaces associated with it. So how do you deal with that? With XMLTable it is simple. As the XMLTable documentation shows, there is an XMLnamespaces_clause where this information can be provided. This translates into the XMLNamespaces(), parameter within the XMLTable syntax. The documentation doesn't show how to use it, but the XML DB FAQ Thread on the OTN XML DB forum does (see the "How do I declare namespace prefix mapping with XMLTable()?" thread). As before in the PL/SQL example showing namespaces, I've taken the previous XML and added two namespaces to it and updated the appropriate calls to make this example.

The updated XML I used.
Connected to Oracle Database 11g Enterprise Edition Release 11.1.0.6.0

SQL>
SQL> CREATE TABLE your_table (xml_col XMLTYPE);

Table created
SQL> INSERT INTO your_table
2 VALUES
3 ('<a:employees xmlns:a="abc.com/123" xmlns:b="xyz.net/456">
4 <a:emp>
5 <a:name>Scott</a:name>
6 <b:favorites>
7 <b:color>red</b:color>
8 <b:color>orange</b:color>
9 </b:favorites>
10 </a:emp>
11 <a:emp>
12 <a:name>John</a:name>
13 <b:favorites>
14 <b:color>blue</b:color>
15 <b:color>green</b:color>
16 </b:favorites>
17 </a:emp>
18 </a:employees>');

1 row inserted
SQL> commit;

Commit complete

SQL> -- Method 3
SQL> SELECT xt.nam
2 FROM your_table yt,
3 XMLTable(XMLNamespaces('abc.com/123' AS "a"),
4 'a:employees/a:emp'
5 PASSING yt.xml_col -- defines source of XMLType, can define a join
6 COLUMNS
7 nam VARCHAR2(20) PATH 'a:name') xt;

NAM
--------------------
Scott
John
SQL> --
SQL> -- Method 4
SQL> SELECT xt.nam
2 FROM your_table yt,
3 XMLTable(XMLNamespaces('abc.com/123' AS "a"),
4 'a:employees/a:emp[2]'
5 PASSING yt.xml_col
6 COLUMNS
7 nam VARCHAR2(20) PATH 'a:name') xt;

NAM
--------------------
John
SQL> --
SQL> -- Method 5.a
SQL> SELECT xt.color
2 FROM your_table yt,
3 XMLTable(XMLNamespaces('abc.com/123' AS "a",
4 'xyz.net/456' AS "b"),
5 'a:employees/a:emp/b:favorites/b:color'
6 PASSING yt.xml_col
7 COLUMNS
8 color VARCHAR2(10) PATH '.') xt;

COLOR
----------
red
orange
blue
green
SQL> --
SQL> -- Method 5.b
SQL> SELECT xt.nam, xt2.color
2 FROM your_table yt,
3 XMLTable(XMLNamespaces('abc.com/123' AS "a",
4 'xyz.net/456' AS "b"),
5 'a:employees/a:emp'
6 PASSING yt.xml_col
7 COLUMNS
8 nam VARCHAR2(20) PATH 'a:name',
9 color_t XMLTYPE PATH 'b:favorites') xt, -- path to the node that repeats
10 XMLTable(XMLNamespaces('xyz.net/456' AS "b"),
11 'b:favorites/b:color'
12 PASSING xt.color_t -- define input XMLType as output of above, aka a join
13 COLUMNS
14 color VARCHAR2(10) PATH '.') xt2;

NAM COLOR
-------------------- ----------
Scott red
Scott orange
John blue
John green
SQL> --
SQL> -- Method 5.c
SQL> SELECT xt.nam, xt2.color
2 FROM your_table yt,
3 XMLTable(XMLNamespaces('abc.com/123' AS "a",
4 'xyz.net/456' AS "b"),
5 'a:employees/a:emp'
6 PASSING yt.xml_col
7 COLUMNS
8 nam VARCHAR2(20) PATH 'a:name',
9 color_t XMLTYPE PATH 'b:favorites/b:color') xt, -- path to the node that repeats
10 XMLTable(XMLNamespaces('xyz.net/456' AS "b"),
11 '/b:color'
12 PASSING xt.color_t -- define input XMLType as output of above, aka a join
13 COLUMNS
14 color VARCHAR2(10) PATH '.') xt2;

NAM COLOR
-------------------- ----------
Scott red
Scott orange
John blue
John green


which is exactly the same output as the previous example. This is what we expected. As always, you only need to pass along namespaces that are referenced directly in the XMLTable and not all the namespaces associated with the XML.

Trouble shooting tips:
  • Don't forgot to include the comma after the closing parenthesis on the XMLNamespaces(), structure otherwise you'll get ORA-02000: missing , keyword

Tuesday, July 20, 2010

Basic XML Parsing via XMLTable

In Basic XML Parsing via PL/SQL, I looked at "How do I parse this XML?" when the XML resides in PL/SQL. Oftentimes, the XML resides in the database and so Oracle provides another option for parsing the XML, the XMLTable command. This command became available with the release of 10.2. On versions <= 10.1, you have to use TABLE(xmlsequence(extract(...))). I might cover that at some point in the future but will see. While reading that documentation may make XMLTable seem complex, which it can be, it is very easy to use. Let us start with putting our XML into the DB via:
Connected to Oracle Database 11g Enterprise Edition Release 11.1.0.6.0

SQL>
SQL> CREATE TABLE your_table
2 (xml_col XMLTYPE);

Table created

SQL>
SQL> INSERT INTO your_table
2 VALUES
3 ('<employees>
4 <emp>
5 <name>Scott</name>
6 <favorites>
7 <color>red</color>
8 <color>orange</color>
9 </favorites>
10 </emp>
11 <emp>
12 <name>John</name>
13 <favorites>
14 <color>blue</color>
15 <color>green</color>
16 </favorites>
17 </emp>
18 </employees>');

1 row inserted

SQL> commit;

Commit complete

Now let's get out the same data we did in the original PL/SQL example. For this we will only be doing Methods 3 - 5 since they cover Methods 1 and 2 as well.
SQL> -- Method 3
SQL> SELECT xt.nam
2 FROM your_table yt,
3 XMLTable('employees/emp'
4 PASSING yt.xml_col -- defines source of XMLType, can define a join
5 COLUMNS
6 nam VARCHAR2(20) PATH 'name') xt;

NAM
--------------------
Scott
John

SQL> --
SQL> -- Method 4
SQL> SELECT xt.nam
2 FROM your_table yt,
3 XMLTable('employees/emp[2]'
4 PASSING yt.xml_col
5 COLUMNS
6 nam VARCHAR2(20) PATH 'name') xt;

NAM
--------------------
John

SQL> --
SQL> -- Method 5.a
SQL> SELECT xt.color
2 FROM your_table yt,
3 XMLTable('employees/emp/favorites/color'
4 PASSING yt.xml_col
5 COLUMNS
6 color VARCHAR2(10) PATH '.') xt;

COLOR
----------
red
orange
blue
green

SQL> --
SQL> -- Method 5.b
SQL> SELECT xt.nam, xt2.color
2 FROM your_table yt,
3 XMLTable('employees/emp'
4 PASSING yt.xml_col
5 COLUMNS
6 nam VARCHAR2(20) PATH 'name',
7 color_t XMLTYPE PATH 'favorites') xt, -- path to the node that repeats
8 XMLTable('favorites/color'
9 PASSING xt.color_t -- define input XMLType as output of above, aka a join
10 COLUMNS
11 color VARCHAR2(10) PATH '.') xt2;

NAM COLOR
-------------------- ----------
Scott red
Scott orange
John blue
John green

SQL> --
SQL> -- Method 5.c
SQL> SELECT xt.nam, xt2.color
2 FROM your_table yt,
3 XMLTable('employees/emp'
4 PASSING yt.xml_col
5 COLUMNS
6 nam VARCHAR2(20) PATH 'name',
7 color_t XMLTYPE PATH 'favorites/color') xt, -- path to the node that repeats
8 XMLTable('/color'
9 PASSING xt.color_t -- define input XMLType as output of above, aka a join
10 COLUMNS
11 color VARCHAR2(10) PATH '.') xt2;

NAM COLOR
-------------------- ----------
Scott red
Scott orange
John blue
John green

Methods 3 - 5.a here extract the same information as we did via PL/SQL.

Breaking down the Oracle documentation syntax, I will cover XMLnamespaces_clause in another post. The XQuery_string can be a simple XPath as shown above or more complex XQuery statements to be demonstrated later. The XML_passing_clause is just the PASSING line where the input XML is defined. This can join the XMLTable to another table/XMLTable, a variable, or be XML generated right there. The XML_table_column section following COLUMNS simply lists how to find the information referenced by the XQuery_string. The information is relative to the results of the XQuery_string.

Trouble shooting tips:
  • If you are unsure of what the XMLType looks like at some point within an XMLTable, simply create a column of XMLType and include it in the SELECT list. In example 5.c, adding xt.color_t to the select list allows us to see what Oracle was passing into the second XMLTable.

Oracle Documentation

One thing I have noticed while being on the OTN forums is that not everyone knows about the online documentation for the version of Oracle they are using. Thankfully, Oracle has the handy Oracle Documentation to start from.

Wednesday, June 30, 2010

XML Parsing with Default Namespaces via PL/SQL

In my previous post I covered parsing of XML where a namespace prefix existed for each namespace. What happens when you have to parse XML where some of the XML is in a default namespace? Make up a namespace prefix. No W3C complaint parser will care about what the prefix is in the XML and the prefix in the XPath is as long as the URI (the stuff within the " quotes) is the same for the node in question. Here is a simple example based off the ongoing example
DECLARE
l_doc VARCHAR2(2000);
l_domdoc dbms_xmldom.DOMDocument;
l_nodelist dbms_xmldom.DOMNodeList;
l_node dbms_xmldom.DOMNode;
l_value VARCHAR2(30);
l_a_ns VARCHAR2(30);
l_both_ns VARCHAR2(60);

l_xmltype XMLTYPE;
l_empx XMLTYPE;
l_index PLS_INTEGER;
l_col_ind PLS_INTEGER;
BEGIN
l_doc := '<employees xmlns="abc.com/123" xmlns:b="xyz.net/456">
<emp>
<name>Scott</name>
<b:favorites>
<b:color>red</b:color>
<b:color>orange</b:color>
</b:favorites>
</emp>
<emp>
<name>John</name>
<b:favorites>
<b:color>blue</b:color>
<b:color>green</b:color>
</b:favorites>
</emp>
</employees>';

l_a_ns := 'xmlns:a="abc.com/123"'; -- assign random prefix to namespace
-- assigning a different prefix to this namespace
l_both_ns := l_a_ns || ' xmlns:c="xyz.net/456"'; -- separated by a space

l_xmltype := XMLTYPE(l_doc);

-- Method 4
dbms_output.new_line;
dbms_output.put_line('Method 4');
l_value := l_xmltype.extract('/employees/emp[2]/name/text()', 'xmlns="abc.com/123"').getStringVal();
dbms_output.put_line('Emp Name: '||l_value);

-- Method 5
dbms_output.new_line;
dbms_output.put_line('Method 5');
l_index := 1;
WHILE l_xmltype.Existsnode('/a:employees/a:emp[' || To_Char(l_index) || ']', l_a_ns) > 0
LOOP
l_empx := l_xmltype.extract('/a:employees/a:emp[' || To_Char(l_index) || ']', l_a_ns);
l_col_ind := 1;
WHILE l_empx.Existsnode('/a:emp/c:favorites/c:color[' || To_Char(l_col_ind) || ']', l_both_ns) > 0
LOOP
l_value := l_empx.extract('/a:emp/c:favorites/c:color[' || To_Char(l_col_ind) || ']/text()', l_both_ns).getStringVal();
dbms_output.put_line('Color: '||l_value);
l_col_ind := l_col_ind + 1;
END LOOP;
l_index := l_index + 1;
END LOOP;
END;

In the XML, we have a default namespace of "abc.com/123" for the elements and the previous "xyz.net/456" namespace. In the code, I randomly picked a prefix of "a" for the default namespace and so anytime an element in the default namespace is referenced, the corresponding prefix is used.

As Method 4 shows, you could just provide the default namespace as is for the third parm and then there is no need to include a prefix in the XPath itself. I prefer to go the route of method 5 as I have encountered XML where the "default" namespace has changed several nodes deep in the XML. Since the default changes, you need to specify a namespace prefix to keep things in order with the namespace each node resides in.

Monday, June 28, 2010

XML Parsing with Namespaces via PL/SQL

My previous post covered the basics for parsing simple XML in PL/SQL, but most XML has one or more namespaces associated with it. So how do you deal with that? Simple pretty much. Oracle provides a third parm (usually it is the third) to pass namespace URI information along to the underlying parser. I've taken the previous XML and added two namespaces to it and updated the appropriate calls to make this example.
DECLARE
l_doc VARCHAR2(2000);
l_domdoc dbms_xmldom.DOMDocument;
l_nodelist dbms_xmldom.DOMNodeList;
l_node dbms_xmldom.DOMNode;
l_value VARCHAR2(30);
l_a_ns VARCHAR2(30);
l_both_ns VARCHAR2(60);

l_xmltype XMLTYPE;
l_empx XMLTYPE;
l_index PLS_INTEGER;
l_col_ind PLS_INTEGER;
BEGIN
l_doc := '<a:employees xmlns:a="abc.com/123" xmlns:b="xyz.net/456">
<a:emp>
<a:name>Scott</a:name>
<b:favorites>
<b:color>red</b:color>
<b:color>orange</b:color>
</b:favorites>
</a:emp>
<a:emp>
<a:name>John</a:name>
<b:favorites>
<b:color>blue</b:color>
<b:color>green</b:color>
</b:favorites>
</a:emp>
</a:employees>';

l_domdoc := dbms_xmldom.newDomDocument(l_doc);
l_a_ns := 'xmlns:a="abc.com/123"';
l_both_ns := l_a_ns || ' xmlns:b="xyz.net/456"'; -- separated by a space

-- Method 1
dbms_output.put_line('Method 1');
-- third parm for namespace info
l_nodelist := dbms_xslprocessor.selectNodes(dbms_xmldom.makeNode(l_domdoc),'/a:employees/a:emp/a:name', l_a_ns);
FOR cur_emp IN 0 .. dbms_xmldom.getLength(l_nodelist) - 1 LOOP
l_node := dbms_xmldom.item(l_nodelist, cur_emp);
l_value := dbms_xmldom.getnodevalue(dbms_xmldom.getfirstchild(l_node));
dbms_output.put_line('Emp Name: '||l_value);
END LOOP;

-- Method 2
dbms_output.new_line;
dbms_output.put_line('Method 2');
l_nodelist := dbms_xmldom.getelementsbytagname(l_domdoc, 'name'); -- This doesn't care about namespaces
-- get first item from list, could loop as shown above
l_node := dbms_xmldom.item(l_nodelist, 0);
l_value := dbms_xmldom.getnodevalue(dbms_xmldom.getfirstchild(l_node));
dbms_output.put_line('Emp Name: '||l_value);

-- Done with DOMDocument examples, setup for XMLType based examples
dbms_xmldom.freeDocument(l_domdoc);
l_xmltype := XMLTYPE(l_doc);

-- Method 3
dbms_output.new_line;
dbms_output.put_line('Method 3');
l_index := 1;
WHILE l_xmltype.Existsnode('/a:employees/a:emp[' || To_Char(l_index) || ']', l_a_ns) > 0
LOOP
l_value := l_xmltype.extract('/a:employees/a:emp[' || To_Char(l_index) || ']/a:name/text()', l_a_ns).getStringVal();
dbms_output.put_line('Emp Name: '||l_value);
l_index := l_index + 1;
END LOOP;

-- Method 4
dbms_output.new_line;
dbms_output.put_line('Method 4');
l_value := l_xmltype.extract('/a:employees/a:emp[2]/a:name/text()', l_a_ns).getStringVal();
dbms_output.put_line('Emp Name: '||l_value);

-- Method 5
dbms_output.new_line;
dbms_output.put_line('Method 5');
l_index := 1;
WHILE l_xmltype.Existsnode('/a:employees/a:emp[' || To_Char(l_index) || ']', l_a_ns) > 0
LOOP
l_empx := l_xmltype.extract('/a:employees/a:emp[' || To_Char(l_index) || ']', l_a_ns);
l_col_ind := 1;
WHILE l_empx.Existsnode('/a:emp/b:favorites/b:color[' || To_Char(l_col_ind) || ']', l_both_ns) > 0
LOOP
l_value := l_empx.extract('/a:emp/b:favorites/b:color[' || To_Char(l_col_ind) || ']/text()', l_both_ns).getStringVal();
dbms_output.put_line('Color: '||l_value);
l_col_ind := l_col_ind + 1;
END LOOP;
l_index := l_index + 1;
END LOOP;
END;
Running this script produces
Method 1
Emp Name: Scott
Emp Name: John

Method 2
Emp Name: Scott

Method 3
Emp Name: Scott
Emp Name: John

Method 4
Emp Name: John

Method 5
Color: red
Color: orange
Color: blue
Color: green
which is exactly the same output as the previous example. This is what we expected. The only call that did not have to change was the one for Method 2. DBMS_XMLDOM is pretty much ignores namespaces when performing operations so this does have useful situations. As Methods 1, 3, and 4 show, the namespace parm only needs to contain the URI information for namespaces referenced in the Xpath, not in the XML. All the namespaces information can be passed along, but it is not needed. Pretty simple, excluding the whole "wait, what namespace is that node in again?" issue.