free ebooks
 Home Computers & InternetProgramming > General  
Transferring data between XML documents and relational databases

1) Overview

In this paper we will discuss strategies for transferring data between XML documents and relational databases according to two mappings (a table-based mapping and an object-based mapping) commonly used to map DTDs to relational databases. Although the discussion largely focuses on the difference between using SAX- and DOM-based tools to transfer data, it also discusses a number of strategies for traversing both the XML and database hierarchies and the tradeoffs among them.

IMPORTANT: The table-based mapping and the object-based mapping are not discussed in this paper. However, you must understand them before reading this paper. For more information, see Mapping DTDs to Databases.

WARNING: This is not the best paper I have ever written, largely due to lack of time to research all of the issues involved. If you find factual errors -- I'm sure there are more than what have already been found -- please contact me. Thanks.

2) Table-based mapping

The table-based mapping views an XML document as a serialized table. That is, the structure of the document must be as follows, where the outermost (<Tables>) element is needed only if the document contains more than one table:

   <Tables>        <Table>           <Row>              <Column_1>...</Column_1>              ...              <Column_n>...</Column_n>           </Row>           <Row>              <Column_1>...</Column_1>              ...              <Column_n>...</Column_n>           </Row>           ...        </Table>     </Tables>  

The table-based mapping has the advantage of simplicity, which makes it useful as a model for writing data transfer tools. However, because the model applies only to a limited set of XML documents, the tools are of limited use. They are most useful for using XML to transfer data between relational databases or for providing single-table access to the database.

For a complete discussion of the table-based mapping, see see Mapping DTDs to Databases".

2.1) Transferring data from XML to the database

The code to transfer data from XML to the database follows a common pattern, regardless of whether it uses SAX or DOM:

  1. Table element start: prepare an INSERT statement
  2. Row element start: clear INSERT statement parameters
  3. Column elements: buffer PCDATA and set INSERT statement parameters
  4. Row element end: execute INSERT statement
  5. Table element end: close INSERT statement

The code does not make any assumptions about the names of the tags. In fact, it uses the name of the table-level tag to build the INSERT statement and the names of the column-level tags to identify parameters in the INSERT statement. Thus, these names could correspond exactly to the names in the database or could be mapped to names in the database using a configuration file.

Here is the code using SAX for a document containing a single table:

   int state = UNKNOWN;     PreparedStatement stmt;     StringBuffer data;          public void startElement(String uri, String name, String qName,                              Attributes attr) {        if (state == UNKNOWN) {           stmt = getInsertStmt(name);           state = TABLE;                 }        else if (state == TABLE) {           state = ROW;           stmt.clearParameters();        } else if (state == ROW) {           state = COLUMN;           data = new StringBuffer();        } else { // if (state == COLUMN)           throw new SAXException("XML document nested too deep.");        }     }          public void characters (char[] chars, int start, int length) {        if (state == COLUMN)           data.append(chars, start, length);     }          public void endElement(String uri, String name, String qName) {        if (state == TABLE) {           stmt.close();           state = UNKNOWN;        }        else if (state == ROW) {           stmt.executeUpdate();           state = TABLE;        } else if (state == COLUMN) {           setParameter(stmt, name, data.toString());           state = ROW;        } else { // if (state == UNKNOWN)           throw new SAXException("Invalid program state.");        }     }  

And here is the code using DOM for a document containing a single table:

   void processTable(Document doc) {        Element table = doc.getDocumentElement();        PreparedStatement stmt = getInsertStmt(table.getNodeName());        Element row = (Element)table.getFirstChild();        while (row != null) {           stmt.clearParameters();           processRow(stmt, row);           row = row.getNextSibling();        }     }          void processRow(PreparedStatement stmt, Element row) {        Element column = (Element)row.getFirstChild();        while (column != null) {           processColumn(stmt, column);           column = column.getNextSibling();        }        stmt.executeUpdate();     }          void processColumn(PreparedStatement stmt, Element column) {        String data = getTextChildren(column);        setParameter(stmt, column.getNodeName(), data);     }  

As you can see, the code is functionally identical, the only difference being the use of SAX or DOM.


 Additional Info
 No. 282
 Posted on 8 June, 2006
Bookmark This Page
Facebook MySpace Twitter Digg stumbleupon friendfeed Delicious


Link to us from your website or blog by using the code below in your html
@2008 ebooklobby privacy policy email: info [at]