XSLT

Bách khoa toàn thư mở Wikipedia

Xử lí biến đổi XSL
Phóng lớn
Xử lí biến đổi XSL

XSLT (viết tắt của tiếng Anh XSL Transformations) là một ngôn ngữ dựa trên XML dùng để biến đổi các tài liệu XML. Tài liệu gốc thì không bị thay đổi; mà thay vào đó, một tài liệu XML mới được tạo ra dựa trên nội dung của tài liệu cũ. Tài liệu mới có thể là có định dạng XML hay là một định dạng nào đó khác, như HTML hay văn bản thuần. XSLT thường dùng nhất trong việc chuyển đổi dữ liệu giữa các lược đồ XML hay để chuyển đổi dữ liệu XML thành các trang web hay tài liệu dạng PDF.

XSLT ra đời là kết quả của các nỗ lực phát triển Extensible Stylesheet Language (XSL) của W3C trong suốt 19981999, cùng với sự ra đời của XSL Formatting Objects (XSL-FO) và XML Path Language, XPath.

Người biên tập cho phiên bản đầu tiên (nhà thiết kế chính của ngôn ngữ) là James Clark. Phiên bản dùng phổ biến hiện nay là XSLT 1.0, là chuẩn gợi ý dùng đưa ra bởi W3C vào ngày 16 tháng 11 năm 1999. Với nhiều mở rộng từ phiên bản 2.0, dưới sự chủ trì biên tập của Michael Kay, XSLT đã đạt đến chuẩn Candidate Recommendation từ W3C vào 3 tháng 11 năm 2005.

Mục lục

[sửa] Tổng quan


The XSLT language is declarative — rather than listing an imperative sequence of actions to perform in a stateful environment, an XSLT stylesheet consists of a template rules collection, each of which specifies what to add to the result tree when the XSLT processor, scanning the source tree, according to a fixed algorithm, finds a node that meets conditions. Instructions within template rules are processed as if they were sequential instructions; but, in fact, they comprise functional expressions, representing their evaluated results - ultimately, nodes to be added to the result tree.

The XSLT specification defines a transformation in terms of source and result trees to avoid locking implementations into system-specific APIs and memory, network and file I/O issues. For example, the specification does not mandate that a source tree always be derived from an XML file, since it may be more efficient for the processor to read from an in-memory DOM object or some other implementation-specific representation. Output may be in a format not envisioned by the XSLT language's designers. However, XSLT processing often begins by reading a serialized XML input document into the source tree and ends by writing the result tree to an output document. The output document may be XML, but can be HTML, RTF, TeX, delimited files, plain text or any other format that the XSLT processor is capable of producing.

XSLT relies upon the W3C's XPath language for identifying subsets of the source document tree, as well as for performing calculations. XPath also provides a range of functions, which XSLT itself further augments. This reliance upon XPath adds a great deal of power and flexibility to XSLT.

Most current operating systems have an XSLT processor installed. For example, Windows XP comes with the MSXML3 library, which includes an XSLT processor. Earlier versions may be upgraded and there are many alternatives, see the External links section.

The W3C finalized the XSLT 1.0 specification in 1999. The XSLT 2.0 specification is currently a Candidate Recommendation.

[sửa] Example 1 (transforming XML to XML)

Transforming the XML document

<persons>
   <person username="MP123456">
     <name>John</name>
     <family_name>Smith</family_name>
   </person>
   <person username="PK123456">
     <name>Morka</name>
     <family_name>Ismincius</family_name>
   </person>
 </persons>

by the following XSLT transform:

<?xml version="1.0"?>
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
 <xsl:output method="xml" indent="yes"/> 
  <xsl:template match="/">
     <transform>
        <xsl:apply-templates/>
     </transform>
  </xsl:template>
  <xsl:template match="person">
      <record>
         <username>
            <xsl:value-of select="@username" />
         </username>
         <name>
            <xsl:value-of select="name" />
         </name>
      </record> 
   </xsl:template>
 </xsl:stylesheet>

We obtain the new document, having another structure:

<?xml version="1.0" encoding="UTF-8"?>
 <transform>
    <record>
       <username>MP123456</username>
       <name>John</name>
    </record>
    <record>
       <username>PK123456</username>
       <name>Morka</name>
    </record>  
 </transform>

[sửa] Example 2 (transforming XML to XHTML)

Example XSLT Stylesheet:

<?xml version="1.0" encoding="UTF-8" ?>

 <xsl:stylesheet version="1.0" 
         xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
         xmlns="http://www.w3.org/1999/xhtml">
     <xsl:output method="xml" indent="yes"
         doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" 
         doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"/>
     
     <!--XHTML document outline--> 
     <xsl:template match="/">
         <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
             <head>
                 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
                 <title>test1</title>
                 <style type="text/css">
                     h1          { padding: 10px; padding-width: 100%; background-color: silver }
                     td, th      { width: 40%; border: 1px solid silver; padding: 10px }
                     td:first-child, th:first-child  { width: 20% } 
                     table       { width: 650px }
                 </style>
             </head>
             <body>
                 <xsl:apply-templates/>
             </body>
         </html>
     </xsl:template>
     
     <!--Table headers and outline-->
     <xsl:template match="domains/*">
         <h1><xsl:value-of select="@ownedBy"/></h1>
         <p>The following host names are currently in use at
           <strong><xsl:value-of select="local-name(.)"/></strong>
         </p>
         <table>
             <tr><th>Host name</th><th>URL</th><th>Used by</th></tr>
             <xsl:apply-templates/>
         </table>
     </xsl:template>
     
     <!--Table row and first two columns-->
     <xsl:template match="host">
         <!--Create variable for 'url', as it's used twice-->
         <xsl:variable name="url" select=
             "normalize-space(concat('http://', normalize-space(node()), '.', local-name(..)))"/>
         <tr>
             <td><xsl:value-of select="node()"/></td>
             <td><a href="{$url}"><xsl:value-of select="$url"/></a></td>
             <xsl:apply-templates select="use"/>
         </tr>
     </xsl:template>

     <!--'Used by' column-->
     <xsl:template match="use">
         <td><xsl:value-of select="."/></td>
     </xsl:template>
         
 </xsl:stylesheet>

Example of incoming XML for above stylesheet:

<?xml version="1.0" encoding="UTF-8"?>

 <domains>
     <sun.com ownedBy="Sun Microsystems Inc.">
         <host>
             www
             <use>World Wide Web site</use>
         </host>
         <host>
             java
             <use>Java info</use>
         </host>
     </sun.com>
     
     <w3.org ownedBy="The World Wide Web Consortium">
         <host>
             www
             <use>World Wide Web site</use>
         </host>
         <host>
             validator
             <use>web developers who want to get it right</use>
         </host>
     </w3.org>
 </domains>

Output XHTML that this would produce (whitespace has been adjusted here for clarity):

<?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
                       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
   <head>
     <meta content="text/html;charset=UTF-8" http-equiv="Content-Type" />
     <title>test1</title>
     <style type="text/css">
       h1          { padding: 10px; padding-width: 100%; background-color: silver }
       td, th      { width: 40%; border: 1px solid silver; padding: 10px }
       td:first-child, th:first-child  { width: 20% } 
       table       { width: 650px }
     </style>
   </head>
   <body>
     <h1>Sun Microsystems Inc.</h1>
     <p>The following host names are currently in use at <strong>sun.com</strong></p>
     <table>
         <tr>
           <th>Host name</th>
           <th>URL</th>
           <th>Used by</th>
         </tr>
         <tr>
           <td>www</td>
           <td><a href="http://www.sun.com">http://www.sun.com</a></td>
           <td>World Wide Web site</td>
         </tr>
         <tr>
           <td>java</td>
           <td><a href="http://java.sun.com">http://java.sun.com</a></td>
           <td>Java info</td>
         </tr>
     </table>
     
     <h1>The World Wide Web Consortium</h1>
     <p>The following host names are currently in use at <strong>w3.org</strong></p>
     <table>
       <tr>
         <th>Host name</th>
         <th>URL</th>
         <th>Used by</th>
       </tr>
       <tr>
         <td>www</td>
         <td><a href="http://www.w3.org">http://www.w3.org</a></td>
         <td>World Wide Web site</td>
       </tr>
       <tr>
         <td>validator</td>
         <td><a href="http://validator.w3.org">http://validator.w3.org</a></td>
         <td>web developers who want to get it right</td>
       </tr>
     </table>
   </body>
 </html>

[sửa] Template rule processing

XSLT stylesheets are declarative, not procedural; rather than defining a sequence of operations to execute, they define rules and other hints applied during processing, according to a fixed algorithm. The algorithm, which is somewhat complicated, is described below, although many of its esoteric details have been omitted.

Every XSLT processor is required to behave as if it had followed the following steps:

  1. Read the XSLT stylesheet with an XML parser and convert (abstract, rather) its content to a tree of nodes (the stylesheet tree), according to the XPath data model. "Compile-time" stylesheet syntax errors are detected at this stage. Stylesheets can be modular, so any transclusions (xsl:include, xsl:import instructions) would also be handled at this stage in order to bring template rules and other top-level stylesheet elements from other XSLT documents into the stylesheet tree.
  2. Read the input XML with an XML parser and convert its content to a tree of nodes (the source tree), according to the XPath data model. The stylesheet may reference other XML sources via document() function calls. These are, typically, evaluated at run-time, since their locations may have to be calculated and the function calls may not even be reachable. (The example above does not reference any other source documents.)
  3. Strip whitespace-only text nodes from the stylesheet tree, except those that are descendants of xsl:text elements. This allows nested elements in template rules to be on separate ('pretty') lines in the original XSLT without resulting in unintended whitespace being added to the result tree.
  4. Strip whitespace-only text nodes from the source tree, if xsl:strip-space instructions are present in the stylesheet. This allows 'pretty' input XML to be processed in a manner that ignores extraneous whitespace. (The example above does not use this feature.)
  5. Supplement the stylesheet tree with a trio of built-in template rules that provide default behavior for any node type that might be encountered during processing. One template rule is provided for processing the root node or any element node; it directs the processor to continue and process each child node. Another template is provided for any text node or attribute node; it directs the processor to make a copy of that result tree node. A third template rule is provided for any comment node or processing instruction node; it is a no-op. Templates, explicitly provided in the stylesheet, will override some or all of these. If the stylesheet contains no explicit template rules, the built-in template rules will result in a recursive source tree descension and only text nodes are copied to the result tree (attribute nodes will not be reached because they are not "children" of their parent elements). This result is generally never desirable, as it tends to be just a concatenation of the non-markup character data from the XML source.
  6. Process the root node of the source tree. The procedure for node processing is described below.
  7. Serialize the result tree, if desired, according to hints provided in the xsl:output instruction.

When processing a node, the following steps are undertaken:

  1. The best-matching template rule for that node is located. This is facilitated by each template rule's "match" pattern (an XPath-like expression), indicating the nodes to which it can be applied. Each template is assigned a relative priority and import precedence by the processor to help ease conflict resolution. The order of template rules in the stylesheet can also help resolve conflicts between templates which match the same nodes, but it does not affect the order in which nodes are processed.
  2. Template rule contents are instantiated. Elements in the XSLT namespace (prefixed with xsl:, typically. It is the namespace identifier bound to the prefix — not the prefix, itself — that matters.) are treated as instructions and have special semantics that guide how they are interpreted. Other elements and text nodes in the template rule are copied, verbatim (namespaces and all) to the result tree. Comments and processing instructions are ignored.

The XSLT instruction xsl:apply-templates, when processed, results in a new set of nodes being selected for processing. The nodes are identified via an XPath expression. Each node is processed in document order (the relative order in which they appear in the original document).

XSLT extends XPath's function library and allows XPath variables to be defined. These variables have different scopes in the stylesheet, depending on where they are defined and their values can originate outside the stylesheet. A variable's value cannot be changed during processing.

Although this procedure may sound complicated, it has the net effect of making XSLT much like other web templating languages. If the stylesheet consists only of a single template rule that matches the root node, everything in the template is essentially copied to the output, except for the XSLT instructions (the 'xsl:…' elements), replaced by computed content. XSLT even offers an abbreviated stylesheet format ("literal result element as stylesheet") for these simple, single-template transformations. However, the ability to define separate template rules greatly increases XSLT's versatility and efficiency, especially when producing output that is very similar to the input.

[sửa] Xem thêm

  • XML transformation language, a computer language designed specifically to transform an input XML document into an output XML document which satisfies some specific goal.

[sửa] Liên kết ngoài

[sửa] Implementations

  • Implementations for Java:
  • Implementations for C or C++:
  • Implementations for Perl:
  • Implementations for Python:
    • 4XSLT, in the 4Suite toolkit by Fourthought, Inc.
    • lxml by Martijn Faassen is a Pythonic wrapper of the libxslt C library
  • Implementations for JavaScript:
    • Google AJAXSLT AjaXSLT is an implementation of XSL-T in JavaScript, intended for use in Ajax applications. Because XSL-T uses XPath, it is also an implementation of XPath that can be used independently of XSL-T.
  • Implementations for specific operating systems:
    • Microsoft's MSXML library may be used in various Microsoft Windows application development environments and languages, such as .Net, Visual Basic, C, and JScript.
    • Saxon.NET Project Weblog, an IKVM.NET-based port of Dr. Michael Kay's and Saxonica's Saxon Processor provides XSLT 2.0, XPath 2.0, and XQuery 1.0 support on the .NET platform.
  • Implementations integrated into web browsers: (Comparison of layout engines (XML))
    • Mozilla has native XSLT support based on TransforMiiX.
    • Safari 1.3+ has native XSLT support.
    • X-Smiles has native XSLT support.
    • Opera has native XSLT support since Version 9.
    • Internet Explorer 6 supports XSLT 1.0 via the MSXML library (described above). IE5 and IE5.5 came with an earlier MSXML component that only supported an older, nonrecommended dialect of XSLT. A newer version of MSXML can be downloaded and installed separately to enable IE5 and IE5.5 to support XSLT 1.0 through scripting, and if certain Windows Registry keys are modified, the newer library will replace the older version as the default used by IE.

[sửa] Tài liệu

[sửa] Địa chỉ thư điện tử

[sửa] Blog

[sửa] Sách

[sửa] Công cụ