Employee Details Employee Name John Roger

Download 78.9 Kb.
NameEmployee Details Employee Name John Roger
A typeDocumentation


Extensible Markup Language (XML) is a new markup language that is used to store data in a structured format. A markup language provides you with a set of tags that you can use to describe and format the text in a document. Consider a situation in which you need to create a Web page, which involves deciding the structure of the page and the content to be included in the page. You can do all of this easily by using the tags of a markup language.

For example, markup languages such as HTML have a set of predetermined tags that you can use to format the text in a document. However, sometimes you will not need the elements that exist in a markup language. In such eases, you need to create your own tags. This can be achieved using XML. Therefore, XML is called a metamarkup language.

This chapter introduces the Standard Generalized Markup Language (SGML) and the markup languages derived from it, such as XML and, Hypertext Markup Language (HTML). This enables you to compare the different markup languages that are available and identify the benefits that XML provides: In addition, this chapter focuses on the various components of an XML document. These components include type definitions, tags, attributes, and comments. Using this knowledge, you will further learn to create a sample XML document. Finally, this chapter discusses namespaces and XML schemas used with XML documents.


Having discussed markup languages, such as SGML and HTML, we will now look at XML and present the advantages offered by XML over these languages.

XML, a subset of SGML, is a text-based markup language used to describe data. However, you cannot format data using XML. To do so, you need to use style sheets. You will learn about style sheets later in this section.

You can create an XML document by using authoring tools that are simple and readily available. Similar to HTML documents, you can create an XML document in a text editor, such as Notepad. To store a text file as an XML document, you need to save it with the .xml extension. You will learn to create an XML document later in this section.

Another significant advantage of XML is its ability to create tags with names that users can identify easily. This implies that XML allows you to create elements, attributes, and containers with meaningful names, as and when the user requires.

Now we will consider another example of an HTML file:

height="40" align="right">

Home Page

Employee Details

Employee Name

John Roger

Employee Name

George Smith

Employee Name

Bob Murray

Employee Name

Daniel Clark

The output of the preceding code is shown in Figure.

Another HTML Document.


As you can see, the tags that HTML uses are difficult to interpret. For example, it would be easier for a user to interpret a tag with the name Employee_Name than a tag with the name

. Therefore, the preceding document can include custom tags by using XML as shown:

John Roger

George Smith

Bob Murray

Daniel Clark

You can view the output of the preceding code by saving the Notepad file with a name Employee.xml and opening the file in Internet Explorer. The XML document will appear as shown in Figure.

A Sample XML Document.


The ability to create custom and meaningful tags in XML makes it a hardware and software independent markup language. This implies that an XML document can be interpreted easily by any computer that is running on any operating system. Therefore, XML is widely used as a markup language that transfers structured data over a network. In addition, XML is used to transfer structured data in high-level Business-to-Business (B2B) transactions.

The following list summarizes the advantages of XML as a markup language:

  • In recent times, XML has been used as a standard markup language for the exchange of data over a network. This is because XML allows you to describe content in a text based format that the two applications can understand easily. Due to its extensibility and universal format, XML is being widely used for data exchange between applications.

  •  XML can make searching for information easy on the Internet. At present, the popular search engines on the Internet return huge amounts of data because search engines either search for the entire text in an HTML page or the search terms in the keyword called metadata. However, using metadata to search for text is not an accurate method because a search based on keywords can be misleading. The search engines need to do a full-page search, which is time consuming. In addition, because the HTML tags only describe the format of the page and not the content that is stored in the page, the results returned by searching the HTML tags are not satisfactory.

For example, by your specifying the keywords Linux programming if you need to search for sites on Linux programming, the search engine returns all pages that contain these two words.

The search result might also include the Web pages on Windows programming with passing information on Linux. The search engine cannot judge the context in which the words Linux and programming are used in the Web page.

However, consider a situation in which XML is used to create Web pages, which include tags as shown in the following example:



If a search engine parses the document containing the tags and retrieves the data from the and tags, then the result returned by the search engine is accurate. This helps to omit several thousand Web pages that contain these keywords in a different context. However, this scenario is not plausible because Web will not become XML based in the near future.

  • XML allows you to create custom tags. The tags that you create using XML can be based on the requirements of a document; therefore, they can be used as a vocabulary for all related documents. For example, you can create a vocabulary of tags for describing the details of an employee. After creating the tags, the tags can be used as a template to describe the information about all the employees of the organization.

An example of such a template is shown in the following code:

John Smith


HR Executive

Human Resources

XML is used to describe data. However, you cannot use XML to format data for display in a Web page. A markup language, such as HTML, is used to describe data, and it includes information about the presentation of data in a Web page. Therefore, to present the same data in different formats by using HTML, you need to create separate Web pages.

To format the data in an XML document, you can use style sheets, such as Extensible Style Sheet Language (XSL) or Cascading Style Sheet (CSS). These style sheets contain information about how to present the data in the document. As a result, using XML, you can present the same data in different formats.

Components of an XML Document

An XML document consists of several components, such as declaration statements, elements, tags, and attributes. The following sections discuss these components in detail.

Markup Syntax

Components, such as declaration statements or markup tags, define the syntax for creating an XML document. The syntax used to create an XML document is called markup syntax. The markup syntax is used to define the structure of the data in the document. The markup syntax includes all tags, DOCTYPE declaration statements, comments, DTDs, and character references.

XML Entities

In addition to the markup syntax, an XML document consists of the content or data to be displayed. Consider the following example:

John Smith


HR Executive

Human Resources

In this case, the tags , , , , and are the markup syntax for the Employees.xml file. However, the content of the XML file is the data enclosed within tags, such as John Smith, 30, HR Executive, and Human Resources

The data stored in an XML document is in the form of text, and it is commonly called an XML entity or a text entity. The text entity is used to store the text in the form of character values as defined in the Unicode Character Set. The following example shows the text data in an XML document:

John Smith

XML Declaration Statement

The XML declaration statement is included in the beginning of an XML document. It is used to indicate that the specified document is an XML document. The XML declaration statement includes a keyword, xml, preceded by a question mark (?). This statement includes the XML specification to which the XML document adheres. For example, if the XML document that you create is based on XML Specification 1.0, then the XML declaration statement would be as shown here:

In addition to the information about the XML version being used, you might provide information such as whether external markup declaration statements are included in the XML document. To do this, you can use the standalone keyword. Consider the following declaration statement:

The attribute value of yes in the preceding code indicates that no external markup declarations are used in the XML document. You can include external markup declaration statements in the XML document by changing the attribute value to no.


Another important component of an XML document is comment entries. Comments allow you to include instructions or notes in an XML document. These comments help you to provide any metadata about the document to the users of the document. Any data that is not part of the main content or the markup syntax can be included in comment entries.

The XML processor ignores any text that you include in comment entries.
This implies that the XML processor does not execute the text in the comment entries. Therefore, you need to be careful while writing comments.

The syntax for writing a comment in an XML document is the same as that of writing a comment in an HTML document. The syntax is as shown:

The exclamation (!) sign in the preceding code indicates that the text within the tags is a comment entry. Consider the following example:

the organization.-->

John Smith


HR Executive

Human Resources

The use of comments in a document provides users with additional information about the document. However, while writing comments, you need to follow the listed guidelines:

  • You cannot include the comment entries in the beginning of an XML document. The first line in the XML document is essentially the XML declaration statement. Therefore, the following code snippet results in an error:


  •  organization.-->


  •  You cannot include hyphens (--) within the comment text. For instance, the following code statement produces an error:


  •  organization.-->

  •  You cannot include comments within tags. For example, the following code statement produces an error:


  •  organization.--> >

  •  You cannot have nested comment entries. For example, the following code statement produces an error:


  •  of the organization.-->

XML Namespaces

The biggest advantage of using XML is the ability to create custom tags. You can create a vocabulary of tags that can be applied to an application or similar applications. Consider the Student.xml file that you created in the previous section. The file contains tags, such as Student, Name, Address, State, and Zip, which are used to describe the information about the students who are studying in Form 5.

Because XML allows you to declare user-defined tags, it is likely that another user will create a tag with the same name and use it in a different context. Consider a situation in which you create a tag with the name to store the average scores of students who are studying in Form 5. However, another user can create a tag to store the average number of students who enroll for a course in a month. The following examples show use of the tag in different contexts:

of students

studying in form 5.-->

Harry Brown


Now, consider the following code that uses the tag in a different context:

of students

who enroll in a course in a month.-->



As you can see, use of the tag is different. This situation can lead to a problem if you try to integrate data from these two documents.

The World Wide Web Consortium (W3C), the group that issues XML specifications, found a solution to this problem in the form of namespaces. An XML namespace is a collection of names of elements or attributes. This collection of names can be referred by a Uniform Resource Identifier (URI). Therefore, you can include a namespace in an XML document to uniquely identify the elements and attributes in the XML document.

Element and attribute names are stored in a structured format in a namespace. A DTD can be considered an example of a namespace that is referred to by its URL.

Before using a namespace, you need to declare it. The following section discusses declaring Namespace

Declaring XML Namespaces

You declare XML namespaces by using the xmlns keyword. A namespace is referred to by its URI. Therefore, while you are declaring a namespace, you also need to mention the URI that you use to access the namespace. The syntax for declaring a namespace is as follows:


As you can see, xmlns is an attribute that takes a value as the URI of the namespace. After you have created a namespace, you can create elements with the same name in different namespaces. For example, you can declare two namespaces and have the tag in both these namespaces. Consider the following code:



To uniquely identify an element, you need to prefix it with the name of the namespace. However, you can not prefix an element with the URI of the namespace, therefore, while declaring a namespace, you can assign an alias name to the namespace. This alias name is called a namespace prefix, or simply prefix. You then need to prefix the alias name with an element name to uniquely identify it. The syntax for declaring a namespace with a prefix is as shown:


You can provide an alias name as follows:



The syntax to access an element in a namespace is this:


Now, the tag in the student namespace can be accessed as follows:

Using XML Namespaces

After declaring a namespace, you can use it in an XML document. The following code declares a namespace and uses it in the previous example:


student.dtd" ?>

average scores

of students who are studying in Form 5.-->

Harry Brown


XML Schemas

XML schemas are used to define the structure of an XML document. In this context, schemas are similar to DTDs that we discussed earlier. Because XML documents are platform independent, they are used extensively to transfer data over a network. This implies that an XML document should adhere to a standard structure. This standard structure is referred to as the XML schema.

When an XML document is transferred over a network, the receiving application might need to process the data to produce a result. However, first the application needs to validate the data in the XML document. Validating the data ensures that no errors are generated while processing the data. In addition, validating the data ensures that the application does not produce erroneous results after processing the data. To validate the data, the receiving application verifies that the data adheres to the XML schema. If the structure of the XML document is not validated, an error is produced.

Defining XML Schemas

An XML schema is a document written in XML syntax that defines the structure of other XML documents. An XML schema consists of elements, attributes, and data types that are allowed to be included in an XML document. In addition, XML schemas define the rules that you need to follow while designing the structure of an XML document. An XML document that adheres to an XML schema is called the document instance of the schema and is an example of a valid document.

As discussed earlier, XML schemas provide a standard against which an XML document is validated. The following section discusses how to validate XML documents in detail.

Validating XML Documents

When an XML document is created and needs to be transferred over a network, both the sending and receiving applications need to mutually agree on a set of elements, attributes, data types, and the structure of the XML document. All this information is included in the XML schema on which the XML document is based.

However, doing this limits the user to use only the components that are included in the XML schema. Therefore, the XML schema constrains the user, as described in the following list:

  • Data type constraint. A data type constraint defines the permissible data types that you can include in an XML document.

  • Content type constraint. As discussed, an XML schema defines the structure of XML data. Therefore, the content type constraint defines the structure and sequence of the data in an XML document.

When an XML document is transferred, the parsers of the receiving application validate the data based on the data type and content type constraints. The parsers then verify the validity of the document.

An example of an XML schema is a DTD. A DTD also defines the structure of a document; therefore, a DTD is used to validate an XML document. However, DTDs have some limitations because of which W3C had to look for an alternative in the form of XML schemas.

The following section compares traditional DTDs with XML schemas.

Comparing DTDs with XML Schemas

Before comparing XML schemas with DTDs, we will list the problems that users face while working with DTDs. This will help you analyze the advantages of XML schemas over DTDs.

XML Document Object Model (DOM)

The Document Object Model (DOM) class is an in-memory representation of an XML document. The DOM allows you to programmatically read, manipulate, and modify an XML document. The XmlReader class also reads XML, however it provides non-cached, forward-only, read-only access. This means that there are no capabilities to edit the values of an attribute or content of an element, or the ability to insert and remove nodes with the XmlReader. Editing is the primary function of the DOM. It is the common and structured way that XML data is represented in memory, although the actual XML data is stored in a linear fashion when in a file or coming in from another object. The following is XML data.







The following illustration shows how memory is structured when this XML data is read into the DOM structure.

XML document structure


Within the XML document structure, each circle in this illustration represents a node, which is called an XmlNode object. The XmlNode object is the basic object in the DOM tree. The XmlDocument class, which extends XmlNode, supports methods for performing operations on the document as a whole, for instance, loading it into memory or saving the XML to a file. In addition, XmlDocument provides a means to view and manipulate the nodes in the entire XML document. Both XmlNode and XmlDocument have performance and usability enhancements, and have methods and properties to:

  • Access and modify nodes specific to DOM, such as element nodes, entity reference nodes, and so on.

  • Retrieve entire nodes, in addition to the information the node contains, such as the text in an element node.

Note   If an application does not require the structure or editing capabilities provided by the DOM, the XmlReader and XmlWriter classes provide non-cached, forward-only stream access to XML. For more information, see Reading XML with the XmlReader and Writing XML with the XmlWriter.

Node objects have a set of methods and properties, as well as basic and well-defined characteristics. Some of these characteristics are:

  • Nodes have a single parent node, a parent node being a node directly above it. The only nodes that do not have a parent is the Document root, as it is the top-level node and contains the document itself, and document fragments.

  • Most nodes can have multiple child nodes, which are nodes directly below it. The following is a list of node types that can have child nodes.

    • Document

    • DocumentFragment

    • EntityReference

    • Element

    • Attribute

The XmlDeclaration, Notation, Entity, CDATASection, Text, Comment, ProcessingInstruction, and DocumentType nodes do not have child nodes.

  • Nodes that are at the same level, represented in the diagram by the book and pubinfo nodes, are siblings.

One characteristic of the DOM is how it handles attributes. Attributes are not nodes that are part of the parent child and sibling relationships. Attributes are considered a property of the element node, and are made up of a name and a value pair. For example, if you have XML data consisting of format="dollar" associated with the element price, the word format is the name, and the value of the format attribute is dollar. To retrieve the format="dollar" attribute of the price node, you call the GetAttribute method when the cursor is located at the price element node. For more information, see Accessing Attributes in the DOM.

As XML is read into memory, nodes are created. However, not all nodes are the same type. An element, in XML, has different rules and syntax than a processing instruction. So as various data is read, a node type is assigned to each node. This node type determines the characteristics and functionality of the node.

Microsoft has extended the APIs that are available in the W3C DOM Level 1 and Level 2 to make it easier to work with an XML document. While fully supporting the W3C standards, the additional classes, methods, and properties add functionality beyond what can be done using the W3C XML DOM. New classes enable you to access relational data, giving you methods for synchronizing with ADO.NET data, simultaneously exposing data as XML. For more information, see Synchronizing a DataSet with an XmlDataDocument.

The DOM is most useful for reading XML data into memory to change its structure, to add or remove nodes, or to modify the data held by a node as in the text contained by an element. However, other classes are available that are faster than the DOM in other scenarios. For fast, non-cached forward only stream access to XML, use the XmlReader and XmlWriter. If you need random access with a cursor model and XPath, use the XPathNavigator class.

Types of XML Parsers

  • PHP provides three different built-in XML parsers

    1. Expat: This parser is a simple parser that linearly scans through an XML file and generates three types of "events": 1) start element events, 2) end element events, and 3) data content events. A program provides handlers for each of these three types of events.

      • Advantages

        1. Incrementally reads the file so large XML files can be easily accommodated

      • Disadvantages

        1. If you want to retain state information about the file you will need to allocate your own data structures

        2. If you want to be able to interpret data content then you typically must maintain your own stack of elements so that you know which element the data content belongs to.

        3. Even simple tasks require code that is more unwieldy than the simple XML parser discussed next

    2. SimpleXML: This parser creates a tree of objects from an XML file and provides a few simple commands that allow you to easily navigate through the tree.

      • Advantages

        1. It produces easy-to-read programs. It was the last addition to the PHP language and essentially exploits the 80-20 rule which is that 80% of the things you want to do with an XML file are simple and hence should be handled with a simple parser.

        2. It creates an easy to navigate tree

      • Disadvantages

        1. It can only be used to read an XML document or modify the contents of an element. It cannot be used to alter the structure of an XML document by adding or deleting elements.

        2. It must read the entire XML document into memory.

        3. The functions it creates for navigating the tree are named after the element tags so if the names of the element tags change, then you must also edit the names of all the function calls. Hence you should only use simpleXML with a relatively stable XML document.

    3. DOM (Document Object Model): This parser builds a tree from the XML specification and provides numerous operations for navigating the tree, modifying the data content, and modifying the structural content by either adding or deleting elements.

      • Advantages

        1. It builds an easy to navigate tree

        2. You can add/delete elements from the XML document

        3. You can build an XML document from scratch

      • Disadvantages

        1. It must read the entire XML file into memory

        2. The functions for generating a tree are more generic than for simpleXML and hence the code is less easy to read.

Expat Parser

  • Creating the parser: $parser=xml_parser_create()

  • Must specify three handler functions to handle each of the three events:

    1. function start_element($parser, $element, $attrs)

      • $parser is a reference to the parser

      • $element is a string in capital letters with the element name (e.g., 'CATALOG', 'ARTIST')

      • $attrs is a reference to an associative array of attributes for this element

    2. function end_element($parser, $element)

      • $parser is a reference to the parser

      • $element is a string in capital letters with the element name (e.g., 'CATALOG', 'ARTIST'): Note that the '/' is stripped

    3. function data_content($parser, $data)

      • $parser is a reference to the parser

      • $data is a reference to the data content of an element

  • Associating functions with appropriate element handlers

    1. xml_set_element_handler($parser,"start_element","stop_element"): sets the handlers for the start and stop elements

    2. xml_set_character_data_handler($parser,"data_content"): sets the handler for the data content

  • Template code for opening an XML file and parsing it--you would replace the test.xml filename with your own filename

  • //Open XML file

  • $fp=fopen("test.xml","r");

  • //Read data

  • while ($data=fread($fp,4096))

  • {

  • // 1) xml_parse returns true if it successfully

  • // 2) feof($fp) tells the parser whether or not this will be the last

  • // block of data that it processes--feof returns true if $fp is now

  • // at end-of-file

  • // 3) if the parse fails then print an appropriate error message and

  • // terminate the program

  • xml_parse($parser,$data,feof($fp)) or

  • die (sprintf("XML Error: %s at line %d",

  • xml_error_string(xml_get_error_code($parser)),

  • xml_get_current_line_number($parser)));

  • }

  • //Free the XML parser

  • xml_parser_free($parser);

Share in:


Employee Details Employee Name John Roger iconThis letter is to certify that we are using the services of (employee...

Employee Details Employee Name John Roger iconPurposes of providing Employee and Bank information to the Registry,...

Employee Details Employee Name John Roger iconIn consideration of the mutual covenants set forth below, Employer...

Employee Details Employee Name John Roger iconTo: employee name from

Employee Details Employee Name John Roger iconNew Employee Checklist

Employee Details Employee Name John Roger iconEmployee Traini

Employee Details Employee Name John Roger iconEmployee Benefits

Employee Details Employee Name John Roger iconResignation and employee references

Employee Details Employee Name John Roger iconNew employee safety orientation

Employee Details Employee Name John Roger icon[insert employee Address]

forms and shapes

When copying material provide a link © 2017