Markup is all about structuring documents so that it can be manipulated by programs such as web browsers or screen readers. An inhernet problem from the early days of the web has been the conflicting ideas about what the markup should be and despite considerable effort to stnadardise, there is still a long way to go before all participants, such as browser developers, standards bodies, commercial companies and software developers agree on a common standard. The elephant in the room has been Microsoft Internet Explorer´s failure to comply with standards. Nevertheless, considerable advances has been made with XHTML and HTML5

Let us clarify what we mean by XHTML5 with a practical minimum document structure for an article we will be writting for learning HTML.

<!doctype html>
<html lang="en">
		<title>My web document</title>
		<meta charset="utf-8" />
		<meta name="author" content="your name" />
		<meta name="description" content="Web development exercises" />
		<link href="main.css" rel="stylesheet" type="text/css" />
				<h1>TODO: page header</h1>	
					<h1>TODO: Main navigation</h1>	
			<section id="main_section">
				<h1>TODO: Main content section</h1>	
				<h1>TODO: Page footer</h1>	
  • doctype declaration specifies the version of HTML and it should always be included as the first line of code. Browsers will attempt to render documents without a doctype with their so called "quirck mode" with possible unexpected results.
  • The html element should always be included. It is refered to as the root element that contains all other elements.
  • The head element wrap other elements to describe meta data or information about the document, such as the title, author, character set and style sheet.
  • The body element wrap other elements that define document structure and content, such as sections, headings, paragraphs and so on.
  • Unlike other elements, there is only one doctype, head, title, and body element per document.

HTML structure

The structure of an HTML document is simple; it starts with a doctype declaration for the version of HTML being used, followed by an html element which has the head and the body elements nested within it. The title element nested within the head element is the only other required element.

Document structure

The document content is inserted within the body element. Structuring your document is key in making it accessible to programs such as a browser. Documents are varied and the challenge is to choose the right element(s) to markup the content. At your disposal is the complete set of HTML elements; take a breif look, we shall return to this in our tour of HTML.

The purpose in the above example is to write a web document for learning HTML. There are several alternatives and at this stage we have marked-up an article element that includes a header, section and footer for the article. This will evolve as we structure additional content and use further elements.

HTML syntax

Any new language would look cryptic and confusing at first but a little practice would soon get you started. Here are some of the general syntax you should note.

  1. Elements generally have an opening and a closing tag, such as <h1> ... Some content ... </h1> with the end tag having a / character to distinguish it from the openning tag.
  2. All elements can have attributes. Attributes are included in the opening tag of elements. Some attribute values are chosen by you and others are pre-defined. For example, <section id="container">, the attribute here is, id and its value chosen by us for this example is, container. Other attributes are pre-defind, for example, the link element, <link href="main.css" rel="stylesheet" type="text/css" /> has 3 attributes, href, rel and type. Each attribute is assigned a value in double quotation marks; href (short for hyperlink reference) has a value assigned by us, "main.css", whereas rel (short for realtionship to this document) is assigned a value stylesheet chosen from a list of permitted values. Similarly attribute type has pre-defined values.
  3. Some elements have all the data they need in attribute values and are self-closing without an end tag, for example, the <link tag is self-closed with /> instead of </link>.
  4. Elements are properly nested. for example: XHTML correctly nested element is correct and the following is incorrect: XHTML incorrectly nested element

The X in XHTML

We defined Hyper Text as text that contains links to other texts; this is the HT in XHTML and ML is a markup Language; so, what about the X in XHTML? The X is an abbreviation for XML, a general purpose markup language with stricter syntax rules than HTML. Why is that important? Because it is essential for well structured documents that are easier to develop and maintain.

All of this syntax can be overwhelming at first but remembering a handful of XML rules will pay off in the long run; here is a summary:

  1. Elements must be properly nested.
  2. Elements must have a start and end tag.
  3. Element names are case-sensitive.
  4. Attribute values must be enclosed in double quotation marks.
  5. Attributes may not be repeated.

Documents adhering to XML syntax are refered to as "well formed" and are ideal for manipulation by computer programs such as browsers.

The 5 XHTML5

Each HTML version has it´s own vocabulary or elements and 5 refers to the particular release of HTML. Remember, we specified this using the doctype declaration.

A document that conforms to a particular doctype and is also "well-formed" is considered a valid document and heaven to developers.

There are validation programs and Integrated Development Environments(IDE) that will help create valid documents, so don´t be too concerned with all of the detail here; we shall return and practice many of the concepts. Let us start with an exercise to clarify some of the fundemental points.

Exercise – XHTML5 ‐ Structure

The aim is to create a valid XHTML5 document.

Leave the presentation (spacing, font, position, colour, etc.) to the broswer default styling; we shall return to it in later tutorials.

  1. Create a directory, named wd. Notice names are case sensetive. We shall use this wd directory to store the files for all exercises. This will make it easier to host the site.
  2. In the wd directory, create a file named, structure.html
  3. Copy and paste the html code above and save the structure.html file.
  4. Open the structure.html file in a web browser (not Internet explorer, use Firefox), you should see the TODO: messages.
  5. Replace the TODO: text in the header h1 by your name. Save the file and refresh the browser to see the change.
  6. Replace the TODO: in the section element with a series of paragraphs using the p element that markup the following questions and their answers (you will find the answers in the tutorials!):
    1. What is a markup?
    2. What is hypertext and hypermedia?
    3. What is a doctype and why is it important to include it in every HTML document?
    4. What language can be used to style web documents?
    5. Which elements are required for a valid XHTML5 document?
    6. What is a self closing HTML element? Give an example?
    7. What is XML and why is it important to web developers?
    8. Describe 4 XML rules?
    9. What is the difference bwtween a "well-formed" and a valid web document?
  7. Save and refresh the browser.
  8. Consider the HTML code above. For each element in the example, insert a paragraph of your understanding in the structure.html file. The answers are here in a complete set of HTML elements page. Note, you can click on each element and in the pop out window there are links to the Mozilla site, with the answer you are looking for.