HTML Primer

If you're an HTML wizard and have an in-depth understanding of the structure of HTML, you can probably skip this section, but for those of you who've never dissected HTML code or the HTML specification, read on. You'll recall that PHP's "middle name" is hypertext and that alone tells you that PHP is intertwined with HTML (Hypertext Markup Language). Understanding how HTML—particularly the HTML <form> element—works is very important to proficiency with PHP.

HTML was created by Tim Berners-Lee and Robert Caillau in 1989. It is a subset of Standard Generalized Markup Language (SGML). SGML was defined by International Standards in 1986 as ISO 8879:1986. SGML is designed to provide a common format for markup languages. HTML is called an SGML application because it is a language, whereas XML is simply a subset of the SGML specification used to make your own markup languages (more on XML in Chapter 8).

Like most SGML applications, HTML includes a Document Type Definition (DTD) that specifies the syntax of markup elements. You'll see examples of the HTML DTD throughout this primer.

The World Wide Web Consortium (W3C) can be found at www.w3.org. This organization maintains the HTML specification (now the XHTML specification). Visit the site and look for the HTML 4.01 specification to see all the elements and attributes.

HTML is a markup language, not a programming language. The primary purpose of HTML is to display data or content (such as text and images) along with hypertext links. HTML tags (the "commands" in HTML) help the Web page designer arrange the display of text, graphics, and multimedia. The only elements that give something resembling programmatic functionality are used to make tables, links, forms, and frames.

HTML is written in plain text, and when a page is requested all the code is sent in plain text format. Here's a simple HTML Web page (without a body):

<html>
<head>
<title>The Title</title>
</head>
</html>

Although the convention for many years was to write HTML tags uppercase (as in <HTML>), the HTML specification actually has no preference, and you can write conforming HTML tags either way, or even a mixture of upper and lower case (as in <hTmL>). However, the latest standard for HTML is now XHTML, which adheres to the XML specification, so there is a difference between uppercase and lowercase tags. XHTML specifies lowercase for tag names, which is why nearly every HTML tag in this book is lowercase. Browsers won't care whether the tags are uppercase or lowercase, but using lowercase will make it a lot easier to change your HTML to conform to XHTML.

An HTML Web page is made up of HTML tags, and most (but not all) of these tags have both beginning (opening) and ending (closing) tags. HTML tags are delimited by the angle brackets (<>). An HTML tag is named for the element it represents. For example, the tags <html> and </html> are the opening and closing tags for the HTML element. These tags signify the beginning and ending of the entire HTML document. Within these tags are the tags for the <head> of the document and for the title of the document. Tags contained within other tags are said to be nested.

Some HTML elements have only a beginning tag, such as the IMG element. When writing an IMG element (the IMG element inserts an external image file into a Web page) all you write is <img>, without en ending </img>. However, to tell the browser where to find the external image file, you place what is called an attribute in the beginning tag. HTML attributes are like fields in a database, or properties in an object, or variables in a program. They have names (such as SRC), and are containers for values. In fact, you set the value of the SRC attribute in the <img> tag to the URL of the image file name (like this: <img SRC="http://www.example.com/images/example.gif">). When the user's browser receives the HTML of the Web page, the browser reads the HTML, finds the URL of the image file, requests that file as well, and then inserts the file into the rendered Web page at the appropriate spot.

The HTML Document Type Definition

A DTD declares what elements and attributes (and a few other things) that are allowed in an HTML document. Although an HTML document is made up of HTML tags, the HTML DTD uses a special format to specify what elements and attributes you can use. For example, because the HTML DTD specifies an IMG element, you can use the IMG element in a Web page.

But it's still up to the maker of your browser to properly recognize and display elements and attributes specified in the HTML DTD. In fact, deviations from the HTML specification are the primary reason a Web page may look (and work) fine in one browser and not in another.

Technically, HTML documents should start with a line indicating the DTD to be used, contained within the <!DOCTYPE> element. The DOCTYPE declaration indicates to the browser the proper DTD to use, but the inclusion of this line is not enforced by browsers. Many Web pages have no DOCTYPE declaration, but are still rendered correctly in browsers. Here's a DOCTYPE declaration inserted by Dreamweaver (a popular Web page design tool):

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

The Form and Input Elements

One of the primary HTML elements you'll be working with is the <form> element. Take a look at how it's specified in the HTML DTD:

<!ELEMENT FORM - - (%block;|SCRIPT)+ -(FORM) -- interactive form -->
<!ATTLIST FORM
  %attrs;                              -- %coreattrs, %i18n, %events --
  action      %URI;          #REQUIRED -- server-side form handler --
  method      (GET|POST)        GET    -- HTTP method used to submit the form --
  enctype     %ContentType;  "application/x-www-form-urlencoded"
  accept      %ContentTypes; #IMPLIED  -- list of MIME types for file upload --
  name        CDATA          #IMPLIED  -- name of form for scripting --
  onsubmit    %Script;       #IMPLIED  -- the form was submitted --
  onreset     %Script;       #IMPLIED  -- the form was reset --
  accept-charset %Charsets;  #IMPLIED  -- list of supported charsets --
  >

The DTD for <form>begins with a line that names it as an element, and then specifies a list of attributes (ATTLIST). Notice the action attribute, which tells the browser where to send the contents of the form, and the method attribute, which tells the browser how to send the contents of the form.

The <input> element makes text fields, radio buttons, check boxes, and so on in a form. Here's its DTD:

<!ENTITY % InputType
  "(TEXT | PASSWORD | CHECKBOX |
    RADIO | SUBMIT | RESET |
    FILE | HIDDEN | IMAGE | BUTTON)"
   >
<!-- attribute name required for all but submit and reset -->
<!ELEMENT INPUT - o EMPTY             -- form control -->
<!ATTLIST INPUT
  %attrs;                             -- %coreattrs, %i18n, %events --
  type        %InputType; TEXT        -- what kind of widget is needed --
  name        CDATA          #IMPLIED -- submit as part of form --
  value       CDATA          #IMPLIED -- Specify for radio buttons and
checkboxes
--
  checked     (checked)      #IMPLIED -- for radio buttons and check boxes --
  disabled    (disabled)     #IMPLIED -- unavailable in this context --
  readonly    (readonly)     #IMPLIED -- for text and passwd --
  size        CDATA          #IMPLIED -- specific to each type of field --
  maxlength   NUMBER         #IMPLIED -- max chars for text fields --
  src         %URI;          #IMPLIED -- for fields with images --
  alt         CDATA          #IMPLIED -- short description --
  usemap      %URI;          #IMPLIED -- use client-side image map --
  ismap       (ismap)        #IMPLIED -- use server-side image map --
  tabindex    NUMBER         #IMPLIED -- position in tabbing order --
  accesskey   %Character;    #IMPLIED -- accessibility key character -
  onfocus     %Script;       #IMPLIED -- the element got the focus --
  onblur      %Script;       #IMPLIED -- the element lost the focus --
  onselect    %Script;       #IMPLIED -- some text was selected --
  onchange    %Script;       #IMPLIED -- the element value was changed --
  accept      %ContentTypes; #IMPLIED -- list of MIME types for file upload --
  >

The type attribute of the element specifies the type of control that will appear on the screen in your browser (text makes a text field, radio makes a radio button, and so on).

To create a Web page with a form in it, you could write the following HTML code in plain text, upload it to a Web server (or even just open it directly in your browser), and it would display as a nicely formatted Web page:

<html>
<head>
<title>
</title>
</head>
<body bgcolor="white">
<form method="post" action="http://www.example.com">
Username:<input type="text" name="username"><br>
Password:<input type="password" name="password"><br>
<input type="submit" value="Login">
</form>
</body>
</html>

This code creates a simple form with two fields (username and password) and a submit button. When submitted, the form's contents are sent to www.example.com.

How can you make HTML forms and PHP work together to make Web pages that are dynamically generated (rather than simply copied to your browser by the Web server)? First, you create a Web page using plain text HTML tags, and include an HTML form within that page. Then, you write PHP code within the Web page, making sure the code is properly enclosed by the <?php and ?> delimiters.

When the Web page is requested by a browser, any PHP code in it is resolved or processed by the PHP scripting engine before the results are returned to the user, and the results of PHP processing are placed in exactly the same spot as the original PHP code. You also could write your code so that it is only processed when there is a form submission (you'll see how in the next few sections). It may even use some of the submitted form data in its processing.

The end result of the PHP processing is HTML compatible, but because the PHP scripting engine has a chance to perform some processing before the final version of the Web page is sent, some of the content of the page (any parts generated by the PHP code) may differ each time the page is requested. And if processing takes place in response to a form submission, there are a great variety of interactive features that can be created.