What's New Tutorial Index References Contact

The Foundation of the Web

The universal language of the web is called HTML, the Hypertext Markup Language. As the name implies, it's not code, but rather, markup—a system of annotation that gives the browser an idea of what it's looking at. See, a web browser will happily show plain text, but it won't look fancier without HTML to give it a clue as to its contents.

Thankfully, HTML is relatively simple to read and write; most of your time spent learning it will be on learning the different elements you might use on a page.

Elements

An HTML document is built up of elements, bits of content and layout that, all together, make up a page. So we're not working in the abstract, I'll give you a visual you're probably familiar with: word processors.

A page written in Google Docs
A page written in Google Docs, a word processor. What you do in a word processor isn't substantially different from how you build a web page.

In HTML, that heading is an element. Each paragraph is an element. That bit of bold text is an element. Images too are elements.

Elements are comprised of a few pieces, generally: tags, which mark the start and end of the element, the content of the element (the text you'll see on the final page), and often attributes, which act as settings that modify that specific element.

Anatomy of an element
Anatomy of an element

Tags

Tags normally come in pairs, with an opening tag and a closing tag to tell the browser where the element starts and where it ends. For the bold text in our example:

text formatting like <strong>boldface</strong>,

text formatting like boldface,

The <strong> is the opening tag. All text after that point will look bolded. In order to stop the boldface, you'll need to close the element with the closing tag, </strong>. Note that the comma falls after the closing tag and thus isn't bolded.

It's imperative that you close off your elements with their closing tags when necessary. Leaving elements open can cause rendering errors and in some cases slow down your page altogether. (Thankfully, most IDEs and text editors will give you some visual indicator of when you've left an element open, or even better, add in the closing tag for you.)

Content

Text in between tags is what'll get displayed in some form or fashion on the end page. This is the content of the element. In the example, "boldface" is the content of the strong element.

Attributes

An attribute acts as a modifier for that element; something about it will act differently when you set one of its attributes. Images are a prime example of the use of attributes; instead of having content, the image gets loaded in through an attribute. All attributes go inside the opening tag of an element.

<img src="lynx.jpg" alt="Good lad">

In this example, src="lynx.jpg" is an attribute of the img element. (Because their content is loaded in through an attribute, images have no closing tag.) Here's how that'll appear on your page:

Good lad

Don't worry if all this seems complicated. You'll get used to it as you experiment.

An example page

The best way to learn HTML is to have something to experiment with. The following snippet is a fully valid HTML document you can copy-and-paste into your text editor, save to a file, and experiment with. (If you'd prefer to just download a page, here's the sample in full.) After the snippet, I'll be explaining what each line does to give you a better idea of what exactly the page is doing and what you can expect if you change some of it.

<!DOCTYPE html>
<html lang="en">
<head>
	<title>Web Page Title</title>
</head>
<body>
	<h1>A Web Page</h1>
	<p>Well, here we are, a basic web page. It's not much, but we all gotta start somewhere, I suppose.</p>
</body>
</html>

When you open it up in your web browser, it'll look something like this:

The results of our little test page
The results of our little test page

To explain what each line means and what effect it has on the page:

  1. <!DOCTYPE html> (Document type declaration)—The doctype tells the browser what kind of page it's looking at to aid in rendering it. There's many different kinds of doctypes, but only one is relevant to us, <!DOCTYPE html>. Always have it on the first line of your page.
  2. <html lang="en">...</html> (Document start + page language attribute)—This signals the official start of the page. Nothing other than a doctype can come before the opening tag or after the closing tag. The lang="en" is another example of an attribute, and it tells the browser the page's language. There's tons of different language codes, in case you're writing a page that isn't in English.
  3. <head>...</head> (Page head)—The page head is for metadata, or information about the page itself. Stylesheets are loaded in the head. <meta> tags and OpenGraph tags go in the head. Nothing in the head is visibly displayed on the page, aside from the...
  4. <title>Web Page Title</title> (Page title)—The page title will display in the browser tab, the browser window's title, and the bookmark of the page. If you don't set it, it'll use the absolute path of the page instead.
  5. <body>...</body> (Page body)—This is where you start writing your page. All the text, images, headings, music, videos, decorative things—they go after the opening tag of the body and before the closing tag of the body. In other words, they're the page's content.
  6. <h1>A Web Page</h1> (Top-level heading)—Though it mostly looks like big text, the top-level heading element is good for titles and is semantically considered the most important heading on the page.
  7. <p>...</p> (Paragraph)—As this bit of text is a paragraph, you'll want to wrap it in <p>...</p> tags to make it look a little cleaner and give it some meaning to the browser. You can have as many paragraphs as you want on a page, just remember to close off the paragraph before you start another. You'll learn more about both paragraphs and headings on their respective page.

It's now up to you to experiment. Start by changing the text in between the tags. Write something ridiculous. Change the title. Change the heading. Add more paragraphs. Take stuff out. Remember to save and refresh every time you change something to see what you did. The nice thing about HTML is that, because it's markup and not code, you can't harm anything by making an error.

Page sources

My final parting suggestion is that you familiarize yourself with looking at page sources. Every web browser has a built-in function to let you see the raw markup of a page. How to get to it differs from browser to browser, but the shortcut is usually CTRL+U (or CMD+U, on a Macintosh). You can also try right-clicking on the page and checking for a "View source" or similar option in the context menu.

The unfortunate thing is that many modern sites and especially web apps (like YouTube, Twitter, or Dropbox) are so messily-written that reading the page source is practically impossible. Thankfully, most text-heavy sites do have fairly readable page sources: Wikipedia is a good example, and of course, Tesserae is built for readability.

Keep this page handy. In the later pages of this section, I'll take you through formatting your text with bolds and italics, adding more headings, linking to other sites and pages, making lists, and adding images.

Summary:

  1. Web pages are simply text files annotated using HTML to give it structure and meaning. HTML is known as markup for that its elements mark up plain text.
  2. Each HTML document is essentially a nest of elements. If you think of a word processor, every bit of bold text is its own element. Every header is an element. Every image and footnote is an element.
  3. Each element starts with an opening tag and ends with a closing tag with content in the middle.
    • Bold text looks like <strong>Bold text</strong> in the page source; the <strong> is the opening tag, the Bold text is the content, and </strong> is the closing tag.
  4. Occasionally, attributes are used in the opening tag of an element.
    • Attributes act as modifiers or settings for their respective element. In <img src="lynx.jpg">, the src="lynx.jpg" is an attribute that tells the browser to use lynx.jpg for the image to load.