Guidelines
Meta Tags

Meta Tag Guidelines for HTML Pages and PDF files

Online documentation has grown not only in quantity, but also in complexity. Sun alone has millions of pages available to our internal, external, and partner customers. Because of the sheer volume of content, using consistent HTML code and including meta tags (which contain information about the document) are more important than ever. All customers must be able to find the right document quickly and confidently.

This document outlines the creation of HTML and meta tags all Sun documentation should contain. By updating your content with these simple tags, you will help ensure our customers - including internal employees and partners - find the document they need the first time.

HTML and Meta Tag Summary Table

Click on a link in the table for additional information on each tag, including:

  • Descriptions
  • Options
  • Editorial best practices
  • Usage guidelines
  • Examples
  • Precautions
  • PDF Guidelines
Tag (click for tag definition) Type Format Applicable to PDF Files Localizable
Title HTML <title>Place your title text here</title> YES YES
Description Meta <meta name="Description" content="Page summary goes here." > YES YES
Date Meta <meta name="date" content="YYYY-MM-DD" > YES NO
Keyword Meta <meta name="keywords" content="word 1, word 2, word n"> YES YES
Doc Type HTML <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">   NO
HTML Lang HTML <html lang="ll-CC">
or
<html lang="ll">
  YES
Content-Type HTML <meta http-equiv="Content-Type" content="text/html; charset=xxx-x">   YES
Content-Language HTML <meta http-equiv="content-language" content="xx-XX">   YES
Anchor Text (for HREF links) HTML "<a class="linkgrey" href="http://www.mozilla.org/js/">What is JavaScript?</a>"   YES
Alt Tag for Images HTML <img src="/en/img/nav/xxxx.gif" alt="Add image description text here." border="0">   YES
 

Introduction

Meta information is simply defined as "data about a document." Often, as in the case of HTML, this information is embedded within the document itself. In the case of PDF files, the meta data is embedded in the file's properties.

Content owners should create meta tags, as documented below, and maintain them as part of the regular Web publishing process.

The correct application of meta data is very important to a number of software applications, search engines being one. Search engines, whether they be enterprise or WWW engines, can be configured to use meta data as a means to index additional information about content and to display meta data in the search results. The most important tags for search include the HTML TITLE, ALT, and Anchor tags (link text descriptions), and the DESCRIPTION and DATE meta tags.

Click here to view an example of a fully compliant HTML page.

Display Characteristics of Meta Tags

Certain meta data are viewable to the user:

Tag Name Displays in the Web Page Displays in Sun Search Results
Title Yes (displays in browser "chrome") Yes
Description No Varies (dependent on search method used)
Date No Yes
Alt Yes (during mouseover) No
Anchor Yes (hyperlink text) No
 

Spamming

Sun adheres to a NO SPAMMING policy and will remove any content that has evidence of spamming in the page content or meta data. Spamming is excessive repetition of a word in a page, optimizing a page for a word which is unrelated to the contents of the site, using invisible or tiny text, etc.

Back to top


TITLE Tag

The Title is perhaps the most important meta tag. The content in the Title tag appears:

  1. in the browser "chrome" (at the top of the browser window)
  2. as the document title in search results
  3. as the "Bookmark" or "Favorites" description

Note: The <title> tag must be written in the same language as the document, that is, it must be localized.

Editorial Best Practices

  • Give each page a unique title. For large documents that are composed of several Web pages, differentiate the Title tag on these pages by associating additional subsection text. For example: "Networking Tutorial--UDP Packets."
  • Make sure the content is specific enough to accurately reflect the content of the document
  • Keep the title short--use as few words as possible and do not exceed 100 characters
  • Include product version numbers, when applicable
  • Do not use trademark (TM) symbols in the link description
  • Use text that is the same or similar to the title of the document minus any trademark symbols
  • Make sure the TITLE contains words a user may use to search for that page
  • Make sure any special characters used in the title follow HTML escap rules. For example, use the HTML escape sequence > for the greater than symbol.
  • Refer to "Sun Microsystems" as "Sun" unless it is a Sun corporate information page.

When to Use

Use the TITLE tag:

  • For ALL HTML content
  • For ALL content that can have an HTML wrapper
  • For ALL PDF documents (Refer to the section below for additional information on PDF titles)

TITLE Tag Format:

<title>Place your title text here</title>

PDF Guidelines

Add PDF files must have appropriate titles.

When you create a PDF file, the document title is autogenerated from the PDF file name. For example, an autogenerated PDF title might look like this: c:\work\whitepaper.pdf or as Template01.pdf.

To edit the title of a PDF document after it is created, use Adobe Acrobat Professional or a free tool, such as: A-PDF INFO Changer.

For instructions on adding meta data during PDF file generation, refer to the PDF Guidelines.

Back to top


DESCRIPTION Tag

The DESCRIPTION tag summarizes the page content and should accurately describe the intent and value of the document.

Note: The DESCRIPTION tag must be written in the same language as the document, that is, it must be localized.

Editorial Best Practices

  • Use capitalization, punctuation, and grammar in the same way you would in the content body of the web page.
  • The description must be contextually relevant to the content on the page.
  • No spamming (refer to Spamming for additional information).
  • Make sure the description has words the search engine can use to further identify the document.
  • The description text for HTML pages can be virtually any length, although the search engine will only display the contents of the Description tag under certain circumstances. When it does, only the first +/- 150 characters will display.
  • Make sure any special characters used in the Description follow HTML escap rules. For example, use the HTML escape sequence > for the greater than symbol.
  • Each description should be unique.

When to Use

Use the DESCRIPTION tag:

  • For ALL HTML content
  • For ALL content that can have an HTML wrapper
  • For ALL PDF documents

Note: Make sure to use a Description tag for ALL documents or pages that begin with something other than text, such as pages and documents that begin with a Table, Picture, TOC, etc.

DESCRIPTION Tag Format

Use this format:

<meta name="Description" content="Your description text goes here.">

NOTE: If there is no DESCRIPTION tag and a particular search engine requires one, the engines will usually display the first 150 characters of the page content. If these characters are not simple text (ex., are control characters that format a table, figure numbers, picture names, a TOC) or contextually relevant (ex., license terms, author information, etc.) the description will be poor and will result in a negative user experience as well as reduced click throughs and a loss of effectiveness of the content.

PDF Guidelines

A description is required for all PDF files that are published to Sun's sites.

When you create a PDF file, the file description (called the "subject" in a PDF) is empty. To add a description to a PDF document after it is generated, use Adobe Acrobat Professional to update the "subject" field, or a free tool such as: A-PDF INFO Changer.

For instructions on adding meta data during PDF file generation, refer to the PDF Guidelines.

The length of the description text should be kept at or under 150 characters.

Back to top


DATE Tag

The DATE tag can reflect either the creation date or revision date.

Note: The DATE tag should not be localized. Do Not modify the numeric format of the Date tag.

When to Use

The date of a document is an important piece of metadata for both the search engine and the user. The search engine uses the date in its ranking algorithm, and users look to the date to see if the document is recently updated or too old to consider.

Sun's search engine determines the date of a document in three ways; it uses the first one it finds, in this order:

  1. the DATE metatag
  2. The date from the HTTP Header (Last-Modified)
  3. the last date the document was indexed by the search engine

The DATE metatag should be included in documents, only if the HTTP header date will be incorrect.

Some example scenarios where a DATE metatag is warranted:

  • The document has had a minor revision, but the publisher does not want the date of the document to change
  • the document is being published on a system where the HTTP header is known to be incorrect.
  • The document has a fixed publication date, and the publisher wants to ensure that the document will always show up with that date

Manually adding a DATE metatag can introduce problems when the page is updated, but the publisher does not change the date. Each time a page with a DATE metatag is revised, the tag should be checked.

Editorial Best Practices

To determine what date to use in the tag, refer to the following guidelines:

Content Status Date to Use Comments
Initial publication or posting Actual publication date If there is considerable lag time between creation and posting, use the creation date rather than the posting date
Minor revision No date change required Minor revisions include those changes that do not significantly alter the document meaning or content
Major revision Actual revision date Major revisions include those changes that significantly alter the document meaning or content
 

DATE Tag Format

Use the following format:

<meta name="date" content="YYYY-MM-DD">

Note: The DATE tag should not be localized.

PDF Guidelines

A PDF file date is a property of the file that cannot be edited. Therefore, each time you edit a PDF document you will change the file date.

Back to top


KEYWORD Meta Tags

The KEYWORD meta tag allows you to provide additional text for the search engine crawler to index along with the HTML body content. The KEYWORD meta tag can be used to specify additional key words or synonyms that describe the contents of a site. KEYWORD meta tags are used in the indexing process but will not display on the web page.

Note: The Keyword tag must be written in the same language as the document, that is, it must be localized.

Editorial Considerations

Some rules around keywords include:

  • Do not "copy and paste": all keywords should be contextually relevant for the body of the page and should not be duplicated across other pages.

  • Do not use too many keywords--keywords start to diminish in importance the more that you have.

  • Do not use keywords on multi-page documents unless they are tailored to each individual page

  • Localize your keywords--they should be in the same language as the document, except for Product names and terms which are not translated.

  • Use the KEYWORD tag:
    • For synonyms and alternate spellings of important words
    • For acronyms and terms that don't otherwise appear on the page
    • To associate old product names with new product names

When to Use

Keywords should be used judiciously--if you are unsure whether or not to use keywords, please do not use them.

Contact your Web Publisher if you have any questions.

KEYWORD Tag Format

<meta name="keywords" content="solaris, malloc, multithread, multiprocessor, heap">

Back to top


DOCTYPE (HTML Version Information) Tag

A valid HTML document declares what version of HTML was used to create the document. The Doctype declaration names the document type definition (DTD) used for the document.

Editorial Considerations

None.

When to Use

Use the Doctype tag on every HTML page.

Format

For most purposes, the following document type declaration should be used in HTML pages:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

Note: The doctype declaration, though it refers to "EN", does not declare the content language of the document. It only declares the content language of the DTD.

For a further discussion of this topic refer to the W3C Recommendations.

Back to top


HTML LANG (Language) Tag

The language attribute is used by the browser to choose the correct font for a document.

Editorial Considerations

None.

When to Use

Use the HTML Lang tag on every HTML page.

Format

The <html> tag should declare the content language (or natural language) of the document. For example,

<html lang="ll-CC">

where "ll" is the language code, and "CC" is the country code.

Please use:

<html lang="en-US">

or

<html lang="ja-JP">

The shorter "ll" option uses the language code only: <html lang="en">. This is acceptable, though it less accurately describes the actual language of the page.

For a list a language options, see: http://www.i18nguy.com/unicode/language-identifiers.html.

Back to top


CONTENT-TYPE Tag

The CONTENT-TYPE declaration tells the client application what type of content is being served. In addition, it specifies the character encoding of the content. It should be the first thing declared in the <head> tag.

Note: If the Content-Type tag is not placed before the Title tag, there is the likelihood that titles that contain non-ASCII characters will not be interpreted correctly.

Editorial Considerations

None.

When to Use

Use the Content-Type tag on every HTML page.

Content-Type Tag Format

Declare within the head <head> tag using this format:

<meta http-equiv="Content-Type" content="text/html; charset=xxxxx">

Set the charset attribute to the correct encoding of the document. This could be UTF-8, ISO-8859-1 or many others. For example,

  • charset=UTF-8
  • charset=ISO-8859-1

Note 1: Changing the charset attribute on its own does not change the character encoding of the document.
Note 2: UTF-8 is the preferred format for page generation.

Back to top


CONTENT-LANGUAGE Tag

The content language of the document needs to be declared a second time. This is declared as meta information within the <head> tag.

Editorial Considerations

None.

When to Use

Use the CONTENT-LANGUAGE tag on every HTML page.

Format

Use this format:

<meta http-equiv="content-language" content="ll-CC">

Where "ll" is the language code and "CC" is the country code. For example, the correct content language options for French Canadian is: content="fr-CA">

For a list a language options, see: http://www.i18nguy.com/unicode/language-identifiers.html.

Back to top


Anchor Text (for HREF Links) Tag

From a user experience perspective, a descriptive link is, in most circumstances, more appropriate than providing the user with an actual http://... reference.

Editorial Considerations

  • Make sure the link description is specific enough to accurately reflect the URL
  • Keep the link description short--use as few words as possible and do not exceed 100 characters
  • Include as much pertinent information as possible
  • Do not use trademark (TM) symbols in the link description
  • Do not use "click here"

When to Use

Every link should have a text description.

Anchor Description Format

Adapt the sample below to conform to the your needs and the style conventions of your Web site:

Page source:

...For additional information about JavaScript, refer to the mozilla.org page titled, "<a class="linkgrey" href="http://www.mozilla.org/js/">What is JavaScript?</a>"...

User display:

For additional information about JavaScript, refer to the mozilla.org page titled, "What is JavaScript?"

Back to top


ALT Tag for Images

ALT tags allow text to be associated with images. ALT tags are used by screen readers and other tools that enable accessibility for disabled users. In addition, they display in most browsers during mouseover.

ALT tags provide compliance of Section 508 Federal guidelines for accessibility. For additional information, refer to: http://www.sun.com/access/508/index.html

Note: The ALT tag must be written in the same language as the document, that is, it must be localized.

Editorial Considerations

  • Make sure the image description is contextual. For example, "download now" does not specify a product name. "Download Java software" is better, yet could be improved upon even further.
  • ALT tags should reflect text used in the image, if any
  • Keep the tag text short--use as few words as possible
  • Do not use trademark (TM) symbols in the link description in the ALT tag description
  • Do not use the same text used on the page to describe the image

When to Use

Every image used to identify programmatic features, such as controls, status indicators, and user actionable elements require an ALT tag description. Images used for page formatting do not require ALT tag descriptions.

ALT Tag Format

Adapt the sample below to conform to the your needs and the style conventions of your Web site:

Page source:

<img src="/en/img/nav/tagline.gif" usemap="#tagline" width="200" height="55" alt="Welcome to java.com. Brought to you by Sun Microsystems." border="0" />

Mouseover display:

Welcome to java.com. Brought to you by Sun Microsystems.

Back to top


Example Of Fully Compliant HTML Page

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="zh-CN">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta http-equiv="content-language" content="zh-CN">
<title>Page Title Goes Here</title>
<meta name="description" content="Java technology is a portfolio of products. Learn more about the powerful features and advantages of Java technology from Sun Microsystems.">
<meta name="keywords" content="word 1, word 2, word 3">
<meta name="date" content="2004-11-23">
</head>
<body .......> //etc.

Additional Information

Meta data specifications are available from the W3C: HTML 4.01 Specification Details, Chapter 7: The Global Structire of an HTML Document:

http://www.w3.org/TR/REC-html40/struct/global.html#h-7.4.4

How the Sun Microsystems Search Engine Works

When the Sun's search engine parses a document, it looks at the meta tag values and body text. For most content areas, Sun's search engine weights meta tags relative to the body text of the document. These weightings are used when determining the ranking of the document for query terms.

  • TITLE (8 times the weight of the content in the body of the web page)
  • DESCRIPTION (4 times the weight of the body)
  • KEYWORDS (4 times the weight of the body)
  • ALT text (1 times the weight of the body)

To optimize your pages for the search engine, ensure that you are following the Title tag best practices (link to that section in the doc) along with guidelines for Description and Keyword tags.