Lektor Out-of-the-Box Step 2: An HTML Sitemap for Your Site

- August 3, 2018

This is the second blog post in the Lektor Out-of-the-Box: Step-by-Step series. This blog series uses the Lektor static CMS "out-of-the-box" (without any plugins) to build a demo website in a series of steps. Each subsequent step will show a step-by-step increase in functionality.

Step 1 is discussed in the Basic Demo Site blog post. Step 2 builds on that, so if you haven't already read that post, now might be a good time to do so. This step adds an HTML sitemap to the website. Then it (optionally) adds a DuckDuckGo site search box to the HTML sitemap. The resulting Demo Website with an HTML sitemap is hosted on GitHub Pages.

The instructions and output below are Windows-specific. However, since Lektor is cross-platform, they should be similar on Linux or macOS.

What's a Sitemap?

A sitemap (can also be spelled "site map") is a hierarchical listing of pages in your website. It should only list those pages which search engines have permission to crawl, based on meta tags within the pages themselves and on settings in your website's robots.txt file.

Here's what the demo website looks like just after adding an HTML sitemap:

Lektor Out-of-the-Box Step-by-Step - Step 2A - HTML Sitemap

A Robots.txt File

To discover and index Web pages on Internet, search engines traverse through pages using what are known as Web robots, Web crawlers, Web wanderers, or spiders. A website's robots.txt file informs robots what they're allowed to access. (Be aware, though, that some robots may disregard restrictions in your robots.txt file, and access pages you've informed them are off-limits.) If you don't have a robots.txt file, robots are told they have full access to all of the files on your site. The same is true if you have an empty robots.txt file. You can explicitly give robots permission to access all of your files by using the following 2-line robots.txt file:

User-agent: *

That file literally means to disallow no files for all "user-agents". Which, to put it more plainly, allows access to all files for all user-agents.

For more information, see The Web Robots Pages.

The Robots <Meta> Tag

Individual Web pages may contain a Robots <meta> tag, such as:

<meta name="robots" content="index, follow">

That's the default, and it tells Web robots that they may:

  1. Index the page
  2. Follow links contained on the page, to find other pages to index

What's an HTML Sitemap, and why should you add one to your website?

There are 2 main types of sitemaps: HTML sitemaps and XML sitemaps. XML sitemaps are used by Web robots to help them better crawl through a website's pages, index them, and rank them. Since search engines can do all of those things without an XML sitemap, their main purpose is for SEO - Search Engine Optimization. I hope to discuss why and how to add an XML sitemap to your Lektor-built website in a future blog post in this series.

While XML sitemaps are made with Web robots in mind, HTML sitemaps are intended for humans to read. They're sometimes called human-readable sitemaps. They help human users locate Web pages within your site. And they can be especially helpful to people with certain cognitive disabilities.

In fact, providing a human-readable sitemap is one of several possible alternatives for meeting the "should have" (Level AA) WCAG 2.0 Success Criterion 2.4.5 Multiple Ways. WCAG (Web Content Accessibility Guidelines) 2.0 is the legal Accessibility standard many websites worldwide are required to comply with. Success Criteria 2.4.5 - Multiple Ways is defined as "More than one way is available to locate a Web page within a set of Web pages except where the Web Page is the result of, or a step in, a process."

How to add an HTML Sitemap to your Lektor-built website

There are numerous HTML sitemap generators available, as well as numerous tutorials on how to build your own HTML sitemap. But for a static website, using a static site generator like Lektor, it's fairly straightforward to build one. Only 3 steps are required.

But we'll add an optional fourth step: adding a DuckDuckGo site search box to the sitemap. A site search box is another of the alternatives for meeting Success Criteria 2.4.5 - Multiple Ways.

Steps to Adding An HTML Sitemap:

Step 1. Create a Sitemap Template

Create the following \templates\sitemap.html file:

{% extends "layout.html" %}
{% block title %}Sitemap{% endblock %}
{% block body %}
<ul class="sitemap">
  {% for page in [site.root] if page.record_label recursive %}
  <li><a href="{{ page|url }}">{{ page.record_label }}</a>
    {% if page.children %}
      <ul>{{ loop(page.children) }}</ul>
    {% endif %}
  {% endfor %}
{% endblock %}

Step 2. Create Sitemap Contents

Create the following \content\sitemap\contents.lr file:

title: Sitemap
_template: sitemap.html
_model: none

Step 3. Add the Sitemap to the Main Navigation Menu

Change \templates\layout.html's <nav> tag from:

{% for href, title in [
  ['/blog', 'Blog'],
  ['/projects', 'Projects'],
  ['/about', 'About']
] %}


{% for href, title in [
  ['/blog', 'Blog'],
  ['/projects', 'Projects'],
  ['/about', 'About'],
  ['/sitemap', 'Sitemap']
] %}

Then build your site with Lektor.

Step 4 (Optional). Add a DuckDuckGo Site Search Box

Add it to your \templates\sitemap.html file:

  1. Go to DuckDuckGo Search Box builder.
  2. Adjust the settings on that page to customize your search box. The settings I adjusted are:
    1. Entered the URL of this demo website ("https://russelljqa.github.io/lektor-out-of-the-box-step-02/" [without the quotes]) in the "Site search:" input text box.
    2. Replaced "Search DuckDuckGo" with "Search this site" in the "Prefill:" input text box
  3. The DuckDuckGo Search Box builder generates quite compact code. It does this by generating an inline frame (iframe) element which displays the DuckDuckGo Search (Box) webpage inline on your webpage. Copy the <iframe> code it generates into your \templates\sitemap.html file. Place it just before the end block at the end of the file. Here's the code I copied:
            <iframe src="https://duckduckgo.com/search.html?site=https://russelljqa.github.io/lektor-out-of-the-box-step-02/&prefill=Search this site" style="overflow:hidden;margin:0;padding:0;width:408px;height:40px;" frameborder="0"></iframe>
  4. Put some appropriate HTML code before the search box, to clarify what it's there for. Here's the code I used:
            <h2 tabindex='0'>Search this site</h2>
            <p>Haven't found what you're looking for? Then search this site below, using DuckDuckGo — "the search engine that doesn't track you".</p>
    The tabindex = '0' puts the added <h2> heading into the tab order. That can be helpful for users who tab through the webpage instead of using a mouse.

Then rebuild your site with Lektor.