Add robots.txt to a Django website

Updated

Table of Contents

A "bot" is a general term for an automated program that does things like crawl the web. Google and other search engines rely on bots to periodically crawl the internet. Newer companies like ChatGPT also crawl the internet (and other copyrighted sources) for training data for their models. A robots.txt file tells bots which URLs they can and cannot access on your site. It lives in your root URL; for example, on this site, it is located at https://learndjango.com/robots.txt.

Unfortunately, a robots.txt file is more like a "Code of Conduct" sign than hard-fast rules. It is not enforceable on its own. Good actors will follow the rules; bad actors and bots will not; managing them is another area of study on large-scale websites.

Adding a robots.txt file to your site is relatively straightforward and recommended for all websites. Adam Johnson wrote an excellent post on this topic that I highly recommend viewing. This tutorial is my take on how to do it.

Initial Set Up

Start by creating a new Django project which can live anywhere on your computer. In this example, we're putting it on the Desktop in a folder called django_robots. Create and activate a new virtual environment, install Django, and create a new project called django_project.

# Windows
$ cd onedrive\desktop\
$ mkdir django_robots
$ cd django_robots
$ python -m venv .venv
$ .venv\Scripts\Activate.ps1
(.venv) $ python -m pip install django~=5.0.0
(.venv) $ django-admin startproject django_project .
(.venv) $ python manage.py migrate
(.venv) $ python manage.py runserver

# macOS
$ cd ~/desktop/
$ mkdir django_robots
$ cd django_robots
$ python3 -m venv .venv
$ source .venv/bin/activate
(.venv) $ python3 -m pip install django~=5.0.0
(.venv) $ django-admin startproject django_project .
(.venv) $ python manage.py migrate
(.venv) $ python manage.py runserver

Navigating to http://127.0.0.1:8000, you'll see the Django welcome screen.

Django welcome page

Create a robots.txt File

A robots.txt file exists for the entire project and is not app-specific; therefore, it should be in a general templates directory. After quitting the development servers with Control + c, create one from the command line.

(.venv) $ mkdir templates

Then we need to tell Django about it so update the TEMPLATES section of the settings.py file by changing the line for DIRS.

# django_project/settings.py
TEMPLATES = [
    {
        "BACKEND": "django.template.backends.django.DjangoTemplates",
        "DIRS": [BASE_DIR / "templates"],  # new
        "APP_DIRS": True,
        "OPTIONS": {
            "context_processors": [
                "django.template.context_processors.debug",
                "django.template.context_processors.request",
                "django.contrib.auth.context_processors.auth",
                "django.contrib.messages.context_processors.messages",
            ],
        },
    },
]

Now Django knows to look for a templates folder in the root directory. Add a new file with your text editor, templates/robots.txt. On this website, the contents are as follows:

# robots.txt
User-Agent: *
Disallow: /privatestuff/

User-agent: GPTBot
Disallow: /

The top line applies to all bots and says to ignore a folder called privatestuff. Then it tells the GPTBot, the one used by ChatGPT, to ignore the entire site. Google has a detailed guide on robots.txt files with more information on how to customize them.

Option 1: Template

The simplest way to display a robots.txt file is by including a new view in the URLconf. For example, we can import TemplateView and then specify the template name and its content type (we must specify text/plain rather than the default format of text/html).

# django_project/urls.py
from django.contrib import admin
from django.urls import path
from django.views.generic.base import TemplateView  # new

urlpatterns = [
    path("admin/", admin.site.urls),
    # robots.txt path below
    path(
        "robots.txt",
        TemplateView.as_view(template_name="robots.txt", content_type="text/plain"),
    ),
]

On the command line, type the command python manage.py runserver, and you should be able to view the file at http://127.0.0.1:8000/robots.txt.

Robots.txt page

While this approach is as simple as it gets, I don't like mixing view logic with URL logic.

Option 2: View/URL

My preferred method is to create a dedicated pages app for all simple pages since there are typically multiple ones, such as an about page, contact page, etc.

To implement a robots.txt file, start by creating a new app called pages.

(.venv) $ python manage.py startapp pages 

Immediately add it to your INSTALLED_APPS setting.

# django_project/settings.py
INSTALLED_APPS = [
    "django.contrib.admin",
    "django.contrib.auth",
    "django.contrib.contenttypes",
    "django.contrib.sessions",
    "django.contrib.messages",
    "django.contrib.staticfiles",
    "pages",  # new
]

Then create a custom view called RobotsTxtView that relies on the built-in TemplateView.

# pages/views.py 
from django.views.generic import TemplateView


class RobotsTxtView(TemplateView):
    template_name = "robots.txt"

Now we need to configure the URLs. Create a new pages/urls.py file with the following code:

# pages/urls.py
from django.urls import path
from .views import RobotsTxtView

urlpatterns = [
    path("robots.txt", RobotsTxtView.as_view(content_type="text/plain"), name="robots"),
]

And then, update the project-level django_project/urls.py file as well.

# django_project/settings.py
from django.contrib import admin
from django.urls import path, include  # new

urlpatterns = [
    path("admin/", admin.site.urls),
    path("", include("pages.urls")),  # new
]

Make sure that you have python manage.py runserver running and then visit the page at http://127.0.0.1:8000/robots.txt.

Robots.txt page

Tests

No code is complete without automated tests that can be run to make sure nothing breaks in the future. If you have taken the second approach of creating a dedicated pages app, there should be a pages/tests.py file there.

Here is one example of what tests might look like:

# pages/tests.py
from http import HTTPStatus
from django.test import SimpleTestCase


class RobotsTxtTests(SimpleTestCase):
    def test_get(self):
        response = self.client.get("/robots.txt")

        assert response.status_code == HTTPStatus.OK
        assert response["content-type"] == "text/plain"

Run the tests in the usual way.

(.venv) $ python manage.py test
Found 1 test(s).
System check identified no issues (0 silenced).
.
----------------------------------------------------------------------
Ran 1 test in 0.002s

OK