Add robots.txt to a Django website
Updated
Table of Contents
A "bot" is a general term for an automated program that does things like crawl the web. Google and other search engines rely on bots to periodically crawl the internet. Newer companies like ChatGPT also crawl the internet (and other copyrighted sources) for training data for their models. A robots.txt
file tells bots which URLs they can and cannot access on your site. It lives in your root URL; for example, on this site, it is located at https://learndjango.com/robots.txt.
Unfortunately, a robots.txt
file is more like a "Code of Conduct" sign than hard-fast rules. It is not enforceable on its own. Good actors will follow the rules; bad actors and bots will not; managing them is another area of study on large-scale websites.
Adding a robots.txt
file to your site is relatively straightforward and recommended for all websites. Adam Johnson wrote an excellent post on this topic that I highly recommend viewing. This tutorial is my take on how to do it.
Initial Set Up
Start by creating a new Django project which can live anywhere on your computer. In this example, we're putting it on the Desktop in a folder called django_robots
. Create and activate a new virtual environment, install Django, and create a new project called django_project
.
# Windows
$ cd onedrive\desktop\
$ mkdir django_robots
$ cd django_robots
$ python -m venv .venv
$ .venv\Scripts\Activate.ps1
(.venv) $ python -m pip install django~=5.0.0
(.venv) $ django-admin startproject django_project .
(.venv) $ python manage.py migrate
(.venv) $ python manage.py runserver
# macOS
$ cd ~/desktop/
$ mkdir django_robots
$ cd django_robots
$ python3 -m venv .venv
$ source .venv/bin/activate
(.venv) $ python3 -m pip install django~=5.0.0
(.venv) $ django-admin startproject django_project .
(.venv) $ python manage.py migrate
(.venv) $ python manage.py runserver
Navigating to http://127.0.0.1:8000
, you'll see the Django welcome screen.
Create a robots.txt File
A robots.txt
file exists for the entire project and is not app-specific; therefore, it should be in a general templates
directory. After quitting the development servers with Control + c
, create one from the command line.
(.venv) $ mkdir templates
Then we need to tell Django about it so update the TEMPLATES
section of the settings.py
file by changing the line for DIRS
.
# django_project/settings.py
TEMPLATES = [
{
"BACKEND": "django.template.backends.django.DjangoTemplates",
"DIRS": [BASE_DIR / "templates"], # new
"APP_DIRS": True,
"OPTIONS": {
"context_processors": [
"django.template.context_processors.debug",
"django.template.context_processors.request",
"django.contrib.auth.context_processors.auth",
"django.contrib.messages.context_processors.messages",
],
},
},
]
Now Django knows to look for a templates
folder in the root directory. Add a new file with your text editor, templates/robots.txt
. On this website, the contents are as follows:
# robots.txt
User-Agent: *
Disallow: /privatestuff/
User-agent: GPTBot
Disallow: /
The top line applies to all bots and says to ignore a folder called privatestuff
. Then it tells the GPTBot
, the one used by ChatGPT, to ignore the entire site. Google has a detailed guide on robots.txt files with more information on how to customize them.
Option 1: Template
The simplest way to display a robots.txt
file is by including a new view in the URLconf
. For example, we can import TemplateView
and then specify the template name and its content type (we must specify text/plain
rather than the default format of text/html
).
# django_project/urls.py
from django.contrib import admin
from django.urls import path
from django.views.generic.base import TemplateView # new
urlpatterns = [
path("admin/", admin.site.urls),
# robots.txt path below
path(
"robots.txt",
TemplateView.as_view(template_name="robots.txt", content_type="text/plain"),
),
]
On the command line, type the command python manage.py runserver
, and you should be able to view the file at http://127.0.0.1:8000/robots.txt
.
While this approach is as simple as it gets, I don't like mixing view logic with URL logic.
Option 2: View/URL
My preferred method is to create a dedicated pages
app for all simple pages since there are typically multiple ones, such as an about page, contact page, etc.
To implement a robots.txt
file, start by creating a new app called pages
.
(.venv) $ python manage.py startapp pages
Immediately add it to your INSTALLED_APPS
setting.
# django_project/settings.py
INSTALLED_APPS = [
"django.contrib.admin",
"django.contrib.auth",
"django.contrib.contenttypes",
"django.contrib.sessions",
"django.contrib.messages",
"django.contrib.staticfiles",
"pages", # new
]
Then create a custom view called RobotsTxtView
that relies on the built-in TemplateView
.
# pages/views.py
from django.views.generic import TemplateView
class RobotsTxtView(TemplateView):
template_name = "robots.txt"
Now we need to configure the URLs. Create a new pages/urls.py
file with the following code:
# pages/urls.py
from django.urls import path
from .views import RobotsTxtView
urlpatterns = [
path("robots.txt", RobotsTxtView.as_view(content_type="text/plain"), name="robots"),
]
And then, update the project-level django_project/urls.py
file as well.
# django_project/settings.py
from django.contrib import admin
from django.urls import path, include # new
urlpatterns = [
path("admin/", admin.site.urls),
path("", include("pages.urls")), # new
]
Make sure that you have python manage.py runserver
running and then visit the page at http://127.0.0.1:8000/robots.txt
.
Tests
No code is complete without automated tests that can be run to make sure nothing breaks in the future. If you have taken the second approach of creating a dedicated pages
app, there should be a pages/tests.py
file there.
Here is one example of what tests might look like:
# pages/tests.py
from http import HTTPStatus
from django.test import SimpleTestCase
class RobotsTxtTests(SimpleTestCase):
def test_get(self):
response = self.client.get("/robots.txt")
assert response.status_code == HTTPStatus.OK
assert response["content-type"] == "text/plain"
Run the tests in the usual way.
(.venv) $ python manage.py test
Found 1 test(s).
System check identified no issues (0 silenced).
.
----------------------------------------------------------------------
Ran 1 test in 0.002s
OK