Docker Linux Software

Personal Search Engine with Searxng and Docker

Searxng

For all the nerds out there that love self-hosting you may find a self-hosted search engine to be an appealing prospect.

That’s where Searxng comes in handy. Searxng (which I assume is pernounced like searching) is an open source meta search engine that was forked from the popular Searx meta search engine.

What makes it different than a traditional search engine is that instead of keeping your own search index and web crawlers, it relies on proxying search queries through other search engines and aggregating the results for display.

Why Host Your Own Search Engine?

It’s true that today there are a variety of search engines to choose from besides Google. Many people avoid products from Microsoft and Google for privacy reasons (myself included). Searxng can help with that but truthfully I don’t believe it to be as private as some other options.

The real reason you might want to host your own Searxng instance is simple for more finite control over your results and what you are searching. You can also host the engine in other countries or proxy it from another country.

It is absolutely more private than a Google or Bing search but if you are self-hosting remember that the site will point to the IP of the server you are hosting from. If you are clever enough to purchase your server without any identifying information then this may not be an issue but it will not prevent outside services from building a profile on your based on the queries you request.

For private searches it’s probably better to use something like Brave Search or Ecosia because queries from those engines would obfuscate who you are amongst many people using it.

The final reason to host it yourself is just because you can. If you are reading this you problem are the type of person that loves to tinker around on some linux terminal.

Spin Up Your Own Searxng with Docker

First off this project requires that you be hosting the jwilder/nginx docker container with the letsencrypt proxy companion that I outline in this article here.

With that out of the way here is the link the to github for my own take on searxng.

First lets take a look at the .env file. Some changes here are required before doing anything else.

# By default listen on https://localhost
# To change this:
# * uncomment SEARXNG_HOSTNAME, and replace <host> by the SearXNG hostname
# * uncomment LETSENCRYPT_EMAIL, and replace <email> by your email (require to create a Let's Encrypt certificate)

SEARXNG_HOSTNAME=your.domain.com
VIRTUAL_HOST=your.domain.com

You will need a DNS entry pointing to your server IP so replace your.domain.com with the domain you chose without the https:// portion. Repeat this for the VIRTUAL_HOST environment variable.

Next lets look at our docker-compose.yml file.

version: '3.7'

services:

  redis:
    container_name: redis
    image: "redis:alpine"
    command: redis-server --save "" --appendonly "no"
    restart: unless-stopped
    labels:
      - "com.centurylinklabs.watchtower.enable=true"
    tmpfs:
      - /var/lib/redis
    cap_drop:
      - ALL
    cap_add:
      - SETGID
      - SETUID
      - DAC_OVERRIDE

  searxng:
    container_name: searxng
    image: searxng/searxng:latest
    ports:
     - "8080:8080"
    volumes:
      - ./searxng:/etc/searxng:rw
#      - ./searxng.png:/usr/local/searxng/searx/static/themes/simple/img/searxng.png  #Uncomment this line to add your custom logo on the page
#      - ./favicon.png:/usr/local/searxng/searx/static/themes/simple/img/favicon.png  #Uncomment this line to add your own favicon
    restart: unless-stopped
    environment:
      - SEARXNG_BASE_URL=https://${SEARXNG_HOSTNAME:-localhost}/
      - VIRTUAL_HOST=${VIRTUAL_HOST}
      - LETSENCRYPT_HOST=${VIRTUAL_HOST}
    labels:
      - "com.centurylinklabs.watchtower.enable=true"
    cap_drop:
      - ALL
    cap_add:
      - CHOWN
      - SETGID
      - SETUID
      - DAC_OVERRIDE
    logging:
      driver: "json-file"
      options:
        max-size: "1m"
        max-file: "1"
networks:
  default:
    external:
      name: nginx-proxy

Starting at the top with the ‘redis’ service, the portion is completely optional. However, it’s recommended to use redis here because it helps with caching and speeds up the performance of the search engine.

In future projects I will post here the docker containers will also point to this instance of redis for caching too.

Next we get to our ‘searxng’ service. Much of this portion is based on the official compose file for searxng on github however I’ve made a few changes.

The port is by default 8080 but can be changed to pretty much anything but 80 or 443 as long as it doesn’t overlap something else on your server.

Under ‘volumes’ I mount the searxng config folder in the project directory. This allows for easy access to config files later if needed.

The next two volume mounts are optional but if you have a custom favicon and/or logo you can place them in the working directory of this project and uncomment the two lines. The favicon will replace the icon you see on bookmarks or in the URL bar of your web browser. The logo replaces the default searxng logo you see on the landing page.

Environment is mostly self explanatory and is set from your environment variables. This is also where it interfaces with the NGINX proxy to automate the proxying and LE certificate renewals.

Finally, at the end the ‘networks’ should match those defined in your NGINX proxy container. If you followed my guide the default name here is fine.

Crank It Up

To run the container, make sure all you environment variables are set and your docker-compose.yml file looks correct and run:

docker-compose up -d

This should bring up the search engine which can be accessed at https://<the domain you entered>. If it doesn’t work immediately then wait 30 seconds and try again. Sometimes it takes a minute for Let’s Encrypt to work it’s magic. If you still run into errors try accessing the page from a private tab to ensure caching is not the issue.

Final Thoughts

I’ve found my custom search engine to be quite functional and have been using it exclusively for the last 6 months. To get the most out of your instance remember to go to ‘preferences’ in the top right ‘engines’ and enable/disable the search engines of your choosing.

There are more tabs there that interface with non-standard search engines. It’s worth looking through all the tabs to select what you want. I find enabling Reddit under ‘Social Media’ is useful as well as ‘Stack Overflow’ under IT.

If you are getting slow search results make sure to investigate each of your search engines to make sure their response time is something reasonable as shown in the ‘Max Time’ column to the right of it.

That’s all! Happy searxng!

Leave a Reply

Your email address will not be published. Required fields are marked *