UP | HOME

Setting up your own search engine using SearXNG, NixOS and Tailscale Funnel

setup_your_own_search_engine.png

Since the latest Google Layoff announcements, there are a lot of takes on Google, its slow demise and descent into the realm of traditional software companies.

I recently stumbled upon this blog article:
https://social.clawhammer.net/blog/posts/2005-09-25-FirstWeekAtGoogle/

there is a lot of nostalgia but also a big disapointment.
Back then, the perspective of a layoff would have seem forever unfathomable, how is it possible that a company with an annual revenue comparable to a nation have to fire people?

The truth is that companies are not non profits, Google could live the dream as long as their revenue progressed, but in the current scenario where online advertising has become less lucrative, they quickly have to abandon their "don't be evil" mantra and soon enough they will alienate both their employees and their customers,
in this context it is interesting to investigate the alternatives in 2024, especially on the self hosted landscape.

Goals to achieve

  • eliminate tracking
  • eliminate ads
  • improved search results
  • lighter browser resource usage
  • improved speed/performance
  • customization/hacking/tinkering perpectives

Introducing SearXNG

SearXNG is a meta search engine, I remember the concept from the 2000's where there was a lof of competition but small players, some apps would provide this functionality, querying in parallel all engines and aggregating the results.

in 2024 SearXNG does mostly the same thing but using a client/server model and with different priority goals in sight, especially around privacy.

NixOS setup

Settin up SearchXNG with nix is extremely easy and requires only a couple of lines in an existing serve setup:

  services.searx.enable = true;
  services.searx.redisCreateLocally = true;
  services.searx.settings.server.secret_key = "test";
  services.searx.settings.server.port = 8080;
  services.searx.settings.server.bind_address = "0.0.0.0";
  services.searx.settings.search.formats = ["html" "json" "rss"];

Self-Hosting using Tailscale Funnel

I have both a server instance in a datacenter and a home server,
I'm gradually trying to get rid of the online server in an attempt to self-host everything at home.

In order to do so I usually setup some port forwarding on my OpenWRT router.

This comes with some drawbacks:

  • I need to dedicate a port by application
  • I'm exposing my home ip, this is a security concern

This time I want to try another approach, setting up a Tunnel with tailscale, assuming it's already running, the following commands will expose your port seamlessly using your tailscale network name:

sudo tailscale serve https:443 / http://127.0.0.1:8080
sudo tailscale serve --serve-port 8443 funnel on
sudo tailscale funnel status

I can now access my SearXNG instance from anythwere using a host like this: https://search.foo4242.ts.net

The only compromise with this system is that I cannot use my own domain name as it breaks the SSL chain.

Admin API Sample

GET /config

Summarized using chatgpt:

{
  "brand": {
    "DOCS_URL": "https://docs.searxng.org/",
    "GIT_URL": "https://github.com/searxng/searxng"
  },
  "categories": [
    "general",
    "social media",
    "files",
    "apps",
    "it",
    "software wikis",
    "images",
    "science",
    "scientific publications",
    "music",
    "videos",
    "web",
    "news",
    "repos",
    "other",
    "packages",
    "weather",
    "map",
    "dictionaries",
    "lyrics",
    "movies",
    "radio",
    "q&a",
    "wikimedia"
  ],
  "default_doi_resolver": "oadoi.org",
  "default_theme": "simple",
  "doi_resolvers": [
    "oadoi.org",
    "doi.org",
    "doai.io",
    "sci-hub.se",
    "sci-hub.st",
    "sci-hub.ru"
  ],
  "engines": [
    {
      "categories": ["social media"],
      "enabled": false,
      "paging": true,
      "name": "9gag",
      "shortcut": "9g",
      "timeout": 3.0
    },
    {
      "categories": ["files"],
      "enabled": false,
      "language_support": true,
      "languages": ["af", "zh", "...", "zh_Hant"],
      "name": "annas archive",
      "shortcut": "aa",
      "timeout": 3.0
    },
    {
      "categories": ["files", "apps"],
      "enabled": false,
      "paging": true,
      "name": "apk mirror",
      "shortcut": "apkm",
      "timeout": 4.0
    },
    // ... (other engines)
  ],
  "instance_name": "SearXNG",
  "limiter": {
    "botdetection.ip_lists.pass_searxng_org": true
  },
  "locales": {
    "af": "Afrikaans",
    "ar": "العربية (Arabic)",
    // ... (other locales)
  },
  "plugins": [
    {
      "enabled": true,
      "name": "Hash plugin"
    },
    {
      "enabled": true,
      "name": "Self Information"
    },
    {
      "enabled": true,
      "name": "Tracker URL remover"
    }
  ],
  "public_instance": false,
  "safe_search": 0,
  "version": "2023.10.31+b05a1554"
}

Search API Sample

GET /search?q=clojure&format=json

Summarized using chatgpt:

{
  "query": "clojure",
  "number_of_results": 0,
  "results": [
    {
      "url": "https://clojure.org/",
      "title": "Clojure",
      "content": "Clojure is a compiled language, a dialect of Lisp, with a code-as-data philosophy and a powerful macro system.",
      "img_src": "",
      "engine": "google",
      "parsed_url": ["https", "clojure.org", "/", "", "", ""],
      "engines": ["brave", "qwant", "duckduckgo", "google"],
      "positions": [1, 1, 1, 1],p
      "score": 16.0
    },
    {
      "url": "https://en.wikipedia.org/wiki/Clojure",
      "title": "Clojure",
      "content": "Clojure is a dynamic, functional dialect of Lisp on the Java platform, with syntax built on S-expressions.",
      "img_src": "",
      "engine": "google",
      "parsed_url": ["https", "en.wikipedia.org", "/wiki/Clojure", "", "", ""],
      "engines": ["brave", "qwant", "duckduckgo", "google"],
      "positions": [4, 2, 2, 2],
      "score": 7.0
    },
    {
      "url": "https://clojure.org/guides/learn/syntax",
      "title": "Learn Clojure - Syntax",
      "content": "A guide to Clojure's syntax, supporting multiple data structures, expressions, and functions.",
      "img_src": null,
      "engine": "google",
      "parsed_url": ["https", "clojure.org", "/guides/learn/syntax", "", "", ""],
      "engines": ["qwant", "duckduckgo", "google"],
      "positions": [5, 3, 4],
      "score": 2.35
    },
    {
      "url": "https://github.com/clojure/clojure",
      "title": "GitHub - clojure/clojure: The Clojure programming language",
      "content": "Clojure supports multiple paradigms like functional, object-oriented, and data-driven programming.",
      "img_src": "",
      "engine": "brave",
      "parsed_url": ["https", "github.com", "/clojure/clojure", "", "", ""],
      "engines": ["brave", "qwant", "duckduckgo"],
      "positions": [5, 7, 7],
      "score": 1.457142857142857
    },
    {
      "url": "https://clojure.org/about/functional_programming",
      "title": "Functional Programming",
      "content": "Clojure is a functional language that avoids mutable state and emphasizes functions as first-class objects.",
      "img_src": null,
      "engine": "google",
      "parsed_url": ["https", "clojure.org", "/about/functional_programming", "", "", ""],
      "engines": ["qwant", "duckduckgo", "google"],
      "positions": [9, 8, 10],
      "score": 1.0083333333333333
    },
    // ... (other results)
  ],
  "infoboxes": [
    {
      "infobox": "Clojure",
      "content": "Clojure is a dynamic and functional dialect of the Lisp programming language on the Java platform.",
      "img_src": "https://upload.wikimedia.org/wikipedia/commons/thumb/5/5d/Clojure_logo.svg/500px-Clojure_logo.svg.png",
      "urls": [
        {"title": "Wikipedia", "url": "https://en.wikipedia.org/wiki/Clojure"},
        {"title": "Official website", "url": "https://clojure.org/"},
        // ... (other URLs)
      ],
      "attributes": [
        {"label": "Inception", "value": "2007"},
        {"label": "Developer", "value": "Richard Hickey"},
        {"label": "Copyright license", "value": "Eclipse Public License"},
        // ... (other attributes)
      ],
      "engine": "wikidata",
      "engines": ["wikidata", "wikipedia"]
    }
  ],
  "suggestions": ["clojure tutorial", "clojure example", "clojure map", "clojure github", "clojure main", "clojure exercises", "clojure hello world", "clojure js"]
}

Date: 2024-01-20 Sat 00:00

Author: Guillaume Buisson

Created: 2024-03-20 Wed 16:10

Validate