added visited sites functionality to crawler

This commit is contained in:
partisan 2025-01-01 23:48:47 +01:00
parent c71808aa1e
commit 918e1823df
5 changed files with 178 additions and 63 deletions

View file

@ -7,30 +7,30 @@
</p>
<p align="center">
A self-hosted private <a href="https://en.wikipedia.org/wiki/Metasearch_engine">metasearch engine</a> that aims to be more resource-efficient than its competition.
A self-hosted private search engine designed to be scalable and more resource-efficient than its competitors.
</p>
# Bare in mind that this project is still WIP
## Comparison to other search engines
## Comparison to other open-source search engines
| Feature | Whoogle [1] | Araa-Search | LibreY | 4get | SearchXNG | *QGato* |
| :------------------------- | ------------------ | ------------------------- | ------------------------ | ------------------------ | ------------------------- | ---------------------------------------------------- |
| Works without JavaScript | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Music search | | ❌ | ❌ | ✅ | ✅ | ✅ |
| Torrent search | ❌ | ✅ | ✅ | ❌ | ✅ | ✅ |
| API | ❌ | ❓ [2] | ✅ | ✅ | ✅ | ✅ |
| Scalable | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
| Not Resource Hungry | ❓ Moderate | ❌ Very resource hungry | ❌ Moderate 200-400mb~ | ❌ Moderate 200-400mb~ | ❌ Moderate 200-300MiB~ | ✅ about 15-20MiB at idle, 17-22MiB when searching |
| Result caching | ❌ | ❌ | ❓ | ❓ | ❓ | ✅ |
| Dynamic Page Loading | ❓ Not specified | ❌ | ❌ | ❌ | ✅ | ✅ |
| User themable | ❌ | ✅ | ❌ | ❌ | ✅[3] | ✅ |
| Unusual logo choice | ❌ | ❌ | ❌ | ✅ | ❌ | ❌ |
| Feature | Whoogle [1] | Araa-Search | LibreY | 4get | SearchXNG | *QGato* |
| :------------------------- | ------------- | ------------------------- | ------------------------ | ------------------------ | ------------------------- | --------------------------------------- |
| Works without JavaScript | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Music search | | ❌ | ❌ | ✅ | ✅ | ✅ |
| Torrent search | ❌ | ✅ | ✅ | ❌ | ✅ | ✅ |
| API | ❌ | ❌ [2] | ✅ | ✅ | ✅ | ✅ |
| Scalable | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
| Not Resource Hungry | ❓ Moderate | ❌ Very resource hungry | ❌ Moderate 200-400mb~ | ❌ Moderate 200-400mb~ | ❌ Moderate 200-300MiB~ | ✅ about 15-30MiB even when searching |
| Result caching | ❓ | ❓ | ❓ | ❓ | ❓ | ✅ |
| Dynamic Page Loading | | ❌ | ❌ | ❌ | ✅ | ✅ |
| User themable | ❌ | ✅ | ❌ | ❌ | ❓[3] | ✅ |
| Unusual logo choice | ❌ | ❌ | ❌ | ✅ | ❌ | ❌ |
[1]: I was not able to check this since their site does not work, same for the community instances.
[2]: In the project repo they specify that it has API, but It looks like they are no loger supporting it. Or just removed "API" button and documentation, since I was not able to find it anymore.
[2]: In the project repo they specify that it has API, but It looks like they are no longer supporting it. Or just removed "API" button and documentation, since I was not able to find it anymore.
[3]: It is called 'User Themable' because you want to give the user freedom of choice for their theme, not by hard-setting one theme in the backend and calling it themable.
@ -48,7 +48,7 @@ A self-hosted private <a href="https://en.wikipedia.org/wiki/Metasearch_engine">
### For Self-Hosting
- **Self-hosted option** - Run on your own server for even more privacy.
- **Lightweight** - Low memory footprint (15-22MiB) even during searches.
- **Lightweight** - Low memory footprint (15-30MiB) even during searches.
- **Decentralized** - No single point of failure.
- **Results caching in RAM** - Faster response times through caching.
- **Configurable** - Tweak features via `config.ini`.
@ -67,7 +67,7 @@ A self-hosted private <a href="https://en.wikipedia.org/wiki/Metasearch_engine">
### Prerequisites
- Go (version 1.18 or higher recommended)
- Go (version 1.23 or higher recommended)
- Git (unexpected)
- Access to the internet for fetching results (even more unexpected)