Server: Difference between revisions
Describe crawler countermeasures |
Remove tor relay |
||
| (One intermediate revision by the same user not shown) | |||
| Line 42: | Line 42: | ||
* an <code>nginx</code> blocklist that blocks all AI robots user agents | * an <code>nginx</code> blocklist that blocks all AI robots user agents | ||
* <code>go-away</code> that blocks requests based on some non-javascript challenges, based on the [https://git.gammaspectra.live/git/go-away/src/branch/master/examples/forgejo.yml default example for forgejo] (with the only differences that <code>js-refresh</code> is replaced by <code>header-refresh</code> and the final <code>js-pow-sha256</code> challenge is removed) | * <code>go-away</code> that blocks requests based on some non-javascript challenges, based on the [https://git.gammaspectra.live/git/go-away/src/branch/master/examples/forgejo.yml default example for forgejo] (with the only differences that <code>js-refresh</code> is replaced by <code>header-refresh</code> and the final <code>js-pow-sha256</code> challenge is removed) | ||
Stats of <code>go-away</code> are available on [https://grafana.yunity.org/d/f7bd6db8-b503-48b3-8a4d-fd1011aac8e0/go-away grafana]. | |||
== SFTP Access == | == SFTP Access == | ||
| Line 65: | Line 66: | ||
Backups are done using Borgmatic. The backup target is the local hard-disk storage, so it does not safe us against fire or theft of the computer. We might think about adding a remote backup as well. | Backups are done using Borgmatic. The backup target is the local hard-disk storage, so it does not safe us against fire or theft of the computer. We might think about adding a remote backup as well. | ||
== Other services == | == Other services == | ||
Latest revision as of 17:50, 18 November 2025
⭐️ We have a server running locally that provides a few services to residents as well as guests.
File sharing services
The server provides the possibility to store and exchange data. Some services are publicly available (e.g. connecting with an anonymous user), for others you need a user account with some privileges. All file services are only available in the full network and served via Samba. Use your computers file manager to browse the available network computers and locate the server as KANTHAUS-SERVER. If this doesn’t show up in your file manager or the link is broken, you can try entering smb://kanthaus-server/ directly into your file managers address bar. This should work on most Linux environments.
Getting a user account
To get a user account, speak to an admin (e.g. Antonin or Tilmann). The admin will add you to the Ansible user configuration and ask you to set a temporary password using your account. You can change the password yourself, e.g. via smbpasswd -r kanthaus-server -U yourusername.
Actually, you have two passwords:
- System user account: Used for local access and SSH access. Change password using
passwdwhen logged in - Samba account: Used for accessing the samba network shares. Change password using command above remotely or using
smbpasswdwhen logged in.
Kanthaus cloud copy
- The share
kanthaus-publicoffers an anonymously usable read only copy of the public part of the kanthaus cloud. - The share
cloud.kanthaus.onlineoffers a read only copy of the whole kanthaus cloud. You need to have a user with the permissions classkanthaus.
The cloud copy is synchronized from the Kanthaus cloud once every minute.
Internal cloud
The share internalcloud stores some data which should only be available from inside Kanthaus (e.g. financial data) and is only available to users with the permissions class internal. Please make sure to only put security sensitive stuff in here and also make sure to not leak your user credentials or the contents of this folder, when you have access to it.
This folder is part of the daily backup.
Home folder
Every user account also has their personal home folder available as the homes storage. All data you put here is only available to yourself. Inside the home folder, there is a directory called storage. This folder lies on an easily expandable, cheap hard-disk storage. It is slower to access but suitable to store lots of data (e.g. backups of your computer).
Your home folder is part of a daily backup. Please put files called .nobackup into folders that you don’t want to be backed up (e.g. to save storage space). The storage folder is not part of the backup, but the hard-disks have a raid configuration to tolerate the loss of one hard-disk.
Shell access
When you have a user account, you can also use SSH to connect to the server and use it for computing tasks. To set an initial password, ask an admin. When you already have file sharing access, you can add your SSH public key to the homes/.ssh/authorized_keys and use that for logging in. Same as above: In your home directory there is a symlinked folder called storage which is on spinning disks, whereby the rest of your home folder is on limited SSD space.
Limiting crawlers
Our web services all run through an NGINX reverse proxy web server. The recent intensification of crawling for LLM training was significantly affecting our Forgejo instance, with periodic spikes over 100% greater than baseline. To diagnose the issue, ngxtop was installed and can be ran with /opt/kh-services/ngxtop/bin/ngxtop --group-by remote_addr -n 50 -l /data/services/data/nginx-logs/forgejo-access.log this provides a "top-like" overview the HTTP responses the web server sends them back to (bucketed) IPs. Additionally, tail -f /data/services/data/nginx-logs/<service>-access.log provides a more detailed, chronological view of HTTP responses.
At the time of writing, for countermeasures we have
- an
nginxrate limiting of 15 requests / second in addition to some burst allowances - a
robots.txtthat forbids all robots entry - an
nginxblocklist that blocks all AI robots user agents go-awaythat blocks requests based on some non-javascript challenges, based on the default example for forgejo (with the only differences thatjs-refreshis replaced byheader-refreshand the finaljs-pow-sha256challenge is removed)
Stats of go-away are available on grafana.
SFTP Access
You can use software like FileZilla to access your home folder through sftp://kanthaus-server providing your username. See the Shell Access section above for other details.
How to unlock the encrypted kanthaus-server via network
- be in
kh-admin ssh -p 2222 root@192.168.178.249- your key must be stored on the server in
/etc/dropbear-initramfs/authorized_keys->update-initramfs -u - ED25519 key fingerprint:
SHA256:mvCVYx8D/Fv/qYq+a/H4MoRAcfExAUsAFW3L2NVHnD0
- your key must be stored on the server in
- enter password (stored in keepass -> Server)
System specs
- System is designed to save power but still have some computing resources.
- CPU: Intel Core i5-2500K (4x 3.3 GHz)
- Ram: 16 GB DDR3L
- SSD: 1 TB Samsung 860 Evo as root file system
- HDD: BTRFS pool with 2 disks. Current usable size 3 TB
Backups
Backups are done using Borgmatic. The backup target is the local hard-disk storage, so it does not safe us against fire or theft of the computer. We might think about adding a remote backup as well.
Other services
- Foodsharing Gitlab which CI runner (dockerized)
- House bus services
- local Web interface (dockerized)
- Logging daemon to externally hosted influxdb
- Virtual machines (kvm/libvirt)