Page view stats with Traefik + Loki
One thing I have been missing after migrating my personal web apps from mainly Vercel to this little hosting thing is web traffic statistics, like page views and client/browser data. Not that it really matters much for smaller personal projects but it’s nice to have some idea on how the apps are getting hit. Luckily Traefik, which handles the routing on the host server, has good support for this. Time for some stats per project. As one might expect, ‘how hard can it be?’ ends up at ‘it depends.’ or ‘what is hard?’ and a few more hours of work than I probably should spend on this. Good fun though. I’ll try to run through the whole setup from nothing to a working dashboard tab with a graph for page views for a project.
With the right config Traefik can be set up so that I get a json file that stores some useful web traffic data. I am setting up Traefik (and everything else) on the host server using Ansible, and docker for running it, so first I need to add some flags in the docker-compose.traefik.yml.j2 config file under the services/traefik/command section:

These lines basically sets up so that I get a logs/access.json file. I keep most request/response fields and headers for debugging and stats, while redacting sensitive things like Authorization and Cookies so I dont end up exposing those.

This is the reason the file will end up as ./logs/access.json on the host (the container path /var/log/traefik is bind-mounted to ./logs in Compose).
Next up is to adjust the Ansible Traefik task to also create the folder for the access.json file and set up Logrotate, which is a little Linux automating tool for compressing and cleaning up files just like the access.json file here, so they wont grow and fill up too much space.

The access.json logrotate setup keeps Traefik’s access log in a single file on the host and rotates it without requiring a Traefik reload. The /logs folder is created on the host, logrotate must be installed, and a rule is added that watches {{ traefik_directory }}/logs/access.json. Logrotate rotates daily or when the file reaches 50 MB, keeps the last 7 rotated copies, compresses older ones, and won’t complain if the file is missing or empty (missingok, notifempty). Copytruncate is used so Traefik can keep writing to the same access.json path: logrotate copies the current content to a rotated file and then truncates access.json in place.
Now, I may be skipping some serious debugging and troubleshooting here, I cant remember, but with updated configurations I should get an access.json file with content something like this adjusted/compressed/masked example of one request:

So now Traefik is collecting data about requests to the hosted apps to the access.json file, next step is to ship those logs from the host to the control plane where I can query them centrally and use them for the UI. For this I will use two more tools, Promtail and Loki. Promtail is a log collector/shipping tool which will run on the host server and ship logs to the consumer, the control plane server in this case, and Loki is a kind of log database which will recieve the logs from Promtail, store them and make them queryable for the UI. Both tools are part of the Grafana stack, an open source platform for handling and visualization of data. As I am writing this and reading up on Grafana I see that Promtail is accually is deprecated and replaced with a tool called Alloy. LTS will end at March 2026, but since the pipeline already is set up and working at this point I can’t be bothered about this right now, maybe a blogpost about migrating to Alloy at some point..
The server setup for for Promtail and Loki follows the same pattern as before, both the host and control plane servers are set up using Ansible so this will be done by direcly adjusting their playbooks/tasks and rerunning them in a trial and error loop until things are working. That is the process I have followed atleast, not sure if that is good practise or not but it gets the job done. The nice part with Ansible is that it’s idempotent. You can tweak a task, run the playbook again, and only the changed bits apply. And when it is done it is done, now I can reproduce a server setup anytime by re-running the playbook. The promtail.yml file is intentionally shortened and compressed a bit for readibility in this post. The images are obviously not included for copy/paste/run, it’s more to give a high-level idea of what I’m doing here. If you are interested in the actual code just ping me somewhere. For Promtail a new promtail.yml file is set up with tasks to make sure it has a directory, render a config file, create a Docker compose file and then start Promtail and include this new file in the Ansible playbook list of tasks to run(site.yml):


Next is setting up Loki, again using Ansible, to recieve and store the log streams on the control plane server. What is good about Loki is that it stores logs cheaply by indexing labels, which makes it easy to filter on things like ‘project’, ‘status’ och ‘method’ and then run LogQL (Loki’s query language) functions like count_over_time or rate. In practice this means fast queries and low storage. I did run into a bug thanks to not scrutinizing what version the LLM suggested when setting up Loki, it was an older version and log indexing was buggy, traffic hits initially was shown when querying but lost after some time, lots of debugging later moving to the latest version solved it. Below is an excerpt from the Loki Ansible tasks.

I am still keeping an eye on this, but it seems that Loki is pretty good at compressing and I will have no problem storing traffic data for the few projects I have running for at least a year back, probably more, on my cheap 40GB server.
One thing that took some serious time, that probably shouldnt have, was setting up a filter with LogQL that only let through accual real traffic hits. Which also touches on an interesting fact I only had a vague idea about before this project: how much noise traffic there is out there, for web servers but also for pretty much anything you expose to the internet. Scanners, crawlers, random http clients hitting up known paths for exploits, it goes on. Which at first sent me off creating, with a LLM, the most intricate queries with long lists of known bots, paths to exclude, http status codes to ignore, do this, do that… The final query was perfect and only seems to lets through accual traffic, and it is painfully simple compared to what I came up with first.

This query only counts pageloads (Sec-Fetch-Dest=document) for a project the last 24 hours, only includes successful/redirected responses (status 200-399) and only requests that has a User-Agent. I will spare you my previous attempts, I take as little credit for those as I do for the one above. It is possible that this is letting through a false hit occasionally but after watching this for a while the traffic data seems pretty trustworthy.
So at this point I have Loki storing traffic data on the control plane server and a query to use for pulling relevant data for a project. Since Loki is running on the same server as the app I’ll be querying from, it’s pretty simple to fetch the data. Loki exposes an http query endpoint and, because the server is reasonably locked down and only exposes ports that need to be public (and Lokis port is not one of them), the backend can just call Loki directly over http without any extra authentication. All I need is an http client with a function that lets me make the request with my nice little LogQL query. Below is an excerpt.

Almost done, using the pageviews_last_days function shown above will return something like:
[
[1763769600, "1"],
[1763942400, "1"],
[1764028800, "4"],
[1764115200, "2"],
[1764201600, "3"]
]
The first value in each pair in this list is a Unix timestamp (seconds since 1970-01-01 UTC, my new favourite time format) and the second value is the number of page loads in that bucket, returned as a string. The example here would translate to this:
- 2025-11-21: 1 page load
- 2025-11-23: 1 page load
- 2025-11-24: 4 page loads
- 2025-11-25: 2 page loads
- 2025-11-26: 3 page loads
Since days without hits are not included in the raw result I run this through a small helper function to fill in the blanks, and in the end get a complete list of dates and hits for the requested time range. Setting up the final UI and chart (Phoenix Liveview and Chart.js) to show the traffic data is kind of out of the scope of this post so lets just skip to the final result:

So there we have it. Very basic view of the last weeks “real” traffic, it probably could do with some design tweaks and some obvious future improvements would be things like selecting a custom timespan for the stats or more detailed per-request stats (client IP, user agent, country/region) but some of that would drag in extra services (GeoIP) and complexity and I’m ok with what I’ve got at the moment, and this might be done as it is too, as it kinda made my point of proving to myself this was doable. If you have suggestions for improvements (or can tell me what I did wrong), I’m all ears, this was my first time setting this up, so chances are there’s a cleaner way.
That’s about it for this time around. Not sure what’s next, but I feel like I need to run through how the build/deploy live logs and the agent are set up, and a bunch of other pieces as well, partly so I dont forget how any of this works, partly to convince myself I’ve actually learned something, and partly to have some kind of written trail I can come back to when everything inevitably falls out of my head. If you made it this far, I salute you.
Things mentioned above
- spwnr - Personal PaaS platform
- Vercel - Hosting platform
- Traefik - Reverse proxy and load balancer
- Ansible - Infrastructure automation tool
- Docker - Container platform
- Logrotate - Log file management utility
- Promtail - Log shipping agent (deprecated)
- Loki - Log aggregation system
- Grafana - Open source observability platform
- Alloy - Successor to Promtail
- LogQL - Loki’s query language
- Elixir - Programming language