RFC: Server side website tracking with Matomo

Since the whole cookie consent thing becomes more and more an even more serious topic server side tracking becomes more interesting. I’m not an expert on the current law etc. so if somebody knows more please give your input.

I created an issue https://github.com/Flowpack/neos-matomo/issues/37 and linked the official Matomo PHP Library.

The following questions arise for me at the moment:

  • What can be done GDPR safe, and what cannot be done without consent?
  • How to filter requests relevant for tracking and irrelevant ones?
  • What is the influence on response times when executing the tracking during request processing (as http component f.e.)?
  • What data do we need as a minimum to generate any usable tracking data?
  • How to ignore crawlers, bots, etc. (or will Matomo take care of that when we supply the necessary information)?

And the most important question:

Who would like to help with the implementation? :slightly_smiling_face:

Where in the chain must a component like this be placed?

  • What is the influence on response times when executing the tracking during request processing (as http component f.e.)?

Could we queue it (using a jobqueue) to make it async?

We can do it after Neos resolves a node, but it has to work with the FullPageCache f.e…
Then we would already know that it’s a page and can retrieve title, etc.

A queue is a possibility, maybe make it available as an option as long as we don’t have a queue as core dependency.

my2c, without knowing matomo in-depth (we actually could do that for google analytics as well)

As soon as you store any personal data (i.e. IP Address), you need consent for that. Not absolutely clear what happens if we anonymize the IP Address (which we should do if we save it)
If you set a cookie (also using local storage or else) which is not technically necessary, you have to inform why and what you do with it.

Server side we mostly can track page views, so we would need to operate on document (url) level.
We could also track interaction (like pagination, forms, …) which would give some more reliable data with regards to conversions. (Could a tracking-API like “@tracking” in Fusion be helpful?)
If you also add tracking on webserver level, you can track downloads like pdfs as well (which might be one of the benefits of server side compared to client-side)

Afaik it is done with this “Device Detector”: GitHub - matomo-org/device-detector: The Universal Device Detection library will parse any User Agent and detect the browser, operating system, device used (desktop, tablet, mobile, tv, cars, console, etc.), brand and model., so what we need to supply is the User-Agent