RFC: Node Context and domains

Hey,

and another one from me… Again more in the category “it’s broken, let’s discuss how to fix it”.

So to set the stage I am working to allow references to nodes outside the current site. Now that basically isn’t hard, I have the change ready to push, but it has one bigger issue. It’s what to show in the backend for those nodes. Obviously the uriPath alone won’t be helpful for other sites and impossible for things that are not in a site at all.
I have some ideas what to show, but that pointed me to the actual issue I would like to discuss:

We cannot reliably link to other sites currently for several reasons:
The domain entity we have has a property hostPattern, but can we trust it contains a valid domain?
How do we know which domain is the primary one?
This would be the first issue to solve. I am not entirely sure how ot proceed but I would suggest a primary flag (which actually should be mutually exclusive to being disabled) and transition to make hostPattern actually a host? Or add host additionally which is required for the primary domain?

Addition: The \TYPO3\Neos\Controller\Backend\MenuHelper::buildSiteList method does linking to other sites, but to me that does look far from reliable…

Then we have a second problem: ContentContext holds currentSite and currentDomain and this information is used all over the place (link building for example), but it is created in a less than optimal way. Which results in link building that doesn’t work for other sites.

See, this code snippet is used in several places:

$currentDomain = $this->domainRepository->findOneByActiveRequest();
if ($currentDomain !== null) {
    $contextProperties['currentSite'] = $currentDomain->getSite();
    $contextProperties['currentDomain'] = $currentDomain;
} else {
    $contextProperties['currentSite'] = $this->siteRepository->findFirstOnline();
}

This is fine for me in (incoming) routing as there we need to go by the current request, but it is for example also used in the NodeConverter and there it is IMHO wrong to do that as the result is not deterministic based on the input alone but depends on the request. I rather think we should derive the site from the node path given to the converter. I also have a change for that ready, but of course that is quite some change (even if mostly not noticeable). Same goes for any other places this type of code is used.
Pushed that here: https://github.com/neos/neos-development-collection/pull/133
Then the additional problem is how we can deal with getting nodes within an existing ContentContext (if you just got the path). Imagine you grab a node from a site that is not currentSite. Then you got a node with a context that doesn’t fit together, which breaks link building (and probably other things as well).

IMHO we have two routes to go with ContentContext and I am not sure yet which I like more, I have to think scenarios through a bit more:

  1. A ContentContext is deliberately about one specific site, any nodes outside of the currentSite are basically non existent for that context. Would be pretty consistent BUT make our lives VERY hard because we would have to build logic to switch contexts everywhere we create nodes (doable) and make sure we expect different contexts and everything not in a site would need to use a CR Context and cannot use a ContentContext.

  2. We basically “remove” currentSite from the ContentContext or treat it differently than we do now (really only as current and not the site all nodes belong to, which we do now). That would definitely mean it shouldn’t be part of the context properties anymore. I still need to see though the code what that would imply but I think it would be doable. This would make the ContentContext mostly a “place to grab some request specific information” but the context properties wouldn’t be different compared to the CR. The task would be to provide a central place that you can give a node and get the matching site (and domain) from.

As said, I wanted to start discussion about it without having a master plan yet. I like both options but I know that option 1 will make some things much harder as the context starts to depend on the node path in some way.
Especially for the basic idea behind the currentSite I would llke to pull in @aertmann @kdambekalns @robert @christopher and @sebastian

No one? :frowning:
Would be great to have some feedback and discussion here. This topic should be tackled…

Well, we have to trust it is a usable pattern. It does not have to be a valid domain, that’s why it is a pattern.

No, please don’t remove the ability to specify a pattern here. Being able to act on subdomains might be a useful feature for someone out there. The question of “which is the canonical domain” is something different and needs to be answered anyway (canonical link, anyone?). So, yes, why not be able to mark one domain per site as primary/canonical and require it to be a FQDN?

If the current domain is in the request, that should be fine. The findFirstOnline I can imagine to give trouble, that should return a (matching) primary domain as per above.

Obviously that needs to use the primary domain for that site. Which leads me to the conclusion that one must have a primary domain per site…

All that being said…

To me option 2 definitely sounds better.

Sorry starting writing this last sunday, but took quite some time and had to reflect a bit over it, so switched focus to the release stuff.

Anyway thanks for bringing this important topic up. We’ve discussed parts of this before when talking about structured editing and multi-site support, but haven’t taken any action to fix them yet.

In general the current implementation is broken by design and the concept is not generic since the whole thing is tightly coupled to a site, which happened for pragmatic reasons. We should come up with a solution where sites is just one example of a “bounded context” of the CR. There are many aspects in this to consider, so we should think things through before deciding on a path.


Decouple URL from the backend
One of the first thing to do is to decouple the front end URL from the backend. This will allow us to show any node from the CR and allow showing entities as well. Additionally it will allow viewing a site regardless of the domain, which is very useful for multi-sites. Either using the node identifier with a context string, e.g. 75a28524-6a48-11e4-bd7d-7831c1d118bc@user-admin;language=en_US or the full path. And then keep the uriPath accessible as a fallback and for flexibility & backwards compatibility.

This means the backend URL won’t be as “accessible” as before, but it will make it more robust and it’s the URL for the page could be displayed in the backend. A speaking URL should be an optional enhancement, not a requirement.


Domain
I found it odd that the current implementation accepts patterns, although potentially useful, it is currently not always working as expected and people get confused by it.

The only reason we need a host name is for linking from one site to another. In all other scenarios using the request for determining the URL should suffice for which the current solution works.

Now there are two ways of solving that problem. Either add a primary flag (canonical) as you suggest, which requires that it’s a host name and not a pattern. Or we make it possible to order them and the first one with a host name is used. We can easily display a warning if a site in a multi-site setup doesn’t have at least one domain with a host name.

I’m not sure which solution is best, but I’m leaning towards the priority one from a UX perspective and since it would fit the existing concept.

Btw. there’s already a ticket for adding a primary flag to the domain entity.


Removing site from context

The current implementation to determine the site in other places than the route part handler is indeed a pragmatic solution.

Option 2 sounds reasonable to me, but I’d like to get a better overview of what the content context actually helps with.

  • Node linking?
  • Could be determined using the node hierarchy instead with a fallback as suggested further up.
  • Resource linking?
  • Resource linking should be done relatively instead of using absolute paths (unless using a CDN ofc.)
  • Node tree limiting?
  • Node tree should be bound to a context and starting path, not a site, see related issue. Multiple trees can cater for different context needs.
  • Node / asset search limiting?
  • Currently these are limited by the viewed site, however if using a context with a starting path, the same could be used here.

So in general I think we should try to come up with a solution that doesn’t involve a site, but rather a flexible context that’s being parsed around (like in CR).

Maybe @radmiraal would be interested in chiming in, since he’s considering some scenarios similar to this.

Related to all this is the challenges about site specific configuration. Currently site packages has quite some built-in conventions, which might not be the optimal solution. But I’m entirely certain on this topic, so having some kind of concept brain storming would be great.


Actually there’s more to it than that, Flow in general doesn’t handle linking to different hosts. Here’s a ticket.


Sorry this isn’t a direct answer to your questions, but I think we need to think take a step back to be able to make the right decisions here.

Ps. while looking into this matter I found a site dependency in CR

Full ack here. I think default behavior should be to use the speaking url if possible. This will make the change a lit less ‘shocky’ :stuck_out_tongue: Would a setting to completely ignore the speaking url for the backend make sense? I think I’d like that as I’d prefer consistency in my installs…

I definitely prefer the primary flag over ordering. Or if we do use the ordering I think the design / context should be really clear about what would be the primary one. But as one could sort a pattern as the first one this could become really confusing.

In short we’re going to put our content outside of a site node, and will just use site nodes to create websites that are displaying that content based on filters. So everything that removes the default site requirement is making me happy :wink:

What I’m not sure about yet is how to solve the dimension issue / specific site configuration. To me it is site configuration which dimensions will be used (so that should be configurable on a site level). But I doubt if I think the same about the dimension fallbacks… If they’re not global it might become overly complex while I’m not even sure yet if it has a usecase.

And as a reply to the original post: option 2 sounds better to me :stuck_out_tongue: If we could manage to get the Neos ContentContext out of the way completely I’d even be more happy. We try to build an application that is based on Flow / CR only but can be extended by using Neos if one would like to have a public website too. The differences in context are a possible source of headaches in the future :wink:

By now I also had evolved from the first ideas that were coming from a specific problem. I also see the need for a generic bounded context. I am still unsure about the way it should behave though.

Option 1 seems more and more difficult as it would mean in such a bound context you can never access other sites (or other trees in general), but on the other hand just having something like a “startingPoint” node(path) has not much of a meaning. I mean what indication would that give? And if you start thinking about the other things like domain configuration per site? etc. etc. I see we need to take a broad sweep here to not solve one problem and maneuver two other topics into a dead end. So lets evolve this topic a bit more and see where that leads us.

From what I see in the code the currentSiteNode(), which is almost the only thing from the three (currentSite, currentDomain, currentSiteNode) that is used, is mostly used for convenience and could easily be replaced. Probably a singleton NeosRequestService or similar, providing this information would help and maybe we can even make that not singleton but rely on having some other node to fetch a site based of.

Also I thought about adding an extra domain entry to the site, the canonicalDomain which expects a full domain name and scheme. Then it’s pretty clear what this is about.

1 Like

Tackled some of it with https://github.com/neos/neos-development-collection/pull/439, however haven’t done any cleanup of the context.

1 Like