The topic sprint about the “Event-Sourced Content Repository” in Kiel has started!
Here’s a brief summary of the first "meeting"¹
General Concepts
We came up with a basis for the concept at a workshop together with Mathias Verraes in December last year. This is quite a while ago, so we had to refresh our minds a little:
Editing Session
One notion we came up with is the Editing Session. It is bound to a specific user working on one workspace (remark: maybe even in one context, i.e. including dimensions!?
)
The Editing Session is started as soon as the user starts to edit (remark: we need to find out a good way to enforce this, maybe we add an additional interaction step to the UX
).
And it ends as soon as the changes are published or revoked (remark: that means that an Editing Session can be very short, e.g. with auto-publishing enabled or last multiple days or weeks
).
Read Model
One advantage of CQRS/Event-Sourcing is that we can have multiple dedicated Read Models, suited for their use case.
But for the core API we’ll probably end up with a main Read Model built upon some proper graph implementation (see Bernhards prototype).
To be defined: Are changes of an Editing Session part of that read model or in a separate layer (remark: an extreme case could be to replay Editing Session events in memory on user login and maintain that Read Model in the browser state
)
Conflicts / Rebase
The current CR implementation uses Optimistic Concurrency in that the last write “wins”. This can lead to nasty side effects and even to an inconsistent state.
With the append-only nature of an Event Store we can no longer tinker around that problem.
Fortunately there’s a good model for what we’re trying to achieve: GIT.
The current idea is that during an Editing Sessions events are published into a separate (remark: possibly temporary
) stream (think branch in GIT).
Before the changes can be published any intermediate changes published to the underlaying workspace(s) have to be incorporated into that Editing Session stream (think rebase in GIT).
Hard constraint: Trying to publish an outdated Editing Session must fail.
(remark: the rebase can happen in the "background" from time to time, e.g. upon user login
).
In case of a conflict (i.e. the same node has been changed in both branches) user interaction might be needed (remark: in a first implementation we could ignore this and fall back to Optimistic Concurrency
)
Migration path
There are many ways how we could approach this beast.
We probably have to adjust on the way, but for now we decided to go the following route:
- Create a new branch in the neos-development-collection
- Keep the current PHP API of the Neos.ContentRepository package and replace implementation piece by piece
- Adapt the Neos importer so that it converts the current XML format to
NodeWasImported
events (wrapped in some kind of Editing or Importing Session) - Provide a (GraphQL) API for HTTP
To be found out: Do we start with a projector that generates the current database structure or can we immediately “bend” the PHP API (including Flow Query, …) to use the graph based structure.
Also to be found out: The “Active Record” kind of way the NodeInterface
behaves today might cause problems.
We will deprecate at least the mutating methods in favor of proper Commands but it will be a challenge (if not impossible) to provide a (PHP) API that behaves as if it was synchronous while adding support for asynchronicity from the start.
For example: Keep supporting something like this won’t be easy to achieve
$node = $this->context->getNode('/some/path');
$node->createNode('Child', $someNodeType);
$childNodes = $node->getChildNodes();
// ...
One approach might be to work with some kind of Promises that block the code until some projection has processed a given command… But maybe it’s more feasible to break compatibility here in order to make the asynchronous nature explicit…
¹ unfortunately I couldn’t join the Sprint face-to-face but my beloved team mates were so kind as to invite me via Slack call