Irregular database error: "The optimistic lock on an entity failed"

Hey guys,

the Neos website of one of our customers fails at irregular intervals for a few minutes with the error message: “The optimistic lock on an entity failed”.
A self-developed ViewCounter is integrated on the website, which counts up a corresponding integer property of the current document node when called. According to the stack trace, the error occurs as soon as the counter wants to save its changes. (see the following portion of the exception log for reference)

Exception in line 483 of .../Packages/Libraries/doctrine/orm/lib/Doctrine/ORM/Persisters/Entity/BasicEntityPersister.php: The optimistic lock on an entity failed.

41 Doctrine\ORM\OptimisticLockException::lockFailed(Neos\ContentRepository\Domain\Model\NodeData)
40 Doctrine\ORM\Persisters\Entity\BasicEntityPersister::updateTable(Neos\ContentRepository\Domain\Model\NodeData, "neos_contentrepository_domain_model_nodedata", array|2|, TRUE)
39 Doctrine\ORM\Persisters\Entity\BasicEntityPersister::update(Neos\ContentRepository\Domain\Model\NodeData)
38 Doctrine\ORM\UnitOfWork::executeUpdates(Neos\Flow\Persistence\Doctrine\Mapping\ClassMetadata)
37 Doctrine\ORM\UnitOfWork::commit(NULL)
36 Doctrine\ORM\EntityManager::flush()
35 Neos\Flow\Persistence\Doctrine\PersistenceManager_Original::persistAll()
34 Vendor\Site\Fusion\CounterImplementation_Original::evaluate()   <- This is the view counter implementation
...

The page counter component was developed by a former colleague, so we can’t ask him about it anymore, but at first sight I was very surprised that $this->persistenceManager->persistAll() is called manually. Since this call is the origin of the error message, I suspect the problem here, but am not quite sure.
If a NodeData model is changed, the PersistenceManager should recognize this as a data change and automatically trigger the saving process when shutting down the framework, or am I wrong? So could you delete the mentioned line to fix the problem or is there another possible cause?

Apart from that, I am also interested in the actual explanation for the error. Based on the information I have, I don’t quite get what’s wrong here. What exactly does the error message “The optimistic lock on an entity failed” mean? On the web I have only found very vague information, according to which some version does not match between database and application, but that didn’t help me. Besides, I don’t understand why the problem only happens once in a while.

I hope you can help me out in some way and thanks in advance for your participation.
Best regards, Max

Hey Maximilian,

I’m trying to shed some light on this :slight_smile:

Locking

“Locking” is a term used for a set of concepts to ensure that only one participant writes data at one point in time, i.e. you don’t accidentally override what others just did. There are various ways to implement locking, i.e. optimistic and pessimistic locking.

The basic idea is that instead of user A overriding data for user B (silently), you want to detect this scenario.

Optimistic Locking

Optimistic Locking works in the following way:

  • each object has some kind of “counter” in the database, which is incremented whenever it’s written to this object. (a “modified timestamp” etc will also work)
  • when reading from the DB, you remember the current state of that counter in your object.
  • then, you modify your object.
  • when you write your object back, the write will only succeed in case the counter is still the one you have read previously. So that means, here are two cases now:
    • Case 1: Nobody touched the object in the meantime; i.e. the counter is still at the value you read initially. All works; on writing, you set the new value and update the counter.
    • Case 2: Somebody else updated the object in the meantime; i.e. the counter will be different than what you read. In this case the system throws an exception with the message pointed out above.

Your scenario

  • this means, whenever two people access a page in your system “almost simultaneously” so that the above scenario with concurrent writes happens, one of them will get the exception from above.
  • that explains why it only happens from time to time.

Options to solve the problem

  • (workaround): you can wrap the call to the PersistenceManager with try/catch; and simply ignore the error. This will lead to some non-exact visitor counts, as when two people visit concurrently, only one will be counted
  • ideally, a visitor count should be implemented differently; i.e. by storing the counter in a separate database table. Then, you could use direct SQL (so no doctrine ORM) to run a query like UPDATE myCount = myCount + 1 WHERE page = ? - This will atomically increment the counter, and the database will take care of this.

I hope this somehow explains it, and gives some background :slight_smile: If some parts are still unclear, don’t hesitate to ask and I’ll do my best to answer!

All the best,
Sebastian

1 Like

Good morning Sebastian,

first of all thank you very much for the detailed answer, I think I have now understood the subject.
The idea with the try catch block already came to me, but I think I’ll try your second suggestion with the direct database access. Thanks also for that, it helped me a lot. :blush:

Best regards, Max

1 Like

In addition to Sebastians great explanation you could have a look at a similar thread that discussed different ways to approach what you are trying to achieve: [SOLVED] Counter for survey or page views

That’s really an interesting discussion. I look at it in detail and see if I can completely rebuild the component based on these information. It seems to me that during our initial development of the counter many aspects were only solved improperly, which obviously leads to problems now.

Thank you very much for your participation.

1 Like

Hey Maximilian,

glad that our responses helped :slight_smile: Don’t hesitate to ask further questions :upside_down_face:

All the best,
Sebastian
PS: Are you aware of Neos Conference in May? Might contain some useful topics :slight_smile: (at least we try hard :wink: )

Hey Sebastian,

yes, I know about the Neoscon, but unfortunately I cannot participate personally this time. But I will follow the livestream as eagerly as last year and if possible I will listen to all talks. Next year I’ll hopefully finally participate there in person, but one way or the other I really look forward to it.

Best regards, Max

I strongly suggest to disable and rethink that view counter, it sounds very dangerous and I have seen similar ideas with devastating effects.

Hey @christianm,

do you have any specifics in mind? Would be interesting if you could share them :slight_smile:

All the best,
Sebastian

Yes of course.

Sebastian explained the problems of the optimistic Lock in NodeData alreaady, that is the main reason it shoudln’t be done in NodeData.

Additionally when you think about workspaces the whole thing just implodes, because what will happen. Let’s say I have a page in my workspace for a week. In the meantime the live node counter is updated a lot of times. Then when I finally publish to live from my workspace the counter is suddenly back to the value it was at a week ago.

Ideally don’t implement a visitor counter at all. If you really must have one, I would look into web statistics tools like Google Analytics or Piwik and actually fetch the visitor count from them.

I personally would stop at this point but then if you really must, do as sebastian suggested, take a separate table and do the direct query updates.

1 Like

Thanks @christianm
really good explanation :slight_smile: :slight_smile: