Bi-directional ManyToMany relation

Hi everybody,

I’m trying to create a bi-directional ManyToMany Relation, but I’m stuck as every way I’ve tried so far simply does not work. I feel quite stupid right now as I couldn’t find anyone with that problem so far and couldn’t find anything in the docs either.

As an example consider the following classes:

/**
 * @Flow\Entity
 */
class Testuser
{
    /**
     * @var integer $id
     *
     * @ORM\Column(name="id", type="integer", nullable=false)
     * @ORM\Id
     * @ORM\GeneratedValue(strategy="IDENTITY")
     */
    protected $id;

    /**
     * @var string $name
     */
    protected $name;

    /**
     * @var \Doctrine\Common\Collections\Collection<\[...]\Testgroup>
     * @ORM\ManyToMany
     */
    protected $groups;

}

/**
 * @Flow\Entity
 */
class Testgroup
{
    /**
     * @var integer $id
     *
     * @ORM\Column(name="id", type="integer", nullable=false)
     * @ORM\Id
     * @ORM\GeneratedValue(strategy="IDENTITY")
     */
    protected $id;

    /**
     * @var string $name
     */
    protected $name;

    /**
     * @var \Doctrine\Common\Collections\Collection<\[...]\Testuser>
     * @ORM\ManyToMany
     */
    protected $users;

}

If I defined them just like this, Flow will create two tables in the database. This is of course not what I want. If I explicitly set the doctrine annotations on one side (or on both sides fot that matter), the software will work, but a doctrine:update will complain that the join table already exists.
So basically my question is: what’s the intended way to model a bi-directional ManyToMany relation like this in Flow?

First of all: No need to feel stupid! Handling relations correctly is in fact one of the hard parts, even though it might look easy on first sight. There are apparently some gotchas. :slight_smile:

Second: Unless you really really need it, avoid using bidirectional relations. They cause a lot of trouble in a lot of cases that could be prevented, because you actually don’t ever need to traverse the object graph in both directions (often you can instead replace single occurences of this need with a custom repository query). See also https://flowframework.readthedocs.io/en/stable/TheDefinitiveGuide/PartIII/Persistence.html#on-unidirectional-onetomany-relations

Third (in case you really do need the bidirectional relation):
A bidirectional ManyToMany relation always requires a JoinTable. It’s impossible to model such a relation in SQL otherwise (without hacks like comma-separated-fields… which no, just don’t). See e.g. https://dzone.com/articles/how-to-handle-a-many-to-many-relationship-in-datab for a bit of an explanation.
Also, a bidirectional relation in doctrine requires you to define one side as the so called “owning side”. This means that when you persist the object graph, you need to update that model in it’s repository. It defines how doctrine will traverse the object graph to find objects it needs to persist, and it will only traverse in one direction. So if you do this wrongly, your relation will not be persisted.
See Association Mapping - Doctrine Object Relational Mapper (ORM) and https://www.doctrine-project.org/projects/doctrine-orm/en/2.9/reference/unitofwork-associations.html#association-updates-owning-side-and-inverse-side for more information.

Now assuming your Testgroup is the logical owning side in your domain model, the annotations should look roughly like that:

/**
 * @Flow\Entity
 */
class Testuser
{
    /**
     * @var integer $id
     *
     * @ORM\Column(name="id", type="integer", nullable=false)
     * @ORM\Id
     * @ORM\GeneratedValue(strategy="IDENTITY")
     */
    protected $id;

    /**
     * @var string $name
     */
    protected $name;

    /**
     * @var \Doctrine\Common\Collections\Collection<\[...]\Testgroup>
     * @ORM\ManyToMany(mappedBy="users")
     */
    protected $groups;

}

/**
 * @Flow\Entity
 */
class Testgroup
{
    /**
     * @var integer $id
     *
     * @ORM\Column(name="id", type="integer", nullable=false)
     * @ORM\Id
     * @ORM\GeneratedValue(strategy="IDENTITY")
     */
    protected $id;

    /**
     * @var string $name
     */
    protected $name;

    /**
     * @var \Doctrine\Common\Collections\Collection<\[...]\Testuser>
     * @ORM\ManyToMany(inversedBy="groups")
     */
    protected $users;

}

However, that means that to persist a user’s groups, you have to do something like this:

$user->addGroup($someNewlyAssignedGroup);
$someNewlyAssignedGroup->addUser($user); // could be done inside the `addGroup` method
$testgroupsRepository->update[/add]($someNewlyAssignedGroup);

This in turn means that doctrine will potentially (see lazy loading collections - optimally use extra_lazy) iterate all users belonging to this group and check them for changes to be persisted, which could be a huge performance issue. You can prevent this by using lazy, or even better extra_lazy loading for those collections (see Extra Lazy Associations - Doctrine Object Relational Mapper (ORM)), but you see bidirectional relations quickly start to become tricky to deal with.

Fourth and just as a sidenote: I suggest to avoid “IDENTITY” or “AUTO” (they are the same on MySQL) strategy identifiers. The reason is many-fold, but just to name a few:

  • when inserting new objects, for every single object the database needs to be hit at least twice - once for inserting the object, a second time for retrieving the DB assigned identity in order to have it available in the object model. Combined with a potentially huge collection/batch insert, this can quickly become a bottleneck
  • you have to deal with guessable identities in your application if you somewhere use the identity in your URLs. Also, you disclose the amount of objects in your database
  • inside the DB an auto-inc field requires a coordination and serialisation of all writes. This in turn means that multi-writer setups become slower than they could be and writes in general can not be parallelized (would lead to duplicate primary keys)
  • when you eventually need to deal with backups and import old data into a still running system, you will have to deal with potentially duplicate keys and manually fix them. Also you can’t just take a dump from one server and import it on another (e.g. a testing instance) without completely overwriting everything there.

All those and more are the reasons why Flow uses UUID identifiers by default. So unless you are dealing with a legacy DB schema that you need to map in your application, rather stick with Flow’s default or manually assign a UUID on entity instanciation, but don’t let the DB generate an ID for you.
See https://www.clever-cloud.com/blog/engineering/2015/05/20/why-auto-increment-is-a-terrible-idea/
and
The case against auto-increment in MySQL |Official Pyhian blog
for more in-depth explanation on why auto-inc is not a good idea.

Hope this helps you in your endeavour!

1 Like

Hi Alexander,

thanks very much for your help and detailed explanations. Adding the mappedBy / inversedBy Information without the Doctrine mapping annotations acutally was something I hadn’t tried yet. Still, this led to an exception within Doctrines SqlWalker which I didn’t really want to investigate any further (probably some issue within a query), so I decided to follow your advice and overthink if I really need a bidirectional relation and dropped it.

So basically: problem solved (and yes, it’s a legacy database, but still thanks for the links - it’s always good to reiterate some things from time to time) - and thanks a lot for all the work you obviuosly put into the post. It’s much appreciated!

1 Like