Flow object inconsistencies

christianm · October 11, 2016, 5:25pm

Hey,

I recently saw implementations of APIs and saw that we have a bunch of inconsistencies build into the whole thing:

PropertyMapper accepts __identity and __type as “magic” properties to load a specific entity (and / or create one with a specified class).

Now the JsonView understands the following configuration options:
_exposeObjectIdentifier (which will by defualt expose the identity/identifier as __ identity ), that’s great although we mix “identity” and “identifier”

_exposeClassName which will expose the specific PHP class with the key __class and that cannot be configured differently, so at this point a data roundtrip with __type is not possible because either sides are not configurable to accept another key.

Additionally we have the f:format.identifier() viewhelper which will expose the “identifier” / “identity” and again we mix both words.

Same goes for the PersistenceManagerInterface which declares getIdentifierByObject and getObjectByIdentifier but also convertObjectToIdentityArray (which internally calls getIdentifierByObject).

My point being this seems inconsistent to me and makes it hard to build round trip APIs and find the right method/VH names.

If you have good reasons why the naming is exactly right in all these places, let me know. Otherwise I would like to discuss which term to keep and how we can make that consistent across everything.

goli · October 12, 2016, 12:35pm

Hey there.

Here’s the way I think about identities and identifiers. To be clear: That’s my personal feelings about those, I know there are other feelings about them. And I’m perfectly fine with that.

To me, an “identity” completely defines the whole thing, where the “identifier” only targets a distinct one out of a given set. This gets a little irrelevant when using UUIDs, but still. Think back to the olden days, when a mysql based auto_increment field would be a good identifier pointing to a single record of a database. But completely defining the whole thing means to at least mention the database table name.

So, asking a certain repository for “findByIdentifier” is good since the target type is clear based on the repository. But asking the persistence manager about “getObjectByIdentifier” is bad because the identifier might be ambiguous in the world of all entities being retrievable by that persistence manager.

This leads to questioning data types of identifiers and identities as a side note. An identity to me clearly consists of at least two values, the type name and the identifier. So the identity to me is multi valued. Even though it can be converted to a scalar value, that’s not its natural type in the first place. The identifier, in turn, of course can be multi valued. But the most common format of the identifier would be a string or integer, so this one is perfectly fine to be a scalar right from the beginning.

And as stated before: I know there are others thinking differently. The most common way of thinking differently about it is to call the identifier the materialization of the concept of things having an identity, like the number printed on my passport, which is an identifier, but my identity is the person I am and the way I interact with my environment.

Regards,
Stephan.

bwaidelich · October 18, 2016, 9:05am

I think the terms are defined in DDD:
An entity has an identity and we can refer to it in code via its technical identifier.

I think at least in the documentation we got that right.

IMO it was a mistake to call the hidden field __identity back then. I think the reasoning was that it could be something different than the technical identifier that unambiguously identifies the object in question but I think that’s just adding confusion.

BTW, a side note re:

It’s not the PropertyMapper that accepts those magic strings but certain TypeConverters and I think that’s bad practice. The ArrayFromObjectConverter even sets those properties in the resulting array which has caused some trouble…

IMO we should always use a 2nd channel, namely the PropertyMapperConfiguration to change the behavior. But that’s a different topic ofc

aberl · October 18, 2016, 10:42pm

I understood the concept of identifier vs. identity inside Flow like this:

“identifier” is the technical identifier as Bastian already stated. It is an immutable property that uniquely identifies the entity (in the persistence). If not specified by @ORM\Id Annotation(s), it is a UUID AOP’d as Persistence_Object_Identifier
“identity” can be any combination of fields that uniquely identify the entity. They are not technical but have a domain meaning and hence can be used e.g. as a slug (as done in routing).
Only properties annotated as @Flow\Identity are considered part of the identity inside the class schema, not the technical identifier.

The identity vs identifier difference is also used in the UniqueEntityValidator for example, where it checks for existence of entities in the repository with the same identity properties, but a different identifier. That means it is attempted to create a duplicate entity (with a new identifier).

Maybe a more practical example:
Books have an identity of (title, author[, year]) and an identifier of the respective ISBN.

The ISBN is a number not directly calculated from the identity, so you could attempt to publish a book with the same title, author (and year) but different ISBN, which should not be allowed. Likewise, publishing a book with different title and same ISBN is not allowed either. The identity has a meaning in the domain of book reading, the ISBN only in the registry (or if you want to find a specific book quickly).

Now of course that’s just my understanding of the current implementation, which as stated already is pretty ambiguous in several places and reading the other posts it seems there is a very different mental model of those two concepts in everyones head.

So I’m all in for clearing that up and getting rid of the ambiguous inter-mixed use of “identity” and “identifier”. We should be more consequent on the wording and be explicit in cases where we deal with both concepts.

Regarding the other topic of making __type vs. __class more consequent: I’d totally buy into making either/both configurable and hell, I’d even think a second about changing either one to the other by default for next major.
So it would be breaking, but could be configured to non-breaking behaviour.

bwaidelich · October 19, 2016, 2:01pm

That’s just the fallback. You can of course have an identifier property annotated @ORM\Id, too