Event Serialization and PII Data (GDPR)
Last updated
Last updated
Ecotone use Converters in order to convert Events into serializable form. This means we can customize process of serializing and deserializing specific Events, to adjust it to our Application.
So let's assume UserCreated Event:
If we would want to change how the Event is serialized, we would define Converter
Then the Event Stream would look like above
This basically means we can serialize the Event in the any format we want.
Having customized Converters for specific Events, is also useful when we need to adjust some legacy Events to new format. We can hook into the deserialization process, and modify the payload to match new structure.
When using JMS Converter support, we can even customize how we want to serialize given class, that is used within Events.
For example we could have User Created Event which make use of UserName
class.
the UserName
would be a simple Class which contains of validation so the name is not empty:
Now if we would serialize it without telling JMS, how to handle this class we would end up with following JSON in the Event Stream:
Now this is fine for short-lived applications and testing, however in the long living application this may become a problem. The problem may come from changes, if we would simply change property name in UserName.value
to UserName.data
it would break deserialization of our previous Events.
As data
does not exists under name
key.
Therefore we want to keep take over the serialization of objects, to ensure stability along the time.
Now with above Converter, whenever we will use UserName
class, we will be actually serializing it to simple string type, and then when deserialize back from simple type to UserName class:
With this, with few lines of code we can ensure consistency across different Events, and keeping our Events bullet proof for code refactor and changes.
In case of storing sensitive data, we may be forced by law to ensure that data should be forgotten (e.g. GDPR). This basically means, if Customer will ask to us to remove his data, we will be obligated by law to ensure that this will happen.
However in case of Event Sourced System we rather do not want to delete events, as this is critical operation which is considered dangerous. Deleting Events could affect running Projections, deleting too much may raise inconsistencies in the System, and in some cases we may actually want to drop only part of the data - not everything. Therefore dropping Events from Event Stream is not suitable solution and we need something different.
Solution that we can use, is to change the way we serialize the Event. We can hook into serialization process just as we did for normal serialization, and then customize the process. Converter in reality is an Service registered in Dependency Container, so we may inject anything we want there in order to modify the serialization process.
So let's assume that we want to encrypt UserCreated
Event:
So what we do here, is we hook into serialization/deserialization
process and pass the data to EncryptionService
. As you can see here, we don't store the payload here, we simply store an reference in form o a key.
EncryptionService can as simple as storing this data in database table using key as Primary Key, so we can fetch it easily. It can also be stored with encryption in some cryptographic service, yet it may also be stored as plain text. It all depends on our Domain.
However what is important is that we've provided the resource id to the EncryptionService
Now this could be used to delete related Event's data. When Customer comes to us and say, he wants his data deleted, we simply delete by resource:
That way this Data won't be available in the System anymore. Now we could just allow Converters fails, if those Events are meant to be deserialized, or we could check if given key exists and then return dummy data instead.
If we allow Converters to fail when Serialization happens, we should ensure that related Projections are using simple arrays instead of classes, and handle those cases during Projecting. If we decide to return dummy data, we can keep deserializing those Events for Projections, as they will be able to use them.