Introduction
This section explains how to delete documents from Shopware with the output-module. The following entities/documents are deletable:
Document | Shopware Table | Delete Modes |
---|---|---|
Category | category | soft/hard |
Customer | customer | soft/hard |
Product | product | soft/hard |
PropertyOption | property_group | hard |
PropertyValue | property_group_option | hard |
This does not mean that all other document types are not deletable at all. Please contact us, if you need to delete other document types.
Delete Modes
There are two delete modes available: hard
and soft
. While a "hard delete" deletes the entity from Shopware, a "soft delete" will
set the target entity to inactive. Since not every entity provides an active
field in Shopware the soft delete is not supported for
every document type.
Delete Mode Configuration
The delete-mode must be configured individually for each subsection. Example:
{
"...": "...",
"subsections": {
"subsectionName": {
"deleteMode": "{DELETE_MODE}"
}
}
}
The module does not delete any entities without actively authorizing it via configuration. If you do not want to delete documents for
a subsection you can simply omit the deleteMode
entry (or enter a value that is not equal to "hard" or "soft").
Delete Timestamps
As soon as a document contains a value for the deletedAt
timestamp it will be handled by the module as deleted. Even if the document
contains a value for updatedAt
or createdAt
that is newer than the value for deletedAt
it will
still be handled as deleted.
Delta Deletion
Removing a document from your data set does not lead to its deletion in Shopware. There is no delta-check in place whatsoever. In
order to delete a document from Shopware you have to set a valid value the field deletedAt
.
Heuristic Anomaly Detection
This feature prevents the unexpected deletion of large amounts of data at once. The main purpose is to intercept possible errors of
input-modules that mistakenly provide all documents with a deletedAt
timestamp - e.g. caused by incorrect delete commands from a
data source that are not detected by the input module.
Overview
So how does the anomaly detection work? From the user's point of view, only the following summary about this mode is really relevant.
- The module analyzes each delete-progress and detects the amount of documents that will actually be deleted from Shopware (documents already deleted in previous executions are respected in the anomaly calculations).
- If an anomaly is detected ( = >n % of entities of the shop will be deleted) the module will stop and persist a confirmation string
- This confirmation string must be added to the configuration at
lockouts.deleteAnomalyConfirmation
- During the next execution the anomaly detection is suppressed for the affected subsection
- The persisted string is deleted when the input is correct. So the anomaly-detection is active again during the next flow execution.
If you are really interested in how the details work you can read the section "Details - How Delete Anomalies are Detected" from below.
Configuration
{
"lockouts": {
"deleteAnomalyConfirmation": "..."
}
}
Threshold
The default threshold used by the module is 75%. The threshold defines the percentage of entities from the target shop that have to be deleted from Shopware in order to stop the execution.
Example: If >=75% of all products will be deleted you have to confirm this via configuration.
You can set the threshold globally via configuration:
{
"deleteAnomalyThresholdPercentage": 75
}
You can also override this value specifically for each subsection:
{
"...": "...",
"subsections": {
"subsectionName": {
"deleteAnomalyThresholdPercentage": 35
}
}
}
Details - How Delete Anomalies are Detected
Mapping subsections that support the deletion of entities contain the subsections delta-count
and anomaly-check
. Those subsections
as well as the progress initialization are part of the delete anomaly detection process.
In the following all steps to detect anomalies are explained.
Step 1 - Analyze Progress
The subsection init-progress
does two things in order to prepare the anomaly detection:
- initialize the delete-progresses with the number of documents with a valid
deletedAt
timestamp - compares each delete-progress with the number of entities available in Shopware
By comparing the delete-progress with the entity count in Shopware the module is able to detect "potential delete anomalies".
A potential anomaly is present when the amount of documents to delete is higher than the configured threshold. This is what we call
a "progress anomaly". Note that we do not know how many documents are actually going to be deleted yet - so we do not know if the
"progress anomaly" is actually a "delete anomaly". We have to make a distinction between progress- and delete-anomolies due to a
very common scenario for full-imports:
- Module runs full-imports every N days
- The transfer database contains 900 products with a
deletedAt
timestamp, Shopware contains 1000 products - The
init-progress
subsection finds a potential delete anomaly: 90% of all products are potentially going to be deleted - But: 899 of the 900 products are already deleted by previous runs, the actual amount to delete is 1
- That is why the heuristic detection cannot take place after the progress initialization - the detection needs access to the cached entities in order to calculate the actual amount of documents to which a Shopware entity is (still) present
To avoid having the user to confirm potential anomalies on every full import there are two additional subsections in
place: delta-count
and anomaly-check
that are responsible to calculate the amount of entities that will actually be deleted in the
current flow execution.
Step 2 - Delta Count
If no "progress-anomaly" was detected during the progress-initialization this subsection finishes immediately.
If the progress is suspicious the affected documents will be further analyzed in the subsection delta-count
. This subsection checks
for each document if its corresponding Shopware entity is still available. This allows the calculation of the actual amount of
entities that are going to be deleted in the current flow execution in the next subsection anomaly-check
.
Step 3 - Anomaly Check
The only reason why delta-count
and anomaly-check
are two separate subsections is to be able to generate the delta counts in
parallel. After the delta-map was generated the actual anomaly check takes place. If the amount of entities to be deleted is higher
than the configured threshold a lockout string is generated and the delete-subsection will not run.