Page MenuHome

Implement account deletion in Cloudv4
ClosedPublic

Authored by Anna Sirota (railla) on Jan 20 2021, 6:51 PM.

Details

Summary

Goal

Handling deletion requests via Blender ID webhook by deleting user records after a grace period of roughly 2 weeks.

Special considerations

Not all accounts can be deleted

Accounts that have certain types of content linked to them cannot be deleted. This includes the following:

  • Any film-related content:
    • assets;
    • collections;
    • production logs etc.
  • Blog posts.

In case of these types of content anonymization is also not an option (e.g. a blog post from an anonymous user makes no sense, same goes for a training).

Theoretically, if the author is replaced on these, it should be possible to delete the original author's account, but then the original ownership is lost, which is also might not be a good option and has to be done manually.

See Collector for details on how to collect data that has to be deleted along with an object, including protected objects: https://github.com/django/django/blob/stable/3.0.x/django/db/models/deletion.py#L64

Further in this doc let's call it *protected data*.

Comments and likes, on the other hand, are designed to be unlinked from deleted accounts but remain publicly visible, and training progress can be safely deleted.

⇒ as long as the account is an *ordinary* account, not a content editor account, it's safe to delete its data.

Cloudv3 auth

We assume Cloudv3 also receives an update from Blender ID and soft-deletes the user as well.

What if subscription is currently active?

There are no checks about the subscription at the moment: it's possible to request account deletion in Blender ID while having an active subscription.

Notification and activity (actstream)

actstream 's actions don't have SET NULL in their GFKs to users , so notification records linked to deleted accounts also get deleted.

Implementation

It's possible to request your account deletion from your settings page

This is a separate tab in the Settings page (e.g. /settings/delete). This tab links to Blender ID where deletion can be requested. Blender ID then sends out a user-modified event that has date_deletion_requested date set.
An info box is displayed in case there's protected data linked to this account on Blender Cloud.

Webhook user-modified with non-null date_deletion_requested
  • checks what kind of protected data is linked to this account;
  • if there's protected data linked to this account, logs an error and does nothing else;
  • otherwise, deactivates the account: user.is_active = False;
  • sets a user.date_deletion_requested to the date in webhook's payload;

Actual deletion

Command queue_deletion_requests
  • called once a day (or less often);
  • selects all Users that have date_deletion_requested at least 2 weeks in the past;
  • for each calls a users.tasks.handle_deletion_request(pk)
  • scheduled via systemd timer.
Task users.tasks.handle_deletion_request
  • deletes a User with a given pk along with all data that it cascades to.
What if they change their mind within 2 weeks

We can, in theory, re-activate the account and clear deletion request (set date_deletion_requested = None) manually, however this is unlikely to happen often to be of concern.

See https://developer.blender.org/T82339

Diff Detail

Event Timeline

Anna Sirota (railla) requested review of this revision.Jan 20 2021, 6:51 PM
Anna Sirota (railla) created this revision.
Anna Sirota (railla) edited the summary of this revision. (Show Details)Jan 21 2021, 3:19 PM

Could you clarify the difference between "process" and "handle" deletion request? I see that one is done via cronjob, the other using background tasks. Would it make sense to merge them?

users/models.py
62

Can you add more info about when this is used?

Anna Sirota (railla) added a comment.EditedJan 21 2021, 3:29 PM

Could you clarify the difference between "process" and "handle" deletion request? I see that one is done via cronjob, the other using background tasks. Would it make sense to merge them?

background_tasks don't have a reliable repeating tasks (the ones in their docs look flimsy at best even in the docs and issues on GitHub are telling), so process_ command is there for repeating checks, and handle_ is for a single deletion request (it also remains in the DB when completed which would make it easier to verify that the pipeline actually works).

users/models.py
62

this is not used: it shouldn't be necessary, as sessions will expire eventually and get collected by clearsessions which is already called once a day.
my bad, forgot to remove from the patch.

  • Remove unused method
  • Remove an unused API view
Anna Sirota (railla) marked an inline comment as done.Jan 21 2021, 3:38 PM

Could you clarify the difference between "process" and "handle" deletion request? I see that one is done via cronjob, the other using background tasks. Would it make sense to merge them?

background_tasks don't have a reliable repeating tasks (the ones in their docs look flimsy at best even in the docs and issues on GitHub are telling), so process_ command is there for repeating checks, and handle_ is for a single deletion request (it also remains in the DB when completed which would make it easier to verify that the pipeline actually works).

May I suggest to rename process to "queue"? That makes it a bit more explicit that no actual processing will happen with this command. Besides that, I have no further comment.

  • Rename process_ command to queue_deletion_requests
Anna Sirota (railla) edited the summary of this revision. (Show Details)
  • Renamed the test module as well
This revision is now accepted and ready to land.Jan 22 2021, 3:26 PM