Page MenuHome

Blender ID integration: storing avatar, full name and roles
ClosedPublic

Authored by Anna Sirota (railla) on Sep 28 2020, 3:05 PM.

Details

Summary

This changeset adds integration with Blender ID, storing avatar and full name in a separate table added by profiles app.

Changes to avatar, email, full name and roles are handled by a new webhook.
Blender ID roles are stored as user groups, and currently not in use yet (the assumption is that templates will be using them to display or not display content).

How does this fit into the rest of the project?

Blender Studio already has Blender ID login working via blender_id_auth_client, however basic profile information and changes to subscription status of users are not available to Blender Studio. The purpose of this change is to add this profile information, and handle changes to both profile and subscription status.

Which functionality should be covered by this code?

This code takes care of

  • Storing additional profile data about a user of Blender Studio, namely avatar, full name and roles;
  • Updating this data when notified by Blender ID via a webhook;
  • Relevant tests.

Which parts of the code are going to be responsible for handling errors, and retrying in case of failure?

Blender ID already has a queue that stores failed webhooks requests, which seems to be a good enough tool for retrying in case of failure.
As for handling errors, Blender Studio will have a Sentry handler that will help with notifying about failures, so this change doesn't have code specific to retrying due to unexpected erros.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Anna Sirota (railla) edited the summary of this revision. (Show Details)Sep 28 2020, 3:22 PM
Sybren A. Stüvel (sybren) requested changes to this revision.Sep 28 2020, 4:58 PM

The overall functionality seems ok, but code-wise there could be some improvements. Mostly the code should be very careful when it comes to error handling. Blender ID will go down, or become unreachable, even right after it has sent a webhook request to us. Timeouts, connection errors, disk I/O errors, such things need to be handled properly. At the minimum, I would suggest to move Blender ID communication into its own module, and wrap various exceptions that can occur with one or more custom exceptions. These can then be handled locally when necessary, or result in a custom error page when not caught.

profiles/models.py
23

Missing return type declaration. This also should be added to other functions, but here it's especially important as sometimes self.id is returned. This is incompatible with the official docs for __str__, which say that a string should be returned.

26

This connects to a remote system and fetches the avatar. Such complex behaviour, with various different ways in which it can fail or cause slowdowns, shouldn't be hidden behind a "plain setter function" name.

27

Not all users may have an oauth_info.oauth_user_id. What happens when this is the case? How will the user be notified of this when they do try and set an avatar?

29–31

This function is doing too many things at once:

  • Obtain a user's Blender ID account number.
  • Construct an OAuth2 session object.
  • Construct the URL that can be used to obtain the avatar.
  • Figure out the avatar's filename.
  • Download the avatar image.
  • Store the avatar on the Profile.

It's probably better to have a separate class for this That class can then also manage the Requests session.

30

This will create a new Requests session for each call to this function. Fetching avatars can be quite a lot of work with thousands of users in the database. Since the request doesn't have to be authenticated with a user-specific token, it's better to have a dedicated requests.Session (or OAuth2Session) that manages the TCP/IP connection and TLS handshake, and allows re-use.

32

pathlib.Path shouldn't be used to parse URLs. Use urllib instead.

profiles/queries.py
4

This module should have a docstring that explains what it's for. What kind of code is expected to call these functions?

6

This is an indication that this functionality should actually be in a custom model manager of OAuthUserInfo.

profiles/signals.py
22

Logging 'User pk={instance.pk} ... will make it clear what the logged number is. Blender ID account numbers are also numerical.

25

Use oauth_info.get('full_name') or '' for safety. Also put the fact that 'full_name' is used in the docstring.

26

How are communication errors handled here? And disk I/O errors saving the avatar?

31

Add a docstring that explains what this function is for.

34

if not getattr(instance, 'profile', None) will cover both the case that it just isn't there, or the hypothetical case that the property is there but set to None.

profiles/tests/test_webhooks.py
18

Add type declarations.

44

Include non-ASCII characters in names.

132

Test the entire name. Now a bug that ignores the name and turns it into ".jpg" won't be detected.

profiles/views/webhooks.py
25

Return type declaration & docstring are missing.

25

This function is doing too many things at once, better to split it up into a few smaller functions.

37

You're careful here with the length check, which is nice. An extra layer of carefulness could be added by using body = request.body[:content_length+1]. That way you read at most only a byte too many, avoiding DoS attacks.

42

If the secret is going to be encoded anyway, why not store it in the config file as bytes?

71

What if the response is something else?

75

What if there is already a user with that username?

77

What if there is already a user with that email address?

87–98

This is doing quite a few DB queries. It'll probably be simpler to get all the groups in the same query that fetches the user (with select_related()), then use sets to compute the differences.

This revision now requires changes to proceed.Sep 28 2020, 4:58 PM
Anna Sirota (railla) marked 7 inline comments as done.

Addressing most of the review comments:

  • Reuse Django's User.__str__ in Profile.__str__, instead of adding more PII into it;
  • Wrap all Blender ID integration into a separate module;
  • Use requests.raise_for_status to handle unexpected Blender ID responses;
  • Include non-ASCII characters in names;
  • Add type declaration to a test helper func.

Other changes:

  • Profile should have the same ID, otherwise there are 3 different IDs including Blender ID;
  • Add todos about handling duplicate username/email (will update in a bit).
Anna Sirota (railla) marked 2 inline comments as done.Sep 29 2020, 2:45 PM
Anna Sirota (railla) added inline comments.
profiles/models.py
27

From what I understand, there'll be no other way to set an avatar other than doing that in Blender ID.
All the profile information in Blender Studio is supposed to be read-only and Blender ID will be the sole authority over it.

profiles/queries.py
6

I think a lot of what's implemented here should be part of blender_id_oauth_client (@Francesco Siddi (fsiddi) disagrees afaik), but I'd like to avoid changing it now, and instead implement profiles in a way that this functionality could be "hotswapped" later if it gets implemented in blender_id_oauth_client.

profiles/tests/test_webhooks.py
132

The name is generated by the storage and IRL will never match the original name because it's hashed (which is tested elsewhere) and naming/storing logic is not specific to the avatar in any way.
Checking the full name would require mocking the storage and/or the hashing, and it seems out of scope of this change, which should assume that storage just works.

I've added a todo https://developer.blender.org/T81290 about configuring test storage in the backlog, because currently it's using the same configuration staging is using, which is not ideal.

profiles/views/webhooks.py
37

This is a copypaste from Blender Cloud's code.
I think this shouldn't be here at all, and should instead be handled by the webserver (e.g. nginx), not the application.

Sybren A. Stüvel (sybren) requested changes to this revision.Sep 29 2020, 4:39 PM

Docstrings should adhere to PEP 257. I know that in our existing codebase this isn't always the case either, but for new/updated comments I think it's a good idea to follow. One of the notes in that PEP states:

The docstring is a phrase ending in a period. It prescribes the function or method's effect as a command ("Do this", "Return that"), not as a description; e.g. don't write "Returns the pathname ...".

This is also consistent with the comments in Blender itself, and the more I use it the more natural it becomes.

What I'm missing most in this patch is a design document/description. How does this fit into the rest of the project? Which functionality should be covered by this code? Which parts of the code are going to be responsible for handling errors, and retrying in case of failure?

profiles/models.py
29

This should document what happens when it fails. Does it automatically retry after a while? What will be returned when it fails?

31

Missing f string prefix.

profiles/tests/test_webhooks.py
132

Fair enough, but why then bother checking the avatar name at all?

This revision now requires changes to proceed.Sep 29 2020, 4:39 PM
profiles/tests/test_webhooks.py
132

It's just a sanity check: the goal of the test is to test that avatar change is being handled in some meaningful way ("avatar used to be empty, check that it's not empty after the webhook was called").

Afaik, the only way to check that a FileField is "not empty", is to check either bool(file_field) or file_field.name, and out of these two I'd prefer the explicit latter one, and just to remind whomever reading the test that it's an image, also check for the extension.

Addressing more review comments:

  • Move modification of user groups into a separate function and module
  • Missing f in a formatted string
  • pydocstyle everything
Anna Sirota (railla) marked an inline comment as done.Sep 29 2020, 7:31 PM
Anna Sirota (railla) added inline comments.
profiles/views/webhooks.py
87–98

Updated the code to use select_related('user') and a bulk query for adding to groups, same a with removing from groups.

Anna Sirota (railla) retitled this revision from Blender ID integration to Blender ID integration: storing avatar, full name and roles.Sep 30 2020, 10:35 AM
Anna Sirota (railla) edited the summary of this revision. (Show Details)

Docstrings should adhere to PEP 257. I know that in our existing codebase this isn't always the case either, but for new/updated comments I think it's a good idea to follow. One of the notes in that PEP states:

The docstring is a phrase ending in a period. It prescribes the function or method's effect as a command ("Do this", "Return that"), not as a description; e.g. don't write "Returns the pathname ...".

This is also consistent with the comments in Blender itself, and the more I use it the more natural it becomes.

Updated docstrings.
It'd be great to have flake8 with flake8-docstrings (and maybe a few more opinionated plugins) set up as pre-commit hooks instead of black, but so far I hadn't had time for that yet. Should probably make time though because that might save some double reviewing time =)

What I'm missing most in this patch is a design document/description. How does this fit into the rest of the project? Which functionality should be covered by this code? Which parts of the code are going to be responsible for handling errors, and retrying in case of failure?

Updated the description of the patch.

Anna Sirota (railla) marked 7 inline comments as done.Sep 30 2020, 11:16 AM
  • Document what copy_avatar_from_blender_id does in case of error
Anna Sirota (railla) marked an inline comment as done.Sep 30 2020, 11:25 AM
  • Catch all exception when attempting to copy an avatar, log an INFO message when successful
Anna Sirota (railla) marked 2 inline comments as done.Sep 30 2020, 11:36 AM
Sybren A. Stüvel (sybren) requested changes to this revision.Oct 1 2020, 2:36 PM

Thanks for the patch description update, that clarifies a lot.

Blender ID already has a queue that stores failed webhooks requests, which seems to be a good enough tool for retrying in case of failure.

In that case, I would still recommend handling errors explicitly and reliably. That means identifying possible points of failure, and raising explicit exceptions for them (or returning explicit HTTP error status codes). That'll make things a lot easier to understand when they go wrong, also on the Blender ID side. It's much easier when the error message says "unable to reach Blender ID" or "hard disk full" than just "internal server error".

As the code of the webhooks.handle_user_modified function is currenty implemented, there is no error status returned. This means that Blender ID will not retry, so the code isn't in line with the intended functionality. The same is seen in the unit tests, where "everything is OK" status codes are returned when there were errors handling the request.

This revision now requires changes to proceed.Oct 1 2020, 2:36 PM
  • Handle storage exceptions when copying profile avatars

Thanks for the patch description update, that clarifies a lot.

Blender ID already has a queue that stores failed webhooks requests, which seems to be a good enough tool for retrying in case of failure.

In that case, I would still recommend handling errors explicitly and reliably. That means identifying possible points of failure, and raising explicit exceptions for them (or returning explicit HTTP error status codes). That'll make things a lot easier to understand when they go wrong, also on the Blender ID side. It's much easier when the error message says "unable to reach Blender ID" or "hard disk full" than just "internal server error".

As the code of the webhooks.handle_user_modified function is currenty implemented, there is no error status returned. This means that Blender ID will not retry, so the code isn't in line with the intended functionality. The same is seen in the unit tests, where "everything is OK" status codes are returned when there were errors handling the request.

The reason the webhook is returning "everything is OK" even when handling of the payload fails is that all of that logic will be later moved into a background task (as a TODO mentions), and there won't be any way to return any HTTP codes concerning handling of the payload.
Any additional error messages and tracebacks will be collected in Sentry in logger.exception and visible in the error log, in my experience, this is more than enough information to fix any issues that might arise.

I've updated copy_avatar_from_blender_id to log different exception messages for storage and communications with Blender ID related failures, so that they get into different event groups in Sentry, as a compromise.

(Also kept black's reformatting of affected files this time)

  • Style changes: black reformatting and PEP8 + PEP257 checks
  • Fix a test that disables logging and does not restore it back, breaking Profile tests.
Sybren A. Stüvel (sybren) requested changes to this revision.Oct 6 2020, 11:15 AM

in my experience, this is more than enough information to fix any issues that might arise.

👍

profiles/blender_id.py
60 ↗(On Diff #29474)

A return type Dict[str, Any] would work here as well I think.

profiles/queries.py
6

👍

18–23

This is of quadratic complexity (len(group_names) × len(current_groups)). How about something like this?

def set_groups(user: User, group_names: Set[str]) -> None:
    current_group_names = {group.name for group in current_groups}

    names_to_add_to = group_names - current_group_names
    groups_to_add_to = [Group.objects.get_or_create(name=group_name) for group_name in names_to_add_to]
    user.groups.add(*groups_to_add_to)

    names_to_remove_from = current_group_names - group_names
    groups_to_remove_from = [
        group := Group.objects.filter(name=group_name).first
        for group_name in names_to_remove_from
        if group is not None]
    user.groups.remove(*groups_to_remove_from)

Disclaimer: I haven't tested this code ;-)
I think that for computing which groups to add/remove, the set semantics makes things more readable. What do you think about such an approach?

profiles/tests/test_webhooks.py
23

There is a lot happening on this one line. It'll probably make debugging easier if it's split up into a few smaller lines.

63

Nicknames in Blender ID aren't limited to ASCII, so be sure to test with non-ASCII nicknames too.

132

Ok, fair enough.

This revision now requires changes to proceed.Oct 6 2020, 11:15 AM
  • Make addition/removal of groups more readable
  • Test Unicode in username;
  • Make HMAC test helper more readable;
profiles/blender_id.py
60 ↗(On Diff #29474)

If this is about

def get_user_info(self, oauth_user_id: str) -> Any:

it complains

profiles/blender_id.py:84: error: Returning Any from function declared to return "Dict[str, Any]"

because of requests .json()

profiles/queries.py
18–23

Yes, it does indeed look more readable with sets.

Updated the code, with few adjustments: get_or_create doesn't "fit" into list comprehension and an extra Group.object.filter query can be avoided.

profiles/blender_id.py
60 ↗(On Diff #29474)

You can fix that by doing a type assertion:

payload = resp.json()
assert isinstance(payload, dict)
return payload

This will make mypy understand it's a dict, and then it won't even bother complaining about the key type.

  • More mypy-compatible typing
Anna Sirota (railla) marked 2 inline comments as done.Oct 6 2020, 12:44 PM
This revision is now accepted and ready to land.Oct 6 2020, 12:55 PM