-
-
Notifications
You must be signed in to change notification settings - Fork 7k
Override model serializer globally, big integer field #9722
-
First of all thanks for this project, i can't use Django without drf 😀
Im looking for a way to override serializers.ModelSerializer globally
Little bit of context:
Our application is grown big and we want to switch django backend from BigAutoField to SnowflakeAutoField.
My problem:
- ts/js (My frontend) do not support numbers over 2^52,
- I don't want to use Bigint in ts
- i don't want to touch MAX_SAFE_INTEGER
So the best solution in my opinion is to serialize bigint as string.
class SnowflakeField(serializers.Field): def to_representation(self, value): return str(value) def to_internal_value(self, data): try: return int(data) except ValueError: raise serializers.ValidationError("Invalid integer value") class SnowflakeModelSerializer(serializers.ModelSerializer): """ Snowflake is a bigint so convert to string to avoid frontend memory handling problems """ def build_standard_field(self, field_name, model_field): field_class, field_kwargs = super().build_standard_field(field_name, model_field) # Use BigIntAsStringField per BigIntegerField if isinstance(model_field, models.BigIntegerField): field_class = SnowflakeField return field_class, field_kwargs
Anyway, since django uses bigint as id, serializing it as sting could be a better option in any case since the id can grow big
Possible ways
- use my custom serializer around the code (not so clean)
class MyModelSerializer(SnowflakeModelSerializer): class Meta: model = MyModel fields = '__all__'
- Write a Custom JsonRender and JsonParser
and do something like this with a lot of assumptions, spoiler bad programming
class SnowflakeJSONRenderer(JSONRenderer): def _convert_big_integers(self, obj): if isinstance(obj, dict): return {key: self._convert_big_integers(value) for key, value in obj.items()} elif isinstance(obj, list): return [self._convert_big_integers(item) for item in obj] elif isinstance(obj, int) and obj > 2**53: return str(obj) return obj
What im looking for
As far as i know there is no option to do this
REST_FRAMEWORK = {
'DEFAULT_MODEL_SERIALIZER_CLASS': 'path.to.utils.serializers.CustomModelSerializer',
}
Am i missing something? In your opinion what could be the best solution? I'm open to any suggestion
Beta Was this translation helpful? Give feedback.
All reactions
I'd recommend going with the first option. It doesn't look pleasant at first, but it's easily maintainable if you add a Django system check to validate that the correct class is used.
Do you have a specific reason to use snowflake btw? If you are open to alternatives, I'd recommend UUID7 as a solid one.
Replies: 2 comments 6 replies
-
I'd recommend going with the first option. It doesn't look pleasant at first, but it's easily maintainable if you add a Django system check to validate that the correct class is used.
Do you have a specific reason to use snowflake btw? If you are open to alternatives, I'd recommend UUID7 as a solid one.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 2
-
Thx for the replay @ulgens
Actually we are also considering uuid, for sure it's a simpler implementation and probably a better compromise in our situation
I'm try to evaluate snowflake option for this reasons:
- url shorter so more easy to copy and read (and espose incremental ids is not the best option)
- smaller and faster in db than uuid
- distributed generation
- time based sorting, but in later releases
- i don't want to change again ids 😂
Probably it's overkill but i'm not committed to the switch yet
I've made some tests with this code and it's working, it's still a fast implementation.
I still need to consider:
- snowflake instance id sync, prob with redis that we alread have as cache
- time sync, in our case azure already do this
from django.db import models from django.utils.encoding import smart_str from rest_framework import serializers class SnowflakeField(serializers.Field): """Convert snowflake data to string""" def to_representation(self, value): return smart_str(value) def to_internal_value(self, data): try: return int(data) except ValueError: raise serializers.ValidationError("Invalid integer value") class SnowflakePrimaryKeyRelatedField(serializers.PrimaryKeyRelatedField, SnowflakeField): """Handle related primary keys as snowflake""" def to_representation(self, value): if self.pk_field is not None: return self.pk_field.to_representation(value.pk) return str(value.pk) def to_internal_value(self, data): data = super(SnowflakeField).to_representation(data) return super(serializers.PrimaryKeyRelatedField).to_internal_value(data) class SnowflakeSlugRelatedField(serializers.SlugRelatedField): """Implement if used""" class SnowflakeModelSerializer(serializers.ModelSerializer): """ Snowflake is a bigint so convert to string to avoid frontend memory handling problems """ serializer_field_mapping = { **serializers.ModelSerializer.serializer_field_mapping, models.BigAutoField: SnowflakeField, models.BigIntegerField: SnowflakeField } serializer_related_field = SnowflakePrimaryKeyRelatedField serializer_related_to_field = SnowflakeSlugRelatedField
My pk model
from django.db import models class SnowflakeBigAutoField(models.BigAutoField): def pre_save(self, model_instance, add): if add and not getattr(model_instance, self.attname, None): value = self.get_next_value() setattr(model_instance, self.attname, value) return super().pre_save(model_instance, add) @staticmethod def get_next_value(): return next(sf) # settings.py DEFAULT_AUTO_FIELD = 'core.snowflake.SnowflakeBigAutoField'
If we deoloy with snowfake i like to share the production code here, because i didn't find any useful resource about snowflake id with django
DEFAULT_MODEL_SERIALIZER_CLASS
In any case, is it possible in your opinion try to introduce this setting? I can submit a pull request.
I looked at the code and it looks possible, am I missing something?
Beta Was this translation helpful? Give feedback.
All reactions
-
then some other unique identifier that's is shorter for URLs
We are doing this right now almost everywhere except on some dashboard elements, but it's kinda a pain 🥲
At one point we also introduced the ability for the user to customize the slug to reach certain resources, it was more confusing than useful, so we switched back to autogeneration.
I like something like discord only one id in and out, no names no personalization no problems, it's ok even if uuid it's a bit longer. I don't see problems in this aproach, am i wrong?
Not having this dict configurable without customizing the class seems problematic to me
Next week i try to implemente this solution and let you know
My understanding is that you have a "Snowflake ID to JSON" conversion problem
There is actually a kind of bug: if you take a BigInt in DB, for example 2^64
and represent as int, JavaScript/TypeScript Number can only safely represent integers up to 2^53−1
. JSON follows the same limitation, so although no error occurs, it is still technically incorrect.
It is possible to find a workaround on client side, but it is not guaranteed to work on all browsers.
Beta Was this translation helpful? Give feedback.
All reactions
-
There is actually a kind of bug: if you take a BigInt in DB, for example 2^64 and represent as int, JavaScript/TypeScript Number can only safely represent integers up to 2^53−1. JSON follows the same limitation, so although no error occurs, it is still technically incorrect.
Sorry if I missed it but what's the result of this behaviour? Does it fail? Does it produce an incorrect response? If a Django native datatype can't be converted to a proper json by DRF, we may consider to add a warning and update the docs about it.
Beta Was this translation helpful? Give feedback.
All reactions
-
Sorry for the late reply, in the meantime i've made some research and extensive tests.
JSON does not have a native BigInt type — it only supports double-precision floating-point numbers (IEEE 754), that are safely precise for integers only up to 253 − 1 (i.e. 9,007,199,254,740,991). Beyond this threshold, numbers lose precision and may be silently truncated or rounded when parsed in JavaScript or other languages.
It depends on the json library implementation, this particular issue doesn’t happen in Python but it’s not consistent across all languages JSON libraries
The twitter example:
Twitter introduced via api id both numeric and id_str as string to avoid this exact problem.
Why DRF test frontend works?
Because it uses FormData, basically it is all string data, and since the server side render is handled by python no bigint errors occour
Resources:
Since the majority of frontends that will call api are javascript, this is an issue that have to be addressed in my opinion, i know that the bigint problem with ids will happen only at very large scale and probably a lot of people will never encounter this, but if we find a solution it would be wonderfull.
COERCE_DECIMAL_TO_STRING
I suggest doing like this flag also for bigint, that is the exact same problem only for floating point that js also does not support well :')
Beta Was this translation helpful? Give feedback.
All reactions
-
❤️ 1
-
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1 -
❤️ 1
-
👏🏻 I think this approach has better focus on the actual issue, and precisely defines what needs to be handled. Thanks for all the details. I'll be following people's reactions under #9733, and I can help with the PR when it's ready.
Beta Was this translation helpful? Give feedback.