Identifiers
There are alternatives to UUID
Published: Friday, Oct 22, 2021 Last modified: Monday, Dec 9, 2024
Update: base62 as introduced by Github seems pretty nice!
A modern computer has the ability to express a 64bit number natively.
How large is a 64 bit number?
$ python -c "print (2**64-1)"
18446744073709551615
It’s very large and for a private system, probably good enough spatial range to uniquely identify all the items you are dealing with.
Though how do you express a very large number compactly and in a way a human can understand?
For example Youtube uses 11 characters of base64. How many base64 characters would it require to express a 2^64 number? 2^6^x = 2^64 .. so x = 64/6 = 10.666666666 … i.e. eleven rounded up.
$ for i in {1..3}; do head -c9 /dev/urandom | base64 | tr '+/' '_-'; done
fFwtamv4YJXP
HB1C9dG7cHGQ
-HTzHvHLwpyf
https://youtu.be/gocwRvLhDf8?t=85
UUID is 128 bit, i.e. 2^128-1 and whilst the probability that a UUID will be duplicated is not zero, it is close enough to zero to be negligible. However it’s opaque, generic and 64 bit is more than enough. For better 128 bit identfiers look at https://github.com/ulid/spec
Classes they should teach at CompSci classes: How to make an Identifier that doesn't SUCK. Hint: 64bit is good enough & it shouldn't always need be a UUID.
— Kai Hendry (@kaihendry) October 19, 2021
small touch but I really like how Stripe prefixes IDs with {sub,cus,evt}_* and especially sk_test_* for the test secret
— TJ Holowaychuk πΊπ¦ (@tjholowaychuk) August 3, 2016
It turns out that *ULIDs* are probably the best choice for the Primary Keys of your database.
— Beyond Code (AJ ONeal) (@_beyondcode) October 20, 2021
(better than UUIDs)
Here's why:
- https://t.co/LStZhLBHFh
- https://t.co/jbPsN9qTIw
- https://t.co/NdAvagFpEq
Thanks @Derek_Perkins
Re: https://t.co/FONhWCeCmd
ULID aside, I do prefer base64 encoded 64 bits numbers, aka the “Youtube style Identifier” https://stackoverflow.com/questions/69675161/how-to-generate-a-youtube-id-in-go
Although unlikely a duplicate (aka collision) random 64 bit number is possible in a large public system, so generated identifiers would ideally need to be checked before usage. A timestamp + YT style to play it safe.
I’d argue for private system, 64 bit should be more than enough to identify (use a prefix!) everything you need to like so: https://github.com/auth0/id-generator