Generating Random Unique IDs
While looking at the Aftership API, I noticed their public tracking IDs looked interesting. Example h4qys6mtkkhnkkjvco35a017
This led me down a path of trying to understand what those values could be and how to make them on my own.
Considerations
- Do these IDs need to be sortable by time?
- roughly time sortable?
- Firebase uses a modified base64 alphabet so that sorting still works lexigraphically. and they do this on the client side!
- Modeled after base64 web-safe chars, but ordered by ASCII.
- Size requirements?
- 64bit could lead to integer overflows in some languages, Javascript.
- always provide IDs as strings, not numerics.
- Case Sensitivity
- file naming issues
- does your alphabet contain things like
/
?
- does your alphabet contain things like
- Limiting curse words in hashes
- There are way to prevent bad words from being generated in the ids.
Encodings
- Base16
- Base32
- alphanumeric, single case
- Base58
- alphanumeric
- excludes letters which might look ambiguous when printed (0 - zero, I - capital i, O - capital o and l - lower case L).
- Base62
- alphanumeric, excludes
+/
from the alphabet
- alphanumeric, excludes
- Base64
- Certain characters, notably “+” and “/” in the base 64 alphabet, are treated as word-breaks by legacy text search/index tools.
See RFC4648 for Base16, Base32 and Base64 specs.
UUIDs
A few versions. 4 is the most recent, which is completely random.
- 128 bits
- UUIDv4 are not sortable by time
ULIDs
- Sortable by time
Other
Distributed Key Generation
- Twitter’s Snowflake
- MongoDB’s ObjectID
- Flickr uses two ticket DBs (one on odd numbers, the other on even) to avoid a single point of failure
MongoDB ObjectID
- 4-byte timestamp value, representing the ObjectId’s creation, measured in seconds since the Unix epoch
- 5-byte random value
- 3-byte incrementing counter, initialized to a random value
Resources
- Sharding & IDs at Instagram
- Honeybadger: Going deep on UUIDs and ULIDs
- Callicoder: Generating unique IDs in a distributed environment at high scale.