A simple way is to use Modular Multiplicative Inverse. You can keep using the serial ID and obscure it for public consumption. FROM things WHERE scramble_thing_id(thing_id, salt) = 'dQw4w9WgXcQ' ĮDIT: Clarification, also I may be developing a form of dyslexia that only affects reading before I post the following is a very cheap index scan Use plpgsql if you want */ĬREATE INDEX scramble_thing_id ON things (scramble_thing_id(thing_id, salt)) Do the following only if an SQL-centric approach is desirableĬREATE FUNCTION scramble_thing_id(the_id BIGINT, the_salt SMALLINT) RETURNS text LANGUAGE sql IMMUTABLE /* more options */ AS $f$ Salt SMALLINT NOT NULL DEFAULT some_random() ,ĬONSTRAINT pk_things PRIMARY KEY (thing_id) If you do the latter you can then index the expression itself and perform index scans directly with the query string.Įxample (completely untested): CREATE TABLE things( My recommendation is to use a traditional serial and a randomly generated smallint salt which does not need uniqueness checks and allows you to scramble & encode the binary value in a deterministic, reversible way, in a character set with a reasonably large base to achieve short strings (base64 would work but some of the characters are not URL-friendly).įor the obfuscation, you can then either use the client code or do it through an SQL expression/immutable function. They also are monstrously wide at 16 bytes of storage). That being said should I try to avoid any solution that involves a serial column and instead find an alternate way to do paging?Īlthough undeniably handy, purely randomly generated primary keys are not necessarily good practice as they completely degenerate the binary tree in the single most important index in the table (such as the widespread habit of using UUIDs when not necessary. *Edit: Additionally, I've heard things about serial IDs not scaling to networked solutions well. This could work but the IDs are longer than my preference and I'm not experienced to know what the pros/cons of this system would be.Įager to hear about any advice/thoughts/systems you've had experience with. So if we were to make the primary ID a randomly generated string like youtube does, then what's the best way to preserve page sorting? Is it bad practice to have a randomly generated technical ID along with a serial column? Maybe it's better to have a created timestamp column?Īn interesting alternative based on what instagram did is described here (basically also what twitter did): /9/a-better-id-generator-for-postgresql I like the way the youtube ids are formatted, but I probably don't need as many characters. Preferably all IDs are the same length and are relatively short for user experience (6-8 characters). createdTimestamp column or is there a better way? A way to safely page through entries if primary key is randomly generated like youtube does. I'd like to have a system that fulfills the following: I'm currently using a single technical serial "id" column in my DB. non sequential, so not telling you that you're viewing the 324234th video uploaded to youtube). This seems to be a very common system used for sites all around the web.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |