Unique Identifiers – Part 2, About Time

Time pervades all things. I remember in my first Geometry class, the teacher talked about three dimensions, and time being the fourth. This concept bothered me for months, it just didn’t seem right to me. A line, for instance, cannot be if it doesn’t exist for a period of time. And when old teach was trying to describe to us what a point was, the geometric concept of it, my confusion cleared up. A point, he said, had no length, breadth, or height, it was only a point. At which one of my little imaginary friends whispered in my ear, “But it has time. It is a point in time.”

I digress from the subject at hand to make my own point. Mankind’s understanding of the true nature of time is weak. We may think we can go back and forth in time if we can just find a De Lorean and build a flux capacitor out of junk parts in the garage. Or we can accept Stephen King’s explanation of the Langoliers, that eat away the past after the present is gone.

In my previous missive on Unique ID’s, I went through all the mental hoops I could to describe how an ID generator could be itself identified by thread, process, computer, and public parts. Now, we’re only required a series of numbers to finish off the Unique ID. That could be as simple as a counting loop within the thread of execution, and we might want to keep that in mind.

But usually, computer programs have a very simple way to get a time value, and that can come to our aid. At the risk of defying the You Ain’t Gonna Need It (YAGNI) concept, we’ll go on to figure out how we can incorporate time into our Unique ID’s, or at least leave room for implentation if we want.

Suppose we have gone through all the hoops to figure out “the best thing to do” for our computer process, and we have two or more items of equal weight (whatever we’ve defined that to be) in contention to be the winner. We could pick a random choice, or as random as our machine will give us, and hope that luck will lead us clearly to our goals. But if we keep arriving at this same decision, we could end up that a value is never chosen, that our luck has gone sour and karma has caught up with us. So, one way out of this mess is to choose the oldest value.

Having a timestamp at this point in the consideration process is extremely handy, in that we don’t need to implement another decision level. We just pick the ID with the smallest time value. Next time we’re at this juncture, our old value will have been gone, and we’ll pick the oldest ID now available.

One of the downsides of doing this is that we’re now shifting our decision process from a comparison of whole ID’s to a relation of embedded values. That will add additional complexity to our program thread, and we’ll only be able to determine its effectiveness when we finally get to implementation.

If we look back at the original spec for UUID’s, we see that time is captured at 100 nanosecond intervals, or 10 million of these values per second is our granularity. When programming, though, we’re very lucky if we can get anything better than granularity to the second without using an external library or weird procedure. I’ll suggest here, then, we keep our programming simple and call seconds as “close enough” for our purpose. This will probably bite me in the butt later as well, because YAGNI.

So, let’s look at the value that gets returned when we ask for a POSIX time value. It is “the number of seconds elapsed since midnight UTC of January 1, 1970, not counting leap seconds.” If we store this value in 32 bits of information, we’ll run out of space on January 19th, 2038. But this doesn’t bother us, as we’re adamant to store our ID in base 32, and that will get us through the rest of this millennium with seven characters.

Something cropped up when I was originally researching UUID’s for this project a few years back, what do you do when someone sets the clock backwards? Well, we could just ignore it if we trusted our system to check for duplicate values. That’s not in my nature, though, as we should try to do always be doing right thing when dealing with lower level functionality.

The UUID spec says they would use a Clock ID set at a random value, then increment the ID by one if a reset was detected. Well, that’s fine, it keeps the ID unique. But it doesn’t help with the sorting, because you’d have to intersperse the clock ID into the right spot in the series somehow. Thankfully, these resets are few and far between, and hopefully the clock resets on a fairly regular basis.

That just about wipes me out as far as using time as part of our Unique ID scheme. I ain’t gonna use it, mostly because of the shifting context problem. We’ll think about keeping space for it in our final version, as we can capture it with eight characters.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: