SRCREV and git packages

Werner Almesberger werner at openmoko.org
Thu Jun 5 00:47:02 CEST 2008


Andy Green wrote:
> Isn't this an issue because only we don't generate source packages, you
> need a magic tag to go to in the scm to recover the sources
> corresponding to the binary package?

As I understand the problem, you need that "magic tag" (i.e., a
sequence number) anyway, since the git hash doesn't allow you to
easily tell which of two heads is the more recent one.

For simple queries (e.g., "is there something I don't have yet ?"),
I think the git hash should be a sufficiently good approximation for
the magic tag that one could do without it. (I'd like to see Holger's
opinion on that, though.)

For more complex queries (e.g., "is this more recent than X ?"), the
magic tag looks like a reasonable solution.

The problem is to translate git hashes, which are easily obtained,
to sequence numbers.

The could be solves with a dictionary that allows one to translate
from git hashes to sequence numbers and back.

One way to obtain such a dictionary would indeed be through creating
it when making a source package, but, if we only look at the dictionary
side of it, the overhead would phenomenal.

Please note that I'm not entirely sure we actually need sequence numbers
for acceptable performance. OE uses them because all other SCMs provide
them for free. But there may be a number of currently unexplored
constraints we can apply that would provide sufficiently similar
functionality without them.

Examples:

Q: "Do I have the latest version of package X ?"
A1: local_X.seq < upstream_X[latest].seq
A2: local_X.hash != upstream_X[latest].hash

  A1 is how this question would be answered if all we had were sequence
  numbers, like with CVS, SVN, etc.

  A2 is a close approximation of A1 that does not need sequence numbers.
  It would only yield a different answer if upstream actually reverted
  to an earlier version of the local one.

  Whether the local version should be rightfully considered to be
  "newer" in this case is a philosophical question.

Q: "Is my local version of package X > sequence N ?"
A1: sequence(local_X) > N
A2: local_X.seq > N

  A1 answers the question by generating the sequence number, which is
  a resource-intensive process.

  A2 answers the question by looking it up in the local package database.
  In order to be able to do so, this information has to have been recorded
  along with the hash (which I assume here to be the primary designator
  for a specific revision).

Q: "Is my local version of package X >= hash G ?"
A1: local_X.hash >= G
A2: local_X.seq >= upstream_X[G].seq
A3: G in local_X.hashes (*)

  A1 can only be answered if we have defined a partial order over hashes.
  In our case, this would be an expensive operation.

  A2 is equally expensive, except if we have access to a dictionary.

  A3 assumes that, at the time we installed X, we recorded all the hashes
  in the history of X (which were trivially accessible at that time).

  Note that A3 only answers Q correctly if changesets are never removed
  from the revision history. I think the current git-rev-list method has
  effectively the same requirement.

  (*) This is a bit simplified. If our local record of hashes also includes
      versions more recent than the one presently installed, we would have
      to do a comparison on that partial order as well.

Q: "I need a version > sequence N of upstream package X. I have no local
   information about X. Which hash do I ask for ?"
A: The latest.

  Trivial, uh ? :-) Note that this answer is incorrect if no such version
  exists. This should be a highly anomalous situation, but we'd have to
  check for it anyway (after downloading that latest version).

Q: As above, but we want any version for which A <= sequence N <= B.

  Assuming that version A exists at the time the package description is
  made, the package description maker could include a hint in the
  description that informs what hash corresponds to sequence A.

Q: As above, but we want the _latest_ verison which A <= sequence N <= B.

  I think this is the point where we need a full dictionary.

Sorry for the not overly rigid pseudo-formal notation. It's more meant
to illustrate the operations than to provide a proper framework.

- Werner



More information about the distro-devel mailing list