<br><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">From:&nbsp;Ian Darwin &lt;<a href="mailto:ian@darwinsys.com">ian@darwinsys.com</a>&gt;<br>

I did mention when I jumped in that I was talking about a slightly different problem than what you seem to be trying to solve. That said...<br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

Storing your MD5s will let you know *if* you are repeating a build. &nbsp;It will not (reasonably) let you repeat a build.<br>

</blockquote>

<br>

You completely misunderstood what I said. Storing the file name, URL *and* its MD5 lets you be sure you are able to reproduce the build. And it&#39;s the only way you can (well, we actually use SHA&#39;s and RMD&#39;s in addition to MD5&#39;s).<br>

</blockquote><div><br>If I misunderstood it, I still do.&nbsp; MD5 absolutely will not help you reproduce a build.&nbsp; It will only help you verify that your procedure (whatever it is) retrieved the same files.&nbsp; It won&#39;t help you find the correct file if the file you got was wrong.<br>

</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>

It does not guarantee to build *exactly* the same binary, but you can *never ever* make that guarantee in general because of changes to shared libraries (in the Java world you have a greater chance, but even then,<br>

even the smallest point upgrade to the JDK could in theory change the generated .class files), so there is really no point whatever using MD5&#39;s on generated binaries. None. I have no idea why Hudson bothers, if that&#39;s what it&#39;s using them for.<br>

</blockquote><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

You need some way of identifying the file you want to build *to the revision control system* (so you can download that version) if you want repeatable builds.<br>

</blockquote>

<br>

In the case I&#39;m talking about - where there are about 5,000 different external programs from about 2,500 different sources - you don&#39;t even think about going to the revision control system (6 or 8 different services over 2,000 different hosts). You can ONLY think in terms of<br>

ftp/http download of version-numbered tarballs. Period.<br>

</blockquote><div><br>No, you MOSTLY think in terms of ftp/http download.&nbsp; For the projects on which you&#39;re doing work (at least) you absolutely go to the revision control system.<br><br>Are we talking about the same thing?&nbsp; I&#39;m talking about a repeatable build so I can do development work, modifying existing code.&nbsp; To do that, you&#39;d better work with the revision control system.&nbsp; Even if you don&#39;t have submit access (which the vast majority of devs probably shouldn&#39;t have), in some cases you have a RCS that supports creating smart diffs that you can send to someone to review &amp; submit trivially (e.g. git or darcs) and in other cases you almost certainly want to submit a patch from the latest possible revision.<br>

</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

MD5s sound nice to verify, if you don&#39;t trust your revision control system (or perhaps the admins ;-)<br>

</blockquote>

<br>

See above for why revision control is completely irrelevant to us. </blockquote></div> Who is &#39;us&#39;? Bobby<br clear="all"> --  If it doesn&#39;t make you smile, you&#39;re doing something wrong.