[PATCH] makepkg: introduce SOURCE_DATE_EPOCH

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] makepkg: introduce SOURCE_DATE_EPOCH

Allan McRae
This patch introduces the SOURCE_DATE_EPOCH environmental variable.  All files
in a package are adjusted to have their modification dates set to the value
of SOURCE_DATE_EPOCH, which defaults to "date +%s".

Setting this variable allows a package that is built twice in the same
environment to be (potentially) reproducible in that the checksum of the
generated package file will be the same.

Signed-off-by: Allan McRae <[hidden email]>
---
 scripts/makepkg.sh.in | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
index c019ae3b..529b51f7 100644
--- a/scripts/makepkg.sh.in
+++ b/scripts/makepkg.sh.in
@@ -87,6 +87,8 @@ SPLITPKG=0
 SOURCEONLY=0
 VERIFYSOURCE=0
 
+SOURCE_DATE_EPOCH=${SOURCE_DATE_EPOCH:-$(date +%s)}
+
 PACMAN_OPTS=()
 
 shopt -s extglob
@@ -620,7 +622,6 @@ write_kv_pair() {
 }
 
 write_pkginfo() {
- local builddate=$(date -u "+%s")
  if [[ -n $PACKAGER ]]; then
  local packager="$PACKAGER"
  else
@@ -654,7 +655,7 @@ write_pkginfo() {
 
  write_kv_pair "pkgdesc" "$spd"
  write_kv_pair "url" "$url"
- write_kv_pair "builddate" "$builddate"
+ write_kv_pair "builddate" "$SOURCE_DATE_EPOCH"
  write_kv_pair "packager" "$packager"
  write_kv_pair "size" "$size"
  write_kv_pair "arch" "$pkgarch"
@@ -738,10 +739,14 @@ create_package() {
  [[ -f $pkg_file ]] && rm -f "$pkg_file"
  [[ -f $pkg_file.sig ]] && rm -f "$pkg_file.sig"
 
+ # ensure all elements of the package have the same mtime
+ find . -exec touch -d @$SOURCE_DATE_EPOCH {} \;
+
  msg2 "$(gettext "Generating .MTREE file...")"
- list_package_files | LANG=C bsdtar -cnzf .MTREE --format=mtree \
+ list_package_files | LANG=C bsdtar -cnf - --format=mtree \
  --options='!all,use-set,type,uid,gid,mode,time,size,md5,sha256,link' \
- --null --files-from - --exclude .MTREE
+ --null --files-from - --exclude .MTREE | gzip -c -f -n > .MTREE
+ touch -d @$SOURCE_DATE_EPOCH .MTREE
 
  msg2 "$(gettext "Compressing package...")"
  # TODO: Maybe this can be set globally for robustness
--
2.12.0
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] makepkg: introduce SOURCE_DATE_EPOCH

Allan McRae
On 17/04/17 20:41, Allan McRae wrote:

> + # ensure all elements of the package have the same mtime
> + find . -exec touch -d @$SOURCE_DATE_EPOCH {} \;
> +
>   msg2 "$(gettext "Generating .MTREE file...")"
> - list_package_files | LANG=C bsdtar -cnzf .MTREE --format=mtree \
> + list_package_files | LANG=C bsdtar -cnf - --format=mtree \
>   --options='!all,use-set,type,uid,gid,mode,time,size,md5,sha256,link' \
> - --null --files-from - --exclude .MTREE
> + --null --files-from - --exclude .MTREE | gzip -c -f -n > .MTREE
> + touch -d @$SOURCE_DATE_EPOCH .MTREE
>  
>   msg2 "$(gettext "Compressing package...")"
>   # TODO: Maybe this can be set globally for robustness
>

These touch commands have had a -h added.

A
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] makepkg: introduce SOURCE_DATE_EPOCH

Andrew Gregory
In reply to this post by Allan McRae
On 04/17/17 at 08:41pm, Allan McRae wrote:
> This patch introduces the SOURCE_DATE_EPOCH environmental variable.  All files
> in a package are adjusted to have their modification dates set to the value
> of SOURCE_DATE_EPOCH, which defaults to "date +%s".
>
> Setting this variable allows a package that is built twice in the same
> environment to be (potentially) reproducible in that the checksum of the
> generated package file will be the same.
>
> Signed-off-by: Allan McRae <[hidden email]>

I'm of the opinion that makepkg is the wrong place to work on
reproducible builds.  We could probably take care of the low-hanging
fruit directly in makepkg, but a number of packages are going to
require more find-grained control over the environment then I think we
should be putting in makepkg.  If you look at `perl -V`, for instance,
it embeds the output of `uname -a` and a timestamp directly in the
executable.  I suspect that any effort we put into reproducible builds
with makepkg would eventually have to be duplicated with a more
powerful wrapper script in order to handle packages like perl that
record more of their environment than we should be manipulating in
makepkg.

apg
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] makepkg: introduce SOURCE_DATE_EPOCH

Andrew Gregory
In reply to this post by Allan McRae
On 04/17/17 at 10:04pm, Allan McRae wrote:

> On 17/04/17 20:41, Allan McRae wrote:
> > + # ensure all elements of the package have the same mtime
> > + find . -exec touch -d @$SOURCE_DATE_EPOCH {} \;
> > +
> >   msg2 "$(gettext "Generating .MTREE file...")"
> > - list_package_files | LANG=C bsdtar -cnzf .MTREE --format=mtree \
> > + list_package_files | LANG=C bsdtar -cnf - --format=mtree \
> >   --options='!all,use-set,type,uid,gid,mode,time,size,md5,sha256,link' \
> > - --null --files-from - --exclude .MTREE
> > + --null --files-from - --exclude .MTREE | gzip -c -f -n > .MTREE
> > + touch -d @$SOURCE_DATE_EPOCH .MTREE
> >  
> >   msg2 "$(gettext "Compressing package...")"
> >   # TODO: Maybe this can be set globally for robustness
> >
>
> These touch commands have had a -h added.

touch -h and date %s are not POSIX, are they available everywhere we
support?

Why the change to gzip for .MTREE?

apg
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] makepkg: introduce SOURCE_DATE_EPOCH

Levente Polyak-3
In reply to this post by Andrew Gregory
On 04/17/2017 03:34 PM, Andrew Gregory wrote:

> On 04/17/17 at 08:41pm, Allan McRae wrote:
>> This patch introduces the SOURCE_DATE_EPOCH environmental variable.  All files
>> in a package are adjusted to have their modification dates set to the value
>> of SOURCE_DATE_EPOCH, which defaults to "date +%s".
>>
>> Setting this variable allows a package that is built twice in the same
>> environment to be (potentially) reproducible in that the checksum of the
>> generated package file will be the same.
>>
>> Signed-off-by: Allan McRae <[hidden email]>
>
> I'm of the opinion that makepkg is the wrong place to work on
> reproducible builds.  We could probably take care of the low-hanging
> fruit directly in makepkg, but a number of packages are going to
> require more find-grained control over the environment then I think we
> should be putting in makepkg.  If you look at `perl -V`, for instance,
> it embeds the output of `uname -a` and a timestamp directly in the
> executable.  I suspect that any effort we put into reproducible builds
> with makepkg would eventually have to be duplicated with a more
> powerful wrapper script in order to handle packages like perl that
> record more of their environment than we should be manipulating in
> makepkg.
>
> apg
>
Makepkg is the place that we control and need to work on to make
packages created by makepkg reproducible. Currently they are not exactly
because of the reasons these patches address and there is literally no
way to get reproducible package artifacts without these patches.
Especially the deterministic way to pass in SOURCE_DATE_EPOCH is a
requirement for cases you mentioned and downstream projects using dates
in any produced artifacts should implement SOURCE_DATE_EPOCH. An
incredible high amount of projects already do so and more and more adopt
as this is getting infacto a standard (actually it already is).
No complex wrapper scripts should be needed at any place to achieve
reproducibility.

cheers,
Levente


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] makepkg: introduce SOURCE_DATE_EPOCH

Andrew Gregory
On 04/17/17 at 03:53pm, Levente Polyak wrote:

> On 04/17/2017 03:34 PM, Andrew Gregory wrote:
> > On 04/17/17 at 08:41pm, Allan McRae wrote:
> >> This patch introduces the SOURCE_DATE_EPOCH environmental variable.  All files
> >> in a package are adjusted to have their modification dates set to the value
> >> of SOURCE_DATE_EPOCH, which defaults to "date +%s".
> >>
> >> Setting this variable allows a package that is built twice in the same
> >> environment to be (potentially) reproducible in that the checksum of the
> >> generated package file will be the same.
> >>
> >> Signed-off-by: Allan McRae <[hidden email]>
> >
> > I'm of the opinion that makepkg is the wrong place to work on
> > reproducible builds.  We could probably take care of the low-hanging
> > fruit directly in makepkg, but a number of packages are going to
> > require more find-grained control over the environment then I think we
> > should be putting in makepkg.  If you look at `perl -V`, for instance,
> > it embeds the output of `uname -a` and a timestamp directly in the
> > executable.  I suspect that any effort we put into reproducible builds
> > with makepkg would eventually have to be duplicated with a more
> > powerful wrapper script in order to handle packages like perl that
> > record more of their environment than we should be manipulating in
> > makepkg.
> >
> > apg
> >
>
> Makepkg is the place that we control and need to work on to make
> packages created by makepkg reproducible. Currently they are not exactly
> because of the reasons these patches address and there is literally no
> way to get reproducible package artifacts without these patches.
> Especially the deterministic way to pass in SOURCE_DATE_EPOCH is a
> requirement for cases you mentioned and downstream projects using dates
> in any produced artifacts should implement SOURCE_DATE_EPOCH. An
> incredible high amount of projects already do so and more and more adopt
> as this is getting infacto a standard (actually it already is).
> No complex wrapper scripts should be needed at any place to achieve
> reproducibility.
>
> cheers,
> Levente

I have no problem with making makepkg's own output more controllable
(e.g.  allowing builddate to be set rather than using the current
time).  But, a lot of the time, reproducing an identical package is
going to require a very precise environment, especially for compiled
software.  The environmental factors that influence the built software
vary from project to project and can get their values from a variety
of locations.  I think that trying to manage all of that from makepkg
would be a mistake if it would even be possible.  Some things, like
building in a chroot for software that embeds the build directory,
would almost certainly be easier from a script that wraps makepkg.
I would prefer to see effort be put toward such a script rather than
have it go into makepkg only to have to be moved to a separate script
later.

apg
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] makepkg: introduce SOURCE_DATE_EPOCH

Levente Polyak-3
On 04/17/2017 08:42 PM, Andrew Gregory wrote:

> I have no problem with making makepkg's own output more controllable
> (e.g.  allowing builddate to be set rather than using the current
> time).  But, a lot of the time, reproducing an identical package is
> going to require a very precise environment, especially for compiled
> software.  The environmental factors that influence the built software
> vary from project to project and can get their values from a variety
> of locations.  I think that trying to manage all of that from makepkg
> would be a mistake if it would even be possible.  Some things, like
> building in a chroot for software that embeds the build directory,
> would almost certainly be easier from a script that wraps makepkg.
> I would prefer to see effort be put toward such a script rather than
> have it go into makepkg only to have to be moved to a separate script
> later.
>
> apg
>

I fully agree with your points... actually exactly that is the plan and
the reason the .BUILDINFO file exists -- to be able to recreate the very
precise environment that was used to build a package. This is of cause
needed, as you mentioned, for things like some binary software (gcc
version)... but we actually include the .BUILDINFO file into the package
itself. This has IMO a lot of advantages but that already declares the
requirement to have an exact identical environment to be reproducible.

The current set of adjustments are needed for makepkg itself. I'm sure
nobody intends to go lot further and include environment recreation
things or explicit software dependent stuff (like PERL_BUILD_DATE).

makechrootpkg and things like that are project (like Arch) specific.
Surely there will be the need of a wrapper around it to recreate an
identical environment from the .BUILDINFO file to be able to reproduce a
package beyond invoking it twice (something like makerepropkg).
On top of that, there will always be some need to add some things to
PKGBUILD files that are software dependent. An example would be to
define PERL_BUILD_DATE="${SOURCE_DATE_EPOCH}" and i agree that something
like PERL_BUILD_DATE is not to be included in makepkg itself.

I hope i could settle some of your concerns :)

cheers,
Levente


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] makepkg: introduce SOURCE_DATE_EPOCH

Allan McRae
In reply to this post by Andrew Gregory
On 17/04/17 23:37, Andrew Gregory wrote:

> On 04/17/17 at 10:04pm, Allan McRae wrote:
>> On 17/04/17 20:41, Allan McRae wrote:
>>> + # ensure all elements of the package have the same mtime
>>> + find . -exec touch -d @$SOURCE_DATE_EPOCH {} \;
>>> +
>>>   msg2 "$(gettext "Generating .MTREE file...")"
>>> - list_package_files | LANG=C bsdtar -cnzf .MTREE --format=mtree \
>>> + list_package_files | LANG=C bsdtar -cnf - --format=mtree \
>>>   --options='!all,use-set,type,uid,gid,mode,time,size,md5,sha256,link' \
>>> - --null --files-from - --exclude .MTREE
>>> + --null --files-from - --exclude .MTREE | gzip -c -f -n > .MTREE
>>> + touch -d @$SOURCE_DATE_EPOCH .MTREE
>>>  
>>>   msg2 "$(gettext "Compressing package...")"
>>>   # TODO: Maybe this can be set globally for robustness
>>>
>>
>> These touch commands have had a -h added.
>
> touch -h and date %s are not POSIX, are they available everywhere we
> support?
>
> Why the change to gzip for .MTREE?
>

A timestamp is embed in a gz file unless gzip -n is used.

A
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] makepkg: introduce SOURCE_DATE_EPOCH

Allan McRae
In reply to this post by Andrew Gregory
On 17/04/17 23:37, Andrew Gregory wrote:

> On 04/17/17 at 10:04pm, Allan McRae wrote:
>> On 17/04/17 20:41, Allan McRae wrote:
>>> + # ensure all elements of the package have the same mtime
>>> + find . -exec touch -d @$SOURCE_DATE_EPOCH {} \;
>>> +
>>>   msg2 "$(gettext "Generating .MTREE file...")"
>>> - list_package_files | LANG=C bsdtar -cnzf .MTREE --format=mtree \
>>> + list_package_files | LANG=C bsdtar -cnf - --format=mtree \
>>>   --options='!all,use-set,type,uid,gid,mode,time,size,md5,sha256,link' \
>>> - --null --files-from - --exclude .MTREE
>>> + --null --files-from - --exclude .MTREE | gzip -c -f -n > .MTREE
>>> + touch -d @$SOURCE_DATE_EPOCH .MTREE
>>>  
>>>   msg2 "$(gettext "Compressing package...")"
>>>   # TODO: Maybe this can be set globally for robustness
>>>
>>
>> These touch commands have had a -h added.
>
> touch -h and date %s are not POSIX, are they available everywhere we
> support?
>

touch -h is in BSDs.  date +%s is mentioned in the FreeBSD man page, so
I assume it works.

A
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] makepkg: introduce SOURCE_DATE_EPOCH

Allan McRae
In reply to this post by Andrew Gregory
On 17/04/17 23:34, Andrew Gregory wrote:

> On 04/17/17 at 08:41pm, Allan McRae wrote:
>> This patch introduces the SOURCE_DATE_EPOCH environmental variable.  All files
>> in a package are adjusted to have their modification dates set to the value
>> of SOURCE_DATE_EPOCH, which defaults to "date +%s".
>>
>> Setting this variable allows a package that is built twice in the same
>> environment to be (potentially) reproducible in that the checksum of the
>> generated package file will be the same.
>>
>> Signed-off-by: Allan McRae <[hidden email]>
>
> I'm of the opinion that makepkg is the wrong place to work on
> reproducible builds.  We could probably take care of the low-hanging
> fruit directly in makepkg, but a number of packages are going to
> require more find-grained control over the environment then I think we
> should be putting in makepkg.  If you look at `perl -V`, for instance,
> it embeds the output of `uname -a` and a timestamp directly in the
> executable.  I suspect that any effort we put into reproducible builds
> with makepkg would eventually have to be duplicated with a more
> powerful wrapper script in order to handle packages like perl that
> record more of their environment than we should be manipulating in
> makepkg.

I agree that makepkg is not the place for much of this.  However, the
SOURCE_DATE_EPOCH variable is a standard and we require makepkg to
understand it and make a few other minor changes for any tool to have a
chance of recreating a package from its PKGBUILD and .BUILDINFO file.  I
am not looking to extend the changes beyond this initial patchset.

Allan