[PATCH 1/5] makepkg: remove build date from .PKGINFO header

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH 1/5] makepkg: remove build date from .PKGINFO header

Levente Polyak-3
From: Allan McRae <[hidden email]>

This information is duplicated (in less friendly format) in the "builddate"
entry and removing it improves reproducible packaging.

Signed-off-by: Allan McRae <[hidden email]>
Signed-off-by: Levente Polyak <[hidden email]>
---
 scripts/makepkg.sh.in | 1 -
 1 file changed, 1 deletion(-)

diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
index 0218e13b..86abb177 100644
--- a/scripts/makepkg.sh.in
+++ b/scripts/makepkg.sh.in
@@ -635,7 +635,6 @@ write_pkginfo() {
  msg2 "$(gettext "Generating %s file...")" ".PKGINFO"
  printf "# Generated by makepkg %s\n" "$makepkg_version"
  printf "# using %s\n" "$(fakeroot -v)"
- printf "# %s\n" "$(LC_ALL=C date -u)"
 
  write_kv_pair "pkgname" "$pkgname"
  if (( SPLITPKG )) || [[ "$pkgbase" != "$pkgname" ]]; then
--
2.12.2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH 4/5] makepkg: add more information to .BUILDINFO

Levente Polyak-3
The .BUILDINFO file should retain all the information needed to reproducibly
build a package.  Add some extra information to the file and also provide a
version number to keep track of future changes.

Signed-off-by: Allan McRae <[hidden email]>
Signed-off-by: Levente Polyak <[hidden email]>
---
 scripts/makepkg.sh.in | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
index 7bdf72b9..bd92c526 100644
--- a/scripts/makepkg.sh.in
+++ b/scripts/makepkg.sh.in
@@ -688,13 +688,17 @@ write_pkginfo() {
 write_buildinfo() {
  msg2 "$(gettext "Generating %s file...")" ".BUILDINFO"
 
- write_kv_pair "builddir"  "${BUILDDIR}"
+ write_kv_pair "format" "1"
+ write_kv_pkgname
+ write_kv_pkgver
 
  local sum="$(sha256sum "${BUILDFILE}")"
  sum=${sum%% *}
-
  write_kv_pair "pkgbuild_sha256sum" $sum
 
+ write_kv_pair "packager" "$(get_packager)"
+ write_kv_pair "builddate" "${SOURCE_DATE_EPOCH}"
+ write_kv_pair "builddir"  "${BUILDDIR}"
  write_kv_pair "buildenv" "${BUILDENV[@]}"
  write_kv_pair "options" "${OPTIONS[@]}"
 
--
2.12.2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH 2/5] makepkg: introduce SOURCE_DATE_EPOCH

Levente Polyak-3
In reply to this post by Levente Polyak-3
From: Allan McRae <[hidden email]>

This patch introduces the SOURCE_DATE_EPOCH environmental variable.  All files
in a package are adjusted to have their modification dates set to the value
of SOURCE_DATE_EPOCH, which defaults to "date +%s".

Setting this variable allows a package that is built twice in the same
environment to be (potentially) reproducible in that the checksum of the
generated package file will be the same.

Signed-off-by: Levente Polyak <[hidden email]>
---
 scripts/makepkg.sh.in | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
index 86abb177..c302c4ad 100644
--- a/scripts/makepkg.sh.in
+++ b/scripts/makepkg.sh.in
@@ -87,6 +87,8 @@ SPLITPKG=0
 SOURCEONLY=0
 VERIFYSOURCE=0
 
+export SOURCE_DATE_EPOCH=${SOURCE_DATE_EPOCH:-$(date +%s)}
+
 PACMAN_OPTS=()
 
 shopt -s extglob
@@ -620,7 +622,6 @@ write_kv_pair() {
 }
 
 write_pkginfo() {
- local builddate=$(date -u "+%s")
  if [[ -n $PACKAGER ]]; then
  local packager="$PACKAGER"
  else
@@ -654,7 +655,7 @@ write_pkginfo() {
 
  write_kv_pair "pkgdesc" "$spd"
  write_kv_pair "url" "$url"
- write_kv_pair "builddate" "$builddate"
+ write_kv_pair "builddate" "$SOURCE_DATE_EPOCH"
  write_kv_pair "packager" "$packager"
  write_kv_pair "size" "$size"
  write_kv_pair "arch" "$pkgarch"
@@ -738,10 +739,14 @@ create_package() {
  [[ -f $pkg_file ]] && rm -f "$pkg_file"
  [[ -f $pkg_file.sig ]] && rm -f "$pkg_file.sig"
 
+ # ensure all elements of the package have the same mtime
+ find . -exec touch -h -d @$SOURCE_DATE_EPOCH {} +
+
  msg2 "$(gettext "Generating .MTREE file...")"
- list_package_files | LANG=C bsdtar -cnzf .MTREE --format=mtree \
+ list_package_files | LANG=C bsdtar -cnf - --format=mtree \
  --options='!all,use-set,type,uid,gid,mode,time,size,md5,sha256,link' \
- --null --files-from - --exclude .MTREE
+ --null --files-from - --exclude .MTREE | gzip -c -f -n > .MTREE
+ touch -d @$SOURCE_DATE_EPOCH .MTREE
 
  msg2 "$(gettext "Compressing package...")"
  # TODO: Maybe this can be set globally for robustness
--
2.12.2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH 3/5] makepkg: extract parts of the write_pkginfo for use elsewhere

Levente Polyak-3
In reply to this post by Levente Polyak-3
Signed-off-by: Allan McRae <[hidden email]>
Signed-off-by: Levente Polyak <[hidden email]>
---
 scripts/makepkg.sh.in | 42 ++++++++++++++++++++++++++----------------
 1 file changed, 26 insertions(+), 16 deletions(-)

diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
index c302c4ad..7bdf72b9 100644
--- a/scripts/makepkg.sh.in
+++ b/scripts/makepkg.sh.in
@@ -608,6 +608,15 @@ find_libprovides() {
  (( ${#libprovides[@]} )) && printf '%s\n' "${libprovides[@]}"
 }
 
+get_packager() {
+ if [[ -n $PACKAGER ]]; then
+ local packager="$PACKAGER"
+ else
+ local packager="Unknown Packager"
+ fi
+ printf "%s\n" "$packager"
+}
+
 write_kv_pair() {
  local key="$1"
  shift
@@ -621,13 +630,22 @@ write_kv_pair() {
  done
 }
 
-write_pkginfo() {
- if [[ -n $PACKAGER ]]; then
- local packager="$PACKAGER"
- else
- local packager="Unknown Packager"
+write_kv_pkgname() {
+ write_kv_pair "pkgname" "$pkgname"
+ if (( SPLITPKG )) || [[ "$pkgbase" != "$pkgname" ]]; then
+ write_kv_pair "pkgbase" "$pkgbase"
+ fi
+}
+
+write_kv_pkgver() {
+ local fullver=$(get_full_version)
+ write_kv_pair "pkgver" "$fullver"
+ if [[ "$fullver" != "$basever" ]]; then
+ write_kv_pair "basever" "$basever"
  fi
+}
 
+write_pkginfo() {
  local size="$(@DUPATH@ @DUFLAGS@)"
  size="$(( ${size%%[^0-9]*} * 1024 ))"
 
@@ -637,16 +655,8 @@ write_pkginfo() {
  printf "# Generated by makepkg %s\n" "$makepkg_version"
  printf "# using %s\n" "$(fakeroot -v)"
 
- write_kv_pair "pkgname" "$pkgname"
- if (( SPLITPKG )) || [[ "$pkgbase" != "$pkgname" ]]; then
- write_kv_pair "pkgbase" "$pkgbase"
- fi
-
- local fullver=$(get_full_version)
- write_kv_pair "pkgver" "$fullver"
- if [[ "$fullver" != "$basever" ]]; then
- write_kv_pair "basever" "$basever"
- fi
+ write_kv_pkgname
+ write_kv_pkgver
 
  # TODO: all fields should have this treatment
  local spd="${pkgdesc//+([[:space:]])/ }"
@@ -656,7 +666,7 @@ write_pkginfo() {
  write_kv_pair "pkgdesc" "$spd"
  write_kv_pair "url" "$url"
  write_kv_pair "builddate" "$SOURCE_DATE_EPOCH"
- write_kv_pair "packager" "$packager"
+ write_kv_pair "packager" "$(get_packager)"
  write_kv_pair "size" "$size"
  write_kv_pair "arch" "$pkgarch"
 
--
2.12.2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH 5/5] makepkg: unify source file times for improved build reproducibility

Levente Polyak-3
In reply to this post by Levente Polyak-3
Signed-off-by: Levente Polyak <[hidden email]>
---
 scripts/makepkg.sh.in | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
index bd92c526..83c80fa7 100644
--- a/scripts/makepkg.sh.in
+++ b/scripts/makepkg.sh.in
@@ -1731,6 +1731,9 @@ if (( !REPKG )); then
  if (( PREPAREFUNC )); then
  run_prepare
  fi
+
+ # unify source times before building for reproducibility
+ find "$srcdir" -exec touch -h -d "@${SOURCE_DATE_EPOCH}" {} +
  fi
 
  if (( PKGVERFUNC )); then
--
2.12.2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH 5/5] makepkg: unify source file times for improved build reproducibility

Andrew Gregory
On 05/12/17 at 12:41pm, Levente Polyak wrote:

> Signed-off-by: Levente Polyak <[hidden email]>
> ---
>  scripts/makepkg.sh.in | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
> index bd92c526..83c80fa7 100644
> --- a/scripts/makepkg.sh.in
> +++ b/scripts/makepkg.sh.in
> @@ -1731,6 +1731,9 @@ if (( !REPKG )); then
>   if (( PREPAREFUNC )); then
>   run_prepare
>   fi
> +
> + # unify source times before building for reproducibility
> + find "$srcdir" -exec touch -h -d "@${SOURCE_DATE_EPOCH}" {} +
>   fi
>  
>   if (( PKGVERFUNC )); then

I'm still not convinced we should be doing this in makepkg and I'm not
sure exactly where our disagreement about this is at this point.  So,
let me describe how I would handle reproducible packages and you can
tell me why your approach is better.

First, I'm only concerned about manipulating things that makepkg is
not directly responsible for.  Anything that improves the
reproducibility of makepkg's own output is fine (e.g. removing the
timestamp from .PKGINFO and .MTREE files).  

Beyond that, I don't think makepkg is the place for trying to make
a package reproducible.  We seem to be in agreement that a separate
script will be necessary to actually reproduce a built package.
I would say that the same script should be used to build the original
package in the first place.  Aside from the fact that changes like
this break existing usage patterns for makepkg, makepkg is never going
to be able to guarantee that a package is reproducible.  There are
simply too many variables that can influence the resulting package for
makepkg to ever record them all.

Building a package that is reproducible requires a controlled
environment.  The script that handles reproducing the package has to
be able to setup such an environment, so why not let it setup the
environment for the initial build?  People that don't care about the
reproducibility of the resulting package can continue to use makepkg
as they do now and those that do care can use the wrapper script to
build it.  And, makepkg doesn't have to concern itself with
reproducibility beyond making sure that its own output is
reproducible.

Such a script could handle the timestamp manipulation in this patch.
We already provide several makepkg options to control which steps are
run.  The wrapper would invoke makepkg once to extract/prepare the
sources, the wrapper would then adjust the timestamps itself, finally
it would invoke makepkg again to continue the build.

apg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH 5/5] makepkg: unify source file times for improved build reproducibility

Allan McRae
On 13/05/17 01:09, Andrew Gregory wrote:

> On 05/12/17 at 12:41pm, Levente Polyak wrote:
>> Signed-off-by: Levente Polyak <[hidden email]>
>> ---
>>  scripts/makepkg.sh.in | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/scripts/makepkg.sh.in b/scripts/makepkg.sh.in
>> index bd92c526..83c80fa7 100644
>> --- a/scripts/makepkg.sh.in
>> +++ b/scripts/makepkg.sh.in
>> @@ -1731,6 +1731,9 @@ if (( !REPKG )); then
>>   if (( PREPAREFUNC )); then
>>   run_prepare
>>   fi
>> +
>> + # unify source times before building for reproducibility
>> + find "$srcdir" -exec touch -h -d "@${SOURCE_DATE_EPOCH}" {} +
>>   fi
>>  
>>   if (( PKGVERFUNC )); then
>
> I'm still not convinced we should be doing this in makepkg and I'm not
> sure exactly where our disagreement about this is at this point.  So,
> let me describe how I would handle reproducible packages and you can
> tell me why your approach is better.
>
> First, I'm only concerned about manipulating things that makepkg is
> not directly responsible for.  Anything that improves the
> reproducibility of makepkg's own output is fine (e.g. removing the
> timestamp from .PKGINFO and .MTREE files).  
>
> Beyond that, I don't think makepkg is the place for trying to make
> a package reproducible.  We seem to be in agreement that a separate
> script will be necessary to actually reproduce a built package.
> I would say that the same script should be used to build the original
> package in the first place.  Aside from the fact that changes like
> this break existing usage patterns for makepkg, makepkg is never going
> to be able to guarantee that a package is reproducible.  There are
> simply too many variables that can influence the resulting package for
> makepkg to ever record them all.
>
> Building a package that is reproducible requires a controlled
> environment.  The script that handles reproducing the package has to
> be able to setup such an environment, so why not let it setup the
> environment for the initial build?  People that don't care about the
> reproducibility of the resulting package can continue to use makepkg
> as they do now and those that do care can use the wrapper script to
> build it.  And, makepkg doesn't have to concern itself with
> reproducibility beyond making sure that its own output is
> reproducible.
>
> Such a script could handle the timestamp manipulation in this patch.
> We already provide several makepkg options to control which steps are
> run.  The wrapper would invoke makepkg once to extract/prepare the
> sources, the wrapper would then adjust the timestamps itself, finally
> it would invoke makepkg again to continue the build.
>

From my understanding, the reproducible build goal is to have a package
build exactly on two invocations of the build tool.  There are some
specific things that need to be set to achieve this (mostly
SOURCE_DATE_EPOCH), but the rest of the environment can be quite
variable.  In fact, the testing framework to ensure reproducibility varies:

date and time,
build path,
hostname,
domain name,
filesystem,
environment variables,
timezone,
language,
locale,
user name,
user id,
group name,
group id,
kernel version,
umask,
CPU type,
number of CPU cores.

So the environment does not necessarily need to be exactly the same
between builds for it to be reproducible.


Given I think python packages are the primary problem here, I'm going to
propose another solution....  Clearly embedding the timestamp in the
pyc/o files is a design decision and not going to be changed.  Could we
however, have a pass in makepkg that generates these files?  In the
"tidy" loop.  That would allow us to set times on the any .py files in
the package, and then generate pyc/o files.   No setting of source times
needed.

Allan
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH 5/5] makepkg: unify source file times for improved build reproducibility

Eli Schwartz
On 05/15/2017 08:51 PM, Allan McRae wrote:

> Given I think python packages are the primary problem here, I'm going to
> propose another solution....  Clearly embedding the timestamp in the
> pyc/o files is a design decision and not going to be changed.  Could we
> however, have a pass in makepkg that generates these files?  In the
> "tidy" loop.  That would allow us to set times on the any .py files in
> the package, and then generate pyc/o files.   No setting of source times
> needed.
>
> Allan
>
As I said on IRC, this is easier said than done. We'd have to somehow
figure out which files are python2 and which ones are python3; while
most will be in the appropriate /usr/lib/python$ver directory, some will
be elsewhere, e.g. Sigil installs python3 files used for its private
plugin interface under /usr/share/sigil instead.

Cinnamon seems to do the same (not that we ship pyc/pyo for any of
that), as does bleachbit, but I am not sure why (since its launcher
executable apparently claims either that or site-packages are both
expected places to find itself???) but whatever, different topic.

Reading the shebang could help cover those cases, but then again, not
all python modules actually come with shebangs, probably because the
author doesn't expect people to care. "Why do you need them, syntax
highlighting is okay because .py and you're not running them, you're
importing them."

...

I was thinking of a different alternative. In keeping with other
software that respects SOURCE_DATE_EPOCH, perhaps we should depend on
the user opting in to reproducible builds by setting
`SOURCE_DATE_EPOCH=something makepkg`
Sort of like `makepkg --reproducible`, but without actually needing a flag.

If SOURE_DATE_EPOCH is set by the user or script calling makepkg, then
makepkg would respect it for its own internal use, as well as touch'ing
files to that date.
I think that should make everyone more or less happy, except the people
who want to see users silently opted into reproducible builds for their
own good. :D

--
Eli Schwartz


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH 5/5] makepkg: unify source file times for improved build reproducibility

Allan McRae
On 16/05/17 22:28, Eli Schwartz wrote:

> On 05/15/2017 08:51 PM, Allan McRae wrote:
>> Given I think python packages are the primary problem here, I'm going to
>> propose another solution....  Clearly embedding the timestamp in the
>> pyc/o files is a design decision and not going to be changed.  Could we
>> however, have a pass in makepkg that generates these files?  In the
>> "tidy" loop.  That would allow us to set times on the any .py files in
>> the package, and then generate pyc/o files.   No setting of source times
>> needed.
>>
>> Allan
>>
>
> As I said on IRC, this is easier said than done. We'd have to somehow
> figure out which files are python2 and which ones are python3; while
> most will be in the appropriate /usr/lib/python$ver directory, some will
> be elsewhere, e.g. Sigil installs python3 files used for its private
> plugin interface under /usr/share/sigil instead.
>
> Cinnamon seems to do the same (not that we ship pyc/pyo for any of
> that), as does bleachbit, but I am not sure why (since its launcher
> executable apparently claims either that or site-packages are both
> expected places to find itself???) but whatever, different topic.
>
> Reading the shebang could help cover those cases, but then again, not
> all python modules actually come with shebangs, probably because the
> author doesn't expect people to care. "Why do you need them, syntax
> highlighting is okay because .py and you're not running them, you're
> importing them."

Other options...

1) We could actually use makepkg-template and have the python package
provide a function to do this.

2) We could follow most other distros have have the pyo/c files
generated by a hook  (I am not a fan due to having more untracked files...)

Allan
Loading...