Bug 15163 - Make Wireshark builds reproducible
Summary: Make Wireshark builds reproducible
Status: RESOLVED FIXED
Alias: None
Product: Wireshark
Classification: Unclassified
Component: Build process (show other bugs)
Version: Git
Hardware: All Linux
: Low Enhancement (vote)
Target Milestone: ---
Assignee: Bugzilla Administrator
URL:
Depends on:
Blocks:
 
Reported: 2018-10-02 16:59 UTC by Peter Wu
Modified: 2019-01-22 11:35 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Wu 2018-10-02 16:59:36 UTC
Build Information:
v2.9.0rc0-2071-gf4392340d6
--
"Reproducible builds are a set of software development practices that create an independently-verifiable path from source code to the binary code used by computers." -- https://reproducible-builds.org/

Wireshark builds are currently not reproducible on Debian nor on Arch Linux. It would be nice if this was solved.

https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/diffoscope-results/wireshark.html (autotools, 2.6.3-1)
https://tests.reproducible-builds.org/archlinux/community/wireshark/wireshark-qt-2.6.3-1-x86_64.pkg.tar.xz.html (autotools, wireshark-qt package is somehow not reproducible)

Observed issues so far:
- CMake enables RPATH by default which will result in different files depending on the build directory: https://gitlab.kitware.com/cmake/cmake/issues/18413
- filesystem.c use of BUILD_TIME_DATAFILE_DIR
- pod2man includes the date in the footer (solved by setting SOURCE_DATE_EPOCH).
- Use of __FILE__: if the build directory is within (a subdirectory of) the source tree, it is a relative path. Otherwise CMake will pass a full path and the full source tree path is included in __FILE__. Workaround: GCC 8.1 users should
 set -fmacro-prefix-map=$srcdir=. (or -ffile-prefix-map=$srcdir=.). This option
 does not exist yet in Clang 7, see https://bugs.llvm.org/show_bug.cgi?id=38135

Fixing autotools builds (on 2.6) is out-of-scope. Fixing CMake builds (mainly master, but if deemed useful, 2.6 too) is the focus.
Comment 1 Gerrit Code Review 2018-10-02 17:36:12 UTC
Change 29984 had a related patch set uploaded by Peter Wu:
wsutil: get_datafile_dir: avoid hard-coded build directory

https://code.wireshark.org/review/29984
Comment 2 Gerrit Code Review 2018-10-03 03:49:21 UTC
Change 29984 merged by Anders Broman:
wsutil: get_datafile_dir: avoid hard-coded build directory

https://code.wireshark.org/review/29984
Comment 3 Peter Wu 2018-10-19 12:00:44 UTC
Another potential issue:
lemon (used to generate epan/dfilter/grammar.c, epan/dtd_grammar.c and plugins/epan/mate/mate_grammar.c) embeds the path to the source file, e.g.:
#line 2 "/tmp/wireshark/epan/dfilter/grammar.lemon"

Potential solution:
Change cmake/modules/UseLemon.cmake
1. If the source and binary dir do not match (out-of-tree build), copy the source to the binary dir
2. change the ${_in} to a relative path instead.
Comment 4 Peter Wu 2018-12-03 13:04:07 UTC
FYI, CMake 3.14 gains a "CMAKE_BUILD_RPATH_USE_ORIGIN" property to enforce relative RPATHs:
https://gitlab.kitware.com/cmake/cmake/merge_requests/2456
Comment 5 Gerrit Code Review 2019-01-18 10:28:41 UTC
Change 31586 had a related patch set uploaded by Peter Wu:
CMake: set CMAKE_BUILD_RPATH_USE_ORIGIN

https://code.wireshark.org/review/31586
Comment 6 Peter Wu 2019-01-18 11:41:52 UTC
Status update with this environment:
CMake 3.14 (or presumably older CMake versions with CMAKE_BUILD_WITH_INSTALL_RPATH=ON or CMAKE_SKIP_RPATH=ON)

export SOURCE_DATE_EPOCH=1547766001
cmake -GNinja -DCMAKE_BUILD_TYPE=Debug -DCMAKE_{C,CXX}_FLAGS="-ffile-prefix-map=$PWD=srcdir -ffile-prefix-map=/tmp/wireshark=builddir" /tmp/wireshark
ninja

Varying these settings (srcdir, builddir ($PWD), username):
- /tmp/ws-review /tmp/reproducible/mybuild, peter
- /tmp/ws-review /tmp/reproducible/anotheruser, peter-test
- /tmp/wireshark /tmp/reproducible/b2, peter

All binaries except for wireshark (Qt) are reproducible, invariant of the build and source directory. wireshark is possibly not reproducible due to differences in:
- ui/qt/CMakeFiles/qtui.dir/qtui_autogen/EJRQKI7XPS/qrc_i18n.cpp.o
- ui/qt/qtui_autogen/EJRQKI7XPS/qrc_i18n.cpp

which is probably the result of changes in:
- ui/qt/CMakeFiles/qtui_autogen.dir/AutogenOldSettings.txt (moc)
- ui/qt/CMakeFiles/qtui_autogen.dir/RCCi18nSettings.txt (rcc)
- ui/qt/qtui_autogen/EJRQKI7XPS/qrc_i18n.cpp

With a different srcdir, more autogen/moc/qrc files are different due to absolute paths in the source file (mostly comments).
Comment 7 Gerrit Code Review 2019-01-18 12:58:25 UTC
Change 31593 had a related patch set uploaded by Peter Wu:
CMake: avoid including file modification time for RCC

https://code.wireshark.org/review/31593
Comment 8 Gerrit Code Review 2019-01-18 15:46:39 UTC
Change 31593 merged by Peter Wu:
CMake: avoid including file modification time for RCC

https://code.wireshark.org/review/31593
Comment 9 Peter Wu 2019-01-18 15:57:16 UTC
With the latest changes, Wireshark 3.0 should be reproducible. Additional requirements:

To make documentation builds reproducible, set SOURCE_DATE_EPOCH.

To avoid including the build and source directory in the build environment[1]:

- Use -ffile-prefix-map=OLD=NEW (same as -fdebug-prefix-map and
  -fmacro-prefix-map) option to strip the source and build directories.
- Ensure that the relative path from the build directory to the source directory
  remains the same, otherwise CMake automoc will result in different binaries:
  https://gitlab.kitware.com/cmake/cmake/issues/18793
- To avoid build-time rpaths from affecting the build IDs of binaries, either:
  1) use CMake 3.14 or newer or
  2) set CMAKE_BUILD_WITH_INSTALL_RPATH or
  3) set CMAKE_SKIP_RPATH=ON

 [1]: https://reproducible-builds.org/docs/perimeter/
Comment 10 Gerrit Code Review 2019-01-21 11:19:08 UTC
Change 31645 had a related patch set uploaded by Peter Wu:
CMake: strip directory prefixes from __FILE__ macros

https://code.wireshark.org/review/31645
Comment 11 Gerrit Code Review 2019-01-21 13:29:44 UTC
Change 31645 merged by Peter Wu:
CMake: strip directory prefixes from __FILE__ macros

https://code.wireshark.org/review/31645
Comment 12 Peter Wu 2019-01-22 11:35:06 UTC
Update to comment 9: the final requirements for reproducible builds is:

- Set the SOURCE_DATE_EPOCH environment variable to a fixed timestamp.

To avoid including the build and source directory in the build environment[1]:
- Use -fdebug-prefix-map=OLD=NEW (Debian and Arch Linux do this).
  (Wireshark sets -fmacro-prefix-map if supported, currently GCC >= 8)
- Ensure that the relative path from the build directory to the source directory
  remains the same. See https://gitlab.kitware.com/cmake/cmake/issues/18793
- To avoid build-time rpaths from affecting the build IDs of binaries, either:
  1) use CMake 3.14 or newer
  2) set CMAKE_SKIP_RPATH=ON (requires LD_LIBRARY_PATH=run while running tests).


I tried to find a workaround to solve the last issue out-of-the-box on older CMake versions, but it appears impossible:
- Setting the BUILD_RPATH property (since CMake 3.8) will prepend that value
  to the existing, absolute paths in RPATH. Relocatable, but not reproducible.
- Setting CMAKE_BUILD_WITH_INSTALL_RPATH=ON and adjusting the INSTALL_RPATH
  property does make the build reproducible and relocatable, but requires
  additional code to strip RPATH at installation time.


Distributors who care about reproducibility invariant of the build directory should probably use "cmake -DCMAKE_SKIP_RPATH=ON" and set environment variable LD_LIBRARY_PATH=$builddir/run while running tests. This workaround is no longer necessary with CMake 3.14 (not yet released).