5.0 KiB
Code Size Test App
The goal of this test app is to track the binary size of a potential app built
on top of TFRT. Such apps need to link the core components of TFRT libraries
including :hostcontext, :support, :befexecutor and various kernel
libraries.
This document describes the test app setup and explains various techniques we used to reduce the binary size, and also documents tools that we have found useful for analyzing binary size.
Test App Setup
This test app computes the Fibonacci sequence. Today, the application logic
needs to be written in MLIR, and converted to BEF format using tfrt_translate -mlir-to-bef tool, before getting run by a simplified BEF executor We also
need to link basic_kernels library for the kernel implementations to compute
Fibonacci sequence.
With the TFRT Host Runtime APIs, a potential application can be built on top of C++ API directly without going through MLIR to BEF conversion.
To run the test:
$ bazel test //mlir_tests/code_size_test_app:fib.mlir.test
What really happens can be inferred from the first line of
fib.mlir:
// RUN: tfrt_translate -mlir-to-bef %s | code_size_test_driver | FileCheck %s
tfrt_translate -mlir-to-befconvertsfib.mlirto a BEF file.code_size_test_driverreads the BEF file fromstdinand runs it.- the output of the test is
FileCheck-ed against expected results annotated infib.mlir.
Techniques to Reduce Binary Size
As of March 31 2020, the code_size_test_driver minimal binary size is 78KB. We
statically link core TFRT libraries, the required kernel libraries and
llvm::support library, while dynamically link the C++ runtime library and
system libraries. If we also statically link the C++ runtime, as done by default
in Bazel, we get 405KB.
$ bazel build --config=code_size_test
third_party/tf_runtime:code_size_test_driver
The above configuration applies a few well-known techniques in mobile app/embedded world to reduce the binary size:
Disable RTTI and exceptions
We use a rtti_and_exceptions Skylark flag to disable RTTI and exceptions
through the corresponding copts -fno-rtti and -fno-exceptions. As a side
effect, we also need to disable header modules.
Strip symbols
We pass linker options --linkopt=-Wl,--strip-all to strip all symbols.
Ask compiler to optimize for binary size.
We ask compiler to optimize for size by specifying --copt=-Os, where -Os
enables all -O2 optimizations except those that often increase code size.
Safe ICF
We ask compiler to merge identical functions into a single copy whenever safe to
do so with --linkopt=-Wl,--icf=safe.
Tools for Analyzing Binary Size
llvm-nm: Dump all the symbols.
$ llvm-nm --demangle --print-size --radix=d --size-sort --reverse-sort nonstripped_binary | head -n 20
0000000000006208 0000000000002011 T main
0000000000005728 0000000000000308 T ReadFromStdInToBuffer()
0000000000006048 0000000000000146 W std::__u::vector<unsigned char, std::__u::allocator<unsigned char> >::reserve(unsigned long)
0000000000008288 0000000000000104 W llvm::SmallVectorImpl<tfrt::AsyncValue*>::assign(unsigned long, tfrt::AsyncValue* const&)
0000000000008416 0000000000000103 T __libc_csu_init
0000000000005376 0000000000000042 T _start
0000000000004808 0000000000000032 r std::__u::__function::__policy const* std::__u::__function::__policy::__choose_policy<std::__u::__function::__default_alloc_func<main::$_0, void (tfrt::DecodedDiagnostic const&)> >(std::__u::integral_constant<bool, true>)::__policy_
0000000000004840 0000000000000032 r std::__u::__function::__policy const* std::__u::__function::__policy::__choose_policy<std::__u::__function::__default_alloc_func<main::$_0, void (tfrt::DecodedDiagnostic)> >(std::__u::integral_constant<bool, true>)::__policy_
0000000000016672 0000000000000008 B stdout
0000000000016664 0000000000000008 b dtor_idx.6689
0000000000016656 0000000000000001 b completed.6687
0000000000008528 0000000000000001 T __libc_csu_fini
0000000000008224 0000000000000001 t void std::__u::__function::__policy_invoker<void (tfrt::DecodedDiagnostic const&)>::__call_impl<std::__u::__function::__default_alloc_func<main::$_0, void (tfrt::DecodedDiagnostic const&)> >(std::__u::__function::__policy_storage const*, tfrt::DecodedDiagnostic const&)
0000000000008256 0000000000000001 t void std::__u::__function::__policy_invoker<void (tfrt::DecodedDiagnostic)>::__call_impl<std::__u::__function::__default_alloc_func<main::$_0, void (tfrt::DecodedDiagnostic)> >(std::__u::__function::__policy_storage const*, tfrt::DecodedDiagnostic&&)
U strlen
0000000000005488 0000000000000000 t register_tm_clones
U puts
U putchar
U printf
U memcpy