netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC bpf-next 0/2] bpf: Fix BTF data for modules
@ 2021-10-23 12:04 Jiri Olsa
  2021-10-23 12:04 ` [PATCH bpf-next 1/2] kbuild: Unify options for BTF generation for vmlinux and modules Jiri Olsa
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Jiri Olsa @ 2021-10-23 12:04 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

hi,
I'm trying to enable BTF for kernel module in fedora,
and I'm getting big increase on modules sizes on s390x arch.

Size of modules in total - kernel dir under /lib/modules/VER/
from kernel-core and kernel-module packages:

               current   new
      aarch64      60M   76M
      ppc64le      53M   66M
      s390x        21M   41M
      x86_64       64M   79M

The reason for higher increase on s390x was that dedup algorithm
did not detect some of the big kernel structs like 'struct module',
so they are duplicated in the kernel module BTF data. The s390x
has many small modules that increased significantly in size because
of that even after compression.

First issues was that the '--btf_gen_floats' option is not passed
to pahole for kernel module BTF generation.

The other problem is more tricky and is the reason why this patchset
is RFC ;-)

The s390x compiler generates multiple definitions of the same struct
and dedup algorithm does not seem to handle this at the moment.

I put the debuginfo and btf dump of the s390x pnet.ko module in here:
  http://people.redhat.com/~jolsa/kmodbtf/

Please let me know if you'd like to see other info/files.

I found code in dedup that seems to handle such situation for arrays,
and added 'some' fix for structs. With that change I can no longer
see vmlinux's structs in kernel module BTF data, but I have no idea
if that breaks anything else.

thoughts? thanks,
jirka


---
Jiri Olsa (2):
      kbuild: Unify options for BTF generation for vmlinux and modules
      bpf: Add support to detect and dedup instances of same structs

 Makefile                  |  3 +++
 scripts/Makefile.modfinal |  2 +-
 scripts/link-vmlinux.sh   | 11 +----------
 scripts/pahole-flags.sh   | 20 ++++++++++++++++++++
 tools/lib/bpf/btf.c       | 12 ++++++++++--
 5 files changed, 35 insertions(+), 13 deletions(-)
 create mode 100755 scripts/pahole-flags.sh


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf-next 1/2] kbuild: Unify options for BTF generation for vmlinux and modules
  2021-10-23 12:04 [RFC bpf-next 0/2] bpf: Fix BTF data for modules Jiri Olsa
@ 2021-10-23 12:04 ` Jiri Olsa
  2021-10-26  4:56   ` Andrii Nakryiko
  2021-10-23 12:04 ` [PATCH bpf-next 2/2] bpf: Add support to detect and dedup instances of same structs Jiri Olsa
  2021-10-26  4:54 ` [RFC bpf-next 0/2] bpf: Fix BTF data for modules Andrii Nakryiko
  2 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2021-10-23 12:04 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

Using new PAHOLE_FLAGS variable to pass extra arguments to
pahole for both vmlinux and modules BTF data generation.

Adding new scripts/pahole-flags.sh script that detect and
prints pahole options.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 Makefile                  |  3 +++
 scripts/Makefile.modfinal |  2 +-
 scripts/link-vmlinux.sh   | 11 +----------
 scripts/pahole-flags.sh   | 20 ++++++++++++++++++++
 4 files changed, 25 insertions(+), 11 deletions(-)
 create mode 100755 scripts/pahole-flags.sh

diff --git a/Makefile b/Makefile
index 437ccc66a1c2..ee514b80c62e 100644
--- a/Makefile
+++ b/Makefile
@@ -480,6 +480,8 @@ LZ4		= lz4c
 XZ		= xz
 ZSTD		= zstd
 
+PAHOLE_FLAGS	= $(shell PAHOLE=$(PAHOLE) scripts/pahole-flags.sh)
+
 CHECKFLAGS     := -D__linux__ -Dlinux -D__STDC__ -Dunix -D__unix__ \
 		  -Wbitwise -Wno-return-void -Wno-unknown-attribute $(CF)
 NOSTDINC_FLAGS :=
@@ -534,6 +536,7 @@ export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE
 export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE
 export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE
 export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL
+export PAHOLE_FLAGS
 
 # Files to ignore in find ... statements
 
diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
index 1fb45b011e4b..7f39599e9fae 100644
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -40,7 +40,7 @@ quiet_cmd_ld_ko_o = LD [M]  $@
 quiet_cmd_btf_ko = BTF [M] $@
       cmd_btf_ko = 							\
 	if [ -f vmlinux ]; then						\
-		LLVM_OBJCOPY="$(OBJCOPY)" $(PAHOLE) -J --btf_base vmlinux $@; \
+		LLVM_OBJCOPY="$(OBJCOPY)" $(PAHOLE) -J $(PAHOLE_FLAGS) --btf_base vmlinux $@; \
 		$(RESOLVE_BTFIDS) -b vmlinux $@; 			\
 	else								\
 		printf "Skipping BTF generation for %s due to unavailability of vmlinux\n" $@ 1>&2; \
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index d74cee5c4326..3ea7cece7c97 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -205,7 +205,6 @@ vmlinux_link()
 gen_btf()
 {
 	local pahole_ver
-	local extra_paholeopt=
 
 	if ! [ -x "$(command -v ${PAHOLE})" ]; then
 		echo >&2 "BTF: ${1}: pahole (${PAHOLE}) is not available"
@@ -220,16 +219,8 @@ gen_btf()
 
 	vmlinux_link ${1}
 
-	if [ "${pahole_ver}" -ge "118" ] && [ "${pahole_ver}" -le "121" ]; then
-		# pahole 1.18 through 1.21 can't handle zero-sized per-CPU vars
-		extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_vars"
-	fi
-	if [ "${pahole_ver}" -ge "121" ]; then
-		extra_paholeopt="${extra_paholeopt} --btf_gen_floats"
-	fi
-
 	info "BTF" ${2}
-	LLVM_OBJCOPY="${OBJCOPY}" ${PAHOLE} -J ${extra_paholeopt} ${1}
+	LLVM_OBJCOPY="${OBJCOPY}" ${PAHOLE} -J ${PAHOLE_FLAGS} ${1}
 
 	# Create ${2} which contains just .BTF section but no symbols. Add
 	# SHF_ALLOC because .BTF will be part of the vmlinux image. --strip-all
diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
new file mode 100755
index 000000000000..2b99fc77019c
--- /dev/null
+++ b/scripts/pahole-flags.sh
@@ -0,0 +1,20 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+extra_paholeopt=
+
+if ! [ -x "$(command -v ${PAHOLE})" ]; then
+	return
+fi
+
+pahole_ver=$(${PAHOLE} --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/')
+
+if [ "${pahole_ver}" -ge "118" ] && [ "${pahole_ver}" -le "121" ]; then
+	# pahole 1.18 through 1.21 can't handle zero-sized per-CPU vars
+	extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_vars"
+fi
+if [ "${pahole_ver}" -ge "121" ]; then
+	extra_paholeopt="${extra_paholeopt} --btf_gen_floats"
+fi
+
+echo ${extra_paholeopt}
-- 
2.31.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf-next 2/2] bpf: Add support to detect and dedup instances of same structs
  2021-10-23 12:04 [RFC bpf-next 0/2] bpf: Fix BTF data for modules Jiri Olsa
  2021-10-23 12:04 ` [PATCH bpf-next 1/2] kbuild: Unify options for BTF generation for vmlinux and modules Jiri Olsa
@ 2021-10-23 12:04 ` Jiri Olsa
  2021-10-26  4:59   ` Andrii Nakryiko
  2021-10-26  4:54 ` [RFC bpf-next 0/2] bpf: Fix BTF data for modules Andrii Nakryiko
  2 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2021-10-23 12:04 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

The s390x compiler generates multiple definitions of the same struct
and dedup algorithm does not seem to handle this at the moment.

I found code in dedup that seems to handle such situation for arrays,
and added btf_dedup_is_equiv call for structs.

With this change I can no longer see vmlinux's structs in kernel
module BTF data, but I have no idea if that breaks anything else.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/lib/bpf/btf.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 3a01c4b7f36a..ec164d0cee30 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -3920,8 +3920,16 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id,
 		 * types within a single CU. So work around that by explicitly
 		 * allowing identical array types here.
 		 */
-		return hypot_type_id == cand_id ||
-		       btf_dedup_identical_arrays(d, hypot_type_id, cand_id);
+		struct btf_type *t;
+
+		if (hypot_type_id == cand_id)
+			return 1;
+		t = btf_type_by_id(d->btf, hypot_type_id);
+		if (btf_is_array(t))
+			return btf_dedup_identical_arrays(d, hypot_type_id, cand_id);
+		if (btf_is_struct(t))
+			return btf_dedup_is_equiv(d, hypot_type_id, cand_id);
+		return 0;
 	}
 
 	if (btf_dedup_hypot_map_add(d, canon_id, cand_id))
-- 
2.31.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-10-23 12:04 [RFC bpf-next 0/2] bpf: Fix BTF data for modules Jiri Olsa
  2021-10-23 12:04 ` [PATCH bpf-next 1/2] kbuild: Unify options for BTF generation for vmlinux and modules Jiri Olsa
  2021-10-23 12:04 ` [PATCH bpf-next 2/2] bpf: Add support to detect and dedup instances of same structs Jiri Olsa
@ 2021-10-26  4:54 ` Andrii Nakryiko
  2021-10-26 12:03   ` Jiri Olsa
  2 siblings, 1 reply; 19+ messages in thread
From: Andrii Nakryiko @ 2021-10-26  4:54 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> hi,
> I'm trying to enable BTF for kernel module in fedora,
> and I'm getting big increase on modules sizes on s390x arch.
>
> Size of modules in total - kernel dir under /lib/modules/VER/
> from kernel-core and kernel-module packages:
>
>                current   new
>       aarch64      60M   76M
>       ppc64le      53M   66M
>       s390x        21M   41M
>       x86_64       64M   79M
>
> The reason for higher increase on s390x was that dedup algorithm
> did not detect some of the big kernel structs like 'struct module',
> so they are duplicated in the kernel module BTF data. The s390x
> has many small modules that increased significantly in size because
> of that even after compression.
>
> First issues was that the '--btf_gen_floats' option is not passed
> to pahole for kernel module BTF generation.
>
> The other problem is more tricky and is the reason why this patchset
> is RFC ;-)
>
> The s390x compiler generates multiple definitions of the same struct
> and dedup algorithm does not seem to handle this at the moment.
>
> I put the debuginfo and btf dump of the s390x pnet.ko module in here:
>   http://people.redhat.com/~jolsa/kmodbtf/
>
> Please let me know if you'd like to see other info/files.
>

Hard to tell what's going on without vmlinux itself. Can you upload a
corresponding kernel image with BTF in it?

> I found code in dedup that seems to handle such situation for arrays,
> and added 'some' fix for structs. With that change I can no longer
> see vmlinux's structs in kernel module BTF data, but I have no idea
> if that breaks anything else.
>
> thoughts? thanks,
> jirka
>
>
> ---
> Jiri Olsa (2):
>       kbuild: Unify options for BTF generation for vmlinux and modules
>       bpf: Add support to detect and dedup instances of same structs
>
>  Makefile                  |  3 +++
>  scripts/Makefile.modfinal |  2 +-
>  scripts/link-vmlinux.sh   | 11 +----------
>  scripts/pahole-flags.sh   | 20 ++++++++++++++++++++
>  tools/lib/bpf/btf.c       | 12 ++++++++++--
>  5 files changed, 35 insertions(+), 13 deletions(-)
>  create mode 100755 scripts/pahole-flags.sh
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 1/2] kbuild: Unify options for BTF generation for vmlinux and modules
  2021-10-23 12:04 ` [PATCH bpf-next 1/2] kbuild: Unify options for BTF generation for vmlinux and modules Jiri Olsa
@ 2021-10-26  4:56   ` Andrii Nakryiko
  2021-10-26 12:03     ` Jiri Olsa
  0 siblings, 1 reply; 19+ messages in thread
From: Andrii Nakryiko @ 2021-10-26  4:56 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> Using new PAHOLE_FLAGS variable to pass extra arguments to
> pahole for both vmlinux and modules BTF data generation.
>
> Adding new scripts/pahole-flags.sh script that detect and
> prints pahole options.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---

LGTM. I suggest posting it separately from the BTF dedup hack.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  Makefile                  |  3 +++
>  scripts/Makefile.modfinal |  2 +-
>  scripts/link-vmlinux.sh   | 11 +----------
>  scripts/pahole-flags.sh   | 20 ++++++++++++++++++++
>  4 files changed, 25 insertions(+), 11 deletions(-)
>  create mode 100755 scripts/pahole-flags.sh
>

[...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 2/2] bpf: Add support to detect and dedup instances of same structs
  2021-10-23 12:04 ` [PATCH bpf-next 2/2] bpf: Add support to detect and dedup instances of same structs Jiri Olsa
@ 2021-10-26  4:59   ` Andrii Nakryiko
  0 siblings, 0 replies; 19+ messages in thread
From: Andrii Nakryiko @ 2021-10-26  4:59 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> The s390x compiler generates multiple definitions of the same struct
> and dedup algorithm does not seem to handle this at the moment.
>
> I found code in dedup that seems to handle such situation for arrays,
> and added btf_dedup_is_equiv call for structs.
>
> With this change I can no longer see vmlinux's structs in kernel
> module BTF data, but I have no idea if that breaks anything else.
>

I'm pretty sure this is not the right way to handle this. Let's figure
out what is causing a difference in types (i.e., why they are not
equivalent), before attempting to patch up BTF dedup algorithm.

> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/lib/bpf/btf.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 3a01c4b7f36a..ec164d0cee30 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -3920,8 +3920,16 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id,
>                  * types within a single CU. So work around that by explicitly
>                  * allowing identical array types here.
>                  */
> -               return hypot_type_id == cand_id ||
> -                      btf_dedup_identical_arrays(d, hypot_type_id, cand_id);
> +               struct btf_type *t;
> +
> +               if (hypot_type_id == cand_id)
> +                       return 1;
> +               t = btf_type_by_id(d->btf, hypot_type_id);
> +               if (btf_is_array(t))
> +                       return btf_dedup_identical_arrays(d, hypot_type_id, cand_id);
> +               if (btf_is_struct(t))
> +                       return btf_dedup_is_equiv(d, hypot_type_id, cand_id);
> +               return 0;
>         }
>
>         if (btf_dedup_hypot_map_add(d, canon_id, cand_id))
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-10-26  4:54 ` [RFC bpf-next 0/2] bpf: Fix BTF data for modules Andrii Nakryiko
@ 2021-10-26 12:03   ` Jiri Olsa
  2021-10-27  4:12     ` Andrii Nakryiko
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2021-10-26 12:03 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > hi,
> > I'm trying to enable BTF for kernel module in fedora,
> > and I'm getting big increase on modules sizes on s390x arch.
> >
> > Size of modules in total - kernel dir under /lib/modules/VER/
> > from kernel-core and kernel-module packages:
> >
> >                current   new
> >       aarch64      60M   76M
> >       ppc64le      53M   66M
> >       s390x        21M   41M
> >       x86_64       64M   79M
> >
> > The reason for higher increase on s390x was that dedup algorithm
> > did not detect some of the big kernel structs like 'struct module',
> > so they are duplicated in the kernel module BTF data. The s390x
> > has many small modules that increased significantly in size because
> > of that even after compression.
> >
> > First issues was that the '--btf_gen_floats' option is not passed
> > to pahole for kernel module BTF generation.
> >
> > The other problem is more tricky and is the reason why this patchset
> > is RFC ;-)
> >
> > The s390x compiler generates multiple definitions of the same struct
> > and dedup algorithm does not seem to handle this at the moment.
> >
> > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> >   http://people.redhat.com/~jolsa/kmodbtf/
> >
> > Please let me know if you'd like to see other info/files.
> >
> 
> Hard to tell what's going on without vmlinux itself. Can you upload a
> corresponding kernel image with BTF in it?

sure, uploaded

jirka

> 
> > I found code in dedup that seems to handle such situation for arrays,
> > and added 'some' fix for structs. With that change I can no longer
> > see vmlinux's structs in kernel module BTF data, but I have no idea
> > if that breaks anything else.
> >
> > thoughts? thanks,
> > jirka
> >
> >
> > ---
> > Jiri Olsa (2):
> >       kbuild: Unify options for BTF generation for vmlinux and modules
> >       bpf: Add support to detect and dedup instances of same structs
> >
> >  Makefile                  |  3 +++
> >  scripts/Makefile.modfinal |  2 +-
> >  scripts/link-vmlinux.sh   | 11 +----------
> >  scripts/pahole-flags.sh   | 20 ++++++++++++++++++++
> >  tools/lib/bpf/btf.c       | 12 ++++++++++--
> >  5 files changed, 35 insertions(+), 13 deletions(-)
> >  create mode 100755 scripts/pahole-flags.sh
> >
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next 1/2] kbuild: Unify options for BTF generation for vmlinux and modules
  2021-10-26  4:56   ` Andrii Nakryiko
@ 2021-10-26 12:03     ` Jiri Olsa
  0 siblings, 0 replies; 19+ messages in thread
From: Jiri Olsa @ 2021-10-26 12:03 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Mon, Oct 25, 2021 at 09:56:37PM -0700, Andrii Nakryiko wrote:
> On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > Using new PAHOLE_FLAGS variable to pass extra arguments to
> > pahole for both vmlinux and modules BTF data generation.
> >
> > Adding new scripts/pahole-flags.sh script that detect and
> > prints pahole options.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> 
> LGTM. I suggest posting it separately from the BTF dedup hack.

ok, will post today

thanks,
jirka

> 
> Acked-by: Andrii Nakryiko <andrii@kernel.org>
> 
> >  Makefile                  |  3 +++
> >  scripts/Makefile.modfinal |  2 +-
> >  scripts/link-vmlinux.sh   | 11 +----------
> >  scripts/pahole-flags.sh   | 20 ++++++++++++++++++++
> >  4 files changed, 25 insertions(+), 11 deletions(-)
> >  create mode 100755 scripts/pahole-flags.sh
> >
> 
> [...]
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-10-26 12:03   ` Jiri Olsa
@ 2021-10-27  4:12     ` Andrii Nakryiko
  2021-10-27  8:53       ` Jiri Olsa
  0 siblings, 1 reply; 19+ messages in thread
From: Andrii Nakryiko @ 2021-10-27  4:12 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Tue, Oct 26, 2021 at 5:03 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> > On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > hi,
> > > I'm trying to enable BTF for kernel module in fedora,
> > > and I'm getting big increase on modules sizes on s390x arch.
> > >
> > > Size of modules in total - kernel dir under /lib/modules/VER/
> > > from kernel-core and kernel-module packages:
> > >
> > >                current   new
> > >       aarch64      60M   76M
> > >       ppc64le      53M   66M
> > >       s390x        21M   41M
> > >       x86_64       64M   79M
> > >
> > > The reason for higher increase on s390x was that dedup algorithm
> > > did not detect some of the big kernel structs like 'struct module',
> > > so they are duplicated in the kernel module BTF data. The s390x
> > > has many small modules that increased significantly in size because
> > > of that even after compression.
> > >
> > > First issues was that the '--btf_gen_floats' option is not passed
> > > to pahole for kernel module BTF generation.
> > >
> > > The other problem is more tricky and is the reason why this patchset
> > > is RFC ;-)
> > >
> > > The s390x compiler generates multiple definitions of the same struct
> > > and dedup algorithm does not seem to handle this at the moment.
> > >
> > > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> > >   http://people.redhat.com/~jolsa/kmodbtf/
> > >
> > > Please let me know if you'd like to see other info/files.
> > >
> >
> > Hard to tell what's going on without vmlinux itself. Can you upload a
> > corresponding kernel image with BTF in it?
>
> sure, uploaded
>

vmlinux.btfdump:

[174] FLOAT 'float' size=4
[175] FLOAT 'double' size=8

VS

pnet.btfdump:

[89318] INT 'float' size=4 bits_offset=0 nr_bits=32 encoding=(none)
[89319] INT 'double' size=8 bits_offset=0 nr_bits=64 encoding=(none)


> jirka
>
> >
> > > I found code in dedup that seems to handle such situation for arrays,
> > > and added 'some' fix for structs. With that change I can no longer
> > > see vmlinux's structs in kernel module BTF data, but I have no idea
> > > if that breaks anything else.
> > >
> > > thoughts? thanks,
> > > jirka
> > >
> > >
> > > ---
> > > Jiri Olsa (2):
> > >       kbuild: Unify options for BTF generation for vmlinux and modules
> > >       bpf: Add support to detect and dedup instances of same structs
> > >
> > >  Makefile                  |  3 +++
> > >  scripts/Makefile.modfinal |  2 +-
> > >  scripts/link-vmlinux.sh   | 11 +----------
> > >  scripts/pahole-flags.sh   | 20 ++++++++++++++++++++
> > >  tools/lib/bpf/btf.c       | 12 ++++++++++--
> > >  5 files changed, 35 insertions(+), 13 deletions(-)
> > >  create mode 100755 scripts/pahole-flags.sh
> > >
> >
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-10-27  4:12     ` Andrii Nakryiko
@ 2021-10-27  8:53       ` Jiri Olsa
  2021-10-27 17:53         ` Andrii Nakryiko
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2021-10-27  8:53 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Tue, Oct 26, 2021 at 09:12:31PM -0700, Andrii Nakryiko wrote:
> On Tue, Oct 26, 2021 at 5:03 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> > > On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > hi,
> > > > I'm trying to enable BTF for kernel module in fedora,
> > > > and I'm getting big increase on modules sizes on s390x arch.
> > > >
> > > > Size of modules in total - kernel dir under /lib/modules/VER/
> > > > from kernel-core and kernel-module packages:
> > > >
> > > >                current   new
> > > >       aarch64      60M   76M
> > > >       ppc64le      53M   66M
> > > >       s390x        21M   41M
> > > >       x86_64       64M   79M
> > > >
> > > > The reason for higher increase on s390x was that dedup algorithm
> > > > did not detect some of the big kernel structs like 'struct module',
> > > > so they are duplicated in the kernel module BTF data. The s390x
> > > > has many small modules that increased significantly in size because
> > > > of that even after compression.
> > > >
> > > > First issues was that the '--btf_gen_floats' option is not passed
> > > > to pahole for kernel module BTF generation.
> > > >
> > > > The other problem is more tricky and is the reason why this patchset
> > > > is RFC ;-)
> > > >
> > > > The s390x compiler generates multiple definitions of the same struct
> > > > and dedup algorithm does not seem to handle this at the moment.
> > > >
> > > > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> > > >   http://people.redhat.com/~jolsa/kmodbtf/
> > > >
> > > > Please let me know if you'd like to see other info/files.
> > > >
> > >
> > > Hard to tell what's going on without vmlinux itself. Can you upload a
> > > corresponding kernel image with BTF in it?
> >
> > sure, uploaded
> >
> 
> vmlinux.btfdump:
> 
> [174] FLOAT 'float' size=4
> [175] FLOAT 'double' size=8
> 
> VS
> 
> pnet.btfdump:
> 
> [89318] INT 'float' size=4 bits_offset=0 nr_bits=32 encoding=(none)
> [89319] INT 'double' size=8 bits_offset=0 nr_bits=64 encoding=(none)

ugh, that's with no fix applied, sry

I applied the first patch and uploaded new files

now when I compare the 'module' struct from vmlinux:

	[885] STRUCT 'module' size=1280 vlen=70

and same one from pnet.ko:

	[89323] STRUCT 'module' size=1280 vlen=70

they seem to completely match, all the fields
and yet it still appears in the kmod's BTF

thanks,
jirka

> 
> 
> > jirka
> >
> > >
> > > > I found code in dedup that seems to handle such situation for arrays,
> > > > and added 'some' fix for structs. With that change I can no longer
> > > > see vmlinux's structs in kernel module BTF data, but I have no idea
> > > > if that breaks anything else.
> > > >
> > > > thoughts? thanks,
> > > > jirka
> > > >
> > > >
> > > > ---
> > > > Jiri Olsa (2):
> > > >       kbuild: Unify options for BTF generation for vmlinux and modules
> > > >       bpf: Add support to detect and dedup instances of same structs
> > > >
> > > >  Makefile                  |  3 +++
> > > >  scripts/Makefile.modfinal |  2 +-
> > > >  scripts/link-vmlinux.sh   | 11 +----------
> > > >  scripts/pahole-flags.sh   | 20 ++++++++++++++++++++
> > > >  tools/lib/bpf/btf.c       | 12 ++++++++++--
> > > >  5 files changed, 35 insertions(+), 13 deletions(-)
> > > >  create mode 100755 scripts/pahole-flags.sh
> > > >
> > >
> >
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-10-27  8:53       ` Jiri Olsa
@ 2021-10-27 17:53         ` Andrii Nakryiko
  2021-10-27 18:18           ` Jiri Olsa
  2021-10-28  1:44           ` Kumar Kartikeya Dwivedi
  0 siblings, 2 replies; 19+ messages in thread
From: Andrii Nakryiko @ 2021-10-27 17:53 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Wed, Oct 27, 2021 at 1:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Tue, Oct 26, 2021 at 09:12:31PM -0700, Andrii Nakryiko wrote:
> > On Tue, Oct 26, 2021 at 5:03 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> > > > On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > >
> > > > > hi,
> > > > > I'm trying to enable BTF for kernel module in fedora,
> > > > > and I'm getting big increase on modules sizes on s390x arch.
> > > > >
> > > > > Size of modules in total - kernel dir under /lib/modules/VER/
> > > > > from kernel-core and kernel-module packages:
> > > > >
> > > > >                current   new
> > > > >       aarch64      60M   76M
> > > > >       ppc64le      53M   66M
> > > > >       s390x        21M   41M
> > > > >       x86_64       64M   79M
> > > > >
> > > > > The reason for higher increase on s390x was that dedup algorithm
> > > > > did not detect some of the big kernel structs like 'struct module',
> > > > > so they are duplicated in the kernel module BTF data. The s390x
> > > > > has many small modules that increased significantly in size because
> > > > > of that even after compression.
> > > > >
> > > > > First issues was that the '--btf_gen_floats' option is not passed
> > > > > to pahole for kernel module BTF generation.
> > > > >
> > > > > The other problem is more tricky and is the reason why this patchset
> > > > > is RFC ;-)
> > > > >
> > > > > The s390x compiler generates multiple definitions of the same struct
> > > > > and dedup algorithm does not seem to handle this at the moment.
> > > > >
> > > > > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> > > > >   http://people.redhat.com/~jolsa/kmodbtf/
> > > > >
> > > > > Please let me know if you'd like to see other info/files.
> > > > >
> > > >
> > > > Hard to tell what's going on without vmlinux itself. Can you upload a
> > > > corresponding kernel image with BTF in it?
> > >
> > > sure, uploaded
> > >
> >
> > vmlinux.btfdump:
> >
> > [174] FLOAT 'float' size=4
> > [175] FLOAT 'double' size=8
> >
> > VS
> >
> > pnet.btfdump:
> >
> > [89318] INT 'float' size=4 bits_offset=0 nr_bits=32 encoding=(none)
> > [89319] INT 'double' size=8 bits_offset=0 nr_bits=64 encoding=(none)
>
> ugh, that's with no fix applied, sry
>
> I applied the first patch and uploaded new files
>
> now when I compare the 'module' struct from vmlinux:
>
>         [885] STRUCT 'module' size=1280 vlen=70
>
> and same one from pnet.ko:
>
>         [89323] STRUCT 'module' size=1280 vlen=70
>
> they seem to completely match, all the fields
> and yet it still appears in the kmod's BTF
>

Ok, now struct module is identical down to the types referenced from
the fields, which means it should have been deduplicated completely.
This will require a more time-consuming debugging, though, so I'll put
it on my TODO list for now. If you get to this earlier, see where the
equivalence check fails in btf_dedup (sprinkle debug outputs around to
see what's going on).

> thanks,
> jirka
>
> >
> >
> > > jirka
> > >
> > > >
> > > > > I found code in dedup that seems to handle such situation for arrays,
> > > > > and added 'some' fix for structs. With that change I can no longer
> > > > > see vmlinux's structs in kernel module BTF data, but I have no idea
> > > > > if that breaks anything else.
> > > > >
> > > > > thoughts? thanks,
> > > > > jirka
> > > > >
> > > > >
> > > > > ---
> > > > > Jiri Olsa (2):
> > > > >       kbuild: Unify options for BTF generation for vmlinux and modules
> > > > >       bpf: Add support to detect and dedup instances of same structs
> > > > >
> > > > >  Makefile                  |  3 +++
> > > > >  scripts/Makefile.modfinal |  2 +-
> > > > >  scripts/link-vmlinux.sh   | 11 +----------
> > > > >  scripts/pahole-flags.sh   | 20 ++++++++++++++++++++
> > > > >  tools/lib/bpf/btf.c       | 12 ++++++++++--
> > > > >  5 files changed, 35 insertions(+), 13 deletions(-)
> > > > >  create mode 100755 scripts/pahole-flags.sh
> > > > >
> > > >
> > >
> >
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-10-27 17:53         ` Andrii Nakryiko
@ 2021-10-27 18:18           ` Jiri Olsa
  2021-10-28 19:12             ` Jiri Olsa
  2021-10-28  1:44           ` Kumar Kartikeya Dwivedi
  1 sibling, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2021-10-27 18:18 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Wed, Oct 27, 2021 at 10:53:55AM -0700, Andrii Nakryiko wrote:
> On Wed, Oct 27, 2021 at 1:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Tue, Oct 26, 2021 at 09:12:31PM -0700, Andrii Nakryiko wrote:
> > > On Tue, Oct 26, 2021 at 5:03 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> > > > > On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > >
> > > > > > hi,
> > > > > > I'm trying to enable BTF for kernel module in fedora,
> > > > > > and I'm getting big increase on modules sizes on s390x arch.
> > > > > >
> > > > > > Size of modules in total - kernel dir under /lib/modules/VER/
> > > > > > from kernel-core and kernel-module packages:
> > > > > >
> > > > > >                current   new
> > > > > >       aarch64      60M   76M
> > > > > >       ppc64le      53M   66M
> > > > > >       s390x        21M   41M
> > > > > >       x86_64       64M   79M
> > > > > >
> > > > > > The reason for higher increase on s390x was that dedup algorithm
> > > > > > did not detect some of the big kernel structs like 'struct module',
> > > > > > so they are duplicated in the kernel module BTF data. The s390x
> > > > > > has many small modules that increased significantly in size because
> > > > > > of that even after compression.
> > > > > >
> > > > > > First issues was that the '--btf_gen_floats' option is not passed
> > > > > > to pahole for kernel module BTF generation.
> > > > > >
> > > > > > The other problem is more tricky and is the reason why this patchset
> > > > > > is RFC ;-)
> > > > > >
> > > > > > The s390x compiler generates multiple definitions of the same struct
> > > > > > and dedup algorithm does not seem to handle this at the moment.
> > > > > >
> > > > > > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> > > > > >   http://people.redhat.com/~jolsa/kmodbtf/
> > > > > >
> > > > > > Please let me know if you'd like to see other info/files.
> > > > > >
> > > > >
> > > > > Hard to tell what's going on without vmlinux itself. Can you upload a
> > > > > corresponding kernel image with BTF in it?
> > > >
> > > > sure, uploaded
> > > >
> > >
> > > vmlinux.btfdump:
> > >
> > > [174] FLOAT 'float' size=4
> > > [175] FLOAT 'double' size=8
> > >
> > > VS
> > >
> > > pnet.btfdump:
> > >
> > > [89318] INT 'float' size=4 bits_offset=0 nr_bits=32 encoding=(none)
> > > [89319] INT 'double' size=8 bits_offset=0 nr_bits=64 encoding=(none)
> >
> > ugh, that's with no fix applied, sry
> >
> > I applied the first patch and uploaded new files
> >
> > now when I compare the 'module' struct from vmlinux:
> >
> >         [885] STRUCT 'module' size=1280 vlen=70
> >
> > and same one from pnet.ko:
> >
> >         [89323] STRUCT 'module' size=1280 vlen=70
> >
> > they seem to completely match, all the fields
> > and yet it still appears in the kmod's BTF
> >
> 
> Ok, now struct module is identical down to the types referenced from
> the fields, which means it should have been deduplicated completely.
> This will require a more time-consuming debugging, though, so I'll put
> it on my TODO list for now. If you get to this earlier, see where the
> equivalence check fails in btf_dedup (sprinkle debug outputs around to
> see what's going on).

it failed for me on that hypot_type_id check where I did fix,
I thought it's the issue of multiple same struct in the kmod,
but now I see I might have confused cannon_id with cand_id ;-)
I'll check more on this

jirka


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-10-27 17:53         ` Andrii Nakryiko
  2021-10-27 18:18           ` Jiri Olsa
@ 2021-10-28  1:44           ` Kumar Kartikeya Dwivedi
  1 sibling, 0 replies; 19+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-10-28  1:44 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

On Wed, Oct 27, 2021 at 11:23:55PM IST, Andrii Nakryiko wrote:
> On Wed, Oct 27, 2021 at 1:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Tue, Oct 26, 2021 at 09:12:31PM -0700, Andrii Nakryiko wrote:
> > > On Tue, Oct 26, 2021 at 5:03 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> > > > > On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > >
> > > > > > hi,
> > > > > > I'm trying to enable BTF for kernel module in fedora,
> > > > > > and I'm getting big increase on modules sizes on s390x arch.
> > > > > >
> > > > > > Size of modules in total - kernel dir under /lib/modules/VER/
> > > > > > from kernel-core and kernel-module packages:
> > > > > >
> > > > > >                current   new
> > > > > >       aarch64      60M   76M
> > > > > >       ppc64le      53M   66M
> > > > > >       s390x        21M   41M
> > > > > >       x86_64       64M   79M
> > > > > >
> > > > > > The reason for higher increase on s390x was that dedup algorithm
> > > > > > did not detect some of the big kernel structs like 'struct module',
> > > > > > so they are duplicated in the kernel module BTF data. The s390x
> > > > > > has many small modules that increased significantly in size because
> > > > > > of that even after compression.
> > > > > >
> > > > > > First issues was that the '--btf_gen_floats' option is not passed
> > > > > > to pahole for kernel module BTF generation.
> > > > > >
> > > > > > The other problem is more tricky and is the reason why this patchset
> > > > > > is RFC ;-)
> > > > > >
> > > > > > The s390x compiler generates multiple definitions of the same struct
> > > > > > and dedup algorithm does not seem to handle this at the moment.
> > > > > >
> > > > > > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> > > > > >   http://people.redhat.com/~jolsa/kmodbtf/
> > > > > >
> > > > > > Please let me know if you'd like to see other info/files.
> > > > > >
> > > > >
> > > > > Hard to tell what's going on without vmlinux itself. Can you upload a
> > > > > corresponding kernel image with BTF in it?
> > > >
> > > > sure, uploaded
> > > >
> > >
> > > vmlinux.btfdump:
> > >
> > > [174] FLOAT 'float' size=4
> > > [175] FLOAT 'double' size=8
> > >
> > > VS
> > >
> > > pnet.btfdump:
> > >
> > > [89318] INT 'float' size=4 bits_offset=0 nr_bits=32 encoding=(none)
> > > [89319] INT 'double' size=8 bits_offset=0 nr_bits=64 encoding=(none)
> >
> > ugh, that's with no fix applied, sry
> >
> > I applied the first patch and uploaded new files
> >
> > now when I compare the 'module' struct from vmlinux:
> >
> >         [885] STRUCT 'module' size=1280 vlen=70
> >
> > and same one from pnet.ko:
> >
> >         [89323] STRUCT 'module' size=1280 vlen=70
> >
> > they seem to completely match, all the fields
> > and yet it still appears in the kmod's BTF
> >
>
> Ok, now struct module is identical down to the types referenced from
> the fields, which means it should have been deduplicated completely.
> This will require a more time-consuming debugging, though, so I'll put
> it on my TODO list for now. If you get to this earlier, see where the
> equivalence check fails in btf_dedup (sprinkle debug outputs around to
> see what's going on).
>

Hello Andrii,

I think I'm seeing something similar when working on the conntrack patches [0],
I was looking to match whether the type in a PTR_TO_BTF_ID register is same as
struct nf_conn, but it seems that there can be two BTF IDs for the same struct
type.

When doing bpftool dump, I see:

 ; bpftool btf dump file /sys/kernel/btf/vmlinux format raw | grep nf_conn
 ...
[89224] STRUCT 'nf_conn' size=256 vlen=15
 ...

 ; bpftool btf dump file /sys/kernel/btf/nf_conntrack format raw | grep nf_conn
 ...
[103077] STRUCT 'nf_conn' size=256 vlen=15
[104988] STRUCT 'nf_conn' size=256 vlen=15
[106490] STRUCT 'nf_conn' size=256 vlen=15
[108187] STRUCT 'nf_conn' size=256 vlen=15
 ...

Inside the kernel, when trying to match both, register PTR_TO_BTF_ID refers to
the nf_conntrack BTF ID, while the BTF_ID_LIST resolves to the one in vmlinux,
this ends up making the job of matching the two struct types a bit difficult
(for now, I am thinking of going with btf_struct_ids_match). My original plan
was to compare the result of btf_types_by_id.

[0]: https://github.com/kkdwivedi/linux/commits/conntrack-bpf

> > thanks,
> > jirka
> >
> > >
> > >
> > > > jirka
> > > >
> > > > >
> > > > > > I found code in dedup that seems to handle such situation for arrays,
> > > > > > and added 'some' fix for structs. With that change I can no longer
> > > > > > see vmlinux's structs in kernel module BTF data, but I have no idea
> > > > > > if that breaks anything else.
> > > > > >
> > > > > > thoughts? thanks,
> > > > > > jirka
> > > > > >
> > > > > >
> > > > > > ---
> > > > > > Jiri Olsa (2):
> > > > > >       kbuild: Unify options for BTF generation for vmlinux and modules
> > > > > >       bpf: Add support to detect and dedup instances of same structs
> > > > > >
> > > > > >  Makefile                  |  3 +++
> > > > > >  scripts/Makefile.modfinal |  2 +-
> > > > > >  scripts/link-vmlinux.sh   | 11 +----------
> > > > > >  scripts/pahole-flags.sh   | 20 ++++++++++++++++++++
> > > > > >  tools/lib/bpf/btf.c       | 12 ++++++++++--
> > > > > >  5 files changed, 35 insertions(+), 13 deletions(-)
> > > > > >  create mode 100755 scripts/pahole-flags.sh
> > > > > >
> > > > >
> > > >
> > >
> >

--
Kartikeya

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-10-27 18:18           ` Jiri Olsa
@ 2021-10-28 19:12             ` Jiri Olsa
  2021-11-01 23:14               ` Andrii Nakryiko
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2021-10-28 19:12 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Wed, Oct 27, 2021 at 08:18:11PM +0200, Jiri Olsa wrote:
> On Wed, Oct 27, 2021 at 10:53:55AM -0700, Andrii Nakryiko wrote:
> > On Wed, Oct 27, 2021 at 1:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > On Tue, Oct 26, 2021 at 09:12:31PM -0700, Andrii Nakryiko wrote:
> > > > On Tue, Oct 26, 2021 at 5:03 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > >
> > > > > On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> > > > > > On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > >
> > > > > > > hi,
> > > > > > > I'm trying to enable BTF for kernel module in fedora,
> > > > > > > and I'm getting big increase on modules sizes on s390x arch.
> > > > > > >
> > > > > > > Size of modules in total - kernel dir under /lib/modules/VER/
> > > > > > > from kernel-core and kernel-module packages:
> > > > > > >
> > > > > > >                current   new
> > > > > > >       aarch64      60M   76M
> > > > > > >       ppc64le      53M   66M
> > > > > > >       s390x        21M   41M
> > > > > > >       x86_64       64M   79M
> > > > > > >
> > > > > > > The reason for higher increase on s390x was that dedup algorithm
> > > > > > > did not detect some of the big kernel structs like 'struct module',
> > > > > > > so they are duplicated in the kernel module BTF data. The s390x
> > > > > > > has many small modules that increased significantly in size because
> > > > > > > of that even after compression.
> > > > > > >
> > > > > > > First issues was that the '--btf_gen_floats' option is not passed
> > > > > > > to pahole for kernel module BTF generation.
> > > > > > >
> > > > > > > The other problem is more tricky and is the reason why this patchset
> > > > > > > is RFC ;-)
> > > > > > >
> > > > > > > The s390x compiler generates multiple definitions of the same struct
> > > > > > > and dedup algorithm does not seem to handle this at the moment.
> > > > > > >
> > > > > > > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> > > > > > >   http://people.redhat.com/~jolsa/kmodbtf/
> > > > > > >
> > > > > > > Please let me know if you'd like to see other info/files.
> > > > > > >
> > > > > >
> > > > > > Hard to tell what's going on without vmlinux itself. Can you upload a
> > > > > > corresponding kernel image with BTF in it?
> > > > >
> > > > > sure, uploaded
> > > > >
> > > >
> > > > vmlinux.btfdump:
> > > >
> > > > [174] FLOAT 'float' size=4
> > > > [175] FLOAT 'double' size=8
> > > >
> > > > VS
> > > >
> > > > pnet.btfdump:
> > > >
> > > > [89318] INT 'float' size=4 bits_offset=0 nr_bits=32 encoding=(none)
> > > > [89319] INT 'double' size=8 bits_offset=0 nr_bits=64 encoding=(none)
> > >
> > > ugh, that's with no fix applied, sry
> > >
> > > I applied the first patch and uploaded new files
> > >
> > > now when I compare the 'module' struct from vmlinux:
> > >
> > >         [885] STRUCT 'module' size=1280 vlen=70
> > >
> > > and same one from pnet.ko:
> > >
> > >         [89323] STRUCT 'module' size=1280 vlen=70
> > >
> > > they seem to completely match, all the fields
> > > and yet it still appears in the kmod's BTF
> > >
> > 
> > Ok, now struct module is identical down to the types referenced from
> > the fields, which means it should have been deduplicated completely.
> > This will require a more time-consuming debugging, though, so I'll put
> > it on my TODO list for now. If you get to this earlier, see where the
> > equivalence check fails in btf_dedup (sprinkle debug outputs around to
> > see what's going on).
> 
> it failed for me on that hypot_type_id check where I did fix,
> I thought it's the issue of multiple same struct in the kmod,
> but now I see I might have confused cannon_id with cand_id ;-)
> I'll check more on this

with more checking I got to the same conclusion as before,
now maybe with little more details ;-)

the problem seems to be that in some cases the module BTF
data stores same structs under new/different IDs, while the
kernel BTF data is already dedup-ed

the dedup algo keeps hypot_map of kernel IDs to kmod IDs,
and in my case it will get to the point that the kernel ID
is already 'known' and points to certain kmod ID 'A', but it
is also equiv to another kmod ID 'B' (so kmod ID 'A' and 'B'
are equiv structs) but the dedup will claim as not equiv


This is where the dedup fails for me on that s390 data:

The pt_regs is defined as:

        struct pt_regs
        {
                union {
                        user_pt_regs user_regs;
                        struct {
                                unsigned long args[1];
                                psw_t psw;
                                unsigned long gprs[NUM_GPRS];
                        };
                };
                ...
        };

considering just the first union:

        [186] UNION '(anon)' size=152 vlen=2
                'user_regs' type_id=183 bits_offset=0
                '(anon)' type_id=181 bits_offset=0

        [91251] UNION '(anon)' size=152 vlen=2
                'user_regs' type_id=91247 bits_offset=0
                '(anon)' type_id=91250 bits_offset=0


---------------------------------------------------------------

Comparing the first member 'user_regs':

        struct pt_regs
        {
                union {
    --->                user_pt_regs user_regs;
                        struct {
                                unsigned long args[1];
                                psw_t psw;
                                unsigned long gprs[NUM_GPRS];
                        };
                };

Which looks like:

        typedef struct {
                unsigned long args[1];
                psw_t psw;
                unsigned long gprs[NUM_GPRS];
        } user_pt_regs;


and is also equiv to the next union member struct.. and that's what
kernel knows but not kmod... anyway,


the dedup will compare 'user_pt_regs':

        [183] TYPEDEF 'user_pt_regs' type_id=181

        [91247] TYPEDEF 'user_pt_regs' type_id=91245


        [181] STRUCT '(anon)' size=152 vlen=3
                'args' type_id=182 bits_offset=0
                'psw' type_id=179 bits_offset=64
                'gprs' type_id=48 bits_offset=192

        [91245] STRUCT '(anon)' size=152 vlen=3
                'args' type_id=91246 bits_offset=0
                'psw' type_id=91243 bits_offset=64
                'gprs' type_id=91132 bits_offset=192

and make them equiv by setting hypot_type_id for 181 to be 91245


---------------------------------------------------------------

Now comparing the second member:

        struct pt_regs
        {
                union {
                        user_pt_regs user_regs;
    --->                struct {
                                unsigned long args[1];
                                psw_t psw;
                                unsigned long gprs[NUM_GPRS];
                        };
                };


kernel knows it's same struct as user_pt_regs and uses ID 181

        [186] UNION '(anon)' size=152 vlen=2
                'user_regs' type_id=183 bits_offset=0
                '(anon)' type_id=181 bits_offset=0

but kmod has new ID 91250 (not 91245):

        [91251] UNION '(anon)' size=152 vlen=2
                'user_regs' type_id=91247 bits_offset=0
                '(anon)' type_id=91250 bits_offset=0


and 181 and 91250 are equiv structs:

        [181] STRUCT '(anon)' size=152 vlen=3
                'args' type_id=182 bits_offset=0
                'psw' type_id=179 bits_offset=64
                'gprs' type_id=48 bits_offset=192

        [91250] STRUCT '(anon)' size=152 vlen=3
                'args' type_id=91246 bits_offset=0
                'psw' type_id=91243 bits_offset=64
                'gprs' type_id=91132 bits_offset=192


now hypot_type_id for 181 is 91245, but we have brand new struct
ID 91250, so we fail

what the patch tries to do is at this point to compare ID 91250
with 91245 and if it passes then we are equal and we throw away
ID 91250 because the hypot_type_id for 181 stays 91245


ufff.. thoughts? ;-)

thanks,
jirka


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-10-28 19:12             ` Jiri Olsa
@ 2021-11-01 23:14               ` Andrii Nakryiko
  2021-11-02 14:14                 ` Jiri Olsa
  0 siblings, 1 reply; 19+ messages in thread
From: Andrii Nakryiko @ 2021-11-01 23:14 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Thu, Oct 28, 2021 at 12:12 PM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Wed, Oct 27, 2021 at 08:18:11PM +0200, Jiri Olsa wrote:
> > On Wed, Oct 27, 2021 at 10:53:55AM -0700, Andrii Nakryiko wrote:
> > > On Wed, Oct 27, 2021 at 1:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > On Tue, Oct 26, 2021 at 09:12:31PM -0700, Andrii Nakryiko wrote:
> > > > > On Tue, Oct 26, 2021 at 5:03 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> > > > > > > On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > > >
> > > > > > > > hi,
> > > > > > > > I'm trying to enable BTF for kernel module in fedora,
> > > > > > > > and I'm getting big increase on modules sizes on s390x arch.
> > > > > > > >
> > > > > > > > Size of modules in total - kernel dir under /lib/modules/VER/
> > > > > > > > from kernel-core and kernel-module packages:
> > > > > > > >
> > > > > > > >                current   new
> > > > > > > >       aarch64      60M   76M
> > > > > > > >       ppc64le      53M   66M
> > > > > > > >       s390x        21M   41M
> > > > > > > >       x86_64       64M   79M
> > > > > > > >
> > > > > > > > The reason for higher increase on s390x was that dedup algorithm
> > > > > > > > did not detect some of the big kernel structs like 'struct module',
> > > > > > > > so they are duplicated in the kernel module BTF data. The s390x
> > > > > > > > has many small modules that increased significantly in size because
> > > > > > > > of that even after compression.
> > > > > > > >
> > > > > > > > First issues was that the '--btf_gen_floats' option is not passed
> > > > > > > > to pahole for kernel module BTF generation.
> > > > > > > >
> > > > > > > > The other problem is more tricky and is the reason why this patchset
> > > > > > > > is RFC ;-)
> > > > > > > >
> > > > > > > > The s390x compiler generates multiple definitions of the same struct
> > > > > > > > and dedup algorithm does not seem to handle this at the moment.
> > > > > > > >
> > > > > > > > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> > > > > > > >   http://people.redhat.com/~jolsa/kmodbtf/
> > > > > > > >
> > > > > > > > Please let me know if you'd like to see other info/files.
> > > > > > > >
> > > > > > >
> > > > > > > Hard to tell what's going on without vmlinux itself. Can you upload a
> > > > > > > corresponding kernel image with BTF in it?
> > > > > >
> > > > > > sure, uploaded
> > > > > >
> > > > >
> > > > > vmlinux.btfdump:
> > > > >
> > > > > [174] FLOAT 'float' size=4
> > > > > [175] FLOAT 'double' size=8
> > > > >
> > > > > VS
> > > > >
> > > > > pnet.btfdump:
> > > > >
> > > > > [89318] INT 'float' size=4 bits_offset=0 nr_bits=32 encoding=(none)
> > > > > [89319] INT 'double' size=8 bits_offset=0 nr_bits=64 encoding=(none)
> > > >
> > > > ugh, that's with no fix applied, sry
> > > >
> > > > I applied the first patch and uploaded new files
> > > >
> > > > now when I compare the 'module' struct from vmlinux:
> > > >
> > > >         [885] STRUCT 'module' size=1280 vlen=70
> > > >
> > > > and same one from pnet.ko:
> > > >
> > > >         [89323] STRUCT 'module' size=1280 vlen=70
> > > >
> > > > they seem to completely match, all the fields
> > > > and yet it still appears in the kmod's BTF
> > > >
> > >
> > > Ok, now struct module is identical down to the types referenced from
> > > the fields, which means it should have been deduplicated completely.
> > > This will require a more time-consuming debugging, though, so I'll put
> > > it on my TODO list for now. If you get to this earlier, see where the
> > > equivalence check fails in btf_dedup (sprinkle debug outputs around to
> > > see what's going on).
> >
> > it failed for me on that hypot_type_id check where I did fix,
> > I thought it's the issue of multiple same struct in the kmod,
> > but now I see I might have confused cannon_id with cand_id ;-)
> > I'll check more on this
>
> with more checking I got to the same conclusion as before,
> now maybe with little more details ;-)
>
> the problem seems to be that in some cases the module BTF
> data stores same structs under new/different IDs, while the
> kernel BTF data is already dedup-ed
>
> the dedup algo keeps hypot_map of kernel IDs to kmod IDs,
> and in my case it will get to the point that the kernel ID
> is already 'known' and points to certain kmod ID 'A', but it
> is also equiv to another kmod ID 'B' (so kmod ID 'A' and 'B'
> are equiv structs) but the dedup will claim as not equiv
>
>
> This is where the dedup fails for me on that s390 data:
>
> The pt_regs is defined as:
>
>         struct pt_regs
>         {
>                 union {
>                         user_pt_regs user_regs;
>                         struct {
>                                 unsigned long args[1];
>                                 psw_t psw;
>                                 unsigned long gprs[NUM_GPRS];
>                         };
>                 };
>                 ...
>         };
>
> considering just the first union:
>
>         [186] UNION '(anon)' size=152 vlen=2
>                 'user_regs' type_id=183 bits_offset=0
>                 '(anon)' type_id=181 bits_offset=0
>
>         [91251] UNION '(anon)' size=152 vlen=2
>                 'user_regs' type_id=91247 bits_offset=0
>                 '(anon)' type_id=91250 bits_offset=0
>
>
> ---------------------------------------------------------------
>
> Comparing the first member 'user_regs':
>
>         struct pt_regs
>         {
>                 union {
>     --->                user_pt_regs user_regs;
>                         struct {
>                                 unsigned long args[1];
>                                 psw_t psw;
>                                 unsigned long gprs[NUM_GPRS];
>                         };
>                 };
>
> Which looks like:
>
>         typedef struct {
>                 unsigned long args[1];
>                 psw_t psw;
>                 unsigned long gprs[NUM_GPRS];
>         } user_pt_regs;
>
>
> and is also equiv to the next union member struct.. and that's what
> kernel knows but not kmod... anyway,
>
>
> the dedup will compare 'user_pt_regs':
>
>         [183] TYPEDEF 'user_pt_regs' type_id=181
>
>         [91247] TYPEDEF 'user_pt_regs' type_id=91245
>
>
>         [181] STRUCT '(anon)' size=152 vlen=3
>                 'args' type_id=182 bits_offset=0
>                 'psw' type_id=179 bits_offset=64
>                 'gprs' type_id=48 bits_offset=192
>
>         [91245] STRUCT '(anon)' size=152 vlen=3
>                 'args' type_id=91246 bits_offset=0
>                 'psw' type_id=91243 bits_offset=64
>                 'gprs' type_id=91132 bits_offset=192
>
> and make them equiv by setting hypot_type_id for 181 to be 91245
>
>
> ---------------------------------------------------------------
>
> Now comparing the second member:
>
>         struct pt_regs
>         {
>                 union {
>                         user_pt_regs user_regs;
>     --->                struct {
>                                 unsigned long args[1];
>                                 psw_t psw;
>                                 unsigned long gprs[NUM_GPRS];
>                         };
>                 };
>
>
> kernel knows it's same struct as user_pt_regs and uses ID 181
>
>         [186] UNION '(anon)' size=152 vlen=2
>                 'user_regs' type_id=183 bits_offset=0
>                 '(anon)' type_id=181 bits_offset=0
>
> but kmod has new ID 91250 (not 91245):
>
>         [91251] UNION '(anon)' size=152 vlen=2
>                 'user_regs' type_id=91247 bits_offset=0
>                 '(anon)' type_id=91250 bits_offset=0
>
>
> and 181 and 91250 are equiv structs:
>
>         [181] STRUCT '(anon)' size=152 vlen=3
>                 'args' type_id=182 bits_offset=0
>                 'psw' type_id=179 bits_offset=64
>                 'gprs' type_id=48 bits_offset=192
>
>         [91250] STRUCT '(anon)' size=152 vlen=3
>                 'args' type_id=91246 bits_offset=0
>                 'psw' type_id=91243 bits_offset=64
>                 'gprs' type_id=91132 bits_offset=192
>
>
> now hypot_type_id for 181 is 91245, but we have brand new struct
> ID 91250, so we fail
>
> what the patch tries to do is at this point to compare ID 91250
> with 91245 and if it passes then we are equal and we throw away
> ID 91250 because the hypot_type_id for 181 stays 91245
>
>
> ufff.. thoughts? ;-)

Oh, this is a really great analysis, thanks a lot! It makes everything
clear. Basically, BTF dedup algo does too good job deduping vmlinux
BTF. :)

What's not clear is what to do about that, because a (current)
fundamental assumption of is_equiv() check is that any type within CU
(or in this case deduped vmlinux BTF) has exactly one unique mapping.
Clearly that's not the case now. That array fix you mentioned worked
around GCC bug where this assumption broke. In this case it's not a
bug of a compiler (neither of algo, really), we just need to make algo
smarter.

Let me think about this a bit, we'll need to make the equivalence
check be aware that there could be multiple equivalent mappings and be
ok with that as long as all candidates are equivalent between
themselves. Lots of equivalence and recursion to think about.

It would be great to have a simplified test case to play with that. Do
you mind distilling the chain of types above into a selftests and
posting it to the mailing list so that I can play with it? It
shouldn't be hard to write given BTF writing APIs. And we'll need a
selftests anyway once we improve the algo, so it's definitely not a
wasted work.

And thanks again for analysis and writing this down, it would take me
ages to get to this otherwise.

P.S. If the improved BTF dedup algo will be able to handle this, we
should also remove the array workaround, because that one should work
automatically. I don't know if we have a test for duplicate array
scenario, but it's probably good to have that as well.


>
> thanks,
> jirka
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-11-01 23:14               ` Andrii Nakryiko
@ 2021-11-02 14:14                 ` Jiri Olsa
  2021-11-07 14:57                   ` Jiri Olsa
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2021-11-02 14:14 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Mon, Nov 01, 2021 at 04:14:29PM -0700, Andrii Nakryiko wrote:
> On Thu, Oct 28, 2021 at 12:12 PM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Wed, Oct 27, 2021 at 08:18:11PM +0200, Jiri Olsa wrote:
> > > On Wed, Oct 27, 2021 at 10:53:55AM -0700, Andrii Nakryiko wrote:
> > > > On Wed, Oct 27, 2021 at 1:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > >
> > > > > On Tue, Oct 26, 2021 at 09:12:31PM -0700, Andrii Nakryiko wrote:
> > > > > > On Tue, Oct 26, 2021 at 5:03 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> > > > > > > > On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > hi,
> > > > > > > > > I'm trying to enable BTF for kernel module in fedora,
> > > > > > > > > and I'm getting big increase on modules sizes on s390x arch.
> > > > > > > > >
> > > > > > > > > Size of modules in total - kernel dir under /lib/modules/VER/
> > > > > > > > > from kernel-core and kernel-module packages:
> > > > > > > > >
> > > > > > > > >                current   new
> > > > > > > > >       aarch64      60M   76M
> > > > > > > > >       ppc64le      53M   66M
> > > > > > > > >       s390x        21M   41M
> > > > > > > > >       x86_64       64M   79M
> > > > > > > > >
> > > > > > > > > The reason for higher increase on s390x was that dedup algorithm
> > > > > > > > > did not detect some of the big kernel structs like 'struct module',
> > > > > > > > > so they are duplicated in the kernel module BTF data. The s390x
> > > > > > > > > has many small modules that increased significantly in size because
> > > > > > > > > of that even after compression.
> > > > > > > > >
> > > > > > > > > First issues was that the '--btf_gen_floats' option is not passed
> > > > > > > > > to pahole for kernel module BTF generation.
> > > > > > > > >
> > > > > > > > > The other problem is more tricky and is the reason why this patchset
> > > > > > > > > is RFC ;-)
> > > > > > > > >
> > > > > > > > > The s390x compiler generates multiple definitions of the same struct
> > > > > > > > > and dedup algorithm does not seem to handle this at the moment.
> > > > > > > > >
> > > > > > > > > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> > > > > > > > >   http://people.redhat.com/~jolsa/kmodbtf/
> > > > > > > > >
> > > > > > > > > Please let me know if you'd like to see other info/files.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Hard to tell what's going on without vmlinux itself. Can you upload a
> > > > > > > > corresponding kernel image with BTF in it?
> > > > > > >
> > > > > > > sure, uploaded
> > > > > > >
> > > > > >
> > > > > > vmlinux.btfdump:
> > > > > >
> > > > > > [174] FLOAT 'float' size=4
> > > > > > [175] FLOAT 'double' size=8
> > > > > >
> > > > > > VS
> > > > > >
> > > > > > pnet.btfdump:
> > > > > >
> > > > > > [89318] INT 'float' size=4 bits_offset=0 nr_bits=32 encoding=(none)
> > > > > > [89319] INT 'double' size=8 bits_offset=0 nr_bits=64 encoding=(none)
> > > > >
> > > > > ugh, that's with no fix applied, sry
> > > > >
> > > > > I applied the first patch and uploaded new files
> > > > >
> > > > > now when I compare the 'module' struct from vmlinux:
> > > > >
> > > > >         [885] STRUCT 'module' size=1280 vlen=70
> > > > >
> > > > > and same one from pnet.ko:
> > > > >
> > > > >         [89323] STRUCT 'module' size=1280 vlen=70
> > > > >
> > > > > they seem to completely match, all the fields
> > > > > and yet it still appears in the kmod's BTF
> > > > >
> > > >
> > > > Ok, now struct module is identical down to the types referenced from
> > > > the fields, which means it should have been deduplicated completely.
> > > > This will require a more time-consuming debugging, though, so I'll put
> > > > it on my TODO list for now. If you get to this earlier, see where the
> > > > equivalence check fails in btf_dedup (sprinkle debug outputs around to
> > > > see what's going on).
> > >
> > > it failed for me on that hypot_type_id check where I did fix,
> > > I thought it's the issue of multiple same struct in the kmod,
> > > but now I see I might have confused cannon_id with cand_id ;-)
> > > I'll check more on this
> >
> > with more checking I got to the same conclusion as before,
> > now maybe with little more details ;-)
> >
> > the problem seems to be that in some cases the module BTF
> > data stores same structs under new/different IDs, while the
> > kernel BTF data is already dedup-ed
> >
> > the dedup algo keeps hypot_map of kernel IDs to kmod IDs,
> > and in my case it will get to the point that the kernel ID
> > is already 'known' and points to certain kmod ID 'A', but it
> > is also equiv to another kmod ID 'B' (so kmod ID 'A' and 'B'
> > are equiv structs) but the dedup will claim as not equiv
> >
> >
> > This is where the dedup fails for me on that s390 data:
> >
> > The pt_regs is defined as:
> >
> >         struct pt_regs
> >         {
> >                 union {
> >                         user_pt_regs user_regs;
> >                         struct {
> >                                 unsigned long args[1];
> >                                 psw_t psw;
> >                                 unsigned long gprs[NUM_GPRS];
> >                         };
> >                 };
> >                 ...
> >         };
> >
> > considering just the first union:
> >
> >         [186] UNION '(anon)' size=152 vlen=2
> >                 'user_regs' type_id=183 bits_offset=0
> >                 '(anon)' type_id=181 bits_offset=0
> >
> >         [91251] UNION '(anon)' size=152 vlen=2
> >                 'user_regs' type_id=91247 bits_offset=0
> >                 '(anon)' type_id=91250 bits_offset=0
> >
> >
> > ---------------------------------------------------------------
> >
> > Comparing the first member 'user_regs':
> >
> >         struct pt_regs
> >         {
> >                 union {
> >     --->                user_pt_regs user_regs;
> >                         struct {
> >                                 unsigned long args[1];
> >                                 psw_t psw;
> >                                 unsigned long gprs[NUM_GPRS];
> >                         };
> >                 };
> >
> > Which looks like:
> >
> >         typedef struct {
> >                 unsigned long args[1];
> >                 psw_t psw;
> >                 unsigned long gprs[NUM_GPRS];
> >         } user_pt_regs;
> >
> >
> > and is also equiv to the next union member struct.. and that's what
> > kernel knows but not kmod... anyway,
> >
> >
> > the dedup will compare 'user_pt_regs':
> >
> >         [183] TYPEDEF 'user_pt_regs' type_id=181
> >
> >         [91247] TYPEDEF 'user_pt_regs' type_id=91245
> >
> >
> >         [181] STRUCT '(anon)' size=152 vlen=3
> >                 'args' type_id=182 bits_offset=0
> >                 'psw' type_id=179 bits_offset=64
> >                 'gprs' type_id=48 bits_offset=192
> >
> >         [91245] STRUCT '(anon)' size=152 vlen=3
> >                 'args' type_id=91246 bits_offset=0
> >                 'psw' type_id=91243 bits_offset=64
> >                 'gprs' type_id=91132 bits_offset=192
> >
> > and make them equiv by setting hypot_type_id for 181 to be 91245
> >
> >
> > ---------------------------------------------------------------
> >
> > Now comparing the second member:
> >
> >         struct pt_regs
> >         {
> >                 union {
> >                         user_pt_regs user_regs;
> >     --->                struct {
> >                                 unsigned long args[1];
> >                                 psw_t psw;
> >                                 unsigned long gprs[NUM_GPRS];
> >                         };
> >                 };
> >
> >
> > kernel knows it's same struct as user_pt_regs and uses ID 181
> >
> >         [186] UNION '(anon)' size=152 vlen=2
> >                 'user_regs' type_id=183 bits_offset=0
> >                 '(anon)' type_id=181 bits_offset=0
> >
> > but kmod has new ID 91250 (not 91245):
> >
> >         [91251] UNION '(anon)' size=152 vlen=2
> >                 'user_regs' type_id=91247 bits_offset=0
> >                 '(anon)' type_id=91250 bits_offset=0
> >
> >
> > and 181 and 91250 are equiv structs:
> >
> >         [181] STRUCT '(anon)' size=152 vlen=3
> >                 'args' type_id=182 bits_offset=0
> >                 'psw' type_id=179 bits_offset=64
> >                 'gprs' type_id=48 bits_offset=192
> >
> >         [91250] STRUCT '(anon)' size=152 vlen=3
> >                 'args' type_id=91246 bits_offset=0
> >                 'psw' type_id=91243 bits_offset=64
> >                 'gprs' type_id=91132 bits_offset=192
> >
> >
> > now hypot_type_id for 181 is 91245, but we have brand new struct
> > ID 91250, so we fail
> >
> > what the patch tries to do is at this point to compare ID 91250
> > with 91245 and if it passes then we are equal and we throw away
> > ID 91250 because the hypot_type_id for 181 stays 91245
> >
> >
> > ufff.. thoughts? ;-)
> 
> Oh, this is a really great analysis, thanks a lot! It makes everything
> clear. Basically, BTF dedup algo does too good job deduping vmlinux
> BTF. :)
> 
> What's not clear is what to do about that, because a (current)
> fundamental assumption of is_equiv() check is that any type within CU
> (or in this case deduped vmlinux BTF) has exactly one unique mapping.
> Clearly that's not the case now. That array fix you mentioned worked
> around GCC bug where this assumption broke. In this case it's not a
> bug of a compiler (neither of algo, really), we just need to make algo
> smarter.
> 
> Let me think about this a bit, we'll need to make the equivalence
> check be aware that there could be multiple equivalent mappings and be
> ok with that as long as all candidates are equivalent between
> themselves. Lots of equivalence and recursion to think about.
> 
> It would be great to have a simplified test case to play with that. Do
> you mind distilling the chain of types above into a selftests and
> posting it to the mailing list so that I can play with it? It
> shouldn't be hard to write given BTF writing APIs. And we'll need a
> selftests anyway once we improve the algo, so it's definitely not a
> wasted work.
> 
> And thanks again for analysis and writing this down, it would take me
> ages to get to this otherwise.
> 
> P.S. If the improved BTF dedup algo will be able to handle this, we
> should also remove the array workaround, because that one should work
> automatically. I don't know if we have a test for duplicate array
> scenario, but it's probably good to have that as well.

right, I'll try to add test for both

thanks for checking on this

jirka


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-11-02 14:14                 ` Jiri Olsa
@ 2021-11-07 14:57                   ` Jiri Olsa
  2021-11-09 23:23                     ` Andrii Nakryiko
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2021-11-07 14:57 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Tue, Nov 02, 2021 at 03:14:52PM +0100, Jiri Olsa wrote:
> On Mon, Nov 01, 2021 at 04:14:29PM -0700, Andrii Nakryiko wrote:
> > On Thu, Oct 28, 2021 at 12:12 PM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > On Wed, Oct 27, 2021 at 08:18:11PM +0200, Jiri Olsa wrote:
> > > > On Wed, Oct 27, 2021 at 10:53:55AM -0700, Andrii Nakryiko wrote:
> > > > > On Wed, Oct 27, 2021 at 1:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > >
> > > > > > On Tue, Oct 26, 2021 at 09:12:31PM -0700, Andrii Nakryiko wrote:
> > > > > > > On Tue, Oct 26, 2021 at 5:03 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> > > > > > > > > On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > hi,
> > > > > > > > > > I'm trying to enable BTF for kernel module in fedora,
> > > > > > > > > > and I'm getting big increase on modules sizes on s390x arch.
> > > > > > > > > >
> > > > > > > > > > Size of modules in total - kernel dir under /lib/modules/VER/
> > > > > > > > > > from kernel-core and kernel-module packages:
> > > > > > > > > >
> > > > > > > > > >                current   new
> > > > > > > > > >       aarch64      60M   76M
> > > > > > > > > >       ppc64le      53M   66M
> > > > > > > > > >       s390x        21M   41M
> > > > > > > > > >       x86_64       64M   79M
> > > > > > > > > >
> > > > > > > > > > The reason for higher increase on s390x was that dedup algorithm
> > > > > > > > > > did not detect some of the big kernel structs like 'struct module',
> > > > > > > > > > so they are duplicated in the kernel module BTF data. The s390x
> > > > > > > > > > has many small modules that increased significantly in size because
> > > > > > > > > > of that even after compression.
> > > > > > > > > >
> > > > > > > > > > First issues was that the '--btf_gen_floats' option is not passed
> > > > > > > > > > to pahole for kernel module BTF generation.
> > > > > > > > > >
> > > > > > > > > > The other problem is more tricky and is the reason why this patchset
> > > > > > > > > > is RFC ;-)
> > > > > > > > > >
> > > > > > > > > > The s390x compiler generates multiple definitions of the same struct
> > > > > > > > > > and dedup algorithm does not seem to handle this at the moment.
> > > > > > > > > >
> > > > > > > > > > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> > > > > > > > > >   http://people.redhat.com/~jolsa/kmodbtf/
> > > > > > > > > >
> > > > > > > > > > Please let me know if you'd like to see other info/files.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Hard to tell what's going on without vmlinux itself. Can you upload a
> > > > > > > > > corresponding kernel image with BTF in it?
> > > > > > > >
> > > > > > > > sure, uploaded
> > > > > > > >
> > > > > > >
> > > > > > > vmlinux.btfdump:
> > > > > > >
> > > > > > > [174] FLOAT 'float' size=4
> > > > > > > [175] FLOAT 'double' size=8
> > > > > > >
> > > > > > > VS
> > > > > > >
> > > > > > > pnet.btfdump:
> > > > > > >
> > > > > > > [89318] INT 'float' size=4 bits_offset=0 nr_bits=32 encoding=(none)
> > > > > > > [89319] INT 'double' size=8 bits_offset=0 nr_bits=64 encoding=(none)
> > > > > >
> > > > > > ugh, that's with no fix applied, sry
> > > > > >
> > > > > > I applied the first patch and uploaded new files
> > > > > >
> > > > > > now when I compare the 'module' struct from vmlinux:
> > > > > >
> > > > > >         [885] STRUCT 'module' size=1280 vlen=70
> > > > > >
> > > > > > and same one from pnet.ko:
> > > > > >
> > > > > >         [89323] STRUCT 'module' size=1280 vlen=70
> > > > > >
> > > > > > they seem to completely match, all the fields
> > > > > > and yet it still appears in the kmod's BTF
> > > > > >
> > > > >
> > > > > Ok, now struct module is identical down to the types referenced from
> > > > > the fields, which means it should have been deduplicated completely.
> > > > > This will require a more time-consuming debugging, though, so I'll put
> > > > > it on my TODO list for now. If you get to this earlier, see where the
> > > > > equivalence check fails in btf_dedup (sprinkle debug outputs around to
> > > > > see what's going on).
> > > >
> > > > it failed for me on that hypot_type_id check where I did fix,
> > > > I thought it's the issue of multiple same struct in the kmod,
> > > > but now I see I might have confused cannon_id with cand_id ;-)
> > > > I'll check more on this
> > >
> > > with more checking I got to the same conclusion as before,
> > > now maybe with little more details ;-)
> > >
> > > the problem seems to be that in some cases the module BTF
> > > data stores same structs under new/different IDs, while the
> > > kernel BTF data is already dedup-ed
> > >
> > > the dedup algo keeps hypot_map of kernel IDs to kmod IDs,
> > > and in my case it will get to the point that the kernel ID
> > > is already 'known' and points to certain kmod ID 'A', but it
> > > is also equiv to another kmod ID 'B' (so kmod ID 'A' and 'B'
> > > are equiv structs) but the dedup will claim as not equiv
> > >
> > >
> > > This is where the dedup fails for me on that s390 data:
> > >
> > > The pt_regs is defined as:
> > >
> > >         struct pt_regs
> > >         {
> > >                 union {
> > >                         user_pt_regs user_regs;
> > >                         struct {
> > >                                 unsigned long args[1];
> > >                                 psw_t psw;
> > >                                 unsigned long gprs[NUM_GPRS];
> > >                         };
> > >                 };
> > >                 ...
> > >         };
> > >
> > > considering just the first union:
> > >
> > >         [186] UNION '(anon)' size=152 vlen=2
> > >                 'user_regs' type_id=183 bits_offset=0
> > >                 '(anon)' type_id=181 bits_offset=0
> > >
> > >         [91251] UNION '(anon)' size=152 vlen=2
> > >                 'user_regs' type_id=91247 bits_offset=0
> > >                 '(anon)' type_id=91250 bits_offset=0
> > >
> > >
> > > ---------------------------------------------------------------
> > >
> > > Comparing the first member 'user_regs':
> > >
> > >         struct pt_regs
> > >         {
> > >                 union {
> > >     --->                user_pt_regs user_regs;
> > >                         struct {
> > >                                 unsigned long args[1];
> > >                                 psw_t psw;
> > >                                 unsigned long gprs[NUM_GPRS];
> > >                         };
> > >                 };
> > >
> > > Which looks like:
> > >
> > >         typedef struct {
> > >                 unsigned long args[1];
> > >                 psw_t psw;
> > >                 unsigned long gprs[NUM_GPRS];
> > >         } user_pt_regs;
> > >
> > >
> > > and is also equiv to the next union member struct.. and that's what
> > > kernel knows but not kmod... anyway,
> > >
> > >
> > > the dedup will compare 'user_pt_regs':
> > >
> > >         [183] TYPEDEF 'user_pt_regs' type_id=181
> > >
> > >         [91247] TYPEDEF 'user_pt_regs' type_id=91245
> > >
> > >
> > >         [181] STRUCT '(anon)' size=152 vlen=3
> > >                 'args' type_id=182 bits_offset=0
> > >                 'psw' type_id=179 bits_offset=64
> > >                 'gprs' type_id=48 bits_offset=192
> > >
> > >         [91245] STRUCT '(anon)' size=152 vlen=3
> > >                 'args' type_id=91246 bits_offset=0
> > >                 'psw' type_id=91243 bits_offset=64
> > >                 'gprs' type_id=91132 bits_offset=192
> > >
> > > and make them equiv by setting hypot_type_id for 181 to be 91245
> > >
> > >
> > > ---------------------------------------------------------------
> > >
> > > Now comparing the second member:
> > >
> > >         struct pt_regs
> > >         {
> > >                 union {
> > >                         user_pt_regs user_regs;
> > >     --->                struct {
> > >                                 unsigned long args[1];
> > >                                 psw_t psw;
> > >                                 unsigned long gprs[NUM_GPRS];
> > >                         };
> > >                 };
> > >
> > >
> > > kernel knows it's same struct as user_pt_regs and uses ID 181
> > >
> > >         [186] UNION '(anon)' size=152 vlen=2
> > >                 'user_regs' type_id=183 bits_offset=0
> > >                 '(anon)' type_id=181 bits_offset=0
> > >
> > > but kmod has new ID 91250 (not 91245):
> > >
> > >         [91251] UNION '(anon)' size=152 vlen=2
> > >                 'user_regs' type_id=91247 bits_offset=0
> > >                 '(anon)' type_id=91250 bits_offset=0
> > >
> > >
> > > and 181 and 91250 are equiv structs:
> > >
> > >         [181] STRUCT '(anon)' size=152 vlen=3
> > >                 'args' type_id=182 bits_offset=0
> > >                 'psw' type_id=179 bits_offset=64
> > >                 'gprs' type_id=48 bits_offset=192
> > >
> > >         [91250] STRUCT '(anon)' size=152 vlen=3
> > >                 'args' type_id=91246 bits_offset=0
> > >                 'psw' type_id=91243 bits_offset=64
> > >                 'gprs' type_id=91132 bits_offset=192
> > >
> > >
> > > now hypot_type_id for 181 is 91245, but we have brand new struct
> > > ID 91250, so we fail
> > >
> > > what the patch tries to do is at this point to compare ID 91250
> > > with 91245 and if it passes then we are equal and we throw away
> > > ID 91250 because the hypot_type_id for 181 stays 91245
> > >
> > >
> > > ufff.. thoughts? ;-)
> > 
> > Oh, this is a really great analysis, thanks a lot! It makes everything
> > clear. Basically, BTF dedup algo does too good job deduping vmlinux
> > BTF. :)
> > 
> > What's not clear is what to do about that, because a (current)
> > fundamental assumption of is_equiv() check is that any type within CU
> > (or in this case deduped vmlinux BTF) has exactly one unique mapping.
> > Clearly that's not the case now. That array fix you mentioned worked
> > around GCC bug where this assumption broke. In this case it's not a
> > bug of a compiler (neither of algo, really), we just need to make algo
> > smarter.
> > 
> > Let me think about this a bit, we'll need to make the equivalence
> > check be aware that there could be multiple equivalent mappings and be
> > ok with that as long as all candidates are equivalent between
> > themselves. Lots of equivalence and recursion to think about.
> > 
> > It would be great to have a simplified test case to play with that. Do
> > you mind distilling the chain of types above into a selftests and
> > posting it to the mailing list so that I can play with it? It
> > shouldn't be hard to write given BTF writing APIs. And we'll need a
> > selftests anyway once we improve the algo, so it's definitely not a
> > wasted work.
> > 


I ended up with simply test, where the idea is to use
type id which is defined after currently processed type

the last VALIDATE_RAW_BTF fails

I'm not sending full atch, because I assume this is not
to merge yet also I assume you might want to change that
anyway ;-)

I'll check later on that special array case

thanks,
jirka


---
 .../bpf/prog_tests/btf_dedup_split.c          | 113 ++++++++++++++++++
 1 file changed, 113 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
index 64554fd33547..2ad54e185221 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
@@ -314,6 +314,117 @@ static void test_split_struct_duped() {
 	btf__free(btf1);
 }
 
+static void btf_add_data(struct btf *btf, int start_id)
+{
+#define ID(n) (start_id + n)
+	btf__set_pointer_size(btf, 8); /* enforce 64-bit arch */
+
+	btf__add_int(btf, "int", 4, BTF_INT_SIGNED);	/* [1] int */
+
+	btf__add_struct(btf, "s", 8);			/* [2] struct s { */
+	btf__add_field(btf, "a", ID(3), 0, 0);		/*      struct anon a; */
+	btf__add_field(btf, "b", ID(4), 0, 0);		/*      struct anon b; */
+							/* } */
+
+	btf__add_struct(btf, "(anon)", 8);		/* [3] struct anon { */
+	btf__add_field(btf, "f1", ID(1), 0, 0);		/*      int f1; */
+	btf__add_field(btf, "f2", ID(1), 32, 0);	/*      int f2; */
+							/* } */
+
+	btf__add_struct(btf, "(anon)", 8);		/* [4] struct anon { */
+	btf__add_field(btf, "f1", ID(1), 0, 0);		/*      int f1; */
+	btf__add_field(btf, "f2", ID(1), 32, 0);	/*      int f2; */
+							/* } */
+#undef ID
+}
+
+static void test_split_struct_missed()
+{
+	struct btf *btf1, *btf2;
+	int err;
+
+	/* generate the base data.. */
+	btf1 = btf__new_empty();
+	if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
+		return;
+
+	btf_add_data(btf1, 0);
+
+	VALIDATE_RAW_BTF(
+		btf1,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] STRUCT 's' size=8 vlen=2\n"
+		"\t'a' type_id=3 bits_offset=0\n"
+		"\t'b' type_id=4 bits_offset=0",
+		"[3] STRUCT '(anon)' size=8 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=1 bits_offset=32",
+		"[4] STRUCT '(anon)' size=8 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=1 bits_offset=32");
+
+	/* ..dedup them... */
+	err = btf__dedup(btf1, NULL, NULL);
+	if (!ASSERT_OK(err, "btf_dedup"))
+		goto cleanup;
+
+	VALIDATE_RAW_BTF(
+		btf1,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] STRUCT 's' size=8 vlen=2\n"
+		"\t'a' type_id=3 bits_offset=0\n"
+		"\t'b' type_id=3 bits_offset=0",
+		"[3] STRUCT '(anon)' size=8 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=1 bits_offset=32");
+
+	/* and add the same data on top of it */
+	btf2 = btf__new_empty_split(btf1);
+	if (!ASSERT_OK_PTR(btf2, "empty_split_btf"))
+		goto cleanup;
+
+	btf_add_data(btf2, 3);
+
+	VALIDATE_RAW_BTF(
+		btf2,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] STRUCT 's' size=8 vlen=2\n"
+		"\t'a' type_id=3 bits_offset=0\n"
+		"\t'b' type_id=3 bits_offset=0",
+		"[3] STRUCT '(anon)' size=8 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=1 bits_offset=32",
+		"[4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[5] STRUCT 's' size=8 vlen=2\n"
+		"\t'a' type_id=6 bits_offset=0\n"
+		"\t'b' type_id=7 bits_offset=0",
+		"[6] STRUCT '(anon)' size=8 vlen=2\n"
+		"\t'f1' type_id=4 bits_offset=0\n"
+		"\t'f2' type_id=4 bits_offset=32",
+		"[7] STRUCT '(anon)' size=8 vlen=2\n"
+		"\t'f1' type_id=4 bits_offset=0\n"
+		"\t'f2' type_id=4 bits_offset=32");
+
+	err = btf__dedup(btf2, NULL, NULL);
+	if (!ASSERT_OK(err, "btf_dedup"))
+		goto cleanup;
+
+	/* after dedup it should match the original data */
+	VALIDATE_RAW_BTF(
+		btf2,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] STRUCT 's' size=8 vlen=2\n"
+		"\t'a' type_id=3 bits_offset=0\n"
+		"\t'b' type_id=3 bits_offset=0",
+		"[3] STRUCT '(anon)' size=8 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=1 bits_offset=32");
+
+cleanup:
+	btf__free(btf2);
+	btf__free(btf1);
+}
+
 void test_btf_dedup_split()
 {
 	if (test__start_subtest("split_simple"))
@@ -322,4 +433,6 @@ void test_btf_dedup_split()
 		test_split_struct_duped();
 	if (test__start_subtest("split_fwd_resolve"))
 		test_split_fwd_resolve();
+	if (test__start_subtest("split_struct_missed"))
+		test_split_struct_missed();
 }
-- 
2.32.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-11-07 14:57                   ` Jiri Olsa
@ 2021-11-09 23:23                     ` Andrii Nakryiko
  2021-11-12  7:23                       ` Jiri Olsa
  0 siblings, 1 reply; 19+ messages in thread
From: Andrii Nakryiko @ 2021-11-09 23:23 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Sun, Nov 7, 2021 at 6:57 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Tue, Nov 02, 2021 at 03:14:52PM +0100, Jiri Olsa wrote:
> > On Mon, Nov 01, 2021 at 04:14:29PM -0700, Andrii Nakryiko wrote:
> > > On Thu, Oct 28, 2021 at 12:12 PM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > On Wed, Oct 27, 2021 at 08:18:11PM +0200, Jiri Olsa wrote:
> > > > > On Wed, Oct 27, 2021 at 10:53:55AM -0700, Andrii Nakryiko wrote:
> > > > > > On Wed, Oct 27, 2021 at 1:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > >
> > > > > > > On Tue, Oct 26, 2021 at 09:12:31PM -0700, Andrii Nakryiko wrote:
> > > > > > > > On Tue, Oct 26, 2021 at 5:03 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> > > > > > > > > > On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > hi,
> > > > > > > > > > > I'm trying to enable BTF for kernel module in fedora,
> > > > > > > > > > > and I'm getting big increase on modules sizes on s390x arch.
> > > > > > > > > > >
> > > > > > > > > > > Size of modules in total - kernel dir under /lib/modules/VER/
> > > > > > > > > > > from kernel-core and kernel-module packages:
> > > > > > > > > > >
> > > > > > > > > > >                current   new
> > > > > > > > > > >       aarch64      60M   76M
> > > > > > > > > > >       ppc64le      53M   66M
> > > > > > > > > > >       s390x        21M   41M
> > > > > > > > > > >       x86_64       64M   79M
> > > > > > > > > > >
> > > > > > > > > > > The reason for higher increase on s390x was that dedup algorithm
> > > > > > > > > > > did not detect some of the big kernel structs like 'struct module',
> > > > > > > > > > > so they are duplicated in the kernel module BTF data. The s390x
> > > > > > > > > > > has many small modules that increased significantly in size because
> > > > > > > > > > > of that even after compression.
> > > > > > > > > > >
> > > > > > > > > > > First issues was that the '--btf_gen_floats' option is not passed
> > > > > > > > > > > to pahole for kernel module BTF generation.
> > > > > > > > > > >
> > > > > > > > > > > The other problem is more tricky and is the reason why this patchset
> > > > > > > > > > > is RFC ;-)
> > > > > > > > > > >
> > > > > > > > > > > The s390x compiler generates multiple definitions of the same struct
> > > > > > > > > > > and dedup algorithm does not seem to handle this at the moment.
> > > > > > > > > > >
> > > > > > > > > > > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> > > > > > > > > > >   http://people.redhat.com/~jolsa/kmodbtf/
> > > > > > > > > > >
> > > > > > > > > > > Please let me know if you'd like to see other info/files.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Hard to tell what's going on without vmlinux itself. Can you upload a
> > > > > > > > > > corresponding kernel image with BTF in it?
> > > > > > > > >
> > > > > > > > > sure, uploaded
> > > > > > > > >
> > > > > > > >
> > > > > > > > vmlinux.btfdump:
> > > > > > > >
> > > > > > > > [174] FLOAT 'float' size=4
> > > > > > > > [175] FLOAT 'double' size=8
> > > > > > > >
> > > > > > > > VS
> > > > > > > >
> > > > > > > > pnet.btfdump:
> > > > > > > >
> > > > > > > > [89318] INT 'float' size=4 bits_offset=0 nr_bits=32 encoding=(none)
> > > > > > > > [89319] INT 'double' size=8 bits_offset=0 nr_bits=64 encoding=(none)
> > > > > > >
> > > > > > > ugh, that's with no fix applied, sry
> > > > > > >
> > > > > > > I applied the first patch and uploaded new files
> > > > > > >
> > > > > > > now when I compare the 'module' struct from vmlinux:
> > > > > > >
> > > > > > >         [885] STRUCT 'module' size=1280 vlen=70
> > > > > > >
> > > > > > > and same one from pnet.ko:
> > > > > > >
> > > > > > >         [89323] STRUCT 'module' size=1280 vlen=70
> > > > > > >
> > > > > > > they seem to completely match, all the fields
> > > > > > > and yet it still appears in the kmod's BTF
> > > > > > >
> > > > > >
> > > > > > Ok, now struct module is identical down to the types referenced from
> > > > > > the fields, which means it should have been deduplicated completely.
> > > > > > This will require a more time-consuming debugging, though, so I'll put
> > > > > > it on my TODO list for now. If you get to this earlier, see where the
> > > > > > equivalence check fails in btf_dedup (sprinkle debug outputs around to
> > > > > > see what's going on).
> > > > >
> > > > > it failed for me on that hypot_type_id check where I did fix,
> > > > > I thought it's the issue of multiple same struct in the kmod,
> > > > > but now I see I might have confused cannon_id with cand_id ;-)
> > > > > I'll check more on this
> > > >
> > > > with more checking I got to the same conclusion as before,
> > > > now maybe with little more details ;-)
> > > >
> > > > the problem seems to be that in some cases the module BTF
> > > > data stores same structs under new/different IDs, while the
> > > > kernel BTF data is already dedup-ed
> > > >
> > > > the dedup algo keeps hypot_map of kernel IDs to kmod IDs,
> > > > and in my case it will get to the point that the kernel ID
> > > > is already 'known' and points to certain kmod ID 'A', but it
> > > > is also equiv to another kmod ID 'B' (so kmod ID 'A' and 'B'
> > > > are equiv structs) but the dedup will claim as not equiv
> > > >
> > > >
> > > > This is where the dedup fails for me on that s390 data:
> > > >
> > > > The pt_regs is defined as:
> > > >
> > > >         struct pt_regs
> > > >         {
> > > >                 union {
> > > >                         user_pt_regs user_regs;
> > > >                         struct {
> > > >                                 unsigned long args[1];
> > > >                                 psw_t psw;
> > > >                                 unsigned long gprs[NUM_GPRS];
> > > >                         };
> > > >                 };
> > > >                 ...
> > > >         };
> > > >
> > > > considering just the first union:
> > > >
> > > >         [186] UNION '(anon)' size=152 vlen=2
> > > >                 'user_regs' type_id=183 bits_offset=0
> > > >                 '(anon)' type_id=181 bits_offset=0
> > > >
> > > >         [91251] UNION '(anon)' size=152 vlen=2
> > > >                 'user_regs' type_id=91247 bits_offset=0
> > > >                 '(anon)' type_id=91250 bits_offset=0
> > > >
> > > >
> > > > ---------------------------------------------------------------
> > > >
> > > > Comparing the first member 'user_regs':
> > > >
> > > >         struct pt_regs
> > > >         {
> > > >                 union {
> > > >     --->                user_pt_regs user_regs;
> > > >                         struct {
> > > >                                 unsigned long args[1];
> > > >                                 psw_t psw;
> > > >                                 unsigned long gprs[NUM_GPRS];
> > > >                         };
> > > >                 };
> > > >
> > > > Which looks like:
> > > >
> > > >         typedef struct {
> > > >                 unsigned long args[1];
> > > >                 psw_t psw;
> > > >                 unsigned long gprs[NUM_GPRS];
> > > >         } user_pt_regs;
> > > >
> > > >
> > > > and is also equiv to the next union member struct.. and that's what
> > > > kernel knows but not kmod... anyway,
> > > >
> > > >
> > > > the dedup will compare 'user_pt_regs':
> > > >
> > > >         [183] TYPEDEF 'user_pt_regs' type_id=181
> > > >
> > > >         [91247] TYPEDEF 'user_pt_regs' type_id=91245
> > > >
> > > >
> > > >         [181] STRUCT '(anon)' size=152 vlen=3
> > > >                 'args' type_id=182 bits_offset=0
> > > >                 'psw' type_id=179 bits_offset=64
> > > >                 'gprs' type_id=48 bits_offset=192
> > > >
> > > >         [91245] STRUCT '(anon)' size=152 vlen=3
> > > >                 'args' type_id=91246 bits_offset=0
> > > >                 'psw' type_id=91243 bits_offset=64
> > > >                 'gprs' type_id=91132 bits_offset=192
> > > >
> > > > and make them equiv by setting hypot_type_id for 181 to be 91245
> > > >
> > > >
> > > > ---------------------------------------------------------------
> > > >
> > > > Now comparing the second member:
> > > >
> > > >         struct pt_regs
> > > >         {
> > > >                 union {
> > > >                         user_pt_regs user_regs;
> > > >     --->                struct {
> > > >                                 unsigned long args[1];
> > > >                                 psw_t psw;
> > > >                                 unsigned long gprs[NUM_GPRS];
> > > >                         };
> > > >                 };
> > > >
> > > >
> > > > kernel knows it's same struct as user_pt_regs and uses ID 181
> > > >
> > > >         [186] UNION '(anon)' size=152 vlen=2
> > > >                 'user_regs' type_id=183 bits_offset=0
> > > >                 '(anon)' type_id=181 bits_offset=0
> > > >
> > > > but kmod has new ID 91250 (not 91245):
> > > >
> > > >         [91251] UNION '(anon)' size=152 vlen=2
> > > >                 'user_regs' type_id=91247 bits_offset=0
> > > >                 '(anon)' type_id=91250 bits_offset=0
> > > >
> > > >
> > > > and 181 and 91250 are equiv structs:
> > > >
> > > >         [181] STRUCT '(anon)' size=152 vlen=3
> > > >                 'args' type_id=182 bits_offset=0
> > > >                 'psw' type_id=179 bits_offset=64
> > > >                 'gprs' type_id=48 bits_offset=192
> > > >
> > > >         [91250] STRUCT '(anon)' size=152 vlen=3
> > > >                 'args' type_id=91246 bits_offset=0
> > > >                 'psw' type_id=91243 bits_offset=64
> > > >                 'gprs' type_id=91132 bits_offset=192
> > > >
> > > >
> > > > now hypot_type_id for 181 is 91245, but we have brand new struct
> > > > ID 91250, so we fail
> > > >
> > > > what the patch tries to do is at this point to compare ID 91250
> > > > with 91245 and if it passes then we are equal and we throw away
> > > > ID 91250 because the hypot_type_id for 181 stays 91245
> > > >
> > > >
> > > > ufff.. thoughts? ;-)
> > >
> > > Oh, this is a really great analysis, thanks a lot! It makes everything
> > > clear. Basically, BTF dedup algo does too good job deduping vmlinux
> > > BTF. :)
> > >
> > > What's not clear is what to do about that, because a (current)
> > > fundamental assumption of is_equiv() check is that any type within CU
> > > (or in this case deduped vmlinux BTF) has exactly one unique mapping.
> > > Clearly that's not the case now. That array fix you mentioned worked
> > > around GCC bug where this assumption broke. In this case it's not a
> > > bug of a compiler (neither of algo, really), we just need to make algo
> > > smarter.
> > >
> > > Let me think about this a bit, we'll need to make the equivalence
> > > check be aware that there could be multiple equivalent mappings and be
> > > ok with that as long as all candidates are equivalent between
> > > themselves. Lots of equivalence and recursion to think about.
> > >
> > > It would be great to have a simplified test case to play with that. Do
> > > you mind distilling the chain of types above into a selftests and
> > > posting it to the mailing list so that I can play with it? It
> > > shouldn't be hard to write given BTF writing APIs. And we'll need a
> > > selftests anyway once we improve the algo, so it's definitely not a
> > > wasted work.
> > >
>
>
> I ended up with simply test, where the idea is to use
> type id which is defined after currently processed type
>
> the last VALIDATE_RAW_BTF fails
>
> I'm not sending full atch, because I assume this is not
> to merge yet also I assume you might want to change that
> anyway ;-)
>

Thanks, Jiri! I'll get to playing with this some time this week,
hopefully. I hope this is not a huge blocker for you?


> I'll check later on that special array case
>
> thanks,
> jirka
>
>
> ---
>  .../bpf/prog_tests/btf_dedup_split.c          | 113 ++++++++++++++++++
>  1 file changed, 113 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
> index 64554fd33547..2ad54e185221 100644
> --- a/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
> +++ b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
> @@ -314,6 +314,117 @@ static void test_split_struct_duped() {
>         btf__free(btf1);
>  }
>
> +static void btf_add_data(struct btf *btf, int start_id)
> +{
> +#define ID(n) (start_id + n)
> +       btf__set_pointer_size(btf, 8); /* enforce 64-bit arch */
> +
> +       btf__add_int(btf, "int", 4, BTF_INT_SIGNED);    /* [1] int */
> +
> +       btf__add_struct(btf, "s", 8);                   /* [2] struct s { */
> +       btf__add_field(btf, "a", ID(3), 0, 0);          /*      struct anon a; */
> +       btf__add_field(btf, "b", ID(4), 0, 0);          /*      struct anon b; */
> +                                                       /* } */
> +
> +       btf__add_struct(btf, "(anon)", 8);              /* [3] struct anon { */
> +       btf__add_field(btf, "f1", ID(1), 0, 0);         /*      int f1; */
> +       btf__add_field(btf, "f2", ID(1), 32, 0);        /*      int f2; */
> +                                                       /* } */
> +
> +       btf__add_struct(btf, "(anon)", 8);              /* [4] struct anon { */
> +       btf__add_field(btf, "f1", ID(1), 0, 0);         /*      int f1; */
> +       btf__add_field(btf, "f2", ID(1), 32, 0);        /*      int f2; */
> +                                                       /* } */
> +#undef ID
> +}
> +
> +static void test_split_struct_missed()
> +{
> +       struct btf *btf1, *btf2;
> +       int err;
> +
> +       /* generate the base data.. */
> +       btf1 = btf__new_empty();
> +       if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
> +               return;
> +
> +       btf_add_data(btf1, 0);
> +
> +       VALIDATE_RAW_BTF(
> +               btf1,
> +               "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> +               "[2] STRUCT 's' size=8 vlen=2\n"
> +               "\t'a' type_id=3 bits_offset=0\n"
> +               "\t'b' type_id=4 bits_offset=0",
> +               "[3] STRUCT '(anon)' size=8 vlen=2\n"
> +               "\t'f1' type_id=1 bits_offset=0\n"
> +               "\t'f2' type_id=1 bits_offset=32",
> +               "[4] STRUCT '(anon)' size=8 vlen=2\n"
> +               "\t'f1' type_id=1 bits_offset=0\n"
> +               "\t'f2' type_id=1 bits_offset=32");
> +
> +       /* ..dedup them... */
> +       err = btf__dedup(btf1, NULL, NULL);
> +       if (!ASSERT_OK(err, "btf_dedup"))
> +               goto cleanup;
> +
> +       VALIDATE_RAW_BTF(
> +               btf1,
> +               "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> +               "[2] STRUCT 's' size=8 vlen=2\n"
> +               "\t'a' type_id=3 bits_offset=0\n"
> +               "\t'b' type_id=3 bits_offset=0",
> +               "[3] STRUCT '(anon)' size=8 vlen=2\n"
> +               "\t'f1' type_id=1 bits_offset=0\n"
> +               "\t'f2' type_id=1 bits_offset=32");
> +
> +       /* and add the same data on top of it */
> +       btf2 = btf__new_empty_split(btf1);
> +       if (!ASSERT_OK_PTR(btf2, "empty_split_btf"))
> +               goto cleanup;
> +
> +       btf_add_data(btf2, 3);
> +
> +       VALIDATE_RAW_BTF(
> +               btf2,
> +               "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> +               "[2] STRUCT 's' size=8 vlen=2\n"
> +               "\t'a' type_id=3 bits_offset=0\n"
> +               "\t'b' type_id=3 bits_offset=0",
> +               "[3] STRUCT '(anon)' size=8 vlen=2\n"
> +               "\t'f1' type_id=1 bits_offset=0\n"
> +               "\t'f2' type_id=1 bits_offset=32",
> +               "[4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> +               "[5] STRUCT 's' size=8 vlen=2\n"
> +               "\t'a' type_id=6 bits_offset=0\n"
> +               "\t'b' type_id=7 bits_offset=0",
> +               "[6] STRUCT '(anon)' size=8 vlen=2\n"
> +               "\t'f1' type_id=4 bits_offset=0\n"
> +               "\t'f2' type_id=4 bits_offset=32",
> +               "[7] STRUCT '(anon)' size=8 vlen=2\n"
> +               "\t'f1' type_id=4 bits_offset=0\n"
> +               "\t'f2' type_id=4 bits_offset=32");
> +
> +       err = btf__dedup(btf2, NULL, NULL);
> +       if (!ASSERT_OK(err, "btf_dedup"))
> +               goto cleanup;
> +
> +       /* after dedup it should match the original data */
> +       VALIDATE_RAW_BTF(
> +               btf2,
> +               "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> +               "[2] STRUCT 's' size=8 vlen=2\n"
> +               "\t'a' type_id=3 bits_offset=0\n"
> +               "\t'b' type_id=3 bits_offset=0",
> +               "[3] STRUCT '(anon)' size=8 vlen=2\n"
> +               "\t'f1' type_id=1 bits_offset=0\n"
> +               "\t'f2' type_id=1 bits_offset=32");
> +
> +cleanup:
> +       btf__free(btf2);
> +       btf__free(btf1);
> +}
> +
>  void test_btf_dedup_split()
>  {
>         if (test__start_subtest("split_simple"))
> @@ -322,4 +433,6 @@ void test_btf_dedup_split()
>                 test_split_struct_duped();
>         if (test__start_subtest("split_fwd_resolve"))
>                 test_split_fwd_resolve();
> +       if (test__start_subtest("split_struct_missed"))
> +               test_split_struct_missed();
>  }
> --
> 2.32.0
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC bpf-next 0/2] bpf: Fix BTF data for modules
  2021-11-09 23:23                     ` Andrii Nakryiko
@ 2021-11-12  7:23                       ` Jiri Olsa
  0 siblings, 0 replies; 19+ messages in thread
From: Jiri Olsa @ 2021-11-12  7:23 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Tue, Nov 09, 2021 at 03:23:25PM -0800, Andrii Nakryiko wrote:
> On Sun, Nov 7, 2021 at 6:57 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Tue, Nov 02, 2021 at 03:14:52PM +0100, Jiri Olsa wrote:
> > > On Mon, Nov 01, 2021 at 04:14:29PM -0700, Andrii Nakryiko wrote:
> > > > On Thu, Oct 28, 2021 at 12:12 PM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > >
> > > > > On Wed, Oct 27, 2021 at 08:18:11PM +0200, Jiri Olsa wrote:
> > > > > > On Wed, Oct 27, 2021 at 10:53:55AM -0700, Andrii Nakryiko wrote:
> > > > > > > On Wed, Oct 27, 2021 at 1:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Oct 26, 2021 at 09:12:31PM -0700, Andrii Nakryiko wrote:
> > > > > > > > > On Tue, Oct 26, 2021 at 5:03 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Oct 25, 2021 at 09:54:48PM -0700, Andrii Nakryiko wrote:
> > > > > > > > > > > On Sat, Oct 23, 2021 at 5:05 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > hi,
> > > > > > > > > > > > I'm trying to enable BTF for kernel module in fedora,
> > > > > > > > > > > > and I'm getting big increase on modules sizes on s390x arch.
> > > > > > > > > > > >
> > > > > > > > > > > > Size of modules in total - kernel dir under /lib/modules/VER/
> > > > > > > > > > > > from kernel-core and kernel-module packages:
> > > > > > > > > > > >
> > > > > > > > > > > >                current   new
> > > > > > > > > > > >       aarch64      60M   76M
> > > > > > > > > > > >       ppc64le      53M   66M
> > > > > > > > > > > >       s390x        21M   41M
> > > > > > > > > > > >       x86_64       64M   79M
> > > > > > > > > > > >
> > > > > > > > > > > > The reason for higher increase on s390x was that dedup algorithm
> > > > > > > > > > > > did not detect some of the big kernel structs like 'struct module',
> > > > > > > > > > > > so they are duplicated in the kernel module BTF data. The s390x
> > > > > > > > > > > > has many small modules that increased significantly in size because
> > > > > > > > > > > > of that even after compression.
> > > > > > > > > > > >
> > > > > > > > > > > > First issues was that the '--btf_gen_floats' option is not passed
> > > > > > > > > > > > to pahole for kernel module BTF generation.
> > > > > > > > > > > >
> > > > > > > > > > > > The other problem is more tricky and is the reason why this patchset
> > > > > > > > > > > > is RFC ;-)
> > > > > > > > > > > >
> > > > > > > > > > > > The s390x compiler generates multiple definitions of the same struct
> > > > > > > > > > > > and dedup algorithm does not seem to handle this at the moment.
> > > > > > > > > > > >
> > > > > > > > > > > > I put the debuginfo and btf dump of the s390x pnet.ko module in here:
> > > > > > > > > > > >   http://people.redhat.com/~jolsa/kmodbtf/
> > > > > > > > > > > >
> > > > > > > > > > > > Please let me know if you'd like to see other info/files.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Hard to tell what's going on without vmlinux itself. Can you upload a
> > > > > > > > > > > corresponding kernel image with BTF in it?
> > > > > > > > > >
> > > > > > > > > > sure, uploaded
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > vmlinux.btfdump:
> > > > > > > > >
> > > > > > > > > [174] FLOAT 'float' size=4
> > > > > > > > > [175] FLOAT 'double' size=8
> > > > > > > > >
> > > > > > > > > VS
> > > > > > > > >
> > > > > > > > > pnet.btfdump:
> > > > > > > > >
> > > > > > > > > [89318] INT 'float' size=4 bits_offset=0 nr_bits=32 encoding=(none)
> > > > > > > > > [89319] INT 'double' size=8 bits_offset=0 nr_bits=64 encoding=(none)
> > > > > > > >
> > > > > > > > ugh, that's with no fix applied, sry
> > > > > > > >
> > > > > > > > I applied the first patch and uploaded new files
> > > > > > > >
> > > > > > > > now when I compare the 'module' struct from vmlinux:
> > > > > > > >
> > > > > > > >         [885] STRUCT 'module' size=1280 vlen=70
> > > > > > > >
> > > > > > > > and same one from pnet.ko:
> > > > > > > >
> > > > > > > >         [89323] STRUCT 'module' size=1280 vlen=70
> > > > > > > >
> > > > > > > > they seem to completely match, all the fields
> > > > > > > > and yet it still appears in the kmod's BTF
> > > > > > > >
> > > > > > >
> > > > > > > Ok, now struct module is identical down to the types referenced from
> > > > > > > the fields, which means it should have been deduplicated completely.
> > > > > > > This will require a more time-consuming debugging, though, so I'll put
> > > > > > > it on my TODO list for now. If you get to this earlier, see where the
> > > > > > > equivalence check fails in btf_dedup (sprinkle debug outputs around to
> > > > > > > see what's going on).
> > > > > >
> > > > > > it failed for me on that hypot_type_id check where I did fix,
> > > > > > I thought it's the issue of multiple same struct in the kmod,
> > > > > > but now I see I might have confused cannon_id with cand_id ;-)
> > > > > > I'll check more on this
> > > > >
> > > > > with more checking I got to the same conclusion as before,
> > > > > now maybe with little more details ;-)
> > > > >
> > > > > the problem seems to be that in some cases the module BTF
> > > > > data stores same structs under new/different IDs, while the
> > > > > kernel BTF data is already dedup-ed
> > > > >
> > > > > the dedup algo keeps hypot_map of kernel IDs to kmod IDs,
> > > > > and in my case it will get to the point that the kernel ID
> > > > > is already 'known' and points to certain kmod ID 'A', but it
> > > > > is also equiv to another kmod ID 'B' (so kmod ID 'A' and 'B'
> > > > > are equiv structs) but the dedup will claim as not equiv
> > > > >
> > > > >
> > > > > This is where the dedup fails for me on that s390 data:
> > > > >
> > > > > The pt_regs is defined as:
> > > > >
> > > > >         struct pt_regs
> > > > >         {
> > > > >                 union {
> > > > >                         user_pt_regs user_regs;
> > > > >                         struct {
> > > > >                                 unsigned long args[1];
> > > > >                                 psw_t psw;
> > > > >                                 unsigned long gprs[NUM_GPRS];
> > > > >                         };
> > > > >                 };
> > > > >                 ...
> > > > >         };
> > > > >
> > > > > considering just the first union:
> > > > >
> > > > >         [186] UNION '(anon)' size=152 vlen=2
> > > > >                 'user_regs' type_id=183 bits_offset=0
> > > > >                 '(anon)' type_id=181 bits_offset=0
> > > > >
> > > > >         [91251] UNION '(anon)' size=152 vlen=2
> > > > >                 'user_regs' type_id=91247 bits_offset=0
> > > > >                 '(anon)' type_id=91250 bits_offset=0
> > > > >
> > > > >
> > > > > ---------------------------------------------------------------
> > > > >
> > > > > Comparing the first member 'user_regs':
> > > > >
> > > > >         struct pt_regs
> > > > >         {
> > > > >                 union {
> > > > >     --->                user_pt_regs user_regs;
> > > > >                         struct {
> > > > >                                 unsigned long args[1];
> > > > >                                 psw_t psw;
> > > > >                                 unsigned long gprs[NUM_GPRS];
> > > > >                         };
> > > > >                 };
> > > > >
> > > > > Which looks like:
> > > > >
> > > > >         typedef struct {
> > > > >                 unsigned long args[1];
> > > > >                 psw_t psw;
> > > > >                 unsigned long gprs[NUM_GPRS];
> > > > >         } user_pt_regs;
> > > > >
> > > > >
> > > > > and is also equiv to the next union member struct.. and that's what
> > > > > kernel knows but not kmod... anyway,
> > > > >
> > > > >
> > > > > the dedup will compare 'user_pt_regs':
> > > > >
> > > > >         [183] TYPEDEF 'user_pt_regs' type_id=181
> > > > >
> > > > >         [91247] TYPEDEF 'user_pt_regs' type_id=91245
> > > > >
> > > > >
> > > > >         [181] STRUCT '(anon)' size=152 vlen=3
> > > > >                 'args' type_id=182 bits_offset=0
> > > > >                 'psw' type_id=179 bits_offset=64
> > > > >                 'gprs' type_id=48 bits_offset=192
> > > > >
> > > > >         [91245] STRUCT '(anon)' size=152 vlen=3
> > > > >                 'args' type_id=91246 bits_offset=0
> > > > >                 'psw' type_id=91243 bits_offset=64
> > > > >                 'gprs' type_id=91132 bits_offset=192
> > > > >
> > > > > and make them equiv by setting hypot_type_id for 181 to be 91245
> > > > >
> > > > >
> > > > > ---------------------------------------------------------------
> > > > >
> > > > > Now comparing the second member:
> > > > >
> > > > >         struct pt_regs
> > > > >         {
> > > > >                 union {
> > > > >                         user_pt_regs user_regs;
> > > > >     --->                struct {
> > > > >                                 unsigned long args[1];
> > > > >                                 psw_t psw;
> > > > >                                 unsigned long gprs[NUM_GPRS];
> > > > >                         };
> > > > >                 };
> > > > >
> > > > >
> > > > > kernel knows it's same struct as user_pt_regs and uses ID 181
> > > > >
> > > > >         [186] UNION '(anon)' size=152 vlen=2
> > > > >                 'user_regs' type_id=183 bits_offset=0
> > > > >                 '(anon)' type_id=181 bits_offset=0
> > > > >
> > > > > but kmod has new ID 91250 (not 91245):
> > > > >
> > > > >         [91251] UNION '(anon)' size=152 vlen=2
> > > > >                 'user_regs' type_id=91247 bits_offset=0
> > > > >                 '(anon)' type_id=91250 bits_offset=0
> > > > >
> > > > >
> > > > > and 181 and 91250 are equiv structs:
> > > > >
> > > > >         [181] STRUCT '(anon)' size=152 vlen=3
> > > > >                 'args' type_id=182 bits_offset=0
> > > > >                 'psw' type_id=179 bits_offset=64
> > > > >                 'gprs' type_id=48 bits_offset=192
> > > > >
> > > > >         [91250] STRUCT '(anon)' size=152 vlen=3
> > > > >                 'args' type_id=91246 bits_offset=0
> > > > >                 'psw' type_id=91243 bits_offset=64
> > > > >                 'gprs' type_id=91132 bits_offset=192
> > > > >
> > > > >
> > > > > now hypot_type_id for 181 is 91245, but we have brand new struct
> > > > > ID 91250, so we fail
> > > > >
> > > > > what the patch tries to do is at this point to compare ID 91250
> > > > > with 91245 and if it passes then we are equal and we throw away
> > > > > ID 91250 because the hypot_type_id for 181 stays 91245
> > > > >
> > > > >
> > > > > ufff.. thoughts? ;-)
> > > >
> > > > Oh, this is a really great analysis, thanks a lot! It makes everything
> > > > clear. Basically, BTF dedup algo does too good job deduping vmlinux
> > > > BTF. :)
> > > >
> > > > What's not clear is what to do about that, because a (current)
> > > > fundamental assumption of is_equiv() check is that any type within CU
> > > > (or in this case deduped vmlinux BTF) has exactly one unique mapping.
> > > > Clearly that's not the case now. That array fix you mentioned worked
> > > > around GCC bug where this assumption broke. In this case it's not a
> > > > bug of a compiler (neither of algo, really), we just need to make algo
> > > > smarter.
> > > >
> > > > Let me think about this a bit, we'll need to make the equivalence
> > > > check be aware that there could be multiple equivalent mappings and be
> > > > ok with that as long as all candidates are equivalent between
> > > > themselves. Lots of equivalence and recursion to think about.
> > > >
> > > > It would be great to have a simplified test case to play with that. Do
> > > > you mind distilling the chain of types above into a selftests and
> > > > posting it to the mailing list so that I can play with it? It
> > > > shouldn't be hard to write given BTF writing APIs. And we'll need a
> > > > selftests anyway once we improve the algo, so it's definitely not a
> > > > wasted work.
> > > >
> >
> >
> > I ended up with simply test, where the idea is to use
> > type id which is defined after currently processed type
> >
> > the last VALIDATE_RAW_BTF fails
> >
> > I'm not sending full atch, because I assume this is not
> > to merge yet also I assume you might want to change that
> > anyway ;-)
> >
> 
> Thanks, Jiri! I'll get to playing with this some time this week,
> hopefully. I hope this is not a huge blocker for you?

I switched off s390x for the moment, so we can move on
I don't think it's huge blocker, nobody is screaming yet ;-)

thanks,
jirka

> 
> 
> > I'll check later on that special array case
> >
> > thanks,
> > jirka
> >
> >
> > ---
> >  .../bpf/prog_tests/btf_dedup_split.c          | 113 ++++++++++++++++++
> >  1 file changed, 113 insertions(+)
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
> > index 64554fd33547..2ad54e185221 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
> > @@ -314,6 +314,117 @@ static void test_split_struct_duped() {
> >         btf__free(btf1);
> >  }
> >
> > +static void btf_add_data(struct btf *btf, int start_id)
> > +{
> > +#define ID(n) (start_id + n)
> > +       btf__set_pointer_size(btf, 8); /* enforce 64-bit arch */
> > +
> > +       btf__add_int(btf, "int", 4, BTF_INT_SIGNED);    /* [1] int */
> > +
> > +       btf__add_struct(btf, "s", 8);                   /* [2] struct s { */
> > +       btf__add_field(btf, "a", ID(3), 0, 0);          /*      struct anon a; */
> > +       btf__add_field(btf, "b", ID(4), 0, 0);          /*      struct anon b; */
> > +                                                       /* } */
> > +
> > +       btf__add_struct(btf, "(anon)", 8);              /* [3] struct anon { */
> > +       btf__add_field(btf, "f1", ID(1), 0, 0);         /*      int f1; */
> > +       btf__add_field(btf, "f2", ID(1), 32, 0);        /*      int f2; */
> > +                                                       /* } */
> > +
> > +       btf__add_struct(btf, "(anon)", 8);              /* [4] struct anon { */
> > +       btf__add_field(btf, "f1", ID(1), 0, 0);         /*      int f1; */
> > +       btf__add_field(btf, "f2", ID(1), 32, 0);        /*      int f2; */
> > +                                                       /* } */
> > +#undef ID
> > +}
> > +
> > +static void test_split_struct_missed()
> > +{
> > +       struct btf *btf1, *btf2;
> > +       int err;
> > +
> > +       /* generate the base data.. */
> > +       btf1 = btf__new_empty();
> > +       if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
> > +               return;
> > +
> > +       btf_add_data(btf1, 0);
> > +
> > +       VALIDATE_RAW_BTF(
> > +               btf1,
> > +               "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> > +               "[2] STRUCT 's' size=8 vlen=2\n"
> > +               "\t'a' type_id=3 bits_offset=0\n"
> > +               "\t'b' type_id=4 bits_offset=0",
> > +               "[3] STRUCT '(anon)' size=8 vlen=2\n"
> > +               "\t'f1' type_id=1 bits_offset=0\n"
> > +               "\t'f2' type_id=1 bits_offset=32",
> > +               "[4] STRUCT '(anon)' size=8 vlen=2\n"
> > +               "\t'f1' type_id=1 bits_offset=0\n"
> > +               "\t'f2' type_id=1 bits_offset=32");
> > +
> > +       /* ..dedup them... */
> > +       err = btf__dedup(btf1, NULL, NULL);
> > +       if (!ASSERT_OK(err, "btf_dedup"))
> > +               goto cleanup;
> > +
> > +       VALIDATE_RAW_BTF(
> > +               btf1,
> > +               "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> > +               "[2] STRUCT 's' size=8 vlen=2\n"
> > +               "\t'a' type_id=3 bits_offset=0\n"
> > +               "\t'b' type_id=3 bits_offset=0",
> > +               "[3] STRUCT '(anon)' size=8 vlen=2\n"
> > +               "\t'f1' type_id=1 bits_offset=0\n"
> > +               "\t'f2' type_id=1 bits_offset=32");
> > +
> > +       /* and add the same data on top of it */
> > +       btf2 = btf__new_empty_split(btf1);
> > +       if (!ASSERT_OK_PTR(btf2, "empty_split_btf"))
> > +               goto cleanup;
> > +
> > +       btf_add_data(btf2, 3);
> > +
> > +       VALIDATE_RAW_BTF(
> > +               btf2,
> > +               "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> > +               "[2] STRUCT 's' size=8 vlen=2\n"
> > +               "\t'a' type_id=3 bits_offset=0\n"
> > +               "\t'b' type_id=3 bits_offset=0",
> > +               "[3] STRUCT '(anon)' size=8 vlen=2\n"
> > +               "\t'f1' type_id=1 bits_offset=0\n"
> > +               "\t'f2' type_id=1 bits_offset=32",
> > +               "[4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> > +               "[5] STRUCT 's' size=8 vlen=2\n"
> > +               "\t'a' type_id=6 bits_offset=0\n"
> > +               "\t'b' type_id=7 bits_offset=0",
> > +               "[6] STRUCT '(anon)' size=8 vlen=2\n"
> > +               "\t'f1' type_id=4 bits_offset=0\n"
> > +               "\t'f2' type_id=4 bits_offset=32",
> > +               "[7] STRUCT '(anon)' size=8 vlen=2\n"
> > +               "\t'f1' type_id=4 bits_offset=0\n"
> > +               "\t'f2' type_id=4 bits_offset=32");
> > +
> > +       err = btf__dedup(btf2, NULL, NULL);
> > +       if (!ASSERT_OK(err, "btf_dedup"))
> > +               goto cleanup;
> > +
> > +       /* after dedup it should match the original data */
> > +       VALIDATE_RAW_BTF(
> > +               btf2,
> > +               "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> > +               "[2] STRUCT 's' size=8 vlen=2\n"
> > +               "\t'a' type_id=3 bits_offset=0\n"
> > +               "\t'b' type_id=3 bits_offset=0",
> > +               "[3] STRUCT '(anon)' size=8 vlen=2\n"
> > +               "\t'f1' type_id=1 bits_offset=0\n"
> > +               "\t'f2' type_id=1 bits_offset=32");
> > +
> > +cleanup:
> > +       btf__free(btf2);
> > +       btf__free(btf1);
> > +}
> > +
> >  void test_btf_dedup_split()
> >  {
> >         if (test__start_subtest("split_simple"))
> > @@ -322,4 +433,6 @@ void test_btf_dedup_split()
> >                 test_split_struct_duped();
> >         if (test__start_subtest("split_fwd_resolve"))
> >                 test_split_fwd_resolve();
> > +       if (test__start_subtest("split_struct_missed"))
> > +               test_split_struct_missed();
> >  }
> > --
> > 2.32.0
> >
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-11-12  7:23 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-23 12:04 [RFC bpf-next 0/2] bpf: Fix BTF data for modules Jiri Olsa
2021-10-23 12:04 ` [PATCH bpf-next 1/2] kbuild: Unify options for BTF generation for vmlinux and modules Jiri Olsa
2021-10-26  4:56   ` Andrii Nakryiko
2021-10-26 12:03     ` Jiri Olsa
2021-10-23 12:04 ` [PATCH bpf-next 2/2] bpf: Add support to detect and dedup instances of same structs Jiri Olsa
2021-10-26  4:59   ` Andrii Nakryiko
2021-10-26  4:54 ` [RFC bpf-next 0/2] bpf: Fix BTF data for modules Andrii Nakryiko
2021-10-26 12:03   ` Jiri Olsa
2021-10-27  4:12     ` Andrii Nakryiko
2021-10-27  8:53       ` Jiri Olsa
2021-10-27 17:53         ` Andrii Nakryiko
2021-10-27 18:18           ` Jiri Olsa
2021-10-28 19:12             ` Jiri Olsa
2021-11-01 23:14               ` Andrii Nakryiko
2021-11-02 14:14                 ` Jiri Olsa
2021-11-07 14:57                   ` Jiri Olsa
2021-11-09 23:23                     ` Andrii Nakryiko
2021-11-12  7:23                       ` Jiri Olsa
2021-10-28  1:44           ` Kumar Kartikeya Dwivedi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).