Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 Jul 2011 11:02:21 +0300
From:      Mikolaj Golub <trociny@freebsd.org>
To:        freebsd-virtualization@freebsd.org
Cc:        Kostik Belousov <kib@FreeBSD.org>, Marko Zec <zec@FreeBSD.org>, "Bjoern A. Zeeb" <bz@freebsd.org>, Robert Watson <rwatson@freebsd.org>, Pawel Jakub Dawidek <pjd@FreeBSD.org>
Subject:   FIX: accessing module's virtualized global variables from another module
Message-ID:  <86r55ghzsi.fsf@kopusha.home.net>

next in thread | raw e-mail | index | archive | help
--=-=-=

Hi,

Some time ago I reported about this problem (the thread "vnet: acessing
module's virtualized global variables from another module"). I got a reply
only from Marko but failed to convinced him that it was not issue with curvnet
context being not set properly.

I will try to explain the issue again, as this time I come with a patch that
might be a solution. Below are some things that may be obvious for people who
are involved in the subject. Please skip to the last paragraphs then :-).

So, kernel and modules have 'set_vnet' linker set and the virtualized global
variables (defined via VNET_DEFINE() macro) are placed in this set. This
serves two purposes:

1) When a new virtual network stack instance is allocated it is initialized
copying the globals from kernel 'set_vnet' linker set.

2) The variable name acts as unique global name by which the variable can be
referred to. The location of a per-virtual instance variable is calculated at
run-time using accessor macros (VNET_VNET, VNET, ...), like in the example
below for ifnet variable in the vnet0:

	vnet0->vnet_data_base + (uintptr_t) & vnet_entry_ifnet

When a vnet is created its vnet_data_base is set accordingly to make this
work.

This works for virtualized variables from kernel 'set_vnet' linker set, but
modules need additional work. When a module is loaded its global variables
from 'set_vnet' linker set are copied to the kernel 'set_vnet', which has
additional space ('modspace') reserved for this purpose. This makes (1) work.
And to make the formula (2) valid for the module's virtual global the linker
reallocates all its references according to this formula:

	(x - ef->vnet_start) + ef->vnet_base

(see kern/link_elf.c:elf_relocaddr()).

Here is an example. In the ipfw module layer3_chain virtual global is defined:

kopusha:~% nm /boot/kernel/ipfw.ko |grep -i 'set_vnet\|layer3_chain'
00010820 A __start_set_vnet
00010b0c A __stop_set_vnet
00010840 d vnet_entry_layer3_chain

So it has relative address 0x00010840. The module is loaded at 0x9337f000:

kopusha:~% kldstat|grep ipfw.ko
22    2 0x9337f000 11000    ipfw.ko

and vnet_entry_layer3_chain is at 0x9337f000 + 0x00010840 = 0x9338f840:

(kgdb) p &vnet_entry_layer3_chain
$1 = (struct ip_fw_chain *) 0x9338f840

But inside ipfw.ko, the linker relocated the 0x9338f840 references to
0x86dac580, otherwise VNET macros will access wrong location:

(kgdb) p *(struct ip_fw_chain *)(vnet0->vnet_data_base + (uintptr_t) &vnet_entry_layer3_chain)
Error accessing memory address 0x9908b2c0: Bad address.
(kgdb) p *(struct ip_fw_chain *)(vnet0->vnet_data_base + (uintptr_t) 0x86dac580)
$2 = {rules = 0x0, reap = 0x872d296c, default_rule = 0x937bdbd0, n_rules = -1823040548, 
  ...

So far so good. But what if the global defined in one module is accessed from
another module (as it is in the case of layer3_chain, which is accessed from
ipfw_nat)? The linker relocates references only for "local" globals (defined
in its own module), so inside ipfw_nat the global is accessed using unmodified
vnet_entry_layer3_chain location, thus by wrong address.

The fix could be to make the linker on a module load recognize "external"
virtual globals defined in previously loaded modules and relocate them
accordingly. For this we need set_vnet list, where the addresses of modules
'set_vnet' linker sets are stored in. The attach patch implements this and
works for me.

Also, could someone explain what is kern/link_elf_obj.c for? The patch does
not deal with this file, but I see the reloactions here similar to those in
link_elf.c. So I might need to do similar modifiactions in link_elf_obj.c too,
but I don't know where this code is used (it looks like it is not executed on
my box, so I don't know how to test it).

-- 
Mikolaj Golub


--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline; filename=vnet.vnet_set.patch

Index: sys/net/vnet.c
===================================================================
--- sys/net/vnet.c	(revision 224286)
+++ sys/net/vnet.c	(working copy)
@@ -152,6 +152,11 @@ struct vnet *vnet0;
  * initialized, in order to propagate the new static initializations to all
  * existing virtual network stack instances so that the soon-to-be executing
  * module will find every network stack instance with proper default values.
+ *
+ * The addresses of modules 'set_vnet' linker sets are stored in
+ * vnet_set list. The list is used by the kernel module linker to bind
+ * references of virtualized global variables defined in another
+ * modules.
  */
 
 /*
@@ -302,6 +307,102 @@ vnet_destroy(struct vnet *vnet)
 }
 
 /*
+ * set_vnet list.
+ */
+MALLOC_DEFINE(M_SET_VNET, "set_vnet", "VNET data");
+
+struct set_vnet {
+	uintptr_t       vs_start;
+	uintptr_t       vs_stop;
+	uintptr_t	vs_base;
+	TAILQ_ENTRY(set_vnet)	vs_link;
+};
+
+static TAILQ_HEAD(set_vnet_head, set_vnet) set_vnet_list;
+
+static void
+set_vnet_insert(struct set_vnet *set)
+{
+	struct set_vnet *iter;
+
+	TAILQ_FOREACH(iter, &set_vnet_list, vs_link) {
+
+		KASSERT((set->vs_start < iter->vs_start && set->vs_stop < iter->vs_stop) ||
+		    (set->vs_start > iter->vs_start && set->vs_stop > iter->vs_stop),
+		    ("vnet sets intersection: to insert: 0x%jx-0x%jx; inserted: 0x%jx-0x%jx",
+		    (uintmax_t)set->vs_start, (uintmax_t)set->vs_stop,
+		    (uintmax_t)iter->vs_start, (uintmax_t)iter->vs_stop));
+
+		if (iter->vs_start > set->vs_start) {
+			TAILQ_INSERT_BEFORE(iter, set, vs_link);
+			break;
+		}
+	}
+
+	if (iter == NULL)
+		TAILQ_INSERT_TAIL(&set_vnet_list, set, vs_link);
+}
+
+static void
+set_vnet_remove(struct set_vnet *set)
+{
+	TAILQ_REMOVE(&set_vnet_list, set, vs_link);
+}
+
+static struct set_vnet *
+set_vnet_search(uintptr_t addr)
+{
+	struct set_vnet *set;
+
+	TAILQ_FOREACH(set, &set_vnet_list, vs_link) {
+		if (addr < set->vs_start)
+			return (NULL);
+		if (addr < set->vs_stop)
+			return (set);
+	}
+	return (NULL);
+}
+
+void
+set_vnet_add(uintptr_t start, uintptr_t stop, uintptr_t base)
+{
+	struct set_vnet *set;
+
+	set = malloc(sizeof(*set), M_SET_VNET, M_WAITOK | M_ZERO);
+	set->vs_start = start;
+	set->vs_stop = stop;
+	set->vs_base = base;
+	set_vnet_insert(set);
+}
+
+void
+set_vnet_delete(uintptr_t start)
+{
+	struct set_vnet *set;
+
+	set = set_vnet_search(start);
+	KASSERT(set != NULL, ("deleting unknown vnet set (start = 0x%jx)",
+	    (uintmax_t)start));
+	set_vnet_remove(set);
+	free(set, M_SET_VNET);
+}
+
+int
+set_vnet_find(uintptr_t addr, uintptr_t *start, uintptr_t *base)
+{
+	struct set_vnet *set;
+
+	set = set_vnet_search(addr);
+	if (set != NULL) {
+		*start = set->vs_start;
+		*base = set->vs_base;
+		return (1);
+	} else {
+		return (0);
+	}
+}
+
+/*
  * Boot time initialization and allocation of virtual network stacks.
  */
 static void
@@ -312,6 +413,7 @@ vnet_init_prelink(void *arg)
 	sx_init(&vnet_sxlock, "vnet_sxlock");
 	sx_init(&vnet_sysinit_sxlock, "vnet_sysinit_sxlock");
 	LIST_INIT(&vnet_head);
+	TAILQ_INIT(&set_vnet_list);
 }
 SYSINIT(vnet_init_prelink, SI_SUB_VNET_PRELINK, SI_ORDER_FIRST,
     vnet_init_prelink, NULL);
Index: sys/net/vnet.h
===================================================================
--- sys/net/vnet.h	(revision 224286)
+++ sys/net/vnet.h	(working copy)
@@ -109,6 +109,20 @@ struct vnet *vnet_alloc(void);
 void	vnet_destroy(struct vnet *vnet);
 
 /*
+ * Functions to manage vnet_set list.
+ */
+void vnet_set_add(uintptr_t start, uintptr_t end, uintptr_t base);
+void vnet_set_delete(uintptr_t start);
+int vnet_set_find(uintptr_t addr, uintptr_t *start, uintptr_t *base);
+
+/*
+ * Functions to manage set_vnet list.
+ */
+void set_vnet_add(uintptr_t start, uintptr_t end, uintptr_t base);
+void set_vnet_delete(uintptr_t start);
+int set_vnet_find(uintptr_t addr, uintptr_t *start, uintptr_t *base);
+
+/*
  * The current virtual network stack -- we may wish to move this to struct
  * pcpu in the future.
  */
Index: sys/kern/link_elf.c
===================================================================
--- sys/kern/link_elf.c	(revision 224286)
+++ sys/kern/link_elf.c	(working copy)
@@ -544,6 +544,7 @@ parse_vnet(elf_file_t ef)
 		return (ENOSPC);
 	memcpy((void *)ef->vnet_base, (void *)ef->vnet_start, count);
 	vnet_data_copy((void *)ef->vnet_base, count);
+	set_vnet_add(ef->vnet_start, ef->vnet_stop, ef->vnet_base);
 
 	return (0);
 }
@@ -1000,6 +1001,7 @@ link_elf_unload_file(linker_file_t file)
 	if (ef->vnet_base != 0) {
 		vnet_data_free((void *)ef->vnet_base,
 		    ef->vnet_stop - ef->vnet_start);
+		set_vnet_delete(ef->vnet_start);
 	}
 #endif
 #ifdef GDB
@@ -1438,6 +1440,10 @@ elf_lookup(linker_file_t lf, Elf_Size symidx, int
 	elf_file_t ef = (elf_file_t)lf;
 	const Elf_Sym *sym;
 	const char *symbol;
+	Elf_Addr addr;
+#ifdef VIMAGE
+	uintptr_t vnet_start, vnet_base;
+#endif
 
 	/* Don't even try to lookup the symbol if the index is bogus. */
 	if (symidx >= ef->nchains)
@@ -1469,7 +1475,13 @@ elf_lookup(linker_file_t lf, Elf_Size symidx, int
 	if (*symbol == 0)
 		return (0);
 
-	return ((Elf_Addr)linker_file_lookup_symbol(lf, symbol, deps));
+	addr = ((Elf_Addr)linker_file_lookup_symbol(lf, symbol, deps));
+
+#ifdef VIMAGE
+	if (set_vnet_find(addr, &vnet_start, &vnet_base))
+		addr = addr - vnet_start + vnet_base;
+#endif
+	return addr;
 }
 
 static void

--=-=-=--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86r55ghzsi.fsf>