From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Wed Jul 25 2007 - 09:53:23 PDT
Cedric Le Goater wrote: > Cedric Le Goater wrote: > >>> Hmm, not 100% sure on that one. There is code in >>> cr_module/cr_module.c:cr_init_module() that compares the addresses of >>> two symbols as probed from the System.map at configure time against >>> their addresses as resolved by the kernel's module linker/loader. BLCR >>> refuses to load the module if these don't match, since it will be making >>> function calls to other addresses obtained in the same way (and we >>> really don't want to invoke code at random addresses in kernel context). >>> >>> So, my best guess is that message means what it says, perhaps due to >>> BLCR's autoconf machinery locating the wrong System.map file. You >>> should try comparing the output of >>> $ grep register_chrdev /proc/kyms >>> against that of >>> $ grep register_chrdev [MAPFILE] >>> where [MAPFILE] is the System.map file being used by BLCR (try "grep >>> LINUX_SYMTAB_FILE Makefile" in your BLCR build directory). If they >>> match, then it is possible that something is happening with respect to >>> kernel linking/relocation that BLCR is not prepared to deal with. If >>> they don't match then it might still be a relocation issue, but more >>> likely it means BLCR found the wrong System.map file. If that is the >>> case, try passing --with-system-map=[WHATEVER] when configuring BLCR. >>> Let us know what you find. >>> >> hmm, the addresses in the /boot/System.map-2.6.22.1-27.fc7 file and >> the ones from /proc/kallsyms (same kernel shipped by fedora) are >> different. weird. I'll investigate. >> > > dunno why this is not the same. > My guess is that fc7 has enabled relocation of the kernel image, in which case the difference between the addresses in System.map and /proc/kallsyms would probably be a multiple of the page size. Could you check for CONFIG_RELOCATABLE in your kernel config (probably in /boot/config-2.6.22.1-27.fc7). If that is the case, then BLCR is going to need to find a way to deal with this (and I don't have any bright ideas at the moment). > >>>> what about glibc 2.6 ? >>>> >>> What about it? I don't have any systems running glibc 2.6. If you have >>> specific problems (once past the System.map problem), please let us know >>> and we'll see what we can do to sort them out. >>> >> I will as soon as i get that module loaded ! :) >> > > so I generated a real System.map with : > > $ cat /proc/kallsyms > System.map > If that works for you, then --with-system-map=/proc/kallsyms should work as well. Did that really work with no other changes? I've been unable to use that approach on other systems because the symbol "_end" (which we key off to validate System.map) is missing from /proc/kallsyms where I've tried it. > > configured, built and run 'make check' : > > PASS: atomics > PASS: cr_run > PASS: bug2003 > PASS: stage0001.st > PASS: stage0002.st > PASS: stage0003.st > PASS: critical_sections.st > PASS: replace_cb.st > PASS: failed_cb.st > PASS: pid_in_use.st > PASS: simple.ct > PASS: simple_pthread.ct > PASS: cwd.ct > PASS: dup.ct > PASS: filedescriptors.ct > PASS: pipe.ct > PASS: named_fifo.ct > PASS: cloexec.ct > PASS: get_info.ct > PASS: orphan.ct > PASS: overlap.ct > PASS: child.ct > PASS: mmaps.ct > No hugetlbfs mount point found (test skipped) > SKIP: hugetlbfs.ct > PASS: readdir.ct > PASS: dev_null.ct > PASS: cr_signal.ct > PASS: linked_fifo.ct > ====================== > All 27 tests passed > (1 tests were not run) > > it seems fine on a bare fc7. I suppose that the tests are doing a > checkpoint+restart sequence. right ? > > Yes, all the test that end in ".ct" are doing checkpoint+restart (some of them multiple times). > thanks ! > > C. > You are welcome. -Paul -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900