Re: sparc implementation

From: Vincentius Robby (vincentius_at_umich_dot_edu)
Date: Wed Sep 03 2008 - 08:54:39 PDT

  • Next message: Vincentius Robby: "Re: sparc implementation"
    Hello Paul,
    
    This is my current code for compare_and_swap:
    CR_INLINE unsigned int
    cri_cmp_swap(cri_atomic_t *p, unsigned int oldval, unsigned int newval)
    {
       register unsigned char ret;
       int tmp;
       static unsigned char lock;
       __asm__ __volatile__("1:      ldstub  [%1], %0\n\t"
    		       "        cmp     %0, 0\n\t"
                            "        bne     1b\n\t"
                            "         nop"
                            : "=&r" (tmp)
                            : "r" (&lock)
                            : "memory");
       if (*p != oldval)
         ret = 0;
       else {
         *p = newval;
         ret = 1;
       }
       __asm__ __volatile__("stb     %%g0, [%0]"
                            : /* no outputs */
                            : "r" (&lock)
                            : "memory");
    
       return ret;
    }
    
    The ldstub seems to be used for memory locks. The code that includes  
    the cas instructions seems effective, although it may not be available  
    for V8.
    
    I tried to make this up for the add_fetch which I don't even know if it works:
    CR_INLINE unsigned int
    cri_atomic_add_fetch(cri_atomic_t *p, int op)
    {
       static unsigned char lock;
       unsigned int oldval, newval;
       int tmp;
    
       __asm__ __volatile__("1:	ldstub	[%1], %0\n\t"
    		       "	cmp	%0, 0\n\t"
    		       "	bne	1b\n\t"
    		       "	 nop"
    		       : "=&r" (tmp)
    		       : "r" (&lock)
    		       : "memory");
       oldval = *p;
       *p += op;
       newval = *p;
       __asm__ __volatile__("stb	%%g0, [%0]"
    		       : /* no outputs */
    		       : "r" (&lock)
    		       : "memory");
    
       return newval;
    }
    
    For the syscall, I ripped off here and there, and does this seem like  
    it will work?
    
    #define cri_syscall_cleanup(res,errno_p) \
         __asm__ volatile ("" ::: "memory", "cc", "cx", "dx");\
         if ((unsigned long)res >= (unsigned long)(-4096)) {	\
    	if (errno_p != NULL) { *errno_p = -res; }	\
    	res = -1;					\
         }
    
    #define cri_syscall0(type,name,nr)					\
    type name(int *errno_p)							\
    ({									\
    	register long __o0 __asm__ ("o0");				\
    	register long __g1 __asm__ ("g1") = name;			\
    	__asm __volatile (type : "=r" (__g1), "=r" (__o0) :		\
    			  "0" (__g1) :					\
    			  __SYSCALL_CLOBBERS);				\
    	__o0;								\
    	cri_syscall_cleanup(__g1, errno_p);				\
    })
    
    #define cri_syscall1(type,name,arg1)					\
    type name(type1 arg1,int *errno_p)					\
    ({									\
    	register long __o0 __asm__ ("o0") = (long)(arg1);		\
    	register long __g1 __asm__ ("g1") = name;			\
    	__asm __volatile (type : "=r" (__g1), "=r" (__o0) :		\
    			  "0" (__g1), "1" (__o0) :			\
    			  __SYSCALL_CLOBBERS);				\
    	__o0;								\
    	cri_syscall_cleanup(__g1, errno_p);				\
    })
    
    I am not sure what the assembly code in the cri_syscall_cleanup does,  
    would it need modification for sparc?
    
    Also, this is the last line before the errors:
    make[2]: Entering directory `/opt/blcr-0.7.3_vincent/builddir/libcr'
    if /bin/sh ../libtool --mode=compile gcc -DHAVE_CONFIG_H -I.  
    -I../../libcr -I.. -D_GNU_SOURCE -D_REENTRANT -I../include  
    -I../../include -I../../libcr/arch//   -Wall -Wno-unused-function  
    -fno-stack-protector  -g -O2 -MT libcr_la-cr_async.lo -MD -MP -MF  
    ".deps/libcr_la-cr_async.Tpo" -c -o libcr_la-cr_async.lo `test -f  
    'cr_async.c' || echo '../../libcr/'`cr_async.c; \
             then mv -f ".deps/libcr_la-cr_async.Tpo"  
    ".deps/libcr_la-cr_async.Plo"; else rm -f  
    ".deps/libcr_la-cr_async.Tpo"; exit 1; fi
      gcc -DHAVE_CONFIG_H -I. -I../../libcr -I.. -D_GNU_SOURCE  
    -D_REENTRANT -I../include -I../../include -I../../libcr/arch// -Wall  
    -Wno-unused-function -fno-stack-protector -g -O2 -MT  
    libcr_la-cr_async.lo -MD -MP -MF .deps/libcr_la-cr_async.Tpo -c  
    ../../libcr/cr_async.c  -fPIC -DPIC -o .libs/libcr_la-cr_async.o
    In file included from ../../libcr/cr_async.c:37:
    ../../libcr/cr_private.h:63:23: error: cr_atomic.h: No such file or directory
    In file included from ../../libcr/cr_private.h:65,
                      from ../../libcr/cr_async.c:37:
    
    Note that it appears that instead of sparc or sparc64, it actually has  
    nothing in between the slahes (-I../../libcr/arch//) Is this because  
    there was no macro defined for sparc? Then I tried looking for the  
    Makefile that includes the snippet but am not able to find it yet.  
    Would you be able to give some more pointers?
    
    Thank you for all the help.
    
    -- 
    Vincentius Robby
    
    Quoting "Paul H. Hargrove" <PHHargrove_at_lbl_dot_gov>:
    
    > My best guess is that the -I options are not quite right to include  
    > the libcr/arch/sparc directory.  Take a look at the full command  
    > line that make executes for a file in libcr.  For example, on a i686  
    > I see:
    >
    > /bin/sh ../libtool --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I.  
    > -I.. -I../../libcr -D_GNU_SOURCE -D_REENTRANT -I../include  
    > -I../../include -I../../libcr/arch/i386/   -Wall  
    > -Wno-unused-function -fno-stack-protector  -g -O2 -MT  
    > libcr_la-cr_async.lo -MD -MP -MF .deps/libcr_la-cr_async.Tpo -c -o  
    > libcr_la-cr_async.lo `test -f 'cr_async.c' || echo  
    > '../../libcr/'`cr_async.c
    >
    > Notice the -I../../libcr/arch/i386/ part.  My best guess is that  
    > your says -I../../libcr/arch/sparc64/, rather than  
    > -I../../libcr/arch/sparc/ as you are expecting (note the ../.. is  
    > based on where my build dir is located in relation to my source dir,  
    > your -I's may differ).  If this is the case, then you should  
    > consider sparc and sparc64 dirs with the same relation as the  
    > existing ppc and ppc64 dirs (64 includes the 32bit code).
    >
    > As for the inc and dec-and-test, you can implement them using the following:
    >
    > CR_INLINE unsigned int
    > __cri_atomic_add_fetch(cri_atomic_t *p, unsigned int op)
    > {
    >    unsigned long oldval, newval;
    >    do {
    >        oldval = cri_atomic_read(p);
    >        newval = oldval + op;
    >    } while (!cri_cmp_swap(p, oldval, newval));
    >    return newval;
    > }
    >
    > CR_INLINE void
    > cri_atomic_inc(cri_atomic_t *p)
    > {
    >    (void)__cri_atomic_add_fetch(p, 1);
    > }
    >
    > CR_INLINE int
    > cri_atomic_dec_and_test(cri_atomic_t *p)
    > {
    >    return (__cri_atomic_add_fetch(p, -1) == 0);
    > }
    >
    >
    > These should be considered the "reference" implementations and would  
    > appear in a porting guide if I had written one.
    >
    > However, I think the following would be the (nearly) "optimal"  
    > __cri_atomic_add_fetch() for UltraSPARC and newer:
    >
    > CR_INLINE unsigned int
    > __cri_atomic_add_fetch(cri_atomic_t *p, unsigned int op)
    > {
    >    register unsigned int oldval, newval;
    >    __asm__ __volatile__ (
    >        "ld       [%4],%0    \n\t" /* oldval = *addr; */
    >        "0:                  \t"
    >        "add      %0,%3,%1   \n\t" /* newval = oldval + op; */
    >        "cas      [%4],%0,%1 \n\t" /* if (*addr == oldval)  
    > SWAP(*addr,newval); else newval = *addr; */
    >        "cmp      %0, %1     \n\t" /* check if newval == oldval (swap  
    > succeeded) */
    >        "bne,a,pn %%icc, 0b  \n\t" /* otherwise, retry (,pn ==  
    > predict not taken; ,a == annul) */
    >        "  mov    %1, %0     "     /* oldval = newval; (branch delay  
    > slot, annulled if not taken) */
    >        : "=&r"(oldval), "=&r"(newval), "=m"(*p)
    >        : "rn"(op), "r"(p), "m"(*p) );
    >    return newval;
    > }
    >
    >
    > If I got that right (based on a different SPARC atomics project I  
    > worked on), the generated asm for atomic inc and dec-and-test will  
    > use immediate +1 and -1 arguments to the add instruction.
    >
    > I'll need to think about whether you need any memory barriers to  
    > make this 100% correct on the SPARC architecture.  Have you  
    > considered that for the cri_cmp_swap()?
    >
    > The syscall code is found in glibc.  For instance in  
    > glibc-2.6/sysdeps/unix/sysv/linux/sparc/sysdep.h, where you'll want  
    > to use guts of the inline_syscall0() through inline_syscall5()  
    > macros, combined with the errno handling as seen in  
    > blcr/libcr/arch/i386/cr_arch.h:cri_syscall_cleanup().
    >
    > Just to be 100% proper about this code I've included here:
    > Signed-off-by: Paul H. Hargrove <PHHargrove_at_lbl_dot_gov>
    >
    > -Paul
    >
    > Vincentius Robby wrote:
    >> Thank you Paul,
    >>
    >> Now even after I put the cr_arch.h and cr_atomic.h under  
    >> libcr/arch/sparc, the following errors appear:
    >> In file included from ../../libcr/cr_async.c:37:
    >> ../../libcr/cr_private.h:63:23: error: cr_atomic.h: No such file or  
    >> directory
    >> In file included from ../../libcr/cr_private.h:65,
    >>                 from ../../libcr/cr_async.c:37:
    >> [some more errors]
    >> ../../libcr/cr_private.h:66:21: error: cr_arch.h: No such file or directory
    >>
    >> Do I have to change something else for blcr to realize the files' existence?
    >> Also, would you be able to point me to some other resources for the  
    >> assembly codes? For cr_atomic.h, I understood how to implement  
    >> compare_and_swap but I am not able to infer the atomic increment  
    >> and decrement and test from the glibc source codes as well as the  
    >> source for other architectures. For the syscall functions, would  
    >> you know of where can I look into? Should the glibc have these?
    >>
    >> Thank you very much for the help, I've been slowly looking into  
    >> these for a while, but my inexperience hinders me from advancing as  
    >> quick.
    >>
    >
    >
    > -- 
    > Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    > Future Technologies Group                 HPC Research Department     
    >                Tel: +1-510-495-2352
    > Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    >
    >
    >
    >
    

  • Next message: Vincentius Robby: "Re: sparc implementation"