top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

generating unaligned vector load instructions using gcc

+1 vote

I wonder how one could get the compiler to generate the "movdqu" instruction, since the vector extensions always seem to assume that everything will be aligned to 16 byte.
I tried using a packed struct and this dint help much. Of course one can always resort to inline assembly but this should not be necessary

Compile with:

gcc -O2 -S -msse2 testvecs.c

Using built-in specs.

Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.7.2-5' 
--enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr 
--program-suffix=-4.7 --enable-shared --enable-linker-build-id 
--with-system-zlib --libexecdir=/usr/lib --without-included-gettext 
--enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes 
--enable-gnu-unique-object --enable-plugin --enable-objc-gc 
--enable-targets=all --with-arch-32=i586 --with-tune=generic 
--enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu 
Thread model: posix
gcc version 4.7.2 (Debian 4.7.2-5)
posted Sep 18, 2013 by Jagan Mishra

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

1 Answer

+1 vote

I do see a movdqu, over a range of gcc (64-bit) versions from 4.4.6 to 4.9. Some of the compilers are complaining about mixed data type arithmetic on lines 29 and 42.
I don't know whether it applies here, but splitting an unaligned memory move is likely to be the right thing on platforms up through Intel Westmere, so you would want to specify -march=native to optimize for newer ones.

answer Sep 18, 2013 by Ahmed Patel
Similar Questions
0 votes

I was working in an Embedded processor with GCC-4.6.4 version. I need to add load/store reverse instructions to the MD file. My instructions will look as below:

 LWX Rd,Ra,Rb
 operation: Addr := Ra + Rb
 Rd := *Addr (loading data with the opposite endianness)
 SWX Rd,Ra,Rb
 operation: Addr := Ra + Rb
 *Addr := Rd (storing data with the opposite endianness)

To add the above instructions in to md file I tried below pattern in md file

 (define_insn "movsi_rev"
 [(set (match_operand:SI 0 "nonimmediate_operand" "=d,m")
 (bswap: SI (match_operand:SI 1 "move_src_operand" "m,d")))]
 [(set_attr "type" "load,store")
 (set_attr "mode" "SI")
 (set_attr "length" "4,4")])

I wrote a small testcase which is generating swx instruction but the operands are more due to which it is failing in assembler phase

 ex: instead of swx r0,r2,r0 it is generating swx r0,r2,0,r0

can anyone please help me in removing the extra operand in the above instruction.