DevMaster.net Forums
[[ Home | Forums | 3D Engines Database | Wiki | Articles/Tutorials | Game Dev Jobs | IRC Chat Network | Contact Us ]]

Go Back   DevMaster.net Forums > Site Discussions > Code & Snapshot Discussion
User Name
Password
Register FAQ Members List Search Today's Posts Mark Forums Read

Reply
 
Thread Tools Search this Thread Display Modes
Old 08-01-2008, 08:00 AM   #1
Enlight
 
Posts: n/a
Default

The other day with a co-worker (Leo Benaducci) we started a small contest: adding support for the swizzle operator, available in shader languages (hlsl, cg, glsl), to any standard Vector 2, 3 or 4 class in C++. Something like:

Code:
Vector3 a; Vector3 b; Vector3 c; a = b.xyy; c.yz = b.zz;
An hour later (more or less), we both came with a solution, but using different approaches. Leo solved it with a template, and I just used a couple of macros. Both solutions provide optimal assembly code in VC++ 2005 compiler with no additional overhead at all from the swizzle operator. Here you have both versions and examples showing how to use them.

Leo swizzle with mini vector3 class:
Code:
class LVector3 { public: float x,y,z; inline LVector3() { x=y=z=0; } inline LVector3(float _x, float _y, float _z) { x = _x; y = _y; z = _z; } template<int srcSwz, int dstSwz> __forceinline void _swz(LVector3 v) { const int _srcSwz = srcSwz; const int _dstSwz = dstSwz; const char *srcSwizzle = (const char*) &_srcSwz; const char *dstSwizzle = (const char*) &_dstSwz; int i = 255; if(*(char*)&i & 255) { *(&x + (srcSwizzle[2] - 'x')) = *(&v.x + (dstSwizzle[2] - 'x')); *(&x + (srcSwizzle[1] - 'x')) = *(&v.x + (dstSwizzle[1] - 'x')); *(&x + (srcSwizzle[0] - 'x')) = *(&v.x + (dstSwizzle[0] - 'x')); } else { *(&x + (srcSwizzle[0] - 'x')) = *(&v.x + (dstSwizzle[0] - 'x')); *(&x + (srcSwizzle[1] - 'x')) = *(&v.x + (dstSwizzle[1] - 'x')); *(&x + (srcSwizzle[2] - 'x')) = *(&v.x + (dstSwizzle[2] - 'x')); } } }; #define swz(a, b, c) _swz<#@a, #@c>(b)
Now you can simply use the operator like this:
Code:
LVector3 a; LVector3 b; LVector3 c; b.swz(xzy, a, zyx); c.swz(xxx, a, zzz); // just to test optimizer!
Looks complex doesn't it? But don't be afraid of using it, because the compiler solved everything and generated optimal code. The first operation unfolds to 3 floating point assignments and the second, only 1.

My version only uses macros, but has additional support for any operation (not just copying) and can use any vector 2, 3, or 4 classes. The only requirement is that vector class implements the [] operator to access vector elements.

Enlight swizzle:
Code:
#define i__(e,c) (((*(c+(char*)(e))-0x77)-1)&3) #define SW2(src,ss,op,dest,ds) src[i__(#ss,0)] op dest[i__(#ds,0)]; \\ src[i__(#ss,1)] op dest[i__(#ds,1)]; #define SW3(src,ss,op,dest,ds) SW2(src,ss,op,dest,ds) \\ src[i__(#ss,2)] op dest[i__(#ds,2)]; #define SW4(src,ss,op,dest,ds) SW3(src,ss,op,dest,ds) \\ src[i__(#ss,3)] op dest[i__(#ds,3)];
Now, use it like this:
Code:
Vec4 a(1,2,3,4); Vec4 b(5,6,7,8); SW4(a,xyzw,=,b,xxyy); // a.xyzw = b.xxyy; (a=5,5,6,6) SW2(b,xy,+=,b,zz); // b.xy += b.zz; (b=12,13,7,8)
The tricky part is the i__ macro. It converts "xyzw" characters to values 0,1,2 and 3. Then, I simply use those values to access vector elements one by one, and let the optimizer do the rest.

Have fun!

Enlight.


  Reply With Quote
Old 08-01-2008, 09:01 AM   #2
Nautilus
Valued Member
 
Join Date: Nov 2004
Location: Milan -ITALY-
Posts: 297
Default Re: Swizzle operator in C++ !

Within Leo's template:
Code:
int i = 255; if(*(char*)&i & 255) { // ... } else { // ... }
You may want to fix that.

Ciao ciao : )
___________________________________________
-Nautilus

1.551640271931635485e+1292913986 ?
Why, that's: 2 ^ ((2 ^ (2 ^ ((2 ^ 2) + (2 ^ (2 - 2))))) - (2 ^ (2 - 2))). Now verify, please...
Nautilus is offline   Reply With Quote
Old 08-01-2008, 10:18 AM   #3
Reedbeta
DevMaster Staff
 
Join Date: Oct 2004
Location: Seattle, WA
Posts: 3,707
Default Re: Swizzle operator in C++ !

I believe that if statement is checking the endianness of the machine. You see it sets an integer to 255 and then checks its first byte, which will be 255 on a little-endian machine and 0 on a big-endian one.

However, the cast should really be to unsigned char, as signed char can't hold the value 255 (although bitwise-and may be ignoring signed/unsigned differences anyway).
___________________________________________
Currently working at Sucker Punch
reedbeta.com - OpenGL demos and other projects
Luabridge - a lightweight, dependency-free C++/Lua binding library.
CD Lite - an unobtrusive, minimal CD player application for Windows.
Reedbeta is offline   Reply With Quote
Old 08-01-2008, 11:05 AM   #4
Nautilus
Valued Member
 
Join Date: Nov 2004
Location: Milan -ITALY-
Posts: 297
Default Re: Swizzle operator in C++ !

I didn't notice. Right you are.

Ciao ciao : )
___________________________________________
-Nautilus

1.551640271931635485e+1292913986 ?
Why, that's: 2 ^ ((2 ^ (2 ^ ((2 ^ 2) + (2 ^ (2 - 2))))) - (2 ^ (2 - 2))). Now verify, please...
Nautilus is offline   Reply With Quote
Old 08-01-2008, 11:46 AM   #5
.oisyn
DevMaster Staff
 
.oisyn's Avatar
 
Join Date: Sep 2005
Location: The Netherlands
Posts: 1,442
Default Re: Swizzle operator in C++ !

Hmm, it looks a bit inefficient with the strings and all. For the second implementation, you should at least put the code of the macro's inside a do {...} while(0) block. Otherwise you could get in serious problems when using things like if-statments without braces.

My version:
Code:
template<int X, int Y, int Z, int W> struct swizzle { }; template<int X> struct swizzle1 : public swizzle<X,X,X,X> { }; template<int X, int Y> struct swizzle2 : public swizzle<X,Y,Y,Y> { }; template<int X, int Y, int Z> struct swizzle3 : public swizzle<X,Y,Z,Z> { }; template<int X, int Y> swizzle2<X,Y> operator,(swizzle1<X>, swizzle1<Y>) { return swizzle2<X,Y>(); } template<int X, int Y, int Z> swizzle3<X,Y,Z> operator,(swizzle2<X,Y>, swizzle1<Z>) { return swizzle3<X,Y,Z>(); } template<int X, int Y, int Z, int W> swizzle<X,Y,Z,W> operator,(swizzle3<X,Y,Z>, swizzle1<W>) { return swizzle<X,Y,Z,W>(); } static swizzle1<0> _x; static swizzle1<1> _y; static swizzle1<2> _z; static swizzle1<3> _w; template<int X, int Y, int Z, int W> struct SwizzledVector; struct Vector { float x, y, z, w; Vector() { } Vector(float x, float y, float z, float w) : x(x), y(y), z(z), w(w) { } template<int X, int Y, int Z, int W> SwizzledVector<X,Y,Z,W> & operator[](swizzle<X,Y,Z,W>) { return reinterpret_cast<SwizzledVector<X,Y,Z,W>&>(*this); } template<int X, int Y, int Z, int W> const SwizzledVector<X,Y,Z,W> & operator[](swizzle<X,Y,Z,W>) const { return reinterpret_cast<const SwizzledVector<X,Y,Z,W>&>(*this); } }; template<int X, int Y, int Z, int W, int I> struct SwizzleSelector; template<int X, int Y, int Z, int W> struct SwizzleSelector<X,Y,Z,W,0> { static const int index = X; }; template<int X, int Y, int Z, int W> struct SwizzleSelector<X,Y,Z,W,1> { static const int index = Y; }; template<int X, int Y, int Z, int W> struct SwizzleSelector<X,Y,Z,W,2> { static const int index = Z; }; template<int X, int Y, int Z, int W> struct SwizzleSelector<X,Y,Z,W,3> { static const int index = W; }; template<int X, int Y, int Z, int W> struct SwizzledVector { template<int I> float & Get() { return reinterpret_cast<float*>(this)[SwizzleSelector<X,Y,Z,W,I>::index]; } template<int I> float Get() const { return const_cast<SwizzledVector&>(*this).Get<I>(); } template<int I> float & Get(swizzle1<I>) { return Get<I>(); } template<int I> float Get(swizzle1<I>) const { return Get<I>(); } SwizzledVector & operator=(const Vector & v) { Get<0>() = v.x; Get<1>() = v.y; Get<2>() = v.z; Get<3>() = v.w; return *this; } template<int X2, int Y2, int Z2, int W2> SwizzledVector & operator=(const SwizzledVector<X,Y,Z,W> & v) { Get<0>() = v.Get<0>(); Get<1>() = v.Get<1>(); Get<2>() = v.Get<2>(); Get<3>() = v.Get<3>(); return *this; } operator Vector() const { return Vector(Get<0>(), Get<1>(), Get<2>(), Get<3>()); } }; // usage int main() { Vector a(1, 2, 3, 4); Vector b = a[_w,_y,_x,_w]; // (4, 2, 1, 4) Vector c = a[_z]; // (3, 3, 3, 3) c[_x,_z,_y,_w] = a; // (1, 3, 2, 4) c[_y,_w] = a[_x]; // (*, 1, *, 1) }

Of course you could generate combinations like _xxy etc. to get rid of the comma's. The cool thing of this implementation is that, because it uses compile-time constants, you could even make use of compiler instrinsics to permute actual SSE registers.
___________________________________________
C++ addict
-
Currently working on: the 3D engine for Tomb Raider: Underworld and Deus Ex 3.
.oisyn is offline   Reply With Quote
Old 08-01-2008, 12:28 PM   #6
Enlight
New Member
 
Join Date: Jun 2007
Location: Argentina
Posts: 25
Default Re: Swizzle operator in C++ !

Not inefficient at all. Check the assembler output. Do the code as you wish, the assembly output is perfect in both cases, although I didn't tried your code...
Enlight is offline   Reply With Quote
Old 08-01-2008, 12:33 PM   #7
leobenaducci
New Member
 
Join Date: Aug 2008
Posts: 3
Post Re: Swizzle operator in C++ !

i'm not sure why you say that, check the generated asm

Code:
__asm int 3 0040107C int 3 b.swz(xzy, a, zyx); 0040107D fld dword ptr [esp+1Ch] 00401081 fstp dword ptr [esp+8] 00401085 fld dword ptr [esp+18h] 00401089 fstp dword ptr [esp+10h] 0040108D fld dword ptr [esp+14h] 00401091 fstp dword ptr [esp+0Ch] __asm int 3 00401095 int 3

do you think there is a faster way to do this?
leobenaducci is offline   Reply With Quote
Old 08-01-2008, 12:52 PM   #8
Shuank
New Member
 
Join Date: Aug 2008
Location: Argentina
Posts: 1
Default Re: Swizzle operator in C++ !

I Like most Leo's Solution, seems to be more clear for the programmer, at least for me (a noob one ), and more OO kind.

Greetings!
Shuank is offline   Reply With Quote
Old 08-01-2008, 01:07 PM   #9
Kenneth Gorking
Senior Member
 
Kenneth Gorking's Avatar
 
Join Date: Aug 2004
Location: Århus, Denmark
Posts: 688
Default Re: Swizzle operator in C++ !

They both suffer from the fact that they rely on strings:

b.swz(aaa, a, bbb);
SW4(a,beer,=,b,good);

both statements will compile just fine, and chrash at runtime.
___________________________________________
"Stupid bug! You go squish now!!" - Homer Simpson
Kenneth Gorking is offline   Reply With Quote
Old 08-01-2008, 01:16 PM   #10
leobenaducci
New Member
 
Join Date: Aug 2008
Posts: 3
Default Re: Swizzle operator in C++ !

sorry, but both codes can send compile time asserts
leobenaducci is offline   Reply With Quote
Old 08-01-2008, 02:33 PM   #11
Reedbeta
DevMaster Staff
 
Join Date: Oct 2004
Location: Seattle, WA
Posts: 3,707
Default Re: Swizzle operator in C++ !

Kenneth's right; there's no compile-time protection against using letters outside the 'w' to 'z' range. Although with the Enlight version, due to the '& 3' in the macro, any other letters will get silently remapped into the 'w'-'z' range; that's slightly better than the Leo version, in which other letters will result in runtime out-of-bounds array accesses.

To fix this, you could add compile-time asserts that each character is within the expected range. Here is a bit of code to do a compile-time assert (from boost):
Code:
template <bool> struct STATIC_ASSERTION_FAILURE; template <> struct STATIC_ASSERTION_FAILURE<true> {}; #define STATIC_ASSERT(f) \ sizeof(STATIC_ASSERTION_FAILURE<(bool)(f)>);

(The real one is slightly more complicated, as it is designed to work outside of a function scope, but that gives you the idea.)
___________________________________________
Currently working at Sucker Punch
reedbeta.com - OpenGL demos and other projects
Luabridge - a lightweight, dependency-free C++/Lua binding library.
CD Lite - an unobtrusive, minimal CD player application for Windows.
Reedbeta is offline   Reply With Quote
Old 08-01-2008, 03:30 PM   #12
leobenaducci
New Member
 
Join Date: Aug 2008
Posts: 3
Default Re: Swizzle operator in C++ !

here is it, out of bounds protection

Code:
#define OUT_OF_BOUNDS(a, b, c) ((a)<b || (a)>c) #define UN_OP(name, op) template<int srcSwz, int dstSwz> \ __forceinline void _##name(LVector3 v) \ { \ const int _srcSwz = srcSwz; \ const int _dstSwz = dstSwz; \ const char *srcSwizzle = (char*) &_srcSwz; \ const char *dstSwizzle = (char*) &_dstSwz; \ \ if(OUT_OF_BOUNDS(*(srcSwizzle+0), 'x', 'z')) \ return; \ if(OUT_OF_BOUNDS(*(dstSwizzle+0), 'x', 'z')) \ return; \ if(OUT_OF_BOUNDS(*(srcSwizzle+1), 'x', 'z')) \ return; \ if(OUT_OF_BOUNDS(*(dstSwizzle+1), 'x', 'z')) \ return; \ if(OUT_OF_BOUNDS(*(srcSwizzle+2), 'x', 'z')) \ return; \ if(OUT_OF_BOUNDS(*(dstSwizzle+2), 'x', 'z')) \ return; \ \ int i = 255; \ if(*(char*)&i & 255) \ { \ *(&x + (srcSwizzle[2] - 'x')) op *(&v.x + (dstSwizzle[2] - 'x')); \ *(&x + (srcSwizzle[1] - 'x')) op *(&v.x + (dstSwizzle[1] - 'x')); \ *(&x + (srcSwizzle[0] - 'x')) op *(&v.x + (dstSwizzle[0] - 'x')); \ } \ else \ { \ *(&x + (srcSwizzle[0] - 'x')) op *(&v.x + (dstSwizzle[0] - 'x')); \ *(&x + (srcSwizzle[1] - 'x')) op *(&v.x + (dstSwizzle[1] - 'x')); \ *(&x + (srcSwizzle[2] - 'x')) op *(&v.x + (dstSwizzle[2] - 'x')); \ } \ }

and the asm

Code:
__asm int 3 0040103A int 3 b.mul(feb, a, zxy); __asm int 3 0040103B int 3

Last edited by leobenaducci : 08-01-2008 at 03:34 PM.
leobenaducci is offline   Reply With Quote
Old 08-02-2008, 04:24 AM   #13
Kenneth Gorking
Senior Member
 
Kenneth Gorking's Avatar
 
Join Date: Aug 2004
Location: Århus, Denmark
Posts: 688
Default Re: Swizzle operator in C++ !

Quote:
Originally Posted by leobenaducci
and the asm
Code:
__asm int 3 0040103A int 3 b.mul(feb, a, zxy); __asm int 3 0040103B int 3
Clearly something has gone wrong here...

Anyways, I tried to create some kind of swizzle support for a float4 class I was working on, but didn't have much luck. After seeing this thread, I decided to go back and give it another go, and finally succeeded. Here it is, in its entirety:


Code:
#pragma once #include <xmmintrin.h> struct __declspec(align(16)) float4 { private: template <unsigned mask = ((3 << 6) | (2 << 4) | (1 << 2) | 0)> struct swizzle_proxy { __m128 &ref; swizzle_proxy(__m128 &ref) : ref(ref) { } __m128 get_swizzled() const { return _mm_shuffle_ps(ref, ref, mask); } swizzle_proxy& operator = (const float4 &other); template<unsigned other_mask> swizzle_proxy& operator = (const swizzle_proxy<other_mask> &other) { __m128 data = other.get_swizzled(); ref = _mm_shuffle_ps(data, data, mask); return *this; } }; public: float4() { } float4(const float4 &other) : x(other.x) , y(other.y) , z(other.z) , w(other.w) { } explicit float4(const __m128 _xmm) : xmm(_xmm) { } float4(float a, float b, float c, float d) : x(a) , y(b) , z(c) , w(d) { } template<unsigned mask> float4(const swizzle_proxy<mask> &other) : xmm(other.get_swizzled()) { } float4 operator + (const float4 &other) const { return float4(_mm_add_ps(xmm, other.xmm)); } float4 operator - (const float4 &other) const { return float4(_mm_sub_ps(xmm, other.xmm)); } float4 operator * (const float4 &other) const { return float4(_mm_mul_ps(xmm, other.xmm)); } float4 operator / (const float4 &other) const { return float4(_mm_div_ps(xmm, other.xmm)); } float4 operator & (const float4 &other) const { return float4(_mm_and_ps(xmm, other.xmm)); } float4 operator | (const float4 &other) const { return float4(_mm_or_ps(xmm, other.xmm)); } float4 operator ^ (const float4 &other) const { return float4(_mm_xor_ps(xmm, other.xmm)); } float4 andnot(const float4 &other) const { return float4(_mm_andnot_ps(xmm, other.xmm)); } // "~this & other" float4 operator + (float f) const { return float4(_mm_add_ps(xmm, _mm_set_ps1(f))); } float4 operator - (float f) const { return float4(_mm_sub_ps(xmm, _mm_set_ps1(f))); } float4 operator * (float f) const { return float4(_mm_mul_ps(xmm, _mm_set_ps1(f))); } float4 operator / (float f) const { return float4(_mm_div_ps(xmm, _mm_set_ps1(f))); } float4 operator & (float f) const { return float4(_mm_and_ps(xmm, _mm_set_ps1(f))); } float4 operator | (float f) const { return float4(_mm_or_ps(xmm, _mm_set_ps1(f))); } float4 operator ^ (float f) const { return float4(_mm_xor_ps(xmm, _mm_set_ps1(f))); } float4 andnot(float f) const { return float4(_mm_andnot_ps(xmm, _mm_set_ps1(f))); } // "~this & f" template<unsigned mask> float4& operator = (const swizzle_proxy<mask> &other) { xmm = other.get_swizzled(); return *this; } float4& operator = (const float4 &other) { xmm = other.xmm; return *this; } float4& operator += (const float4 &other) { xmm = _mm_add_ps(xmm, other.xmm); return *this; } float4& operator -= (const float4 &other) { xmm = _mm_sub_ps(xmm, other.xmm); return *this; } float4& operator *= (const float4 &other) { xmm = _mm_mul_ps(xmm, other.xmm); return *this; } float4& operator /= (const float4 &other) { xmm = _mm_div_ps(xmm, other.xmm); return *this; } float4& operator &= (const float4 &other) { xmm = _mm_and_ps(xmm, other.xmm); return *this; } float4& operator |= (const float4 &other) { xmm = _mm_or_ps(xmm, other.xmm); return *this; } float4& operator ^= (const float4 &other) { xmm = _mm_xor_ps(xmm, other.xmm); return *this; } float4& andnot_asg(const float4 &other) { xmm = _mm_andnot_ps(xmm, other.xmm); return *this; } // "this = ~this & other" float4& operator += (float f) { xmm = _mm_add_ps(xmm, _mm_set_ps1(f)); return *this; } float4& operator -= (float f) { xmm = _mm_sub_ps(xmm, _mm_set_ps1(f)); return *this; } float4& operator *= (float f) { xmm = _mm_mul_ps(xmm, _mm_set_ps1(f)); return *this; } float4& operator /= (float f) { xmm = _mm_div_ps(xmm, _mm_set_ps1(f)); return *this; } float4& operator &= (float f) { xmm = _mm_and_ps(xmm, _mm_set_ps1(f)); return *this; } float4& operator |= (float f) { xmm = _mm_or_ps(xmm, _mm_set_ps1(f)); return *this; } float4& operator ^= (float f) { xmm = _mm_xor_ps(xmm, _mm_set_ps1(f)); return *this; } float4& andnot_asg(float f) { xmm = _mm_andnot_ps(xmm, _mm_set_ps1(f)); return *this; } // "this = ~this & f" friend float4 operator / (float f, const float4 &a) { return float4(_mm_mul_ps(_mm_set_ps1(1.0f/f), a.xmm)); } friend float4 sqrt(const float4 &a) { return float4(_mm_sqrt_ps(a.xmm)); } friend float4 rcp(const float4 &a) { return float4(_mm_rcp_ps(a.xmm)); } friend float4 rsqrt(const float4 &a) { return float4(_mm_rsqrt_ps(a.xmm)); } friend float4 horizontal_add(const float4 &a) { return float4(_mm_add_ss(a.xmm,_mm_add_ss(_mm_shuffle_ps(a.xmm, a.xmm, 1),_mm_add_ss(_mm_shuffle_ps(a.xmm, a.xmm, 2),_mm_shuffle_ps(a.xmm, a.xmm, 3))))); } friend float4 min(const float4 &a, const float4 &b) { return float4(_mm_min_ps(a.xmm, b.xmm)); } friend float4 max(const float4 &a, const float4 &b) { return float4(_mm_max_ps(a.xmm, b.xmm)); } friend float4 dot(const float4 &a, const float4 &b) { return horizontal_add(a*b); } friend float4 length(const float4 &a) { return sqrt(dot(a, a)); } friend float4 rlength(const float4 &a) { return rsqrt(dot(a, a)); } friend float4 normalize(const float4 &a) { return a * rlength(a); } friend float4 distance(const float4 &a, const float4 &b) { return length(a-b); } friend float4 clamp(const float4 &x, const float4 &a, const float4 &b) { return max(a, min(b, x)); } friend float4 cross(const float4 &a, const float4 &b) { enum { shuf_yzxw = _MM_SHUFFLE(3, 0, 2, 1), shuf_zxyw = _MM_SHUFFLE(3, 1, 0, 2) }; __m128 left = _mm_mul_ps(_mm_shuffle_ps(a.xmm, a.xmm, shuf_yzxw), _mm_shuffle_ps(b.xmm, b.xmm, shuf_zxyw)); __m128 right = _mm_mul_ps(_mm_shuffle_ps(a.xmm, a.xmm, shuf_zxyw), _mm_shuffle_ps(b.xmm, b.xmm, shuf_yzxw)); #if 0 return float4(_mm_add_ps(_mm_set_ps(1.0f, 0.0f, 0.0f, 0.0f), _mm_sub_ps(left, right))); #else return float4(_mm_sub_ps(left, right)); // .w equals zero #endif } // NewtonRaphson Reciprocal // [2 * rcpps(a) - (a * rcpps(a) * rcpps(a))] friend float4 rcp_nr(const float4 &a) { float4 ra0 = rcp(a); return (ra0 + ra0) - (a * ra0 * ra0); } template<const unsigned a, const unsigned b, const unsigned c, const unsigned d> swizzle_proxy<(d << 6) | (c << 4) | (b << 2) | a> shuffle() { swizzle_proxy<(d << 6) | (c << 4) | (b << 2) | a> sw(xmm); return sw; } public: union { struct { float x,y,z,w; }; struct { float r,g,b,a; }; __m128 xmm; }; }; template<unsigned mask> float4::swizzle_proxy<mask>& float4::swizzle_proxy<mask>::operator = (const float4 &other) { ref = _mm_shuffle_ps(other.xmm, other.xmm, mask); return *this; } // Test-defines #define xyzw shuffle<0,1,2,3>() #define wzyx shuffle<3,2,1,0>() #define xyxy shuffle<0,1,0,1>() #define yzyx shuffle<1,2,1,0>() #define xxxx shuffle<0,0,0,0>() #define yyyy shuffle<1,1,1,1>() #define zzzz shuffle<2,2,2,2>() #define wwww shuffle<3,3,3,3>()

Using the supplied defines, it is now possible to write code like this (which is pretty close to Cg):
Code:
float4 float4_test() { float4 f1 = float4(1,2,3,4); printf("'f1' : %f, %f, %f, %f\n", f1.x, f1.y, f1.z, f1.w); float4 f3; f3.wzyx = f1; printf("'f3.wzyx = f1' : %f, %f, %f, %f\n", f3.x, f3.y, f3.z, f3.w); float4 f2 = f1.yyyy; printf("'f2 = f1.yyyy' : %f, %f, %f, %f\n", f2.x, f2.y, f2.z, f2.w); f2.wzyx = f3.xyxy; printf("'f2.wzyx = f3.xyxyx' : %f, %f, %f, %f\n", f2.x, f2.y, f2.z, f2.w); float4 f4 = f2 + f1.wzyx; f1 = f1.wzyx; printf("'f1.wzyx' : %f, %f, %f, %f\n", f1.x, f1.y, f1.z, f1.w); printf("'f2.xyzw' : %f, %f, %f, %f\n", f2.x, f2.y, f2.z, f2.w); printf("'f4 = f2 + f1.wzyx' : %f, %f, %f, %f\n", f4.x, f4.y, f4.z, f4.w); float4 f5 = f4 * f2.yzyx; printf("f5 = 'f4 * f2.yzyx' : %f, %f, %f, %f\n", f5.x, f5.y, f5.z, f5.w); return f5; }

which results in the following output:
'f1' : 1.000000, 2.000000, 3.000000, 4.000000
'f3.wzyx = f1' : 4.000000, 3.000000, 2.000000, 1.000000
'f2 = f1.yyyy' : 2.000000, 2.000000, 2.000000, 2.000000
'f2.wzyx = f3.xyxyx' : 3.000000, 4.000000, 3.000000, 4.000000
'f1.wzyx' : 4.000000, 3.000000, 2.000000, 1.000000
'f2.xyzw' : 3.000000, 4.000000, 3.000000, 4.000000
'f4 = f2 + f1.wzyx' : 7.000000, 7.000000, 5.000000, 5.000000
f5 = 'f4 * f2.yzyx' : 28.000000, 21.000000, 20.000000, 15.000000
And finally, the generated assembly (without all the printf calls):
Code:
; 6 : float4 f1 = float4(1,2,3,4); fld1 fstp DWORD PTR _f1$[esp+16] fld DWORD PTR __real@40000000 fstp DWORD PTR _f1$[esp+20] fld DWORD PTR __real@40400000 fstp DWORD PTR _f1$[esp+24] fld DWORD PTR __real@40800000 fstp DWORD PTR _f1$[esp+28] ; 7 : //printf("'f1' : %f, %f, %f, %f\n", f1.x, f1.y, f1.z, f1.w); ; 8 : ; 9 : float4 f3; ; 10 : f3.wzyx = f1; movaps xmm1, XMMWORD PTR _f1$[esp+16] shufps xmm1, xmm1, 27 ; 0000001bH ; 11 : //printf("'f3.wzyx = f1' : %f, %f, %f, %f\n", f3.x, f3.y, f3.z, f3.w); ; 12 : ; 13 : float4 f2 = f1.yyyy; ; 14 : //printf("'f2 = f1.yyyy' : %f, %f, %f, %f\n", f2.x, f2.y, f2.z, f2.w); ; 15 : ; 16 : f2.wzyx = f3.xyxy; movaps xmm0, xmm1 shufps xmm0, xmm1, 68 ; 00000044H shufps xmm0, xmm0, 27 ; 0000001bH ; 17 : //printf("'f2.wzyx = f3.xyxyx' : %f, %f, %f, %f\n", f2.x, f2.y, f2.z, f2.w); ; 18 : ; 19 : float4 f4 = f2 + f1.wzyx; ; 20 : f1 = f1.wzyx; ; 21 : //printf("'f1.wzyx' : %f, %f, %f, %f\n", f1.x, f1.y, f1.z, f1.w); ; 22 : //printf("'f2.xyzw' : %f, %f, %f, %f\n", f2.x, f2.y, f2.z, f2.w); ; 23 : //printf("'f4 = f2 + f1.wzyx' : %f, %f, %f, %f\n", f4.x, f4.y, f4.z, f4.w); ; 24 : ; 25 : float4 f5 = f4 * f2.yzyx; movaps xmm2, xmm0 shufps xmm2, xmm0, 25 ; 00000019H addps xmm1, xmm0 mulps xmm2, xmm1 movaps XMMWORD PTR [eax], xmm2 ; 26 : //printf("f5 = 'f4 * f2.yzyx' : %f, %f, %f, %f\n", f5.x, f5.y, f5.z, f5.w); ; 27 : ; 28 : return f5; ; 29 : }
___________________________________________
"Stupid bug! You go squish now!!" - Homer Simpson

Last edited by Kenneth Gorking : 08-02-2008 at 06:49 AM.
Kenneth Gorking is offline   Reply With Quote
Old 08-02-2008, 09:38 AM   #14
Kenneth Gorking
Senior Member
 
Kenneth Gorking's Avatar
 
Join Date: Aug 2004
Location: Århus, Denmark
Posts: 688
Default Re: Swizzle operator in C++ !

A small adendum to the above: Instead of the single 'shuffle' function, there should be two. A read-only, and a read-write version.

Code:
template<const unsigned a, const unsigned b, const unsigned c, const unsigned d> swizzle_proxy<(d << 6) | (c << 4) | (b << 2) | a> rw_shuffle() { swizzle_proxy<(d << 6) | (c << 4) | (b << 2) | a> sw(xmm); return sw; } template<const unsigned a, const unsigned b, const unsigned c, const unsigned d> const swizzle_proxy<(d << 6) | (c << 4) | (b << 2) | a> ro_shuffle() { swizzle_proxy<(d << 6) | (c << 4) | (b << 2) | a> sw(xmm); return sw; } ... #define xyzw rw_shuffle<0,1,2,3>() #define wzyx rw_shuffle<3,2,1,0>() #define xyxy ro_shuffle<0,1,0,1>() #define yzyx ro_shuffle<1,2,1,0>() #define xxxx ro_shuffle<0,0,0,0>() #define yyyy ro_shuffle<1,1,1,1>() #define zzzz ro_shuffle<2,2,2,2>() #define wwww ro_shuffle<3,3,3,3>()

This way you can't accidentally perform operations like 'v1.xyxy = float4(1234)'
___________________________________________
"Stupid bug! You go squish now!!" - Homer Simpson
Kenneth Gorking is offline   Reply With Quote
Old 08-03-2008, 01:04 AM   #15
Enlight
New Member
 
Join Date: Jun 2007
Location: Argentina
Posts: 25
Default Re: Swizzle operator in C++ !

Quote:
Originally Posted by Kenneth Gorking
Clearly something has gone wrong here...

No, it is fine! It should NOT compile anything because it has bad syntax:

Code:
b.mul(feb, a, zxy); // b.feb *= a.zxy;
what is "feb"?, nothing, so it doesn't do anything hehehe

Now, about your work on the swizzle operator, I'm just amazed, I didn't imagine someone would get *that* far...

I didn't try it out yet, but looks amazing, great work, specially for using 128 bit registers.
Enlight is offline   Reply With Quote
Old 08-03-2008, 05:10 AM   #16
Kenneth Gorking
Senior Member
 
Kenneth Gorking's Avatar
 
Join Date: Aug 2004
Location: Århus, Denmark
Posts: 688
Default Re: Swizzle operator in C++ !

Quote:
Originally Posted by Enlight
No, it is fine! It should NOT compile anything because it has bad syntax:

Code:
b.mul(feb, a, zxy); // b.feb *= a.zxy;
what is "feb"?, nothing, so it doesn't do anything hehehe
Oh yeah, I missed that part
Instead of just doing nothing, maybe you should use a compile-time assert to alert the user to his mistake?

Quote:
Originally Posted by Enlight
Now, about your work on the swizzle operator, I'm just amazed, I didn't imagine someone would get *that* far...

I didn't try it out yet, but looks amazing, great work, specially for using 128 bit registers.
Thanks
___________________________________________
"Stupid bug! You go squish now!!" - Homer Simpson
Kenneth Gorking is offline   Reply With Quote
Old 08-03-2008, 06:10 AM   #17
.oisyn
DevMaster Staff
 
.oisyn's Avatar
 
Join Date: Sep 2005
Location: The Netherlands
Posts: 1,442
Default Re: Swizzle operator in C++ !

Btw, it's pretty pointless to make template integer arguments const
___________________________________________
C++ addict
-
Currently working on: the 3D engine for Tomb Raider: Underworld and Deus Ex 3.
.oisyn is offline   Reply With Quote
Old 08-07-2008, 09:39 AM   #18
Groove
New Member
 
Groove's Avatar
 
Join Date: Mar 2006
Posts: 26
Default Re: Swizzle operator in C++ !

I have also implemented swizzle operator in my math library (glm.g-truc.net)

My implementation is based on a third party class that only contain references.

My implementation is based on GLSL syntax so that we could do something like this:

vec4 v1(1, 2, 3, 4);
vec4 v2(1);
v2.yzx = v1.xyz + v1.yzx;

Here is some detail of the implementation... so annoying to do because of the #defines like yzx that wrap function calls. I have no SSE optimization yet for this.

I will definitely come back on this post to have a closer look of your implementation!

Enjoy:

Code:
namespace glm{ namespace detail{ template <typename T> class _xref4 { public: _xref4(T& x, T& y, T& z, T& w); _xref4<T>& operator= (const _xref4<T>& r); _xref4<T>& operator+=(const _xref4<T>& r); _xref4<T>& operator-=(const _xref4<T>& r); _xref4<T>& operator*=(const _xref4<T>& r); _xref4<T>& operator/=(const _xref4<T>& r); _xref4<T>& operator= (const _xvec4<T>& v); _xref4<T>& operator+=(const _xvec4<T>& v); _xref4<T>& operator-=(const _xvec4<T>& v); _xref4<T>& operator*=(const _xvec4<T>& v); _xref4<T>& operator/=(const _xvec4<T>& v); T& x; T& y; T& z; T& w; }; } //namespace detail } //namespace glm namespace glm{ namespace detail{ template <typename T> class _cvec4 { public: typedef T value_type; typedef int size_type; static const size_type value_size; //////////////// // Components // #ifndef GLM_SINGLE_COMP_NAME #if GLM_FORCE_HALF_COMPATIBILITY || (defined(GLM_COMPILER) && (GLM_COMPILER & GLM_COMPILER_VC) && (GLM_COMPILER <= GLM_COMPILER_VC71)) union { struct{T x, y, z, w;}; struct{T r, g, b, a;}; struct{T s, t, p, q;}; }; #else union{T x, r, s;}; union{T y, g, t;}; union{T z, b, p;}; union{T w, a, q;}; #endif #else T x, y; #endif//GLM_SINGLE_COMP_NAME // Components // //////////////// const T* _address() const{return (T*)(this);} T* _address(){return (T*)(this);} // Constructor _cvec4(){} _cvec4(const T x, const T y, const T z, const T w); // Accesses T& operator[](size_type i); T operator[](size_type i) const; #if (!defined(GLM_AUTO_CAST) || (GLM_AUTO_CAST == GLM_ENABLE)) operator T*(); operator const T*() const; #endif//GLM_AUTO_CAST #if defined(GLM_SWIZZLE) // Left hand side 2 components common swizzle operators _xref2<T> _yx(); _xref2<T> _zx(); _xref2<T> _wx(); _xref2<T> _xy(); _xref2<T> _zy(); _xref2<T> _wy(); _xref2<T> _xz(); _xref2<T> _yz(); _xref2<T> _wz(); _xref2<T> _xw(); _xref2<T> _yw(); _xref2<T> _zw(); // Right hand side 2 components common swizzle operators const _xvec2<T> _xx() const; const _xvec2<T> _yx() const; const _xvec2<T> _zx() const; const _xvec2<T> _wx() const; const _xvec2<T> _xy() const; const _xvec2<T> _yy() const; const _xvec2<T> _zy() const; const _xvec2<T> _wy() const; const _xvec2<T> _xz() const; const _xvec2<T> _yz() const; const _xvec2<T> _zz() const; const _xvec2<T> _wz() const; const _xvec2<T> _xw() const; const _xvec2<T> _yw() const; const _xvec2<T> _zw() const; const _xvec2<T> _ww() const; // Left hand side 3 components common swizzle operators _xref3<T> _zyx(); _xref3<T> _wyx(); _xref3<T> _yzx(); _xref3<T> _wzx(); _xref3<T> _ywx(); _xref3<T> _zwx(); _xref3<T> _zxy(); _xref3<T> _wxy(); _xref3<T> _xzy(); _xref3<T> _wzy(); _xref3<T> _xwy(); _xref3<T> _zwy(); _xref3<T> _yxz(); _xref3<T> _wxz(); _xref3<T> _xyz(); _xref3<T> _wyz(); _xref3<T> _xwz(); _xref3<T> _ywz(); _xref3<T> _yxw(); _xref3<T> _zxw(); _xref3<T> _xyw(); _xref3<T> _zyw(); _xref3<T> _xzw(); _xref3<T> _yzw(); // Right hand side 3 components common swizzle operators const _xvec3<T> _xxx() const; const _xvec3<T> _yxx() const; const _xvec3<T> _zxx() const; const _xvec3<T> _wxx() const; const _xvec3<T> _xyx() const; const _xvec3<T> _yyx() const; const _xvec3<T> _zyx() const; const _xvec3<T> _wyx() const; const _xvec3<T> _xzx() const; const _xvec3<T> _yzx() const; const _xvec3<T> _zzx() const; const _xvec3<T> _wzx() const; const _xvec3<T> _xwx() const; const _xvec3<T> _ywx() const; const _xvec3<T> _zwx() const; const _xvec3<T> _wwx() const; const _xvec3<T> _xxy() const; const _xvec3<T> _yxy() const; const _xvec3<T> _zxy() const; const _xvec3<T> _wxy() const; const _xvec3<T> _xyy() const; const _xvec3<T> _yyy() const; const _xvec3<T> _zyy() const; const _xvec3<T> _wyy() const; const _xvec3<T> _xzy() const; const _xvec3<T> _yzy() const; const _xvec3<T> _zzy() const; const _xvec3<T> _wzy() const; const _xvec3<T> _xwy() const; const _xvec3<T> _ywy() const; const _xvec3<T> _zwy() const; const _xvec3<T> _wwy() const; const _xvec3<T> _xxz() const; const _xvec3<T> _yxz() const; const _xvec3<T> _zxz() const; const _xvec3<T> _wxz() const; const _xvec3<T> _xyz() const; const _xvec3<T> _yyz() const; const _xvec3<T> _zyz() const; const _xvec3<T> _wyz() const; const _xvec3<T> _xzz() const; const _xvec3<T> _yzz() const; const _xvec3<T> _zzz() const; const _xvec3<T> _wzz() const; const _xvec3<T> _xwz() const; const _xvec3<T> _ywz() const; const _xvec3<T> _zwz() const; const _xvec3<T> _wwz() const; const _xvec3<T> _xxw() const; const _xvec3<T> _yxw() const; const _xvec3<T> _zxw() const; const _xvec3<T> _wxw() const; const _xvec3<T> _xyw() const; const _xvec3<T> _yyw() const; const _xvec3<T> _zyw() const; const _xvec3<T> _wyw() const; const _xvec3<T> _xzw() const; const _xvec3<T> _yzw() const; const _xvec3<T> _zzw() const; const _xvec3<T> _wzw() const; const _xvec3<T> _xww() const; const _xvec3<T> _yww() const; const _xvec3<T> _zww() const; const _xvec3<T> _www() const; // Left hand side 4 components common swizzle operators _xref4<T> _wzyx(); _xref4<T> _zwyx(); _xref4<T> _wyzx(); _xref4<T> _ywzx(); _xref4<T> _zywx(); _xref4<T> _yzwx(); _xref4<T> _wzxy(); _xref4<T> _zwxy(); _xref4<T> _wxzy(); _xref4<T> _xwzy(); _xref4<T> _zxwy(); _xref4<T> _xzwy(); _xref4<T> _wyxz(); _xref4<T> _ywxz(); _xref4<T> _wxyz(); _xref4<T> _xwyz(); _xref4<T> _yxwz(); _xref4<T> _xywz(); _xref4<T> _zyxw(); _xref4<T> _yzxw(); _xref4<T> _zxyw(); _xref4<T> _xzyw(); _xref4<T> _yxzw(); _xref4<T> _xyzw(); // Right hand side 4 components common swizzle operators const _xvec4<T> _xxxx() const; const _xvec4<T> _yxxx() const; const _xvec4<T> _zxxx() const; const _xvec4<T> _wxxx() const; const _xvec4<T> _xyxx() const; const _xvec4<T> _yyxx() const; const _xvec4<T> _zyxx() const; const _xvec4<T> _wyxx() const; const _xvec4<T> _xzxx() const; const _xvec4<T> _yzxx() const; const _xvec4<T> _zzxx() const; const _xvec4<T> _wzxx() const; const _xvec4<T> _xwxx() const; const _xvec4<T> _ywxx() const; const _xvec4<T> _zwxx() const; const _xvec4<T> _wwxx() const; const _xvec4<T> _xxyx() const; const _xvec4<T> _yxyx() const; const _xvec4<T> _zxyx() const; const _xvec4<T> _wxyx() const; const _xvec4<T> _xyyx() const; const _xvec4<T> _yyyx() const; const _xvec4<T> _zyyx() const; const _xvec4<T> _wyyx() const; const _xvec4<T> _xzyx() const; const _xvec4<T> _yzyx() const; const _xvec4<T> _zzyx() const; const _xvec4<T> _wzyx() const; const _xvec4<T> _xwyx() const; const _xvec4<T> _ywyx() const; const _xvec4<T> _zwyx() const; const _xvec4<T> _wwyx() const; const _xvec4<T> _xxzx() const; const _xvec4<T> _yxzx() const; const _xvec4<T> _zxzx() const; const _xvec4<T> _wxzx() const; const _xvec4<T> _xyzx() const; const _xvec4<T> _yyzx() const; const _xvec4<T> _zyzx() const; const _xvec4<T> _wyzx() const; const _xvec4<T> _xzzx() const; const _xvec4<T> _yzzx() const; const _xvec4<T> _zzzx() const; const _xvec4<T> _wzzx() const; const _xvec4<T> _xwzx() const; const _xvec4<T> _ywzx() const; const _xvec4<T> _zwzx() const; const _xvec4<T> _wwzx() const; const _xvec4<T> _xxwx() const; const _xvec4<T> _yxwx() const; const _xvec4<T> _zxwx() const; const _xvec4<T> _wxwx() const; const _xvec4<T> _xywx() const; const _xvec4<T> _yywx() const; const _xvec4<T> _zywx() const; const _xvec4<T> _wywx() const; const _xvec4<T> _xzwx() const; const _xvec4<T> _yzwx() const; const _xvec4<T> _zzwx() const; const _xvec4<T> _wzwx() const; const _xvec4<T> _xwwx() const; const _xvec4<T> _ywwx() const; const _xvec4<T> _zwwx() const; const _xvec4<T> _wwwx() const; const _xvec4<T> _xxxy() const; const _xvec4<T> _yxxy() const; const _xvec4<T> _zxxy() const; const _xvec4<T> _wxxy() const; const _xvec4<T> _xyxy() const; const _xvec4<T> _yyxy() const; const _xvec4<T> _zyxy() const; const _xvec4<T> _wyxy() const; const _xvec4<T> _xzxy() const; const _xvec4<T> _yzxy() const; const _xvec4<T> _zzxy() const; const _xvec4<T> _wzxy() const; const _xvec4<T> _xwxy() const; const _xvec4<T> _ywxy() const; const _xvec4<T> _zwxy() const; const _xvec4<T> _wwxy() const; const _xvec4<T> _xxyy() const; const _xvec4<T> _yxyy() const; const _xvec4<T> _zxyy() const; const _xvec4<T> _wxyy() const; const _xvec4<T> _xyyy() const; const _xvec4<T> _yyyy() const; const _xvec4<T> _zyyy() const; const _xvec4<T> _wyyy() const; const _xvec4<T> _xzyy() const; const _xvec4<T> _yzyy() const; const _xvec4<T> _zzyy() const; const _xvec4<T> _wzyy() const; const _xvec4<T> _xwyy() const; const _xvec4<T> _ywyy() const; const _xvec4<T> _zwyy() const; const _xvec4<T> _wwyy() const; const _xvec4<T> _xxzy() const; const _xvec4<T> _yxzy() const; const _xvec4<T> _zxzy() const; const _xvec4<T> _wxzy() const; const _xvec4<T> _xyzy() const; const _xvec4<T> _yyzy() const; const _xvec4<T> _zyzy() const; const _xvec4<T> _wyzy() const; const _xvec4<T> _xzzy() const; const _xvec4<T> _yzzy() const; const _xvec4<T> _zzzy() const; const _xvec4<T> _wzzy() const; const _xvec4<T> _xwzy() const; const _xvec4<T> _ywzy() const; const _xvec4<T> _zwzy() const; const _xvec4<T> _wwzy() const; const _xvec4<T> _xxwy() const; const _xvec4<T> _yxwy() const; const _xvec4<T> _zxwy() const; const _xvec4<T> _wxwy() const; const _xvec4<T> _xywy() const; const _xvec4<T> _yywy() const; const _xvec4<T> _zywy() const; const _xvec4<T> _wywy() const; const _xvec4<T> _xzwy() const; const _xvec4<T> _yzwy() const; const _xvec4<T> _zzwy() const; const _xvec4<T> _wzwy() const; const _xvec4<T> _xwwy() const; const _xvec4<T> _ywwy() const; const _xvec4<T> _zwwy() const; const _xvec4<T> _wwwy() const; const _xvec4<T> _xxxz() const; const _xvec4<T> _yxxz() const; const _xvec4<T> _zxxz() const; const _xvec4<T> _wxxz() const; const _xvec4<T> _xyxz() const; const _xvec4<T> _yyxz() const; const _xvec4<T> _zyxz() const; const _xvec4<T> _wyxz() const; const _xvec4<T> _xzxz() const; const _xvec4<T> _yzxz() const; const _xvec4<T> _zzxz() const; const _xvec4<T> _wzxz() const; const _xvec4<T> _xwxz() const; const _xvec4<T> _ywxz() const; const _xvec4<T> _zwxz() const; const _xvec4<T> _wwxz() const; const _xvec4<T> _xxyz() const; const _xvec4<T> _yxyz() const; const _xvec4<T> _zxyz() const; const _xvec4<T> _wxyz() const; const _xvec4<T> _xyyz() const; const _xvec4<T> _yyyz() const; const _xvec4<T> _zyyz() const; const _xvec4<T> _wyyz() const; const _xvec4<T> _xzyz() const; const _xvec4<T> _yzyz() const; const _xvec4<T> _zzyz() const; const _xvec4<T> _wzyz() const; const _xvec4<T> _xwyz() const; const _xvec4<T> _ywyz() const; const _xvec4<T> _zwyz() const; const _xvec4<T> _wwyz() const; const _xvec4<T> _xxzz() const; const _xvec4<T> _yxzz() const; const _xvec4<T> _zxzz() const; const _xvec4<T> _wxzz() const; const _xvec4<T> _xyzz() const; const _xvec4<T> _yyzz() const; const _xvec4<T> _zyzz() const; const _xvec4<T> _wyzz() const; const _xvec4<T> _xzzz() const; const _xvec4<T> _yzzz() const; const _xvec4<T> _zzzz() const; const _xvec4<T> _wzzz() const; const _xvec4<T> _xwzz() const; const _xvec4<T> _ywzz() const; const _xvec4<T> _zwzz() const; const _xvec4<T> _wwzz() const; const _xvec4<T> _xxwz() const; const _xvec4<T> _yxwz() const; const _xvec4<T> _zxwz() const; const _xvec4<T> _wxwz() const; const _xvec4<T> _xywz() const; const _xvec4<T> _yywz() const; const _xvec4<T> _zywz() const; const _xvec4<T> _wywz() const; const _xvec4<T> _xzwz() const; const _xvec4<T> _yzwz() const; const _xvec4<T> _zzwz() const; const _xvec4<T> _wzwz() const; const _xvec4<T> _xwwz() const; const _xvec4<T> _ywwz() const; const _xvec4<T> _zwwz() const; const _xvec4<T> _wwwz() const; const _xvec4<T> _xxxw() const; const _xvec4<T> _yxxw() const; const _xvec4<T> _zxxw() const; const _xvec4<T> _wxxw() const; const _xvec4<T> _xyxw() const; const _xvec4<T> _yyxw() const; const _xvec4<T> _zyxw() const; const _xvec4<T> _wyxw() const; const _xvec4<T> _xzxw() const; const _xvec4<T> _yzxw() const; const _xvec4<T> _zzxw() const; const _xvec4<T> _wzxw() const; const _xvec4<T> _xwxw() const; const _xvec4<T> _ywxw() const; const _xvec4<T> _zwxw() const; const _xvec4<T> _wwxw() const; const _xvec4<T> _xxyw() const; const _xvec4<T> _yxyw() const; const _xvec4<T> _zxyw() const; const _xvec4<T> _wxyw() const; const _xvec4<T> _xyyw() const; const _xvec4<T> _yyyw() const; const _xvec4<T> _zyyw() const; const _xvec4<T> _wyyw() const; const _xvec4<T> _xzyw() const; const _xvec4<T> _yzyw() const; const _xvec4<T> _zzyw() const; const _xvec4<T> _wzyw() const; const _xvec4<T> _xwyw() const; const _xvec4<T> _ywyw() const; const _xvec4<T> _zwyw() const; const _xvec4<T> _wwyw() const; const _xvec4<T> _xxzw() const; const _xvec4<T> _yxzw() const; const _xvec4<T> _zxzw() const; const _xvec4<T> _wxzw() const; const _xvec4<T> _xyzw() const; const _xvec4<T> _yyzw() const; const _xvec4<T> _zyzw() const; const _xvec4<T> _wyzw() const; const _xvec4<T> _xzzw() const; const _xvec4<T> _yzzw() const; const _xvec4<T> _zzzw() const; const _xvec4<T> _wzzw() const; const _xvec4<T> _xwzw() const; const _xvec4<T> _ywzw() const; const _xvec4<T> _zwzw() const; const _xvec4<T> _wwzw() const; const _xvec4<T> _xxww() const; const _xvec4<T> _yxww() const; const _xvec4<T> _zxww() const; const _xvec4<T> _wxww() const; const _xvec4<T> _xyww() const; const _xvec4<T> _yyww() const; const _xvec4<T> _zyww() const; const _xvec4<T> _wyww() const; const _xvec4<T> _xzww() const; const _xvec4<T> _yzww() const; const _xvec4<T> _zzww() const; const _xvec4<T> _wzww() const; const _xvec4<T> _xwww() const; const _xvec4<T> _ywww() const; const _xvec4<T> _zwww() const; const _xvec4<T> _wwww() const; #endif// defined(GLM_SWIZZLE) }; } //namespace detail } //namespace glm #include "_cvec4.inl" #endif//glm_core_cvec4
___________________________________________
www.g-truc.net - glm.g-truc.net

Last edited by Reedbeta : 08-07-2008 at 10:09 AM.
Groove is offline   Reply With Quote
Old 08-07-2008, 10:09 AM   #19
Reedbeta
DevMaster Staff
 
Join Date: Oct 2004
Location: Seattle, WA
Posts: 3,707
Default Re: Swizzle operator in C++ !

Groove, thanks for your post, but please use the [code]...[/code] tags.
___________________________________________
Currently working at Sucker Punch
reedbeta.com - OpenGL demos and other projects
Luabridge - a lightweight, dependency-free C++/Lua binding library.
CD Lite - an unobtrusive, minimal CD player application for Windows.
Reedbeta is offline   Reply With Quote
Old 08-12-2008, 10:28 AM   #20
Groove
New Member
 
Groove's Avatar
 
Join Date: Mar 2006
Posts: 26
Default Re: Swizzle operator in C++ !

Sorry for that I'll try to remember for next time !
Thanks !
___________________________________________
www.g-truc.net - glm.g-truc.net
Groove is offline   Reply With Quote
Old 09-08-2008, 07:38 PM   #21
WizardOfOzzz
New Member
 
Join Date: Sep 2008
Location: Vancouver
Posts: 1
Default Re: Swizzle operator in C++ !

Hi Groove,

Wouldn't using the intermediate references cause pointers to be created and extra overhead in the assembly code?

The GLM library looks awesome! Keep up the good work.

Eric
WizardOfOzzz is offline   Reply With Quote
Old 09-08-2008, 08:10 PM   #22
Groove
New Member
 
Groove's Avatar
 
Join Date: Mar 2006
Posts: 26
Default Re: Swizzle operator in C++ !

Some people were already afraid of that but using references provide to compiler that support cross function optimizations to just skip them all:

Have a look on this:

vec2 v1(1.0);
vec2 v2(2.0);
vec2 v3(3.0);

The following is just the asm code for this line:
v1.xy = vec2(v2.xy) + vec2(v3.xy);

GCC 3.4.5:
Code:
0x00401354 <main+100>: mov %ebx,0xffffffc0(%ebp) 0x00401357 <main+103>: lea 0xffffffec(%ebp),%eax 0x0040135a <main+106>: mov %eax,0xffffffc4(%ebp) 0x0040135d <main+109>: mov 0xffffffc0(%ebp),%eax 0x00401360 <main+112>: mov %esi,0xffffffb0(%ebp) 0x00401363 <main+115>: mov 0xffffffc4(%ebp),%edx 0x00401366 <main+118>: mov %eax,0xffffffb8(%ebp) 0x00401369 <main+121>: lea 0xffffffdc(%ebp),%eax 0x0040136c <main+124>: mov %eax,0xffffffb4(%ebp) 0x0040136f <main+127>: mov 0xffffffb0(%ebp),%eax 0x00401372 <main+130>: mov %edx,0xffffffbc(%ebp) 0x00401375 <main+133>: mov 0xffffffb4(%ebp),%edx 0x00401378 <main+136>: mov %eax,0xffffffa8(%ebp) 0x0040137b <main+139>: mov 0xffffffa8(%ebp),%eax 0x0040137e <main+142>: mov %edx,0xffffffac(%ebp) 0x00401381 <main+145>: flds (%eax) 0x00401383 <main+147>: mov 0xffffffb8(%ebp),%eax 0x00401386 <main+150>: fadds (%eax) 0x00401388 <main+152>: mov 0xffffffac(%ebp),%eax 0x0040138b <main+155>: flds (%eax) 0x0040138d <main+157>: mov 0xffffffbc(%ebp),%eax 0x00401390 <main+160>: fadds (%eax) 0x00401392 <main+162>: fxch %st(1) 0x00401394 <main+164>: fsts 0xffffffc8(%ebp)

GCC 4.3.0:
Code:
0040809F mov eax,DWORD PTR [ebp+12] 004080A2 mov ecx,DWORD PTR [ebp+16] 004080A5 mov edx,DWORD PTR [ebp+20] 004080A8 fld DWORD PTR [edx+4] 004080AB fadd DWORD PTR [ecx+4] 004080AE fld DWORD PTR [edx] 004080B0 fadd DWORD PTR [ecx] 004080B2 fstp DWORD PTR [eax] 004080B4 fstp DWORD PTR [eax+4]

with SSE

Code:
00408123 mov edx,DWORD PTR [ebp+20] 00408126 mov ecx,DWORD PTR [ebp+16] 00408129 mov eax,DWORD PTR [ebp+12] 0040812C movss xmm1,DWORD PTR [edx+4] 00408131 movss xmm0,DWORD PTR [edx] 00408135 addss xmm1,DWORD PTR [ecx+4] 0040813A addss xmm0,DWORD PTR [ecx] 0040813E movss DWORD PTR [eax+4],xmm1 00408143 movss DWORD PTR [eax],xmm0

VC8
Code:
mov eax, DWORD PTR _b$[esp-4] fld DWORD PTR [eax] fld DWORD PTR [eax+4] mov eax, DWORD PTR _a$[esp-4] fld DWORD PTR [eax+4] fld DWORD PTR [eax] mov eax, DWORD PTR ___$ReturnUdt$[esp-4] faddp ST(3), ST(0) fxch ST(2) fstp DWORD PTR [eax] faddp ST(1), ST(0) fstp DWORD PTR [eax+4]

With SSE

Code:
mov eax, DWORD PTR _b$[esp-4] movss xmm2, DWORD PTR [eax] movss xmm3, DWORD PTR [eax+4] mov eax, DWORD PTR _a$[esp-4] movss xmm0, DWORD PTR [eax] movss xmm1, DWORD PTR [eax+4] mov eax, DWORD PTR ___$ReturnUdt$[esp-4] addss xmm0, xmm2 addss xmm1, xmm3 movss DWORD PTR [eax], xmm0 movss DWORD PTR [eax+4], xmm1

GCC 3.4.5 show the problem you point. But GCC 3.x didn't supported cross function optimizations. It gets available since GCC 4.1 I think. Maybe some with GCC 4.0. With Vistua Studioit is supported for ages, even Visual C++ 6 but I'm not sure, under the name whole program optimizations.

I'm doing my best to keep up GLM and GLM 0.8.x will be a good step for this. GLSL 1.30 support indeed but also lot of internal improvements
___________________________________________
www.g-truc.net - glm.g-truc.net
Groove is offline   Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Forum Jump


All times are GMT -7. The time now is 05:05 AM.


Powered by vBulletin
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.