DevMaster.net Forums
[[ Home | Forums | 3D Engines Database | Wiki | Articles/Tutorials | Game Dev Jobs | IRC Chat Network | Contact Us ]]

Go Back   DevMaster.net Forums > Site Discussions > Code & Snapshot Discussion
User Name
Password
Register FAQ Members List Search Today's Posts Mark Forums Read

Reply
 
Thread Tools Search this Thread Display Modes
Old 11-24-2007, 08:00 AM   #1
Nick
Senior Member
 
Join Date: Aug 2004
Location: Ghent, Belgium
Posts: 1,056
Default

Hi all,

Let me start with a famous qoute from Butler Lampson: "All problems in computer science can be solved by another level of indirection". This is quite true when building complex systems and forms the basis of abstraction. Unfortunately, when performance and memory usage are of primary concern his words often have to be appended with: "Except the problem of having too many levels of indirection"...

One such situations is a (Pascal) string class. To create variable-length strings with an explicit size field (for fast processing), we would normally start with something like this:
Code:
struct PascalString { PascalString(int newSize) : size(newSize) { string = new char[size + 1]; } short size; char *string; };
Note however that if we want to access the actual characters, 'string' needs to be dereferenced to get the actual array. Also, this PascalString struct is 6 bytes*, plus the size of the string itself. And because of memory management these two allocations can actually consume even more memory.

What we really want is the string to follow directly after the 'size' field. That way there would be no extra pointer dereferencing, and it can be one allocation with just 2 bytes extra.

This can be achieved with placement new. Placement new takes an extra parameter, the pre-allocated memory location where to construct the object:
Code:
struct PascalString { static PascalString *newString(int newSize) // Factory method { char *buffer = new char[sizeof(size) + newSize + 1]; // Allocate flat memory for size field, string, and closing 0 return new (buffer) PascalString(newSize); // Placement new } short size; char string[1]; private: PascalString(short newSize) : size(newSize) { } };
Note that with templates this is easy to extend to any kind of arrays. Personally I use it for a class that has multiple variable size arrays, with variable size elements. This previously required several pointer indirections, noticable affecting performance and memory usage. Modern CPU's really don't like 'pointer chasing', and loading a whole cache line for just a pointer quickly evicts more important data.

So I hope this inspires you to create more efficient data structures too. One word of caution though: don't blindly apply this trick. Profile before you optimize!

Cheers,

Nicolas

(*) Assuming a 32-bit environment. It's even worse with 64-bit...

Last edited by Nick : 11-25-2007 at 04:28 AM.
Nick is offline   Reply With Quote
Old 11-24-2007, 12:30 PM   #2
Reedbeta
DevMaster Staff
 
Join Date: Oct 2004
Location: Seattle, WA
Posts: 3,707
Default Re: C++ Pascal strings with placement new

Nice trick, but shouldn't the factory function be static?
___________________________________________
Currently working at Sucker Punch
reedbeta.com - OpenGL demos and other projects
Luabridge - a lightweight, dependency-free C++/Lua binding library.
CD Lite - an unobtrusive, minimal CD player application for Windows.
Reedbeta is offline   Reply With Quote
Old 11-24-2007, 03:57 PM   #3
.oisyn
DevMaster Staff
 
.oisyn's Avatar
 
Join Date: Sep 2005
Location: The Netherlands
Posts: 1,442
Default Re: C++ Pascal strings with placement new

Why not create a placement new that takes the size of the string?
Code:
struct PascalString { void * operator new(size_t, size_t stringSize) { PascalString * s = reinterpret_cast<PascalString*>(new char[stringSize + sizeof(short) + 1]); s->size = stringSize; return s; } PascalString() // do NOT init 'size', it will be initialized by the placement new { } // etc }; int main() { PascalString * s = new(10) PascalString; }
But I don't really like this approach anyway. What if you create a PascalString on the stack? I'd rather create a wrapper object around the PascalString pointer that serves as a proxy. Then you can just create it on the stack, perhaps do some refcounting, etc.

Code:
struct PascalStringData { short size; char string[1]; static PascalStringData * Create(size_t size); static PascalStringData * Copy(const PascalStringData * other); // etc. }; class PascalString { public: PascalString(size_t size) : data(PascalStringData::Create(size)) { } PascalString(PascalStringData * theData) : data(theData) { } // etc PascalString Copy() const { return PascalString(PascalStringData::Copy(data)); } // etc private: PascalStringData * data; };

Basically generates the same code as your version, but the user doesn't have the burdon of the implementation details like the placement new and such. It can simply use the PascalString object like any other.
___________________________________________
C++ addict
-
Currently working on: the 3D engine for Tomb Raider: Underworld and Deus Ex 3.
.oisyn is offline   Reply With Quote
Old 11-24-2007, 07:56 PM   #4
Reedbeta
DevMaster Staff
 
Join Date: Oct 2004
Location: Seattle, WA
Posts: 3,707
Default Re: C++ Pascal strings with placement new

.oisyn, doesn't your second version defeat Nick's stated purpose (which was to remove an indirection)?
___________________________________________
Currently working at Sucker Punch
reedbeta.com - OpenGL demos and other projects
Luabridge - a lightweight, dependency-free C++/Lua binding library.
CD Lite - an unobtrusive, minimal CD player application for Windows.
Reedbeta is offline   Reply With Quote
Old 11-25-2007, 04:29 AM   #5
Nick
Senior Member
 
Join Date: Aug 2004
Location: Ghent, Belgium
Posts: 1,056
Default Re: C++ Pascal strings with placement new

Quote:
Originally Posted by Reedbeta
Nice trick, but shouldn't the factory function be static?
Fixed it. Thanks for noticing!
Nick is offline   Reply With Quote
Old 11-25-2007, 04:59 AM   #6
Nick
Senior Member
 
Join Date: Aug 2004
Location: Ghent, Belgium
Posts: 1,056
Default Re: C++ Pascal strings with placement new

.oisyn, thanks for the stack version! As Reedbeta noticed though, it defeats the purpose when allocating strings on the heap.

I don't think it's possible to combine both and keep all benefits, but I'd love to be proven wrong. Anyway, this trick is really about being aware of the relationship between indirections and performance. It's only useful for very specific bottlenecks, and the objects will likely be on the heap. In case they're on the stack your solution would be preferable though.
Nick is offline   Reply With Quote
Old 11-25-2007, 09:30 AM   #7
.oisyn
DevMaster Staff
 
.oisyn's Avatar
 
Join Date: Sep 2005
Location: The Netherlands
Posts: 1,442
Default Re: C++ Pascal strings with placement new

Quote:
Originally Posted by Reedbeta
.oisyn, doesn't your second version defeat Nick's stated purpose (which was to remove an indirection)?
How? For nick's version, you need a nick::PascalString* on the stack. For my version, you need oisyn::PascalString on the stack, which contains a single pointer (to a oisyn::PascalStringData, which is functionally equivalent to nick::PascalString). How is that different? In a way, my implementation is a wrapper around the pointer you need for Nick's implementation, which makes the usage (imho ) a bit easier.
___________________________________________
C++ addict
-
Currently working on: the 3D engine for Tomb Raider: Underworld and Deus Ex 3.

Last edited by .oisyn : 11-25-2007 at 09:33 AM.
.oisyn is offline   Reply With Quote
Old 11-26-2007, 03:22 AM   #8
ThomasYoung
New Member
 
Join Date: Aug 2005
Posts: 5
Default Re: C++ Pascal strings with placement new

Hey Nick,

I like the gem, simple but I can see how this could make a big difference in some situations.
Are there any issues with respect to portability, or other subtle gotchas to be aware of?

Quote:
How? For nick's version, you need a nick::PascalString* on the stack. For my version, you need oisyn::PascalString on the stack, which contains a single pointer (to a oisyn::PascalStringData, which is functionally equivalent to nick::PascalString). How is that different? In a way, my implementation is a wrapper around the pointer you need for Nick's implementation, which makes the usage (imho ) a bit easier.

The gem really targets the situation where the objects have to be allocated dynamically. If you're going to put them on the stack then you may as well use the very first implementation with size and string pointer both as data members.

Thomas
ThomasYoung is offline   Reply With Quote
Old 11-26-2007, 04:08 AM   #9
.oisyn
DevMaster Staff
 
.oisyn's Avatar
 
Join Date: Sep 2005
Location: The Netherlands
Posts: 1,442
Default Re: C++ Pascal strings with placement new

Argh! Is it me or are you simply not getting what I'm saying? I'M NOT PUTTING THE WHOLE STRING ON THE STACK, I SIMPLY SUGGEST CREATING A WRAPPER FOR THE POINTER TO MAKE THE USAGE EASIER AND SAFER!!! Geez...

There is absolutely no difference between a single pointer and an object containing a single pointer. Whichever one you use, there is no other way than putting it on the stack.
___________________________________________
C++ addict
-
Currently working on: the 3D engine for Tomb Raider: Underworld and Deus Ex 3.

Last edited by .oisyn : 11-26-2007 at 04:11 AM.
.oisyn is offline   Reply With Quote
Old 11-26-2007, 06:00 AM   #10
ThomasYoung
New Member
 
Join Date: Aug 2005
Posts: 5
Default Re: C++ Pascal strings with placement new

Quote:
Originally Posted by .oisyn
Whichever one you use, there is no other way than putting it on the stack.

Hmm.. so I guess you're saying that where Nick might have something like:
std::list<PascalStringWithPlacementNewTrick*>
you would look to replace this with something like:
std::list<PascalStringWithPlacementNewTrickPointer >
?

I guess that is then preferrable to:
std::list<OriginalPascalStringWithSizeAndPointerMe mbers>
because, for example, copying the list doesn't have to move around so much data and so on, even if accessing the actual string data through this list involves exactly the same number of indirections..

btw. Your user gif fits really nicely with the text of that last message.
ThomasYoung is offline   Reply With Quote
Old 11-26-2007, 02:55 PM   #11
.oisyn
DevMaster Staff
 
.oisyn's Avatar
 
Join Date: Sep 2005
Location: The Netherlands
Posts: 1,442
Default Re: C++ Pascal strings with placement new

Quote:
Originally Posted by ThomasYoung
Hmm.. so I guess you're saying that where Nick might have something like:
std::list<PascalStringWithPlacementNewTrick*>
you would look to replace this with something like:
std::list<PascalStringWithPlacementNewTrickPointer >
Well yes, exactly . And the code generated for these template instantiations would probably even be binary identical.

Quote:
btw. Your user gif fits really nicely with the text of that last message.
Yeah sorry 'bout that, I felt misunderstood by everyone
___________________________________________
C++ addict
-
Currently working on: the 3D engine for Tomb Raider: Underworld and Deus Ex 3.
.oisyn is offline   Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Forum Jump


All times are GMT -7. The time now is 03:03 AM.


Powered by vBulletin
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.