View Full Version : Reducing DirectX DrawPrimitive calls
skusey
12-03-2006, 04:04 AM
Hi,
I have read all over the place the importance in trying to reduce DrawPrimtives/DrawIndexexedPrimitives, and how it can speed things up. But whereas in some cases I can see how this easily applies, in others, like with the example of laser beams in a game, I can't. Lets say these are textured quads, with their movement being updated with each new frame. I orginally thought I would have a single static VertexBuffer that describes this type of laser beam's quad, then just calculate a new worldmatrix for each laser beam describing its orientation and position for each frame and call DrawPrimitive. I could potentially have hundreds of laser beams at any one time, so hundreds of DrawPrimitive calls just to draw a single quad each time. So I guess I'm asking how do other people handle this kind of situation? Is a single vertex buffer used instead and that is updated each frame with all the laser beams quads, then a single DrawPrimitive call made? Would writing a new vertexbuffer each frame and discarding the old be slow? Is that a common pratice? Sorry I'm rambling a bit now, so I'll shut up! Any help or thoughts at all on this would be great :).
Cheers,
skusey
juhnu
12-03-2006, 05:26 AM
Hi,
Is a single vertex buffer used instead and that is updated each frame with all the laser beams quads, then a single DrawPrimitive call made?
Yes.
Would writing a new vertexbuffer each frame and discarding the old be slow? Is that a common pratice?
It's fast and a lot faster compared to issuing separate DrawPrimitive commands - pretty much the common practise for dynamic and low polygon geometry. Just make sure you create your dynamic vertex buffer with proper flags, in proper memory and don't forget to lock them with correct flags(discrard old contents, no-overwrite).
skusey
12-03-2006, 08:07 AM
Thanks for the reply juhnu I will take that approach then with what I'm doing.
So would the VertexBuffer thats being used to render these multiple laser beams, etc, be only created once at a fixed size? Something large enough to cover all situations, or would that be re-created each time a new object is added to it? Also would it make sense to just keep an array of vertex data for the laser beams locally, updating and amending that as required, then memcpy'ing it over into the locked buffer each frame? If I did use that approach would I still use the D3DLOCK_NOOVERWRITE flag, or just the D3DLOCK_DISCARD on its own?
Many thanks for the help btw, its great to be able to talk it over with some of you guys who really know what your doing.
Nils Pipenbrinck
12-03-2006, 10:49 AM
The main issue with dynamic geometry like your laser beams is timing: Sure, collecting vertex data and writing that stuff to a vertex buffer does take time. Usually it pays off to do some extra work on the cpu.
If you do this stuff after you've issued all draw commands for static geometry, the GPU is busy drawing and the CPU has time enough to do more expensive things.
Parallelism and pipelining usually help to overlap cpu and gpu work to some extend, but games sometimes require a fast mouse/gamepad response and force the gpu into lockstep ( a rant about that is here: http://xyzw.de/c120.html ) So timing such cpu heavy jobs is still a good practice.
skusey
12-03-2006, 11:54 AM
ahhh... that is interesting, I had never considered that. I have a terrain in my game, rendered in large patches, maybe 5 - 15 DrawIndexedPrimitives a go for the whole terrain. Obivously it is a rather GPU intensive aspect of the game, so rendering this first in the game cycle will ensure the GPU is busy while I get on with the other aspects on the CPU? Does that mean the GPU queues up DrawPrimitive commands recieved from the CPU? It does'nt complete it before returning control to the CPU? Thanks for the advice Nils Pipenbrinck!
Reedbeta
12-03-2006, 12:28 PM
Does that mean the GPU queues up DrawPrimitive commands recieved from the CPU? It does'nt complete it before returning control to the CPU?
Yep, pretty much all commands to the GPU go into a queue, allowing the GPU and CPU to execute concurrently. One normally synchronizes at the end of each frame or so as that article explains.
Nils Pipenbrinck
12-03-2006, 01:24 PM
if you draw your terrain with just a dozen of calls, you will have plenty of cpu-time to combine your laser-beam geometry.
Some drives are even that smart that they batch together draw calls if you haven't changed textures or renderstate, so you might not see a performance improvement. But some other drivers aren't as smart, so you'll better do it on your own, just to make sure it runs fast everywhere.
juhnu
12-03-2006, 07:01 PM
So would the VertexBuffer thats being used to render these multiple laser beams, etc, be only created once at a fixed size? Something large enough to cover all situations, or would that be re-created each time a new object is added to it?
Allocate your buffers on-demand basis at a fixed size and when you run out of space you can allocate a new buffer. In the beginning of each frame mark all the buffers as non-used and when you lock them for the first time in frame, use D3D_DISCARD-flag and after that lock with D3D_NOOVERWRITE. This means you can use a one bigger buffer for many smaller sections of vertex data.
Also would it make sense to just keep an array of vertex data for the laser beams locally, updating and amending that as required, then memcpy'ing it over into the locked buffer each frame? If I did use that approach would I still use the D3DLOCK_NOOVERWRITE flag, or just the D3DLOCK_DISCARD on its own?
It might be just faster to recalculate vertices on fly when you are updating the data(if it's going to change every frame anyway?) and remember to update the vertex buffer in linear order - do not random access it as that wouldn't give you benefits of the write cache.
skusey
12-04-2006, 03:05 AM
Thanks everyone for your comments, thats just the sort of information I needed.
One last question, as discussed about the GPU working away leaving time free for the CPU, would it not make more sense then to have the main game cycle start with its Draw methods, then finish with its Update methods? Traditionally I have always done it the other way round, but it seems unproductive after this chat.
Reedbeta
12-04-2006, 12:28 PM
If you synch at the end of the game cycle, then yes, do the drawing first and then the updating. Actually, you can have it be multithreaded and do the updates and drawing in separate threads, as long as you maintain two copies of the state data so that you're not updating something at the same time you're trying to render it.
Nils Pipenbrinck
12-04-2006, 03:03 PM
the multithreading approach is best of booth worlds today..
vBulletin, Copyright ©2000-2010, Jelsoft Enterprises Ltd.