![]() |
| [[ Home | Forums | 3D Engines Database | Wiki | Articles/Tutorials | Game Dev Jobs | IRC Chat Network | Contact Us ]] |
|
|
#1 | |||
|
New Member
Join Date: Sep 2006
Location: Japan
Posts: 25
|
Quote:
I dont know about the new render to vertex array from ATI with DX9, so I describe the two actual render to vertex array methods for NVidia cards I know as a kind of Tutorial. Quote:
The example-source below is working (I just finished my own implementation) and since there are many traps and things that should be known before implementing render to vertex array, I hope this little howto can help others. Comments for improvements and other issues are of course welcome. Overview: Method 1: Texture read in the Vertex Shader. Advantages:
Disadvantages:
Method 2: Copy to Pixel Buffer Object (PBO) Advantages:
Disadvantages:
Older GFX-cards dont have multiple rendertargets!Implementation: Method 1. Texture read in the Vertex Shader. Step 1: Create FBO Create an FBO and attach the texture that will hold the vertex data. ( see further down the text how to create ) Step 2: Create VBO Create a vertex buffer object (VBO) holding texture coordinates for referencing the FBO texture (e.g. a position VBO with 2 float-coordinates per vertex) ( see further down the text how to create ) Step 3: Render the FBO ( see further down for details ) Step 4: Render the VBO Here the vertex coordinates from the VBO are used to lookup the FBO texture containing the coordinates Here an example for the vertex program code: Code:
Allowed texture formats for using a texture in the vertex shader: Quote:
Method 2. Copy to Pixel Buffer Object (PBO). Step 1: Create a VBO as pixel buffer object: Here a sample: Code:
Step 2: Create a FBO. Multiple render targets can be helpful for writing vertex/normal/binormal at the same time. Here an example to create an FBO: Code:
Here an example to create a float texture: Code:
Rect textures can be used for GPGPU but notfor copying FBO -> PBO !! Only GL_TEXTURE_2D Even if an RGB texture is created, it may be RGBA internally=> slow glReadPixels if RGB is the destination format due to format conversion Step 3: Render FBO The input textures contain the necessary data (vertex position etc) to compute the outputs. Example to bind the buffer Code:
Example to render the buffer using a quad Code:
gluOrtho2D is necessary ! If missing, glReadPixels wont work glClampColorARB is only necessary in case of drawing other than by gl_FragData[0..n] in the shader, to allow unclamped color values To write to multiple targets, here an example Fragment Shader (GLSL): Code:
Step 4: Copy from FBO to PBO Example: Code:
( of course vbo_vertices.size() == tex_width * tex_height ) If glReadPixels is too slow, different formats (like RGBA->RGB) might be a problem, since simple copy doesnt work! Step 5: Render VBO as usual: Code:
References http://oss.sgi.com/projects/ogl-samp...fer_object.txt http://developer.nvidia.com/object/u..._textures.html http://wiki.delphigl.com/index.php/GLSL_Partikel (german) http://www.mathematik.uni-dortmund.d...l.html#arrays3 http://download.developer.nvidia.com...s/samples.html Last edited by spacerat : 10-28-2006 at 08:07 AM. |
|||
|
|
|
|
#2 |
|
Senior Member
Join Date: Jan 2003
Posts: 949
|
I didnt quite understand... are you showing us stuff or are you asking for help?
Last edited by Mihail121 : 10-28-2006 at 05:15 AM. |
|
|
|
|
#3 | |
|
Senior Member
Join Date: Aug 2004
Location: Århus, Denmark
Posts: 752
|
Quote:
It is described in detail in the ATI SDK, including 13 samples on: animation, cloth, IK, NPatches, footprints in snow, particle collision, particle system, particle sorting, shadow volume generation, ocean simlation, water physics and terrain morphing. They even run on my crummy radeon 9700 ![]()
___________________________________________
"Stupid bug! You go squish now!!" - Homer Simpson |
|
|
|
|
|
#4 | |
|
New Member
Join Date: Sep 2006
Location: Japan
Posts: 25
|
Quote:
Yes, it seems that render to vertex-buffer is not a special feature of the hardware - its just the software which didn't support it directly so far, isn't it ? ![]() I hope there will also be some good GL extensions soon. Hm.. but after taking a look at the next GL version it might still take a while.. http://www.khronos.org/developers/li...06/OpenGL_BOF/ But I wonder why r2vb is not possible on NVidia cards at the moment.. I mean, is it not just about setting one pointer of the renderbuffer in the GPU ? Btw., does anybody have a benchmark of the ATI crowd simulation with 10.000 characters ? How many vertices/s can be transformed on a newer card ? Last edited by spacerat : 10-28-2006 at 05:14 PM. |
|
|
|
|
|
#5 |
|
New Member
Join Date: Sep 2006
Location: Japan
Posts: 25
|
Update:
If the above implementation should be applied for an older graphics card, such as GF FX5200, then the FBO texture type should be GL_FLOAT_RGB32_NV, with GL_TEXTURE_RECTANGLE_ARB instead of GL_TEXTURE_2D. The texture coordinates will change the range from (0..1 , 0..1) to ( 0..tex_width , 0..tex_height ) and the fragment program will have to use sampler2DRect and texture2DRect If only one rendertarget is available, sometimes gl_FragColor must be used instead of gl_FragData[]. In this case, it might be required to switch off color clamping (glClampColorARB(..)) The implementation on top was done using Windows XP with 81.98 NVidia Drivers. On the actual 91.47 drivers there seems to be a bug with glReadPixels for PBO's. Even there is no GL-error, it is not copying anything from the FBO to the PBO/VBO. Last edited by spacerat : 10-30-2006 at 11:52 PM. |
|
|
|
|
#6 |
|
New Member
Join Date: Sep 2006
Location: Japan
Posts: 25
|
Another important thing is to call glFinish() at the end of each frame ! Since glReadPixels is working asynchronous, there will be strong frame-rate deviations otherwise. Here an example for rendering about 1M vertices on a GF FX 5200 Without glFinish() : Average Fps:11, 85.833 ms Time Range: 38 - 152 ms Frame: 73 Time:59 ms = 16.949 fps Frame: 74 Time:152 ms = 6.579 fps Frame: 75 Time:38 ms = 26.316 fps Frame: 76 Time:95 ms = 10.526 fps Frame: 77 Time:130 ms = 7.692 fps Frame: 78 Time:76 ms = 13.158 fps Frame: 79 Time:152 ms = 6.579 fps Frame: 80 Time:53 ms = 18.868 fps Frame: 81 Time:98 ms = 10.204 fps Frame: 82 Time:105 ms = 9.524 fps Frame: 83 Time:84 ms = 11.905 fps With glFinish() : Average Fps:11, 85.167 ms Time Range: 92 - 102 ms Frame: 93 Time:96 ms = 10.417 fps Frame: 94 Time:97 ms = 10.309 fps Frame: 95 Time:90 ms = 11.111 fps Frame: 96 Time:91 ms = 10.989 fps Frame: 97 Time:91 ms = 10.989 fps Frame: 98 Time:92 ms = 10.870 fps Frame: 99 Time:91 ms = 10.989 fps Frame: 100 Time:89 ms = 11.236 fps Frame: 101 Time:102 ms = 9.804 fps Frame: 102 Time:100 ms = 10.000 fps Frame: 103 Time:94 ms = 10.638 fps Frame: 104 Time:94 ms = 10.638 fps Frame: 105 Time:94 ms = 10.638 fps Frame: 106 Time:92 ms = 10.870 fps |
|
|
|
|
#7 |
|
New Member
Join Date: Nov 2006
Posts: 1
|
Are u sure it's a driver bug? I haven't tested the new driver yet, maybe u need to repport the bug to nvidia? (or did they did it on purpose
) |
|
|
|
|
#8 |
|
New Member
Join Date: Sep 2006
Location: Japan
Posts: 25
|
I just tested everything with the new NVPerf-Kit, where the driver gives you additional debug infos, but I couldnt figure out what was going wrong. I asked the NVidia customer service a few days ago, if there have been important changes, but I dont have an exact answer yet. I will post an update when I know what happend or what needs to be fixed.
|
|
|
|
|
#9 |
|
New Member
Join Date: Sep 2006
Location: Japan
Posts: 25
|
On the GF8, an extra copy operation from FBO to PBO/VBO won't be necessary anymore. EXT_texture_buffer_object seems to solve this issue.
http://developer.nvidia.com/object/n...ngl_specs.html |
|
|
|
|
#10 |
|
New Member
Join Date: Dec 2005
Location: Chandler, AZ
Posts: 17
|
Not to sound dick or anything, but is there a reason why this isn't in the DevMaster Wiki under OpenGL?
Great post by the way.
___________________________________________
Someone13 |
|
|
|
|
#11 |
|
New Member
Join Date: May 2007
Posts: 5
|
It's great ! i've tried it, and it works on my NVidia 5700.
By the way, i don't understand why my program continue to use CPU when i render the scene. In my main loop i've done the copy of FOB PBO to VBO, then draw VBO, normaly, it uses only GPU, isn't it ? |
|
|
|
|
#12 |
|
DevMaster Staff
Join Date: Oct 2004
Location: Seattle, WA
Posts: 4,015
|
You'll still see 100% CPU usage if your program is busy waiting. Basically it's looping and continually asking the GPU if it's done before rendering the next frame.
___________________________________________
Currently working at Sucker Punch reedbeta.com - OpenGL demos and other projects Luabridge - a lightweight, dependency-free C++/Lua binding library. CD Lite - an unobtrusive, minimal CD player application for Windows. |
|
|
|
|
#13 |
|
New Member
Join Date: May 2007
Posts: 5
|
Thanks, that's right, but when i do some test with different texture resolution, i realize that a texture which higher resolution uses more CPU than usual.
The bottleneck must be in the function copy FBO to PBO/VBO glReadBuffer(GL_COLOR_ATTACHMENT0_EXT); glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, vbo_vertices_handle); glReadPixels(0, 0, imgWidth, imgHeight, GL_RGBA, GL_FLOAT, 0); glReadBuffer(GL_NONE); glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, 0 ); And draw VBO glBindBufferARB(GL_ARRAY_BUFFER_ARB, vbo_vertices_handle); glEnableClientState(GL_VERTEX_ARRAY); glVertexPointer(4, GL_FLOAT, 4 * sizeof(float), (char *) 0); glDrawArrays(GL_POINTS, 0, imgWidth * imgHeight); glDisableClientState(GL_VERTEX_ARRAY); Because when i comment them, the CPU stays idle. But i don't understand why ? |
|
|
|
|
#14 |
|
DevMaster Staff
Join Date: Oct 2004
Location: Seattle, WA
Posts: 4,015
|
Oh, you're using glReadPixels...that might very well engage the CPU to copy all the data, depending on the driver. Are you sure you're using the latest updated drivers for your 5700?
___________________________________________
Currently working at Sucker Punch reedbeta.com - OpenGL demos and other projects Luabridge - a lightweight, dependency-free C++/Lua binding library. CD Lite - an unobtrusive, minimal CD player application for Windows. |
|
|
|
|
#15 |
|
New Member
Join Date: May 2007
Posts: 5
|
Yes, i've downloaded the latest driver of NVidia (93.71), and this is the same with my Quadro FX 3400. I've tried some tests with R2VB, VTF, and Vertex mapping. I've realize that R2VB is a little faster on my Quadro FX but it has not a constant FPS and Vertex mapping works very well (i don't know why ?)
|
|
|
|
|
#16 |
|
DevMaster Staff
Join Date: Oct 2004
Location: Seattle, WA
Posts: 4,015
|
What do you mean by vertex mapping?
___________________________________________
Currently working at Sucker Punch reedbeta.com - OpenGL demos and other projects Luabridge - a lightweight, dependency-free C++/Lua binding library. CD Lite - an unobtrusive, minimal CD player application for Windows. |
|
|
|
|
#17 |
|
New Member
Join Date: May 2007
Posts: 5
|
I've found it in a tutorial of Vertex Buffer Object http://www.g-truc.net/article/vbo-en.pdf
With this technique, i send data from RAM to server OpenGL, and at each frame, i manipulate directly this data with my program as a client //In init // Allocate vertex buffer size = imgWidth * imgHeight * 4 * sizeof(GLfloat); glGenBuffersARB(1, &vbo_vm); glBindBufferARB(GL_ARRAY_BUFFER, vbo_vm); glBufferData(GL_ARRAY_BUFFER, size, 0, GL_DYNAMIC_DRAW); //In my main loop //Update VBO glBindBufferARB(GL_ARRAY_BUFFER_ARB, vbo_vm); glBufferData(GL_ARRAY_BUFFER, 4 * imgWidth * imgHeight * sizeof(GLfloat), 0, GL_DYNAMIC_DRAW); GLvoid* gpu_vertices = glMapBuffer(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY); memcpy(gpu_vertices, ram_vertices, 4 * imgWidth * imgHeight * sizeof(GLfloat)); glUnmapBuffer(GL_ARRAY_BUFFER); //Draw VBO as usual glEnableClientState(GL_VERTEX_ARRAY); glVertexPointer(4, GL_FLOAT, 4 * sizeof(GLfloat), (char *) 0); glDisable(GL_LIGHTING); glColor3f(1, 1, 1); glDrawArrays(GL_POINTS, 0, imgWidth * imgHeight); glEnable(GL_LIGHTING); glDisableClientState(GL_VERTEX_ARRAY); |
|
|
|
|
#18 |
|
DevMaster Staff
Join Date: Oct 2004
Location: Seattle, WA
Posts: 4,015
|
Oh, okay. You mean mapping the vertex buffer into system memory. It's not usually called 'vertex mapping'.
Anyways, for a vertex buffer object, the driver might keep a copy in system memory (lets you read from it quickly), and when writing to it, the driver just sends it to the video card via DMA. For render-to-vertex-buffer, it's possible the driver copies the data to system memory and then back to the card...it shouldn't really, but who knows?
___________________________________________
Currently working at Sucker Punch reedbeta.com - OpenGL demos and other projects Luabridge - a lightweight, dependency-free C++/Lua binding library. CD Lite - an unobtrusive, minimal CD player application for Windows. |
|
|
|
|
#19 |
|
New Member
Join Date: May 2007
Posts: 5
|
Thanks Reedbeta, i think you're right because this it the same way that i think, i've done a test, and i've realized that my program uses an extra memory than what i've allocated, but who knows ?
Anyways, i'll use VTF, it's simple to use, consumes less memory and rather efficient for futur graphic card |
|
|
|
|
#20 |
|
New Member
Join Date: Sep 2006
Location: Japan
Posts: 25
|
I just released the source of my project (Deformation Styles).
There, render to vertexbuffer is one of the main parts. You can download everything here: http://www.xinix.org/sven/main/publications.htm |
|
|
|
|
#21 |
|
New Member
Join Date: Jul 2007
Posts: 6
|
hello,
i try to render to a vertexbuffer like you...I try to copy to the PBO and copy to the VBO to render but...nothing happen. I am not realy good in OpenGL so I am not really surprised... I used a shader in CG to calculateposition and put the result in a texture. my function to draw is here : Code:
I used almsot the same thing that in the tutorial.InitVBO is called in the main Code:
1)something wrong in the algorithme? 2) targetVBO= GL_PIXEL_PACK_BUFFER_EXT, i saw other example with GL_ARRAY_BUFFER...is it better? 3)in glDrawArrays( GL_TRIANGLES, 0,4); I want to display a pyramid with 4 point, should I use GL_POINT ? (in my texture I have the position of the 4 point) |
|
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
|