View Full Version : Hashing for int tuple matching
Wernaeh
03-05-2006, 02:57 AM
Hi there :)
Just a short question this time. I want to hash a number of integer tuples (x, y, z) into a linear array to quickly find matching tuples. The tuples don't have any special spatial properties (i.e. are distributed quite randomly).
What would be a good mapping function from a tuple to an array index ?
hashindex = (x * MediumPrime + y * SmallPrime + z) % LargePrimeAndArraySize;
came to my mind, but I'm missing the experience to decide here.
Are there any alternatives ? For example, might using some tree-based sorting (octree, etc) be any faster than simple hashing ?
The point behind this question is that I recently implemented wavefront .obj loading, and it stores 3d geometry data in a very strange way. It has seperate vertex, texture, and normal vertices, and each face seperately indexes into each of these. Consequently, I want to merge vertices that share the same (vertex index, texture index, normal index) tuple. Linear searching results in an O(vertex*face) runtime, and thus takes forever on large meshes.
Any further input ? :)
Thank you,
Cheers,
- Wernaeh
anubis
03-05-2006, 05:36 AM
How often does ist happen, that a vertex shares all three of those properties ? Any statistics on that ?
Why not just loop through each face and build new vertices (including all data) from the separate vertices that are indexed?
MJeannig
03-05-2006, 10:00 AM
The point behind this question is that I recently implemented wavefront .obj loading, and it stores 3d geometry data in a very strange way. It has seperate vertex, texture, and normal vertices, and each face seperately indexes into each of these. Consequently, I want to merge vertices that share the same (vertex index, texture index, normal index) tuple. Linear searching results in an O(vertex*face) runtime, and thus takes forever on large meshes.
This is the way most modelers store meshs. I had optimized it once, by keeping the correspondance between the index of the position in the modeler and the splitted vertices with this index. This will give you a smaller list to search your vertex : only as long as the number of times you had to split the vertex. Then I flattened the list.
M
Wernaeh
03-05-2006, 12:23 PM
First, thank you all for the replies :)
How often does ist happen, that a vertex shares all three of those properties ? Any statistics on that ?
For a normal mesh, I'd assume that at least half of the vertices share all indices with at least another vertex. Consider a simple circle made from triangles with a simple planar texture mapping. Here, each edge of each triangle shares normals, texture coordinates, and position with at least another triangle edge, and it would be nice to use that for optimizing rendering. Afaik modern hardware caches vertex transforms, so for this case I'd get quite some performance gain.
Why not just loop through each face and build new vertices (including all data) from the separate vertices that are indexed?
This approach results in FaceCount * 3 vertices, even if most vertices within the mesh share coordinates. This is why I want to merge "completely equal" vertices.
I had optimized it once, by keeping the correspondance between the index of the position in the modeler and the splitted vertices with this index.
So you're saying that you first run through all vertices as in the file, and create a new vertex list for each of these, with one entry, namely the original vertex. Then, you assign texture coordinates and normal coordinates on the first time you access this vertex from an index, and on each further time, you check whether the vertex / coordinate / normal pair is already inside the list ?... Interesting approach, I have to admit :) Sounds very reasonable, I think I'll give it a shot. :)
Thanks again :D
Cheers,
- Wernaeh
This approach results in FaceCount * 3 vertices, even if most vertices within the mesh share coordinates. This is why I want to merge "completely equal" vertices.
Not necessarily. When you build a new vertex, just check if this vertex (position) has been built before. (Just keep an array of bools, where index == original_triangle_face_vertex_position_index. Got that?) If so, check if all data are identical. If not identical, build another vertex instead.
It should be reasonably simple to code, unless I am missing something fundamental. (Wich happens a lot. :-)
skynet
03-05-2006, 06:22 PM
I have build such a function using std::map. It is quite fast.
First you store all vertices in the map, using the vertexindex/normalindex/texcoordindex triple as key (you need to use a special predicate which can order pairs of triples). You get a map filled with exactly the triples that are used by the model data. Next you walk the map from begin to end setting the "value" of each map entry to an increasing index, starting from 0. This will be used as "new vertex index" into your final array. While doing this, you can fill your new vertex array with the corresponding vertex data also. After that you can linearly walk your old indices, using the "old tuple" as key into your map, finding the corresponding index for the new final index array.
anubis
03-05-2006, 06:58 PM
For a normal mesh, I'd assume that at least half of the vertices share all indices with at least another vertex. Consider a simple circle made from triangles with a simple planar texture mapping. Here, each edge of each triangle shares normals, texture coordinates, and position with at least another triangle edge, and it would be nice to use that for optimizing rendering. Afaik modern hardware caches vertex transforms, so for this case I'd get quite some performance gain.
I know why you use index meshes, thank you :)
I'm not sure what the point of my question actually was, but then again I put the question this morning, without consulting my coffee pot up in front.
edit : I'm still not sure that I understand the question. If you have say 20 vertex coords, 10 normals and 10 texture coords, by definition you have 20 distinct vertices and while passing a vertex array to the card you just need to dublicate some data. You may use some fancy algorithm to do that, but I wouldn't see the point. Just loop over the longest of the three arrays and pull in the other values as you build up your vertex array.
Nils Pipenbrinck
03-06-2006, 12:37 PM
hashindex = (x * MediumPrime + y * SmallPrime + z) % LargePrimeAndArraySize;
came to my mind, but I'm missing the experience to decide here.
That's basically what I use when I need a hash-function. I don't go to the extreme when using primes. Something in the range of 311 to 1021 (my favorite picks are 311, 571 and 1021 for three components to be exact).
When I need a lightweight hash function i just shift and xor the components together until I get a kind of good distribution of hashes for my test-data. That might not make a difference on todays pc's, but for slower machines such "suboptimal" hashes have their merits.
vBulletin, Copyright ©2000-2009, Jelsoft Enterprises Ltd.