PDA

View Full Version : 3d engine programming blog


v71
01-04-2009, 07:22 AM
Hi, i have just created since few days a blog about 3d engine programming, i will put code snippets, theory and everything else regarding my job at building a complete 3d engine using opengl and c++.
Everything is free for you to use

http://vp8671.blogspot.com/

Bye

starstutter
01-04-2009, 09:45 PM
huh, I don't normally read blogs but this might be pretty cool

I read a lot of the first page, and even though its just basic information, its actually refreshing to read the begginner level stuff every once in a while. Its like a good breather from the 300+ pages of code I'm usually swimming around in.

Well anyway, keep with interesting updates and you'll at least have one reader :)

v71
01-05-2009, 03:57 AM
Its thought for begginners , but it will get pretty hardcore in time when i will deal with advanced stuff like optimized data structures and shaders.
In few days i will post all my fastmath library for everyone to use.
Of course constructive critics are alwasy accepted

Nick
01-05-2009, 04:24 AM
Starting with utilities like 'fast math' seems like the right thing to do at first, but in my experience it's not very productive.

You're bound to spend a lot of time adding features that eventually won't be used. This makes the code bigger and harder to maintain. It starts with small things so you see no reason for concern, but before you know it they add up to something ugly and smelly. Also, I think it's pointless to be looking for optimizations at this extremely early point in development. Another concern is that this probably won't be tested thoroughly till it's actually used. So months or even years later some of these functions might finally find a use, assuming it works as expected without giving it any further thought, and wasting another day of debugging. It might not be so dramatic for these particular functions, but it will be if someone continues using this approach.

Some function specific advise:


NextPowerOfTwo: Why does it only go to 65536? If the input value is higher, 0 is returned, which I highly doubt is what one expects and is bound to lead to trouble!
IsPowerOf2: Is 0 a power of 2? This function returns non-zero while the result should be false. Is -2 a power of 2? This function returns zero while the result should be true. In fact, it doesn't work for any negative numbers. So use an unsigned argument to clarify this or first take the absolute value. The issue with 0 can be solved by using "(x & -x) == x" instead.
IsOdd/IsEven: Note that this only works for numbers in 2's complement (http://en.wikipedia.org/wiki/Signed_number_representations) representation. It's probably safe for game development, but I wouldn't want this code in say automotive software. The right way to do this is to use "x % 2 == 0" (for even numbers). It's the compiler's job to optimize this with an AND instruction, so it's not likely to lose any performance. It might even have a better approach (not likely this one but in many other occasions the compiler can optimize things best if you avoid bit tricks and such).
CeilPow2: Isn't this supposed to be the same thing as NextPowerOfTwo? Quite confusing. Also, for an input of 0 it returns 2, while clearly 1 is the right answer.
IsNaN: This function also returns true for 0 / 0, which is not a NaN (http://en.wikipedia.org/wiki/NaN#Displaying_NaN) but an indeterminate form. You probably want to catch those too but just for clarity I'd advise to rename the function to IsNaNorIndeterminate or something like that.
IFloor: Beware that this function relies on the IEEE 754 representation of floats. Again, fine for games but don't recycle the code in any other project. You could use preprocessor directives that test for certain CPU's that are guaranteed to use IEEE 754 and fall back to floor() otherwise. In fact I would always use floor() unless the fully completed project has been profiled and I found floor() to be a significant hotspot. Also, the latest x86 processors even include single instruction floor operations that are faster than this bit manipulating trick. So you should probably leave it to the compiler to optimize things unless you know for a fact that you can do better and it's necessary.
FloatToInt: __asm is compiler-specific, and it's only available in 32-bit mode. While this only means it would have to be reimplemented on other platforms, the use of the frndint instruction is a bigger concern. Its actual rounding operation depends on a CPU control word. So while at one point it might be doing a floor operation the next time it could be a ceil or round to nearest! For example the DirectX libraries are known to modify the rounding control.
RoundFloatToInt/FloatToIntRet: Same remarks as FloatToInt. And having three variants of a function is definitely going to lead to confusion.
FloatToByte: This trick also relies on IEEE 754. Furthermore, the return value for input out of range can be quite unexpected. So I would comment that thoroughly or just not use this function unless I absolutely have to at the very end of the project.
log2: It's faster to multiply with the reciprocal of log(2).
FastSin/FastCos/FastSinCos: Let the compiler handle these. In fact Visual C++ will call a function which is faster than the FPU assembly instruction on Pentium 4 and up.
FastDotProduct: This was clearly written in the time of the original Pentium. Instruction latencies and scheduling rules are quite different now. So unless you're an assembly pro and you know exactly what it does, keep away from other people's assembly snippets. Simply writing "x*x+y*y+z*z" is very likely to be faster nowadays, and if dot products are really holding performance down you should look at SSE instead when you're done implementing all functionality.

So while you might have been quite pleased with your collection of 'fast' functions, the reality is that almost every single one of them is seriously flawed. Sorry for the harshness, but the only other alternative would be to learn it the hard way with several weeks of debugging/reimplementing/reoptimizing.

My advice is to just throw ALL of it away. Seriously. Implement only what is needed and in the most straightforward way. Don't waste time trying to optimize anything before the fully functional code has been thoroughly profiled. One day of optimizing the actual hotspots is worth a whole month of needlessly optimizing what one might blindly assume needs optimizing (not counting the frustration caused by bloating the code and debugging malfunctions).

Instead, I would advise to concentrate on how to get things functional. Not just anything, only the things that are truely needed. Therefore, it is of primary importance to have very clear design goals and to not slip away from them. I would even present a time schedule and stick to it as close as possible. Else you might just spend months coding and not achieve much and eventually get demotivated. By sticking to a plan you'll be able to get amazing results in minimal time.

Good luck! :yes:

kusma
01-05-2009, 04:49 AM
CeilPow2 and NextPowerOfTwo should pretty much be the same, but there's a bug in CeilPow2 it seems. It should OR all the shifted values together instead of just effectively right-shifting everything with 31.

Nick
01-05-2009, 05:15 AM
It should OR all the shifted values together instead of just effectively right-shifting everything with 31.
Actually it looks like all the & and | symbols have been eaten by the blog.

kusma
01-05-2009, 05:40 AM
Ah, that makes sense. In that case, the only functional difference I can see between those functions is that NextPowerOfTwo is capped at a positive 16 bit range (and "handles" negative values).

v71
01-05-2009, 06:41 AM
Yes i am having problems with poisting code in the blog , i will try to correct them asap

Grumpy
01-05-2009, 08:52 AM
I have found it hard to keep track of what you have recently added. If you could break the blog into subsequent pages to track the additions and transitions, it would be greatly appreciated. Other than that, The blog is a good idea and has a good start.


In other news; I'm a long time viewer, first time poster ;)

v71
01-05-2009, 10:42 AM
I have decided to post only theory or explanations about the code , because someone reported malfunctiong code fragments , i noticed also that the blogger omitted & and | , so future code will be added only as a zip file

KPBeast
01-05-2009, 01:11 PM
I have to recommend a few things myself, seeing as I have had prior experience developing engines. You have already been given some great advice. One thing I can't stress enough, and has actually already been talked about, is premature optimization. It's simply rediculous unless it has been TRIED & PROVEN. Otherwise, your wasting your time that could be spent doing far better things for your engine.

Another thing, always write up UNIT TESTS, even if it is the smallest tiniest detail there is. Unit tests will help you, and they will provide you with a basis to know that things are still working over time, or they aren't working. It's easy to run through a load of unit tests then it is to write up a new test every single time and try to remember what you might have affected in your modifications.

And my final recommendations, WRITE GAMES with your engine. They will help you learn the engine better, what works and what doesn't, and they are another basis on top of the unit tests that you can use to see what has changed over time, and what needs to be fixed, etc.

Well, that's all I've got for now. I hope that helps, and overall just have fun! Which it already appears that you are. :D

JarkkoL
01-05-2009, 01:47 PM
Regarding "premature optimization", I must say that you can't delay all your optimizations to the last states of development either, at least if you are developing games. You must know relatively early in the development how well the final code is going to perform in order to know how rich content you can create or how many features you can run simultaneously. If you have good idea about final performance early, you can focus your efforts better.

v71
01-05-2009, 03:08 PM
"So while you might have been quite pleased with your collection of 'fast' functions, the reality is that almost every single one of them is seriously flawed..."

"My advise is to just throw ALL of it away..."

Man,that made me laugh like 30 seconds straight...thanks ;-)

TheNut
01-05-2009, 06:02 PM
That's cause Nick was talking assembly when he was only 2 milliseconds old. That's a looong time in the computer world!. For the rest of us humans, it's always interesting to see what's out there ;) It would be good if you bundled benchmarks with them. Compare standard C/C++ generated code versus use of your optimizations. I'd probably be inclined to implement something along the lines of a fast type converter. I do try to avoid them, but sometimes due to design I frequent change between ints and floats. The C++ notation of typecasting is ugly as hell too ;) I'd also be interested in knowing the performance of faster sin and cos since I'd like to maximize the performance of my DSP filters that use them, even if at a slight cost of quality.

kusma
01-06-2009, 02:39 AM
I'd probably be inclined to implement something along the lines of a fast type converter. I do try to avoid them, but sometimes due to design I frequent change between ints and floats.
IIRC, the Intel Core2 architecture can convert 4 floats to ints (or the other way) per clock while still having 4 of the 5 execution units ready to do "real" work. You shouldn't usually need to optimize this any more - but if you really do, use some SSE intrinsics to vectorize it.

Nick
01-06-2009, 04:05 AM
Man,that made me laugh like 30 seconds straight...thanks ;-)
I fail to see how that's hilarious, but every laugh is positive! :lol:
Regarding "premature optimization", I must say that you can't delay all your optimizations to the last states of development either, at least if you are developing games. You must know relatively early in the development how well the final code is going to perform in order to know how rich content you can create or how many features you can run simultaneously. If you have good idea about final performance early, you can focus your efforts better.
While I totally agree about the last part, I believe that premature optimization and performance estimation are two separate things.

This is why we have prototypes. If you have no idea how something performs, write a minimal implementation that will allow you to evaluate the algorithm. It doesn't have to be fine-tuned at all. If during profiling you discover that a floor() function takes half of the execution time, leave it. The important thing is the algorithmic complexity and to have an estimate of how it will perform in the final product. Of course the latter requires some experience and might need some additional experimenting on its own, but that still doesn't fall into the category of premature optimization. Prototypes should be thrown away; failure to do so commonly results in the lava flow (http://en.wikipedia.org/wiki/Lava_flow_(programming)) anti-pattern.
That's cause Nick was talking assembly when he was only 2 milliseconds old. That's a looong time in the computer world!. For the rest of us humans, it's always interesting to see what's out there ;)
Heh, actually, I'm just trying to forewarn people for the type of mistakes I make over and over again. :whistle: I'm the first to admit it's great fun to experiment with these kind of 'performance' tricks, but more often than not it's simply counter-productive.

JarkkoL
01-06-2009, 05:33 AM
This is why we have prototypes. If you have no idea how something performs, write a minimal implementation that will allow you to evaluate the algorithm.
I don't see what prototyping has to do with this discussion because you still have to optimize the feature to have good idea how it performs. If you can ever afford prototyping in real game projects, you only test if the approach you are taking is feasible functionality wise. Once you start to optimize your code, it becomes your final implementation. When an artist asks you how many polygons they can have on a character or how many characters they can have simultaneously on the screen, you are clueless without optimized code (note, I'm don't mean final optimized code, but code that has been optimized to level where you are pretty confident about the final performance).

Even though algorithmic complexity has its value, it's often overstated and preached by fresh university graduates. There can be dominating performance bottlenecks in code which has nothing to do with algorithmic complexity.

But in the end you have to know where to put your optimization efforts and how early in the project. I'm just saying that it's totally wrong ideology to delay all optimizations to the late stage of development: "Premature optimization is the root of all evil", but also "Belated pessimization is the leaf of no good". Unfortunately mr. Knuth's words have been taken out of the context and people never talk about the "We should forget about small efficiencies, say about 97% of the time" part, and he wasn't exactly working on games where you have to work with content people which twist these numbers a bit (:

.oisyn
01-06-2009, 08:30 AM
Is -2 a power of 2? This function returns zero while the result should be true.
Uhm, since when is -2 a power of 2? I must have missed that groundbreaking discovery in recent news... ;)
And for the smart-asses: no, 1 + pi∙(ln 2)-1∙i is not a natural number ;)

CeilPow2: Isn't this supposed to be the same thing as NextPowerOfTwo? Quite confusing.
Actually, intuitively, I would expect CeilPow2(2) to return 2, while I would expect NextPow2(2) to return 4. But maybe that's just me.

Nick
01-06-2009, 08:48 AM
I don't see what prototyping has to do with this discussion because you still have to optimize the feature to have good idea how it performs.
My point is that performance estimation is a separate process. It doesn't justify premature optimization.

Note that I'm not saying optimization should be deferred to the very last day of development. But it's just pointless and a waste of time when doing it blindly like in the blog. A third of those functions are just bloat that will never be used, another third is mathematically wrong and will lead to bugs, and the last third is slower than a straighforward or standard implementation. Sorry v71, nothing personal, but I would throw it all away till you have at least a functional module that is fully tested and profiled before you start with low-level optimizations.
But in the end you have to know where to put your optimization efforts and how early in the project. I'm just saying that it's totally wrong ideology to delay all optimizations to the late stage of development: "Premature optimization is the root of all evil", but also "Belated pessimization is the leaf of no good". Unfortunately mr. Knuth's words have been taken out of the context and people never talk about the "We should forget about small efficiencies, say about 97% of the time" part, and he wasn't exactly working on games where you have to work with content people which twist these numbers a bit (:
I agree that game development is different because it has higher performance demands. The problem is that this too quickly leads to people thinking 50% of code needs optimization. I have yet to see a case in game development where premature optimization has been overestimated. So I wouldn't say it's a totally wrong ideology...

JarkkoL
01-06-2009, 10:04 AM
It doesn't justify premature optimization.
Nothing justifies premature optimization of course because by definition it's premature (: My beef with people preaching about premature optimization is that they always talk as if all optimization is bad and that you should profile the project and then optimize pieces which show up in profiler. Nice theory, but in practice anyone who has actually done any optimization of a real project knows that profiler only shows you the top few hot spots. It doesn't show when your entire code base is bloated with half assed implementation of hundreds or thousand functions whose cumulative effect result in bad performance of your application.

I agree that game development is different because it has higher performance demands
And more importantly there is large amount of people creating content, who need to have good idea about the final performance characteristics of the game early on the project (i.e. when entering production).

alphadog
01-06-2009, 10:23 AM
It doesn't show when your entire code base is bloated with half assed implementation of hundreds or thousand functions whose cumulative effect result in bad performance of your application.

I don't agree, although I think I understand what you are trying to say.

If your entire codebase is in such bad shape at the end, that kind of problem will not be solved by any up-front "optimization" activity because it is the poor skillset surrounding the project that lead the codebase to this fate. The same people who code badly would not be able to "pre-optimize" in any productive way.

JarkkoL
01-06-2009, 11:34 AM
If your entire codebase is in such bad shape at the end, that kind of problem will not be solved by any up-front "optimization" activity because it is the poor skillset surrounding the project that lead the codebase to this fate.
Isn't that a bit contradictory? If a code base ends up to the state where it's badly optimized, surely up-front optimization would help. It would likely mean that you would have less but better performing and higher quality functionality. It's not necessarely the poor skillset of programmers (even though it plays its role in the equation), but where management of the project focuses team's efforts.

starstutter
01-06-2009, 01:12 PM
It would likely mean that you would have less but better performing and higher quality functionality.

I would like to add a random comment here:
As rediculously expensive as Photoshop is (the actual Adobe version) and as relativley few tools that it includes (at least in what I've seen of it), its fairly superior to other products. Why? Because the tools that it does give you are highly polished and very fluent to work with. You can give a program x1000 tools, but I belive thats called bloatware, and it makes it even worse when not one of them works worth a crap.

So in short, I agree :)

alphadog
01-06-2009, 01:27 PM
Isn't that a bit contradictory? If a code base ends up to the state where it's badly optimized, surely up-front optimization would help. It would likely mean that you would have less but better performing and higher quality functionality. It's not necessarely the poor skillset of programmers (even though it plays its role in the equation), but where management of the project focuses team's efforts.

I think this thread has become a big, tangential semantic mess, so this may be my last post in here. (I guess that's why Starstutter started another thread. I'll go philosophize there...)

What is "premature optimization" and what is "design"? Should you worry about "premature optimization"?

My point is that premature optimization is just a subset of design. If you can't generally design properly, you sure as hell won't pre-optimize properly. If you can design properly, you should end up at minimum with a passable codebase, and so you should focus on optimizing bottlenecks later. This is a more efficient use of time and resources, and gets your deliverables out faster. It's basically Lean Development concepts, in a way...

KPBeast
01-06-2009, 01:28 PM
For some reason I feel somewhat responsible for this discussion about premature optimization. If I am the one you were responding to, JarkkoL, then I would recommend that you re-read my post.

Nick
01-06-2009, 11:32 PM
Uhm, since when is -2 a power of 2? I must have missed that groundbreaking discovery in recent news... ;)
Whoops. :whistle:
Actually, intuitively, I would expect CeilPow2(2) to return 2, while I would expect NextPow2(2) to return 4. But maybe that's just me.
Yes, that would have made sense, but that's not how it was implemented. What would be even clearer is to just write 2 * CeilPow2(x) explicitely and not have a confusing second function.
My beef with people preaching about premature optimization is that they always talk as if all optimization is bad and that you should profile the project and then optimize pieces which show up in profiler. Nice theory, but in practice anyone who has actually done any optimization of a real project knows that profiler only shows you the top few hot spots. It doesn't show when your entire code base is bloated with half assed implementation of hundreds or thousand functions whose cumulative effect result in bad performance of your application.
I have to disagree. There are never more than a handful of hotspots. What you're concerned about is that 'tail' of remaining functions. Say it takes 20% of execution time, and half of the code is under your control. Then maybe after half a year you've doubled the speed of that part. Congratulations, you've just sped up your application by 5%. While if you concentrated on the top hotspots it's easy to gain a lot more in a matter of days.

So it's absolutely ridiculous to optimize something before you know whether it will be part of the top or the tail of the hotspots.

Of course, you all have to weigh this against the risks. Something is only a premature optimization if you risk code bloat, robustness or performance. So for instance choosing a faster STL implementation is not a premature optimization. It's not as hard as it looks to determine when it's mature or not. When in doubt, it's premature.

So back to the topic of v71's fast math functions, in my opinion it's all clearly premature. There are too many assumptions. It's assumed that these functions will be needed later, it's assumed they are correct and robust, it's assumed they are faster, etc. :sad:

JarkkoL
01-07-2009, 02:42 AM
I have to disagree. There are never more than a handful of hotspots.
Well, that's what I just said, didn't I? (:

If you have mentality of writing badly performing code, it's the cumulative effect of that code that will bring down the performance of your entire application. You are saying that optimize your floor() function only if it shows up in profiler as a hot spot. I say that's just a bogus advice, because it will never show up in your profiler and you know it. It will take maybe only 0.1% of the entire execution time and never show as a "hot spot" in profiler. BUT if you have mentality of writing all those functions thinking "I'll optimize it if it shows up in profiler" you will have 1000 badly performing functions, each taking 0.1% of the execution time.

It's easy to focus on the hot spots and optimize those. And that's what you have to do as well, but it's not the only optimization you have to do. Having mentality of constantly writing well performing code takes effort and it's up to you how much of your time you want to invest on that. 0%? 10%? 20%? 50%? You seem to think it's 0% and I totally disagree with that.

Nick
01-07-2009, 05:08 AM
Well, that's what I just said, didn't I? (:
Yes, but you consider that a problem, I consider that an asset...
BUT if you have mentality of writing all those functions thinking "I'll optimize it if it shows up in profiler" you will have 1000 badly performing functions, each taking 0.1% of the execution time.
That never happens. It's so contrived that it makes your whole argument complete nonsense. It also contradicts that you say a profiler shows just the top few hotspots.

You always have some distribution. And if you concentrate your effort on the top hotspots you'll get better results in less time. Guaranteed. After every optimization the distribution might change and you have to concentrate on the new top. This way your time is spent the most effective. Note that you sometimes also have to look at a call graph profile to see whether a function is part of something bigger that can be optimized at a higher level, or it consists of multiple smaller parts of which the slower ones are easier to optimize than the whole. It's not like relevant optimization opportunities dissapear over time, you just have to closely analyze your profile data.

Just blindly optimizing things in advance costs a lot of time and won't yield good results. It takes time not just to write the code, also to thoroughly test it, debug it, profile it, and it bloats your code so it's harder to manage the really important things.

Besides, not optimizing any low-level functions in advance doesn't mean their performance has to suck. Standard functions performs just fine unless you execute them millions of times, in which case they will show up in the profiler. Most straightforward implementations are also optimized fine by the compiler. The rest, shows up in the profiler.
It's easy to focus on the hot spots and optimize those. And that's what you have to do as well, but it's not the only optimization you have to do. Having mentality of constantly writing well performing code takes effort and it's up to you how much of your time you want to invest on that. 0%? 10%? 20%? 50%? You seem to think it's 0% and I totally disagree with that.
If you've finished implementing a feature or module and performance doesn't meet the design goal, then and only then it's justified to optimize in the middle of a project before continuing with the next thing. I'd even strongly recommend that, since it's easier to optimize and test code you've just written. But it's totally counter-productive to spend a single minute optimizing anything that isn't a direct design goal, while still implementing functionality. It distracts you from what really matters.

Actually you've beautifully described the false reasoning that leads to premature optimization and all its perils. If you assume that a thousand functions each take equal execution time (you've roughly spent the same amount of time coding them), you might be inclined to give them all equal attention for opimization. In reality, coding time and lines of code are totally unrelated to execution time. Without profiling finished code you're just flying blind.

.oisyn
01-07-2009, 05:18 AM
Yes, that would have made sense, but that's not how it was implemented. What would be even clearer is to just write 2 * CeilPow2(x) explicitely and not have a confusing second function.
Well, no, because for an input value like 3 both CeilPow2() and NextPow2() should return the same result: 4. If you write 2*CeilPow(3), you would get 8, which makes no sense. Of course, you could define one function in terms of the other, assuming that the other function works as expected:
int CeilPow2(int x)
{
return IsPow2(x) ? x : NextPow2(x);
}

// OR

int NextPow2(int x)
{
return IsPow2(x) ? 2*x : CeilPow2(x);
}

Nick
01-07-2009, 05:40 AM
Well, no, because for an input value like 3 both CeilPow2() and NextPow2() should return the same result: 4. If you write 2*CeilPow(3), you would get 8, which makes no sense.
Point, but clearly it's ambiguous either way. Googling around I found that everyone actually expects it to behave like CeilPow2. What would your definition be useful for anyway?

JarkkoL
01-07-2009, 05:43 AM
That never happens.
I don't know based on what game development experience you make that assertion, but yes it does happen. Think it as a layer of goo over the entire code base that result from not considering performance while writing code ;) Surely there is more goo in some places that you can easily shovel off, but once you are done with that rest of the goo is smeared randomly around the code base that's not so easy to clean up. You seem to think that you can iteratively keep optimizing the code base focusing always the new top hot spots, but from my experience of working on several game engines, it doesn't quite work like that in practice. But maybe your experiences with game engines has been different, I don't know.

kusma
01-07-2009, 05:51 AM
Excuse me if I'm stupid here or something, but wouldn't NextPow2(x) simply be "CeilPow2(x+1);"? You move the threshold up by one - this should as far as I can understand give the correct result... (For positive numbers, that is)

alphadog
01-07-2009, 06:10 AM
For the record, I totally agree with Nick, except maybe in degree. (Optimization should be done at various steps, with quick passes in the middle of the project, and deeper passes towards the end, but never all the way up front.) I certainly would PO based on experience where, as Nick says, there is no doubt. Also, some effort can be put to consider/research up-front design/optimization, but otherwise, as they say in Lean methods, "decide as late as possible".

If you continuously PO, you risk never getting done. Or, certainly taking much longer than you could have.

For example, how many people use a string buffer instead of just concatenating strings in languages like Java.

Makes sense in a loop with lots of concatenations, but I've seen people do it outside of loops with the idea that they are "doing it for performance and those anti-PO guys are just too lazy to write the extra 3-5 lines it takes."

Now, most times, because of the times the code path is executed, this PO will result in no benefit to the deliverable. But, the PO mentality itself, when spread through each class, each helper function, each block, will result in a delayed deliverable.

alphadog
01-07-2009, 06:12 AM
BTW, this may be the longest thread in the Personal Announcements forum in recent history I've ever seen... :)

JarkkoL
01-07-2009, 06:19 AM
Ah, I finally found the article I was looking for: http://cowboyprogramming.com/2007/01/04/mature-optimization-2 also http://www.gamasutra.com/features/20051220/thomason_01.shtml

Cheers, Jarkko

.oisyn
01-07-2009, 06:27 AM
Excuse me if I'm stupid here or something, but wouldn't NextPow2(x) simply be "CeilPow2(x+1);"? You move the threshold up by one - this should as far as I can understand give the correct result... (For positive numbers, that is)
Very good point :). Even for negative numbers btw, assuming that CeilPow2(x) would return 1 for x<1 as it should. The only edge case is INT_MAX, but CeilPow2() can't return a valid result for INT_MAX anyway (unless you let it return an unsigned int)

alphadog
01-07-2009, 06:30 AM
once you are done with that rest of the goo is smeared randomly around the code base that's not so easy to clean up. You seem to think that you can iteratively keep optimizing the code base focusing always the new top hot spots.

Assuming we have a really bad codebase, then we optimize the top two or three hot spots. If it still doesn't perform within reqs, we pick the next biggest offenders and optimize them. At some point, we're left with an even smear of low performance issues. But, just pick any and rinse, lather, repeat, until you are under specs.

This is an efficient way to use a team's resources.

Instead, what you advocate is to spend a lot of time to performance-optimize each and every part, and interaction thereof, and hopefully end up with a performant whole. This is usually counter-productive. You either a) optimize things you wouldn't have had to optimize (irrelevant to actual and/or perceived performance), or b) paralyze your time-to-market, c) hand-tune a block that the compiler tunes anyways, or d) is made irrelevant by changing technology.

Good codebases will not often exhibit your stipulated behavior, though. This is because good codebases will have been built from the ground up with basic principles (like TDD, etc) that would make it a) unlikely to happen, and b) easier to fix. (I don't buy the "harder to do later". If that's the case, you have bad code.)

PS: There is a BIG difference between proper methods that do not sacrifice resources and "cowboy coding". "Waterfall" people are always afraid of these. There's also a big difference between general knowledge and strategies ("avoid allocations" or "use a BSP tree instead of z-buffering for this app") and specific optimizations ("use a string buffer here instead of a String because we are looping 100K times").

PPS: As an aside, I remember a project where a guy spent lots of time in POS. Ended up with a pretty fast app out of the gates, but many of his techniques resulted in excessive memory use. He was sooo focused on performance, he forgot other aspects. He sub-optimized at the cost of the overall application.

PPPS: Link wars!
http://www.tantalon.com/pete/cppopt/general.htm#GoodOptimizationStrategies

JarkkoL
01-07-2009, 06:54 AM
This is an efficient way to use a team's resources.
I don't agree. It's ok to polish the top few hot spots, but once you hit that even smear of lower performance code to optimize in order to hit your performance goal, it starts to take a lot of time. What happens then? You start to cut/optimize content to ship in time which is total waste of team's resources. I see that you and Nick observe this purely from programmer's point of view but you can't do that when you talk about "efficient way to use a team's resources". I really suggest reading that Mick West's article I posted about early optimization and why it's important.

Good codebases will not often exhibit your stipulated behavior.
... which is because good codebases are optimized early.

Edit: And continuing with this thread is total waste of my team's time as well so I will phase off. But it's nice to have a good argument once in awhile, so thank you (:


Cheers, Jarkko

alphadog
01-07-2009, 07:08 AM
I really suggest reading that Mick West's article I posted about early optimization and why it's important.

The root of the problem is you conflate general design concepts, strategies and methods, which good coders would know and use inherently, with specific optimizations, which are application dependent and can only surface with an actual codebase to test.

I disagree with your Mike West link in the same way I disagree with you. As a commenter says in his thread, not all parts of all application *need* to avoid iterations in all places.

I prefer a more balanced approach. No op up-front, with growing amounts as you get from late alphas to betas. You should be happy with your profiling results in your late betas, before your release candidates.

EDIT: Sorry, Jarkko. After I posted and refreshed, saw you wanted out of the discussion. Hopefully, I don't drag you back in. ;)

Nick
01-07-2009, 08:06 AM
I don't know based on what game development experience you make that assertion, but yes it does happen.
No it doesn't. Based on profiling 3DMark, Splinter Cell, Assassin’s Creed, Dirt... just to name a few. :sneaky:
Think it as a layer of goo over the entire code base that result from not considering performance while writing code ;)
It doesn't matter. If floor() is called only once per frame I don't care how much faster it could be. If it's called millions of times, it will show up in the profile and I will optimize it.

By the way, note that even if it doesn't make the top ten that doesn't mean I won't consider it. Actually we should sort the functions by the gain / effort ratio. Experience obviously helps a lot with that. But if you have no idea yet what can be gained and how much effort it will take, clearly it's better to profile first and focus on the biggest hotspots than to optimize prematurely.
It's ok to polish the top few hot spots, but once you hit that even smear of lower performance code to optimize in order to hit your performance goal, it starts to take a lot of time.
Really? So it takes less time to blindly optimize your functions up front instead of optimizing just those that you've actually measured take up the bulk of execution time?

I've seen quite a few people make the mistake of wasting lots of time to optimize things prematurely and then find there is not enough time left for proper profiling and optimization. Sounds familiar?