PDA

View Full Version : Seperate thread for game input?


bignobody
10-11-2005, 09:01 AM
Hello,

Currently my game runs in a single thread with the user input being collected once per game loop. This seems to work OK, provided that it is running at a reasonable frame rate.

Because the game controls can really make or break a game, I'm starting to wonder if I should move collecting player input to a seperate thread.

Is this more trouble than it's worth?

.oisyn
10-11-2005, 09:18 AM
Assuming the input is queued (which usually is the case), it makes no sense to collect them in a different thread, because the game won't react to that input immediately; you can see the feedback when the input is processed and a new frame is drawn. Collecting the input immediately doesn't mean you'll get feedback as soon as you press a button.

Are you having troubles when the framerate drops?

bignobody
10-11-2005, 11:07 AM
Hi .oisyn, thanks for the reply.

I'm not having a lot of troubles, but I do notice a bit of a difference in the "feel" of the controls at lower framerates. If I'm running it on my laptop fullscreen 640 x 480, the control feels a little more responsive then when it's running at 1024 x 768 (and therefore a lower framerate).

You do make a good point, since the input is not going to be processed until the game loop anyway. Am I really going to save that much time by not having to read all the devices, just process the buffered input? Hard to say. Perhaps I'll leave it as is. Cheers!

Reedbeta
10-11-2005, 11:25 AM
I'm pretty sure that most games I've played including commercial ones demonstrate something similiar. The controls are just barely less responsive at a lower framerate. I'm not sure there is any way to get around this. The framerate controls the minimum amount of time that it takes to get any response to a control action; if your controls are implemented properly you'll see a bigger per-frame response at a lower framerate (so that the response is at a constant rate with respect to real time); but nothing can change the fact that there's a longer response time at lower framerates.

bignobody
10-11-2005, 12:06 PM
Thanks Reed! I think that clinches it then. Sure glad I made this post, you guys just saved me a lot of pointless work! :yes:

.oisyn
10-11-2005, 12:49 PM
Yes, I've experienced it with commercial games as well. Max Payne particularly comes to mind, but my total PC configuration probably had something to do with it as well. The thing mostly causing this problem is tripplebuffering together with vsync. In the worst case, it can take like 4 or 5 frames between the input and the actual output, and with a low framerate the game can be quite unresponsive.

I once read an article about a new graphics vendor (bitboys probably) presenting a technique to make multiple GPUs cooperate by giving each GPU it's own frame. That effectively means you are giving commands for the second frame while the first is still rendering, which simply adds one more frame to the delay. I'll ask you this: WHAT THE F*CK WERE THEY THINKING?! But enough of that ;)

Reedbeta
10-11-2005, 12:58 PM
.oisyn: that sounds like nVidia SLI ;)

.oisyn
10-11-2005, 01:10 PM
Scan Line Interleaving? The acronym says otherwise ;). But no, it wasn't nVidia, but I don't know if they use the same technique. Why would they? They've bought the perfect algorithm from 3dfx (read: they bought 3dfx ;)). Or do they?

zavie
10-11-2005, 01:18 PM
I once read an article about a new graphics vendor (bitboys probably) presenting a technique to make multiple GPUs cooperate by giving each GPU it's own frame. That effectively means you are giving commands for the second frame while the first is still rendering, which simply adds one more frame to the delay. I'll ask you this: WHAT THE F*CK WERE THEY THINKING?! But enough of that ;)

What's the problem with that? If the rasterizing step appears to be the weak link in the rendering process, being able to start a new frame while the previous one is still being rasterized is a good thing. And the delay for commands processing is not superior.

zavie
10-11-2005, 01:22 PM
But no, it wasn't nVidia, but I don't know if they use the same technique. Why would they? They've bought the perfect algorithm from 3dfx (read: they bought 3dfx ;)). Or do they?

They do.
http://en.wikipedia.org/wiki/3dfx

.oisyn
10-11-2005, 01:48 PM
In SLI mode, two Voodoo 2 boards were connected together, each drawing half the scanlines of the screen.
That's not like each GPU drawing it's own complete frame.

fringe
10-11-2005, 02:14 PM
I don't know anything, but it comes to mind why not put the controls and game logic in one thread and the renderer in a seperate thread. Then the controls and work with the game, the renderer just updates for the current state of the game?

Of course this might lead to slightly strange things happening at low frame rates where you leap from too far left to too far right without passing through the middle. A thought though.

corey
10-11-2005, 03:03 PM
I don't know anything, but it comes to mind why not put the controls and game logic in one thread and the renderer in a seperate thread. Then the controls and work with the game, the renderer just updates for the current state of the game?

Of course this might lead to slightly strange things happening at low frame rates where you leap from too far left to too far right without passing through the middle. A thought though.
Just a note to throw out with multi-threading. You automatically add the need to sync that can possibly add unwanted delays (although not always) ,and it doesn't guarantee any sort of deterministic execution. What you get with a simple Input -> Logic -> Communication -> Render loop is simple, built-in timing control.

Depending on the UI that you have setup, threading doesn't always provide a faster solution either. Many times, you can just poll for the state you're testing on immediately. If you're using Windows messaging, you're using the window's thread anyway (unless you use a modified thread's message loop).

However, many networking and input thread solutions are possible and can work. This is especially helpful when waiting on a connection or some other blockable/long state.

Corey

m4x0r
10-11-2005, 03:48 PM
If you are displaying a mouse cursor then using the hardware cursor support can make a big different in the apparent responsiveness of a game. Of course this doesn't apply to all games.

Max

bignobody
10-12-2005, 06:44 AM
fringe: Thanks for your thoughts. I'm still going to leave things as-is though, as doing what you suggest would require some major code changes which I'm trying to avoid at this point.

corey: Thanks for your input. Definitely sticking with the single thread.

m4x0r: My game doesn't use the mouse pointer, but it is a good suggestion nonetheless!

Cheers!

Reedbeta
10-12-2005, 10:18 AM
BTW, it's Scalable Link Interface now, not Scan Line Interleaving. There are two modes in which SLI can run, one of which interleaves frames rendered alternately by two (or more) GPUs, the other of which attempts to "split the work" of a single frame equally between the GPUs (it doesn't give any details as to how this splitting is accomplished, though possibly by scan-line interleaving).

http://en.wikipedia.org/wiki/Scalable_Link_Interface

bignobody
10-12-2005, 10:20 AM
Sounds like another case of TMA! (Too Many Acronyms) :lol:

.oisyn
10-13-2005, 02:16 AM
Reedbeta: fair enough, but my point was:
GPUs work on the same frame simultaneously, each performing roughly half of the calculations required to render the frame

Assigning different frames to different GPU's can be a serious crime, especially when using feedback effects where the contents of the previous frame is needed or when doing hardware occlusion culling and you have to wait for the results.

zavie
10-13-2005, 05:59 AM
Come on! People who are designing this are not that nut. If a feedback request is made, the GPU has to wait for the ressource to be available. It is exactly the same as programming on a single GPU. It is just faster (on worst case, it's the same speed).
On single GPU there are the same problems: texturing operation waiting for geometry ones to be finished, and so on. The programmer just has to be aware of that, and must take care of bottlenecks.

.oisyn
10-13-2005, 06:32 AM
You're obviously missing the point. The whole advantage of having multiple GPU's disappears as soon as one GPU is waiting for the other one to be finished. You want parallel processing with none to little dependencies, not serial processing where each unit has to wait on it's predecessor.

By giving each GPU a part of the same frame, your fillrate doubles and there are no stalls. By giving each GPU a different frame, fillrate is lost while one GPU is waiting for the other. I'm not saying the latter situation is actually slower than having a single GPU, I'm saying that the effectiveness of multiple GPU's is much less when using such a technique.

zavie
10-13-2005, 11:41 AM
You're obviously missing the point. The whole advantage of having multiple GPU's disappears as soon as one GPU is waiting for the other one to be finished. You want parallel processing with none to little dependencies, not serial processing where each unit has to wait on it's predecessor.
I agree with that. Real parallel processing just like 3Dfx's one is much clever. But the case of a GPU waiting for the other just has to be avoided. Just like bottleneck in the rendering pipe have to be avoided. This is all the game of optimizing rendering. ;-)

I'm not saying the latter situation is actually slower than having a single GPU, I'm saying that the effectiveness of multiple GPU's is much less when using such a technique.
Indeed.

geon
10-13-2005, 04:01 PM
Hmm. When will we see dual GPUs on a single graphics card?

corey
10-13-2005, 05:20 PM
Hmm. When will we see dual GPUs on a single graphics card?
http://www.ati.com/technology/crossfire/index.html is what is coming up. At least it provides some opportunity for parallel access.

corey

cm_rollo
10-13-2005, 07:00 PM
BTW, it's Scalable Link Interface now, not Scan Line Interleaving. There are two modes in which SLI can run, one of which interleaves frames rendered alternately by two (or more) GPUs, the other of which attempts to "split the work" of a single frame equally between the GPUs (it doesn't give any details as to how this splitting is accomplished, though possibly by scan-line interleaving).

It seems to be more of a "split the whole frame in the middle" if I read nvidia's GPU Programming Guide (http://developer.nvidia.com/object/gpu_programming_guide.html) right. I'm guessing that deep pipelines and branching wouldnt like doing interleaving lines.

According to the guide, card 1 handles the top part of the frame and card 2 the bottom and the sizes are load balanced between frames, so if card 1 is idle for half the current frame it will get a larger part next time (this would make sense in most outdoor shooters - Heavy shaders on characters and ground and simple shaders for sky).

Rydinare
10-15-2005, 06:42 AM
Assuming the input is queued (which usually is the case), it makes no sense to collect them in a different thread, because the game won't react to that input immediately; you can see the feedback when the input is processed and a new frame is drawn. Collecting the input immediately doesn't mean you'll get feedback as soon as you press a button.

Are you having troubles when the framerate drops?

I think this is a deisgn issue, as well. To say input won't be reacted to immediately, I guess one could argue that, but consider this...

Your app runs at 60 frames a second. That's just the graphics. Often, other behind-the-scene processes can run quite a bit faster than that. So, get a message, transport it to the game logic thread and it gets processed let's say 1/100 of a second later. This is undetectable to the user. Unless you're running at like 5 frames at second (in which case you have much bigger issues than this), I think putting input in a separate thread is not a harm timing-wise.

Getting back to why this is a design issue, a common top-down approach is to approach a system and break it into subsystems. Input often falls into a common subsystem. To ensure efficiency, each subsystem is often made up of one or more threads, which communicate (via pipes, mailboxes, or direct method calls with semaphore protection). In my experience, a design like this usually is pretty efficient.

Now then, that all being said, I think it falls into the main issue of your architecture. If you have an approach similar to the one I mentioned above, I think a separate thread for input works very well. If you don't and you're not really using threads for most of the rest of your system, you may not want to go this route.

One more thing to remember. Multithreaded applications will be more efficient now and in the near future. Here's a good article on it:

http://www.gotw.ca/publications/concurrency-ddj.htm