Using hand tracking for a game menu pointer: best practices?
Hey everyone,
I'm adding some more stuff to my game and I'm kind of stuck on a problem regarding the menus. Before the gesture API, I used pointer intersection points and a hover timer (like the kinect: when you hover something for N seconds, the click is registered).
After that, I started using the palm position translation to move the pointer relatively, and the "keytap" gesture for clicks. Unfortunately, the results weren't very good: sometimes, the palm position moved when I tapped the index finger, and tracking didn't seem that great either.
So here's the question: how would you approach that problem? How would you track movement in a way that you could use gestures to click? Or would the hovering timer solution be better?
Thanks!
Recently, I've been playing with the sphereRadius to do this. So, 'click' is a 'closed hand'. You could either use the sphere center as the raw positioning, or use a pointer to position, then when the sphere radius goes below a certain value, the user has "closed" their hand.
I would throw in some value averaging, or a mean over a short time period, so that you don't get jitter.
Another approach might be thumb trigger. Use your forefinger as a 'pointer' to get the cursor over the target (while your thumb is outstretched), then close your thumb to the finger to signal a 'click'.
The pointer/thumb trigger is probably more accurate when you have smaller icons, and the palm is probably easier to implement.
As folks are starting to fine tune their apps, the topic of menu selections has come up. We're currently working on some examples and best practices to share with the dev community. They'll have a variety of options: Push a button, flip a switch, etc.. something really clear and simple for you to use.
Hoping to get those out to you quickly!
I'll wait a few more days for the new menu examples, but will look into the thumb trigger idea. Thanks a lot, William and Mizago!
If anyone else has any other approaches, feel free to share =)
The thumb trigger is something we're definitely aiming to use and distribute. You can also think about the thumb tapping towards the middle (curled up) finger. Easier than trying to tap the index finger!
My game uses the "scissors" gesture a lot (index and middle fingers up, the others curled).
Do you think using one finger's intersection point for movement (probably middle) and the other one for tapping (getting the index tap by the gestures API) could be a good idea?
For use as cursor-type movement, the "scissors" position might be tricky. I think people equate a single point on the screen with a single reference. Two fingers might not give people a sense that they're moving one thing.
For menus, we're exploring a bunch of options: buttons, switches, knobs... etc. The more "tactile" we can make the various selections, the more it feels like we're creating a space in your computer we can control. Hoping to share those next week.
The fallback option is the hover/timer solution... where hovering your pointer over a selection shows a quick counter (say for 2 seconds) and the selection is made based on that.
Our approach was to keep the menus as simple as possible, because they're something that the user needs to get through to get to the actual beef, the game itself. If the menu experience is poor, the user may give up even before getting to play. Things might have been different if we were building something where menu interaction was a bigger part of the game, but as it is, most users are likely to just get it over with as quickly as possible.
We came up with these guidelines:
- Menus should be intuitive enough so that using them doesn't require tutorials
- Menus should work for all users, even if it means slightly inconveniencing those who would be willing to learn advanced usage
- If the game doesn't require Locate Screen, menus shouldn't require it either
Based on these requirements, we ended up with the hover timer solution, because it's intuitive and the worst case scenario is not too bad, because the user probably won't be able to play the game either. 1.2 seconds seems to be enough for comfortable use without mis-selections. 0.8 seconds would be optimal for experienced users - if you have lots of menus, you might consider adding an option to toggle between "normal selection speed" and "fast selection speed".
Our menus are 2D, and the depth coordinate isn't used at all. Instead of using Locate Screen data, we simply have a hard-coded translation to screen coordinates, because this works regardless of whether the screen has been located or not. We don't use pointable direction, either - the on-screen pointer's position is simply XY position of the controlling pointable's tip.
It's not the coolest or fastest way to use menus, granted. But frustration can be avoided by spending more effort in reducing menu depth (eg. instead of going through level selection each time, have a "continue game" button in the main menu).
For scrolling lists, we have wide areas at the top and bottom of the list. As long as the pointer is hovering over an area, the list scrolls automatically. The drawback is that scrolling all the way to the end of a very long list can take a while, but the benefit is that it doesn't require gestures or any other specialized code. Again, menu design can counter some of this: for example, if you have a list of levels, have it start at the position of the last played level, instead of always at the top.
Even this simple solution does have some unexpected challenges. Here are a few things we found that we needed to implement:
- After selecting a button / menu item, require the pointer to be moved a bit before allowing a new selection. Otherwise the user may not notice that they're already making a second selection after the first one is completed. Especially important when pressing buttons that change menus.
- Once the hover progress is full, it's good to perform a separate animation that shows that the button is getting pressed (or whatever your widget does upon interaction).
- If you have an option to use mouse control (we allow both), don't turn it off at the instant the Leap device sees something. Wait a bit instead. In very poor lighting conditions, the Leap device may sometimes find phantom pointables that are gone in the next frame.
Hope this gives some ideas.
Aki Kanerva
Designer-Programmer
Virtual Air Guitar Company Oy
How did you go about getting your pointer on screen to follow finger movements? Can you share any code to help get this done?
Thanks
For menus our test users felt confortable with hover / timer solution. They recognise this pattern from others platforms (especially talking about game) like you said (kinect/wiimote). But if the menu have small number of options maybe will be funny to try distinct gestures to choose between options. In one of tests here we used the hands in "\m/" (heavy metal) position to start the interaction. Everyone likes it. =)
Hope this gives some ideas.
This contribution gave me a lot of ideas! Thanks, Aki!
@Aki: that was a wonderful post - I was rushing to finish my demo and didn't get time to reply. I used the scrolling area at one moment too, but the overall design changed. In any case, the post has a ton of excellent insight, thanks a lot!
@nandico: \m/ is awesome - I'm doing a musical game, so I planned on doing a metal pack and switching one of the symbols for that hahah
I went away from the timer, and now some elements of my game are hover-enabled (when you hover a button, you get some info on the level etc), and then when you "click" it, it starts. But even though pointing and motion is better at the moment, the "click" is still kinda funky.
The main panel displays all the levels (like in Angry Birds or similar games), so I have to think about the best way of rolling back to that. If anyone has any suggestions, they're welcome!
@pixelboi: We use Unity and NGUI, so the pointer code is specific to that combination. It also depends on a lot of other things in our project, so it wouldn't be really useful as a generic library.
Basically, the pointer's position equals the TipPosition of a Leap Pointable. The implementation depends on your project - you could set the position of a pointer object in 3D or 2D space, or draw a pointer texture at the position.
You also need to figure out how to translate and scale Leap coordinates to match your UI's coordinates. Usually this means trying out different offsets and scales until you reach something that feels good. Or, you can use Leap's screen location system (can't give any more info on that since I haven't explored it myself).
@Yanko: There's one other thing I should mention, and that's the method of choosing which pointable to follow if more than one is found in a given frame. This applies regardless of whether you select items by timer, click, or some other method.
I don't know if it's been improved in the latest SDK, but when we developed our menus using 0.7.4, there didn't seem to be an unambiguous way of following a specific pointable across time. It is possible to have two consecutive frames that both have two pointables, but in the first frame they have different IDs than in the second frame (eg. 1 and 2 in the first frame, 13 and 27 in the second frame).
Keep in mind that pointables can be found even if the user isn't pointing with anything - we had several occasions where the user was leaning over Leap and their nose was detected as a pointable. Also, the tracking may sometimes detect phantom pointables from things such as fluorescent tube lights. Although this is rare, having the control stick to a phantom pointable is not good for the user.
Therefore, we had to implement a simple logic to choose which pointable is the one that's most likely to be the one that the user wants to control the menus with. In every frame, loop through all detected pointables that are valid, and choose the pointable whose position is closest to whatever pointable was chosen in the previous frame, ignoring IDs entirely.
Another great insight, Aki ;)
I had to add some height and depth checks to avoid getting foreheads detected as hands - and since my game deals a lot with fingers, I needed a sweetspot where I'd be almost sure the leap wouldn't miss any of them.
The thing I was just testing out is getting the average of all pointers. First i used the new 0.76 projection (which I can get the pointer intersection according to a specific vector), but that made the corners of the screen be hard to get to. Now I've added the common pointer intersection to the mix - it got better in some aspects, especially dealing with corners and such, and making the whole screen accessible with a posture that feels natural and lets keytaps be detected. It's jumpier tho, so I'll try to remove thumbs or any other pointers that diverge too much and see if it gets any better.
Also, I'm moving the cursor "lazily": every frame the spot calculated by projections and such is set and I move the cursor towards it, instead of setting the coordinates directly. That helped a lot with jittering and such.
I have been experimenting with a circular menu: a list in which the 'selected' item is always in the center. When the user scrolls up or down to change the selection, the entire contents of the menu shifts, with the last item wrapping around to the other side. I think of it as an infinite version of the iOS UIPickerView, more akin to The Price is Right wheel.
To me this style of menu solves two problems:
- Selection by tapping or hovering is difficult because the user's hand is not 'grounded' to any physical surface. Therefore it is easy to select the wrong item if the tap gesture occurs slightly outside of the region being pointed, or if the cursor drifts out of the region during the hover timeout interval.
- Traditional 'random access' menus need to be scrolled when the there are too many entires to show in the given bounds. This requires an additional interaction to perform scrolling, which itself could be a difficult interaction to get the hang of.
I think the 'circular menu' approach resolves these issues by combining scrolling and selection into a single interaction.
Hi Adam,
The circular menu approach sounds like a great idea. Do you have any example code on how to implement it, preferably in Objective-C?
Thanks!
I've done work with various camera tracking systems, and filtering is something that usually enters the picture sooner or later. In order to retain usability, latency has to be sacrificed for smoothness. Even if there is no jitter, tracking FPS is usually lower than the app's FPS (30 is pretty typical), so it makes sense to do at least a little bit of filtering to interpolate the gaps where no new tracking data is available.
Leap, however, doesn't necessarily need any filtering out of the box. I'd guess that the data is already filtered at a low level. Usually forced filtering is a bad thing, because some developers may want to use their own filtering methods or can work with unfiltered data, but in Leap's case this isn't an issue, because both FPS and latency seem to be very good regardless of what happens under the hood.
Median filtering is a simple and useful method of eliminating jitter, especially when the filtered value is expected to remain fairly still.
A single-exponential filter is good for many UI purposes, particularly for eliminating large glitches. Increasing smoothness will naturally make the filtered position lag behind.
To gain a bit of speed at the cost of overshooting when the user stops moving or makes abrupt changes in direction, you can combine the filter with a filtered prediction vector, or use a double-exponential filter.
With all continuous time-based filters, it's important to remember that they must be independent of the application's frame rate. If you don't do this, users with greatly different frame rates will get different speeds. For example, if your development computer runs your app at 60fps, but a user's computer only at 30fps, all of the filters will move at half of the speed at which you designed them. And if the user has a very powerful computer, the FPS can go up to 100-200 for visually simple apps. Also, unless you turn vertical sync on, the FPS will have constant small fluctuations as well.
To make filters time-independent, you must take into account the delta time from the previous frame. To do this, pick a fixed delta time, and divide the actual delta time with it. For example, if your fixed delta time is 1/60 seconds (60 FPS), you can get the filter timestep by calculating timeStep = deltaTime / (1/60). The math for actually applying the timestep to a filter depends on the filter itself.
I'm thinking about it in a similar way.. something similar to the gui for Prezi Desktops menu/rose =)