Kinect SDK Crash Course (In 12 slides or less) Elliot Babchick
What Do You Get?
Important! Kinect shines brightest when you use it in a wide, open space. Specifically, meters is the supported range. The sweet spot is in the middle (~2.5 m). If it can’t see your entire body, it can’t track you. Make sure your entire body is in the frame!
Setting it Up We did this for you in the skeleton code, but quickly: Import it into your project via references include ‘using Microsoft.Research.Kinect.Nui’ (NUI = Natural User Interface) The Nui is declared a “Runtime” object, pass it the sensors you want using RuntimeOptions and pipes (‘|’). You must specify these up-front (no asking for them after initialized). nui = new Runtime(); try { nui.Initialize(RuntimeOptions.UseDepthAndPlayerIndex | RuntimeOptions.UseSkeletalTracking | RuntimeOptions.UseColor); } catch (InvalidOperationException) { return 42 ; }
Event-driven data streams An example handler, taking the RGB video and putting it into a WPF element named “video” (really an image) nui.DepthFrameReady += new EventHandler (nui_DepthFrameReady); nui.SkeletonFrameReady += new EventHandler (nui_SkeletonFrameReady); nui.VideoFrameReady += new EventHandler (nui_ColorFrameReady); void nui_ColorFrameReady( object sender, ImageFrameReadyEventArgs e) { PlanarImage Image = e.ImageFrame.Image; video.Source = BitmapSource.Create( Image.Width, Image.Height, 96, 96, PixelFormats.Bgr32, null, Image.Bits, Image.Width * Image.BytesPerPixel); }
What’s In A... ImageFrame We’ll cover this in more detail next week. For now, just know that you have access to the raw bytes (misnamed “bits”) that makes up the pixels
What’s in a... DepthFrame Look familiar? It’s the same ImageFrame, but has a different Type field value (it’s a depth image, not a color image)
Making (quick) sense of a depth image Raw data in ImageFrame.Image.Bits Array of bytes: public byte[] Bits; 2 bytes per pixel, moves left to right then top to bottom Every 2 bytes tells how far away that particular pixel is (in millimeters). But you can’t just read the bytes straight out... You need to bit-shift differently depending on whether you’re tracking depth and skeletons or just depth... more on this next week, see the link for more detail if you need it sooner: ickstarts/Working-with-Depth-Data ickstarts/Working-with-Depth-Data
What’s In A... SkeletonFrame A collection of skeletons, each with a collection of joints
Skeleton Data In Detail You get all the joints you see above with. Z values get larger as you move away from the sensor. Moving right (your right) gives you larger X values. Moving up is left to you as an exercise (get it?). Units in meters (note that raw depth was in millimeters)
Mapping coordinates to the UI Coding4Fun Library extends the Joint object with: ScaleTo(int x, int y, float maxSkeletonX, float maxSkeleton y) x and y describe the rectangular space of pixels you’d like to scale a joint to. The second two arguments specify how far you need to move to traverse the scaled range. For example, skeleton.Joints[JointID.HandRight].ScaleTo(640, 480,.5f,.5f) means that your right hand will only need to travel one meter (-.5 to.5) to cover the full 640-pixel-wide distance on screen. Convenient function for converting ImageFrame’s byte data to actual images: ImageFrame.ToBitmapSource() Find Coding4Fun here (but it’s already in the starter code project):
This is Slide #12 I had a slide to spare. Now let’s look at the skeleton code.