Natural User Interface with Kinect for Windows Clemente Giorio & Paolo Patierno
Natural User Interface
Hardware Overview IR PROJECTOR RGB CAMERA DEPTH CAMERA TILT MOTOR 3-axis ACCELEROMETER MIC ARRAY Hardware Requirements : Windows 7, Windows 8, Windows Embedded Standard 7, or Windows Embedded POSReady 7. CPU x86 or x64 Dual-core 2.66-GHz Dedicated USB 2.0 bus 2 GB RAM
Inside Kinect
IR Projector The pattern is composed by 3x3 sub-patterns of 211x165 dots pattern (for a total of 633x495 dots). In each sub-patterns one spot is much brighter than all the others. 827nm
Depth Camera ModePhysical LimitsPractical Limits Near0.4 to 3m (1.3 to 9.8ft)0.8 to 2.5m (2.6 to 8.2ft) Normal0.8 to 4m (2.6 to 13.1ft)1.2 to 3.5m (4 to 11.5ft) CMOS with an IR-pass filter up-to 640x480 pixels Each pixel, based on 11 bits, can represents 2048 levels of depth.
RGB Camera CMOS with 8bits per channel producing a Bayer filter output with a RGGBD pattern IR frame
Tilt Motor & 3-axis Accelerometer 3-axis accelerometer configured for a 2g range (g is the acceleration value due to gravity) with 1-3 degree accuracy. Tilt Motor
Mic Array 24-bit Analog to Digital Converter The captured audio is encoded using Pulse-Code Modulation (PCM) with a sampling rate of 16 KHz and 16-bit depth. 4 x mic Advantages of multi-microphones Enhanced Noise Suppression, Acoustic Echo Cancellation Beam-forming technique.
SDK Overview
Camera Data
Step 1: Register for VideoFrameReady Event // Add an event handler to be called whenever there is new color frame data this.sensor.ColorFrameReady += this.SensorColorFrameReady; // Turn on the color stream to receive color frames this.sensor.ColorStream.Enable(ColorImageFormat.RgbResolution640x480Fps30); // Start the sensor! this.sensor.Start(); /// Active Kinect sensor private KinectSensor sensor;
Step 2: Read the Stream /// Event handler for Kinect sensor's ColorFrameReady event private void SensorColorFrameReady(object sender, ColorImageFrameReadyEventArgs e) { using (ColorImageFrame colorFrame = e.OpenColorImageFrame()) { if (colorFrame != null) { // Copy the pixel data from the image to a temporary array colorFrame.CopyPixelDataTo(this.colorPixels); // Write the pixel data into our bitmap this.colorBitmap.WritePixels(new Int32Rect(0, 0, this.colorBitmap.PixelWidth, this.colorBitmap.PixelHeight), this.colorPixels, this.colorBitmap.PixelWidth * sizeof(int), 0); }
DepthFrameReady Event void sensor_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e) { using (DepthImageFrame depthFrame = e.OpenDepthImageFrame()) { if (depthFrame != null) { // Copy the pixel data from the image to a temporary array depthFrame.CopyDepthImagePixelDataTo(this.depthPixels); //convert the depth pixels to colored pixels ConvertDepthData2RGB(depthFrame.MinDepth, depthFrame.MaxDepth); this.depthBitmap.WritePixels(new Int32Rect(0, 0, this.depthBitmap.PixelWidth, this.depthBitmap.PixelHeight), this.colorDepthPixels, this.depthBitmap.PixelWidth * sizeof(int), 0); UpdateFrameRate(); } } }
Depth Data ImageFrame.Image.Bits Array of bytes - public byte[] Bits; Array –Starts at top left of image –Moves left to right, then top to bottom –Represents distance for pixel in millimeters
Distance 2 bytes per pixel (16 bits) Depth – Distance per pixel –Bitshift second byte by 8 –Distance (0,0) = (int)(Bits[0] | Bits[1] << 8); –VB (int)(CInt(Bits(0)) Or CInt(Bits(1)) << 8); DepthAndPlayer Index – Includes Player index –Bitshift by 3 first byte (player index), 5 second byte –Distance (0,0) =(int)(Bits[0] >> 3 | Bits[1] << 5); –VB:(int)(CInt(Bits(0)) >> 3 Or CInt(Bits(1)) << 5);
Skeleton Tracking Skeleton Data Y X Z
Skeleton Default 20 Joints Seated 10 Joints
Skeleton API
Joint Data Maximum two players tracked at once Six player proposals Each player with set of joints in meters Each joint has associated state Tracked, Not tracked, or Inferred Inferred - Occluded, clipped, or low confidence joints
Step 1: SkeletonFrameReady event // Turn on the skeleton stream to receive skeleton frames this.sensor.SkeletonStream.Enable(); // Add an event handler to be called whenever there is new color frame data this.sensor.SkeletonFrameReady += this.SensorSkeletonFrameReady; /// Event handler for Kinect sensor's SkeletonFrameReady event private void SensorSkeletonFrameReady (object sender, SkeletonFrameReadyEventArgs e) { Skeleton[] skeletons = new Skeleton[0]; using (SkeletonFrame skeletonFrame = e.OpenSkeletonFrame()) { if (skeletonFrame != null) { skeletons = new Skeleton[skeletonFrame.SkeletonArrayLength]; skeletonFrame.CopySkeletonDataTo(skeletons); } }
Step 2: Read the skeleton data using (DrawingContext dc = this.drawingGroup.Open()) { // Draw a transparent background to set the render size dc.DrawRectangle(Brushes.Black, null, new Rect(0.0, 0.0, RenderWidth, RenderHeight)); if (skeletons.Length != 0) { foreach (Skeleton skel in skeletons) { RenderClippedEdges(skel, dc); if (skel.TrackingState == SkeletonTrackingState.Tracked) { this.DrawBonesAndJoints(skel, dc);} else if (skel.TrackingState == SkeletonTrackingState.PositionOnly) { dc.DrawEllipse(this.centerPointBrush, null, this.SkeletonPointToScreen(skel.Position), BodyCenterThickness, BodyCenterThickness); }}} // prevent drawing outside of our render area this.drawingGroup.ClipGeometry = new RectangleGeometry(new Rect(0.0, 0.0, RenderWidth, RenderHeight)); } }
Step 3: Use the joint data // Left Arm this.DrawBone(skeleton, drawingContext, JointType.ShoulderLeft, JointType.ElbowLeft); this.DrawBone(skeleton, drawingContext, JointType.ElbowLeft, JointType.WristLeft); this.DrawBone(skeleton, drawingContext, JointType.WristLeft, JointType.HandLeft);
Step 4: Fine-tune
Audio As microphone For Speech Recognition
Speech Recognition Kinect Grammar available to download Grammar – What we are listening for –Code – GrammarBuilder, Choices –Speech Recognition Grammar Specification (SRGS) C:\Program Files (x86)\Microsoft Speech Platform SDK\Samples\Sample Grammars\
FORWARD avanti vai avanti avanza BACKWARD indietro vai indietro indietreggia Grammar
Netduino Plus based robot Magician chassis Struttura 2 DC motors Motor driver WiFi bridge Netduino Plus
Demo MotionControlRemote Connect & Commands MotionClient MotionServer MotionControlTB6612FNG TB6612FNG
Demo
DEMO Per visualizzare qualche attimo registrato durante la sessione: Demo Gesture Recognition: Demo Speech Recognition in Napoletano:
Kinect for Windows: MSDN: Clemente Giorio: Paolo Patierno: Resources & Contact
Ringraziamo gli sponsor!