View Categories

HandPose – Detect Hands and Gestures in Scratch

🖐️ HandPose – Detect Hands and Gestures in Scratch #

The HandPose extension brings real-time hand tracking into Scratch.
It lets your projects react to fingers, wrist movement, and simple gestures – right in your browser, with no setup required.
Perfect for classrooms, workshops, and creative coding builds. 🙌


🌟 Overview #

  • Detect up to 4 Hands: Track one or more hands simultaneously.
  • 21 Landmarks: Wrist, thumb joints, and each finger’s joints and tips.
  • Read Coordinates: Get X/Y positions of any hand landmark on the Scratch stage.
  • Measure: Calculate angles and distances between hand landmarks.
  • Camera Controls: Show, hide, mirror, and choose your camera device.
  • Choose Input: Analyze from the live camera or the stage image.

Key Features #

  • Multi-hand tracking (1–4 hands).
  • Friendly dropdown of joints and fingertips.
  • Adjustable “classify” intervals for smooth performance and CPU control.
  • Camera preview transparency and mirroring options.
  • Fully browser-based – no installation, private and secure.

🚀 How to Use #

  1. Go to pishi.ai/play.
  2. Open the Extensions section.
  3. Select HandPose from the list.
  4. Allow camera access when prompted and check that the preview appears.
  5. If no cameras are detected, the input will automatically switch to the stage image instead.
  6. Continuous detection starts by default at a smooth 100 ms interval.
  7. Use position and measurement blocks to make sprites react to finger gestures.

Tips:

  • Use good, even lighting for best tracking accuracy.
  • When multiple hands appear, use hand no: 1 – 4 in the block dropdown.
  • On slower computers, increase interval (e.g., 150–250 ms) to reduce CPU load.

🧱 Blocks and Functions #

📍 Position & Count #

x of keypoint no: [KEYPOINT] hand no: [HAND_NUMBER]
y of keypoint no: [KEYPOINT] hand no: [HAND_NUMBER]

Reports the X or Y position of a specific hand landmark on the stage.
[KEYPOINT]: choose from the dropdown list (wrist, joints, fingertips).
[HAND_NUMBER]: selects which hand to track (1–4). “1” = first detected hand.
Returns empty if no hand is detected.

 

hand count

Reports how many hands are currently detected (0 – 4).


📏 Measurements #

angle between keypoints: [KEYPOINT_1] and [KEYPOINT_2] object no: [HAND_NUMBER]

Reports the angle (in degrees) between two landmarks – ideal for detecting finger bend or wrist rotation.

distance between keypoints: [KEYPOINT_1] and [KEYPOINT_2] object no: [HAND_NUMBER]

Measures the distance in stage pixels between two landmarks – useful for pinch, expand, or spread gestures.

Notes:
Default keypoints: 0 (wrist) and 12 (middle fingertip).
Coordinates follow the Scratch stage center (X ≈ −240240, Y ≈ −180180).
When mirroring is on, X values flip to match the preview view.


⚙️ Classification Controls #

  • classify [INTERVAL] - Choose how often detection runs:
    • every time this block runs
    • continuous, without delay
    • continuous, every 50–2500 ms
  • turn classification [on/off] - start or stop continuous detection.
  • classification interval - reports the current interval in milliseconds.
  • continuous classification - reports continuous detection is “on” or “off”.
  • select input image [camera/stage] - choose camera or stage.
  • input image - reports the active input source.

🎥 Video Controls #

  • classify [INTERVAL] - Choose how often detection runs:
    • every time this block runs
    • continuous, without delay
    • continuous, every 50–2500 ms
  • turn classification [on/off] - start or stop continuous detection.
  • classification interval - reports the current interval in milliseconds.
  • continuous classification - reports continuous detection is “on” or “off”.
  • select input image [camera/stage] - choose camera or stage.
  • input image - reports the active input source.

🖐️ Common Keypoints (Handy Numbers) #

Use these shortcuts for common hand regions, or select any landmark from the dropdown.

0: wrist,
1: base of thumb, 2: thumb joint 2,
3: thumb joint 1, 4: thumb tip,
5–8: index finger joints / tip,
9–12: middle finger joints / tip,
13–16: ring finger joints / tip,
17–20: little finger joints / tip

The menu counts from 0–20 (like MediaPipe indices).


🎓 Educational Uses #

  • Explore computer vision by visualizing hand joints and movement.
  • Teach coordinate systems by mapping finger motion to sprite X/Y.
  • Apply geometry and math to calculate angles and distances.
  • Create gesture-based interactions like pinching, pointing, or thumbs-up triggers.

🎮 Example Projects #

  • Pinch to Click: Detect thumb–index distance to simulate a mouse click.
  • Finger Piano: Map fingertips to keys and play notes as you move.
  • Thumbs-Up Detector: Trigger actions when the thumb points upward.
  • Rock – Paper – Scissors: Recognize hand shapes using landmark distances.
  • Hand Controller: Move sprites with wrist X/Y and boost on finger spread.

🧩 Try it yourself: pishi.ai/play


🔧 Tips & Troubleshooting #

  • No camera?
    • Make sure your camera is connected and browser permission is allowed.
    • If the camera is blocked, enable it in your browser’s site settings and reload the page.
    • During extension load, if no cameras are detected, the input will automatically switch to the stage image so you can still test FaceMesh features.
  • No detection?
    continuous classification: Use this reporter to see if classification is active.
    • If it is active, improve lighting and face the camera directly.
    turn classification [on]: Use this block, if classification is not active, then recheck the classification status with the above reporter.
    • In camera input mode, when the camera is turned off, classification is also stopped - you must turn the video back on or switch input to stage.
    • In stage input mode, the system classifies whatever is visible on the stage - backdrops, sprites, or images. You can turn off the video completely and still process stage images.
    • Stage mode is slower than camera input, so lower your classification interval (e.g., 100–250 ms) for smoother results using this block: classify [INTERVAL]
    • In stage mode, “left” and “right” landmarks are swapped because the stage image is not mirrored - coordinate space represents a real (non-mirrored) view.
    • Classification can also restart automatically when you use blocks such as:
    turn video [on] / classify [INTERVAL] / select camera [CAMERA] / select input image [camera/stage].
  • Flipped view?
    turn video [on-flipped]: Use this to show the camera without mirroring. “on” mirrors like a selfie; “on-flipped” shows real left/right orientation.
  • Laggy or slow?
    Use classification intervals between 100–250 ms or close other browser tabs to reduce processing load.
  • WebGL2 warning?
    Try Firefox, or a newer device that supports WebGL2 graphics acceleration.
  • Analyze stage instead of camera?
    select input image [stage]: Use this to analyze the Scratch stage image instead of a live camera feed.

🖐️ HandPose Specific Tips #

  • Hand not detected? Ensure your full hand – including the wrist – is visible in the camera. Spread your fingers slightly; closed fists or motion blur make detection harder.
  • Fingers confused? Keep fingers clearly separated and avoid overlapping for accurate tracking of individual fingertips.
  • Multiple hands? Use the hand no: 1–4 parameter of block to choose which hand to track. Hand 1 is typically the largest or closest hand in view.
  • Tracking unstable? Keep your hand steady and evenly lit. Avoid strong shadows or very bright reflections on skin.
  • Detect pinch gesture? Measure the distance between keypoints 4 (thumb tip) and 8 (index fingertip). A smaller distance indicates a pinch.
  • Detect pointing? Check if keypoint 8 (index fingertip) has a smaller Y value than keypoint 5 (index base) while other fingers are bent down.
  • Count fingers up? Compare each fingertip’s Y position with its base – if the fingertip Y is higher, that finger is extended.
  • Thumbs-up detection? Verify if keypoint 4 (thumb tip) is higher than keypoint 2 (thumb base) while other fingers remain folded.
  • Hand orientation? Calculate the angle between keypoint 0 (wrist) and keypoint 12 (middle fingertip) to estimate hand rotation or tilt.
  • Using stage mode with photos? In stage mode, landmarks are not mirrored – left and right correspond to true anatomical positions.

🔒 Privacy & Safety #

  • Everything runs locally in your browser.
  • No images or video are uploaded anywhere.
  • Model files may download once for offline use.
  • Always ask a teacher or parent before using the camera.
  • Anytime, you can safely turn video [off].

🧪 Technical Info #

  • Model: MediaPipe Hands (HandPose)
  • Framework: TensorFlow.js (latest) – runs fully in-browser with WebGL 2
  • Detection: Up to 4 hands / 21 landmarks (0 – 20)
  • Coordinate System: Scratch stage pixels (X right, Y up)
  • Mirroring: “on” = mirrored preview, “on-flipped” = true view
  • Input Modes: Camera or Stage canvas
  • Default Keypoints: 0 (wrist), 12 (middle fingertip)
  • Default Classify Interval: 100 ms

🔗 Related Extensions #

  • 😎 FaceMesh – detect face landmarks
  • 🕺 PoseNet – track body pose
  • 🖼️ Image Trainer – build custom AI models
  • 🏫 Google Teachable Machine – import your own TM models

📚 Learn More #


Scroll to Top