1 / 43

Iron Reign Computer Vision

Learn about the use of Vuforia, OpenCV, and TensorFlow Lite in computer vision for FTC, including their pros and cons. Understand how these technologies can help with mineral recognition and autonomous navigation. Deadline: Tonight - Tuesday evening.

pmark
Download Presentation

Iron Reign Computer Vision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Iron Reign Computer Vision Rover Ruckus Season

  2. Outline Need outline completed by tonight - Tuesday evening • What is FTC - (backgrounder for Computer Visionaries - not needed for DPRG • ftc_app - basic description • Vuforia • OpenCV4Android • “Tensorflow Lite” - a black box mineral recognition engine for all teams to use without really having to understand computer vision • Roll your own CNN in Tensorflow

  3. ftc_app • Android app framework published by ftctechnh on github • Common framework for all ftc teams - this is our starting ground • Robot controller - phone on robot connected to underlying hardware controllers • Driver station - phone that operators use to: • Send remote joystick commands • Get telemetry on robot status • Change the active opmode • Restart the robot • Opmodes • Configurations

  4. ftc_app control system Robot Side Driver Side

  5. Robot Wiring

  6. The REV Expansion Hub REV Robotics started by Greg Needel, former DPRG member

  7. Game Elements Gold cubes Silver balls

  8. Why do we need Computer Vision in the first place? Main reason: Sampling is a part of the FTC challenge this year. It awards 30 points to a team which can move only the gold mineral, not the silver mineral, during the autonomous portion of the match (i.e. no driver control).

  9. Vuforia • Augmented reality SDK for mobile devices, but includes image tracking • Developed by Qualcomm, now owned by PTC, developers of Creo and MathCAD • What we use it for • Localization against reference targets - 1 slide • Real-time target tracking - demo of cartbot • Initial frame capture for hand off to downstream CV • Not using Vuforia target tracking in this year’s challenge • Demo - Cartbot tracking a target

  10. What is OpenCV? • Collection of open source Computer Vision algorithms • This is the “standard” computer vision library • Can be combined into powerful pipelines • Been an FTC standard for many years • Very stable and well-tested library • Written in native code for maximum efficiency

  11. Pros and Cons of OpenCV Pros • Easy to use and design • Tools like GRIP make it even easier • Iron Reign has a good track record with OpenCV - using it for the last 3 years • As an added benefit, this means that OpenCV is already integrated into the build so we don’t have to do any fiddling around with Gradle Cons • We can only test pipelines on lighting conditions that we have images for • If a competition has different lighting conditions, our pipeline might fail • OpenCV lacks the “wow factor” associated with Machine Learning/AI during judging.

  12. OpenCV4Android • The name describes it • Java wrappers around opencv functionality tuned for on-phone use • Native code is possible but a heavier lift • Not integrated with the shipping ftc_app project - teams have to do it themselves • High school teams struggle with learning the basic vision algorithms • Then coding accurately becomes an issue • We started with sample apps like ColorBlobDetector and adapted to ftc_app • Manually coded hybrids of Vuforia and OpenCV • Now we use interactive vision pipeline explorers and code generators • RoboRealm - contributed licenses, closed source • GRIP - based on OpenCV - generates pipeline code in Java, C++ and Python

  13. But OpenCV looks too scary....

  14. Use GRIP to design your pipelines

  15. Our OpenCV Pipeline

  16. Our OpenCV Pipeline

  17. Our OpenCV Pipeline

  18. Our OpenCV Pipeline

  19. Our OpenCV Pipeline

  20. Handling anomalies

  21. Our final pipeline

  22. Tensorflow Lite TensorFlow is Google’s open source machine learning library. It models neural networks using “tensors,” which are basically neurons. TensorFlow Lite is the solution for running ML models on mobile and embedded devices.

  23. Refresher on Neural Networks

  24. TensorFlow Object Detection TensorFlow’s Object Detection is implemented via a sliding window classifier. A basic sliding window classifier

  25. “Tensorflow” as bundled in ftc_app • Game specific solution to targeting game elements • Easy to follow tutorial on getting it working • Gives a recognition confidence level and location of detected elements • Slow and sloppy on our phones (what’s the fps)

  26. “Tensorflow” as bundled in ftc_app • Black box • Not sure what algorithms are involved, though likely a small CNN called a MobileNet • Probably a “re-trained” Imagenet (transfer learning) • But without knowing, not sure how to improve recognition or speed • Trying to improve repeatability through on-bot lighting • Sketchy performance • We have had trouble getting it to work right • Higher accuracy with minerals on the left than the right during testing • Our sister team said that “mounting their phone at a sideways angle helped improve accuracy”¯\_(ツ)_/¯

  27. Pros and Cons of “TensorFlow” Pros • Easy to get started(sample code provided) • Already integrated into ftc_app build Cons • Black box - we don’t know what is happening • Very sketchy performance • Little to no “wow-factor” as this is available to all teams

  28. Rolling our own Convolutional Neural Network Instead of using a black box model, why not write our own model? We actually had the idea to train our own CNN before TensorFlow Lite integration was even released to ftc_app.

  29. Step 0: Determine Training Objective of the model Given an image, we wished for the network to output 2 integers x and y, (0 < x < 320 and 0 < y < 240; our image is 320x240). These two integers would be the location of the gold mineral on the image.

  30. Step 1: Capturing training data

  31. Step 2: Label training data We could label coordinates by hand, but this is too difficult. Instead, we wrote a program to do help us label the data.

  32. Step 2.1: Write labeling program Available at github.com/arjvik/MineralLabler

  33. Step 2.2: Use labeling program to label images

  34. Step 3: Train model

  35. CNN Structure

  36. Step 3.1: Try again with Java/DL4J

  37. Our CNN Structure (in Java)

  38. Step 4: Continue adapting the model to improve it We are considering turning our model into a sliding window classifier, to potentially increase accuracy. We also need to capture and label more training data, to better fit our model.

  39. Step 5: ?????? Step 6: Profit Of course, if we can get this to work, it will be a great benefit, both during the robot game, and also during judging. For now, we will keep trying to improve it, but we haven’t forgotten about the other two vision methods either. Iron Reign believes in parallel development, and we will continue working on all three before evaluating which one we wish to use later on in the season.

  40. Summary

  41. Resources GRIP pipeline generator: https://wpilib.screenstepslive.com/s/4485/m/24194/l/463566-introduction-to-grip TensorFlow Lite on Android:https://medium.com/tensorflow/using-tensorflow-lite-on-android-9bbc9cb7d69d TensorFlow for Poets codelab:https://codelabs.developers.google.com/codelabs/tensorflow-for-poets Find out more about Iron Reign’s vision algorithms: https://www.ironreignrobotics.com/tags/vision/index

More Related