Project update 5 of 17
The Vision FPGA SoM campaign is now live!
Hello potential backers, thank you for signing up. This update covers the background and vision behind the Vision FPGA SoM. In case you missed it, do check out the AMA for the project and various other cool projects here.
Of all the various human senses, Vision is the one that supplies the most information about the world around us. Processing the huge data stream from our eyes (already compressed!) requires the largest portion of the human brain (in terms of size and energy) to process and make sense of. Vision is a key sense that is now being used by machines to see, as opposed to conventional cameras who’s primary job was to reproduce an image/video as faithfully as possible.
Vision is also the modality that is missing in most IoT devices due to high cost, power, and complexity as compared to other senses such as balance/motion (accelerometer/gyro) and hearing (Hint: olfactory sensors are another area ripe for picking!).
Machine vision is a term associated with the Cloud more often than with smaller battery powered devices. Machine vision may be defined as the ability to not just capture images but also extract meaning from them rather than the ability to accurately reproduce the scene, eg. "Dog" or "Package near door" rather than just the image of a dog or the porch.
The Cloud, while being easy to integrate with and providing extensive libraries/frameworks and almost indefinite compute, requires compromises of power to send images and receive information about the image, latency, and perhaps most importantly (but least prioritized!) privacy. There are a large number of use cases that can be enabled by enabling low power CV on such devices eg. toys that can respond to their "owners" faces, fire alarms that can detect the number of people in the room, and so on.
This project aims to bring together various components of machine vision: hardware, optics, algorithms and software into a common platform that will enable a significantly larger group of users to experiment with and incorporate CV into their devices. The FPGA Vision SoM supports makers with one-off experimentation while providing a path to higher volume products they may come up with.
The ideal user for this module is one looking to incorporate vision into their device. eg. this may be a smart camera trigger that only responds when motion is detected in a particular region of the field of view.
As described above, CV for IoT/edge processing requires high cost, deep expertise in various aspects of CV such as HW, supply chain, SW, algorithms. This module integrates all these aspects and presents the user with a low-cost volume capable solution.
At my previous employer (Qualcomm Inc.), I founded and eventually led an amazing team to develop a low power image processing chip/system from ground up. While this was revolutionary for its time (and even today) in terms of its low power consumption and other features, I came to the realization that there was a large gap between companies building chips and end users creating end products. I saw the need (and opportunity) to create not just low power/cost ASIC’s but the infrastructure around them to enable vision.
Cloud-based vision is not only a crowded space but also reasonably well developed. My North Star is to enable vision on IoT devices that tend to be highly constrained power/cost/size yet have potentially huge impact in the world. As a Google researcher Pete Warden put it "The future of ML is Tiny".
I invite you to participate with me in this quest by backing this project. Thank you!