Hello everyone, Thank you for signing up and backing this project!
We will briefly explore some applications and potential use cases of the Vision FPGA SoM in this update.
Disclaimer: Not all of these use cases have been actually prototyped on the Vision SoM but are technically possible.
The Vision SoM has an image sensor, 6 DoF IMU and a microphone. All these are hooked up to the FPGA and can be used to capture images, record sound and/or read the IMU information. The interface to the SoM uses SPI which any microcontroller supports.
When paired with a microcontroller, the SoM can be used as a camera with sound recording. The IMU can be programmed to send an interrupt when motion is detected, triggering video and sound capture.
This is a more complex use case. IMU’s have inherent drift that causes them to lose accuracy over time. By utilizing the camera, it is possible to correct for such inaccuracies by detecting lack of motion in the visual data and initiating periodic self calibration.
AR/VR applications require each image to be synchronized and tagged with the corresponding IMU reading. The IMU is also required to be as close to the image center of the Image sensor to minimize error. The Vision SoM has been designed with exactly this application in mind by placing the IMU as close to the Vision sensor optic center as possible.
Another mode of using the SoM is to actually use the FPGA to process the sensor data and extract information from them rather than shipping out the raw information to the host. The Lattice SensAI core allows you to run a neural network in the SoM that can be trained to recognize various objects such as people, cars as well as on non-image data such as from the Microphone and IMU. The details of this flow will be covered in a future update.
You can build a camera that starts recording when it detects people. The on-board qSPI SRAM allows for storage of a large amount of video that can be retrieved later.
How about a touch-less user interface driven by gestures?
Sending the microphone data to the Neural Network enables detecting various keywords. We have successfully trained the system to recognize four keywords (Forward, Backward, Left, Right) so you could drive a toy car by voice alone!
Did you know that falls are a leading source of deaths for people over 65? Well, the Apple watch and other similar devices introduced some features for this reason. Now you can try this out on your own too and train a Neural Network that runs on the SoM to detect a fall.
The project is not yet buttoned up! A lot of work remains in terms of software, documentation, marketing, training framework, hardware verification and so on. We invite you to collaborate using the Crowd Supply Discord channel and/or the tinyVision Discord channel.