8 Machine Vision Development

What is image recognition?

  1. principle of image recognition

    • Image recognition refers to the technology of using computers to process, analyze and understand images in order to identify targets and objects of various modes. It is a practical application of deep learning algorithms.
  2. Application scenarios of image recognition

    • At the present stage, image recognition technology is generally divided into face recognition and commodity recognition. Face recognition is mainly used in security check, identity verification and mobile payment. Product identification is mainly used in the process of commodity circulation, especially in unmanned retail areas such as unmanned shelves and intelligent retail cabinets.
  3. Artificial intelligence application of image recognition

    • Image recognition is an important field of artificial intelligence. Different image recognition models have been proposed in order to make computer programs simulating human image recognition activities. An example is the template matching model. This model holds that to recognize an image, one must have a memory pattern of the image in the past experience, also known as the template. If the current stimulus matches the template in the brain, the image is recognized. For example, if there is a letter “A”, the letter “A” is recognized if the size, orientation, and shape of the letter “A” are exactly the same as the template of "A" in the mind. The model is simple and straightforward, and easy to be applied in practice. However, this model emphasizes that the image must be completely consistent with the template in the brain before it can be recognized. In fact, people can not only recognize the image that is completely consistent with the template in the brain, but also recognize the image that is not completely consistent with the template. For example, people can identify not only a specific letter “A”, but also printed, handwritten, misaligned, and different-sized letters “A”. At the same time, people can recognize a large number of images, if the recognition of each image has a corresponding template in the brain, it is impossible.
    • In order to solve the problem of template matching model, Gestalt psychologists put forward a prototype matching model. According to this model, what is stored in long-term memory is not the innumerable templates to be recognized, but some "similarity" of the images. The "similarity" abstracted from the image can be used as a prototype to test the image to be recognized .If a similar prototype can be found, the image is identified. This model is better than template-matching models, both in the process of neural and memory exploration, and it can also explain the recognition of images that are irregular, but in some ways similar to the prototype. However, this model does not explain how people can identify and process similar stimuli, and it is difficult to implement in a computer program. Therefore, a more complex model is proposed, that is, the "pan-demonic" recognition model.
    • In industrial applications, pictures are usually taken by industrial cameras, and then processed by software according to the grayscale difference of the picture to identify useful information. The representative of the image recognition software is Connex.
  4. development of image recognition

    • The development of image recognition has experienced three stages: character recognition, digital image processing and recognition, and object recognition. The research of character recognition began in 1950. It is generally used to recognize letters, numbers and symbols. It is widely used from printed character recognition to handwritten character recognition.
    • The research of digital image processing and recognition began in 1965. Compared with analog images, digital images have great advantages such as storage, convenient transmission and compression, not easy distortion in transmission and convenient processing, which provide a powerful impetus for the development of image recognition technology. Object recognition mainly refers to the perception and understanding of the object and environment of the three-dimensional world, which belongs to advanced computer vision. It is based on the digital image processing and recognition of the combination of artificial intelligence, systems science and other disciplines research, its research results have been widely used in a variety of industrial and exploring robots. One of the shortcomings of modern image recognition technology is the poor adaptive performance. Once the target image is polluted by strong noise or the target image has large imperfection, there will not be an ideal identification result.
    • The mathematical nature of image recognition is a mapping problem from pattern space to category space. At present, there are mainly three recognition methods: statistical pattern recognition, structural pattern recognition and fuzzy pattern recognition in the development of image recognition.Image segmentation is a key technology in image processing. Since the 1970s, its research has a history of several decades and has been highly valued by people. Up to now, thousands of segmentation algorithms have been proposed with the help of various theories, and the research in this field is still being actively carried out.
    • There are many kinds of existing image segmentation methods, including threshold segmentation, edge detection, region extraction, and the segmentation method combined with specific theoretical tools. From the type of image to include: gray image segmentation, color image segmentation and texture image segmentation. As early as 1965, someone proposed the edge detection operator, which resulted in many classical edge detection algorithms. However, in the past 20 years, with the rapid development of image segmentation based on histogram and wavelet transform, computing technology and VLSI technology, the research on image processing has made great progress. Image segmentation methods combine some specific theories, methods and tools, such as image segmentation based on mathematical morphology, segmentation based on wavelet transform, and segmentation based on genetic algorithm.

Use StivckV + Maixpy - IDE to image recognition development

Development Platform

  • Maixpy - IDE

Development Environment

  • Windows
  • Linux

Developer Components

  • M5Stack - StickV




M5Stick-V RISC-V AI Camera

M5Stack recently launched the new AIoT(AI+IoT) Camera powered by Kendryte K210 -an edge computing system-on-chip(SoC) with dual-core 64bit RISC-V CPU and advanced neural network processor..

M5StickV AI Camera possesses machine vision capabilities, equips OmniVision OV7740 image sensor, adopts OmniPixel®3-HS technology, provides optimum low light sensitivity, supports various vision identification capabilities. (e.g. Real-time acquisition of the size, type and coordinates of the detected target ) In addition to an OV7740 sensor, M5StickV features more hardware resources such as a speaker with built-in I2S Class-D DAC, IPS screen, 6-axis IMU, 200mAh Li-po battery, and more.

It is able to perform convolutional neural network calculations at low power consumption, so M5StickV will be a good zero-threshold machine vision embedded solution. It is in support with MicroPython, which makes your code to be more concise when you use M5stick-V for programming.

Product Features

  • Dual-Core 64-bit RISC-V RV64IMAFDC (RV64GC) CPU / 400Mhz(Normal)
  • Dual Independent Double Precision FPU
  • Neural Network Processor(KPU) / 0.8Tops
  • Field-Programmable IO Array (FPIOA)
  • Dual hardware 512-point 16bit Complex FFT
  • SPI, I2C, UART, I2S, RTC, PWM, Timer Support
  • AES, SHA256 Accelerator
  • Direct Memory Access Controller (DMAC)
  • Micropython Support
  • Firmware encryption support
  • Case Material: PC + ABS


  • Face recognition/detection
  • Object detection/classification
  • Obtaining size and coordinates of the target in real-time
  • Obtaining the type of detected target in real-time
  • Shape recognition
  • Video/Display
  • Game simulator

USB Drive problems

  • M5StickV may not work without driver in some systems. Users can manually installFTDI to fix this problem.
Resources Parameter
Kendryte K210 Dual core 64-bit RISC-V RV64IMAFDC(RV64GC)CPU / 400Mhz(Normal)
Flash 16M
Power input 5V @ 500mA
KPU parameter size
of neural network
Port TypeC x 1,GROVE(I2C + I / 0 + UART)x 1
Button Custom buttonx 2
IPS screen 1.14 TFT,135 * 240,ST7789
Camera OV7740(30w pixels)
FOV 55deg
Battery 200mAh
External storage TF-card(microSD)
Net weight 23g
Gross weight 82g
Product Size 48 24 22mm
Package Size 144 44 43mm
Case Material Plastic(PC)


  • M5StickV not currently recognize all types of TF-card(microSD). We have tested some common TF-card(microSD). The test results are as follows.
Brand Storage Type Class Format Test Results
Kingston 8G HC Class4 FAT32 ok
Kingston 16G HC Class10 FAT32 ok
Kingston 32G HC Class10 FAT32 no
Kingston 64G XC Class10 FAT文件 ok
SanDisk 16G HC Class10 FAT32 ok
SanDisk 32G HC Class10 FAT32 ok
SanDisk 64G XC Class10 / no
SanDisk 128G XC Class10 / no
XIAKE 16G HC Class10 FAT32 ok(purple)
XIAKE 32G HC Class10 FAT32 ok
XIAKE 64G XC Class10 / no
XIAKE 32G HC Class10 / no

Functional Description

Kendryte K210

The Kendryte K210 is a system-on-chip (SoC) that integrates machine vision. Using TSMC’s ultra-low-power 28-nm advanced process with dualcore 64-bit processors for better power efficiency, stability and reliability. The SoC strives for ”zero threshold” development and to be deployable in the user’s products in the shortest possible time, giving the product artificial intelligence

  • Machine Vision
  • Better low power vision processing speed and accuracy
  • KPU high performance Convolutional Neural Network (CNN) hardware accelerator
  • Advanced TSMC 28nm process, temperature range -40°C to 125°C
  • Firmware encryption support
  • Unique programmable IO array maximises design flexibility
  • Low voltage, reduced power consumption compared to other systems with the same processing power
  • 3.3V/1.8V dual voltage IO support eliminates need for level shifters


The chip contains a high-performance, low power RISC-V ISA-based dual core 64-bit CPU with the following features:

  • Core Count: Dual-core processor
  • Bit Width: 64-bit CPU 400MHz
  • Frequency: 400MHz
  • ISA extensions: IMAFDC
  • FPU: Double Precision
  • Platform Interrupts: PLIC
  • Local Interrupts: CLINT
  • I-Cache: 32KiB x 2
  • D-Cache: 32KiB x 2
  • On-Chip SRAM: 8MiB


  • support for output formats: RAW RGB and YUV
  • support for image sizes: VGA, QVGA, CIF and any size smaller
  • support for black sun cancellation
  • support for internal and external frame synchronization
  • standard SCCB serial interface
  • digital video port (DVP) parallel output interface
  • embedded one-time programmable (OTP) memory
  • on-chip phase lock loop (PLL)
  • embedded 1.5 V regulator for core
  • Sophisticated Edge Rate Control Enables Filterless Class D Outputs
  • 77dB PSRR at 1kHz
  • Low RF Susceptibility Rejects TDMA Noise from GSM Radios
  • Extensive Click-and-Pop Reduction Circuitry
  • array size: 656 x 488
  • power supply: – core: 1.5VDC ± 5% – analog: 3.3V ± 5% – I/O: 1.7 ~ 3.47V
  • temperature range: – operating: -30° C to 70°C – stable image: 0° C to 50° C
  • output format: – 8-/10-bit raw RGB data – 8-bit YUV
  • lens size: 1/5"
  • input clock frequency: 6 ~ 27 MHz
  • max image transfer rate: VGA (640x480): 60 fps – QVGA (320 x 240): 120 fp
  • sensitivity: 6800 mV/(Lux-sec)
  • maximum exposure interval: 502 x tROW
  • pixel size: 4.2 μm x 4.2 μm
  • image area: 2755.2 μm x 2049.6 μm
  • package/die dimensions: – CSP3: 4185 μm x 4345 μm – COB: 4200 μm x 4360 μm


  • Single-Supply Operation (2.5V to 5.5V).
  • 3.2W Output Power into 4Ω at 5V
  • 2.4mA Quiescent Current
  • 92% Efficiency (RL = 8Ω, POUT = 1W)
  • 22.8µVRMS Output Noise (AV = 15dB)
  • Low 0.013% THD+N at 1kHz
  • No MCLK Required
  • Sample Rates of 8kHz to 96kHz
  • Supports Left, Right, or (Left/2 + Right/2) Output
  • Sophisticated Edge Rate Control Enables Filterless Class D Outputs
  • 77dB PSRR at 1kHz
  • Low RF Susceptibility Rejects TDMA Noise from GSM Radios
  • Extensive Click-and-Pop Reduction Circuitry


  • Operation Voltage: 2.9V - 6.3V(AMR:-0.3V~15V)
  • Configurable Intelligent Power Select system
  • Current and voltage limit of adaptive USB or AC adapter input
  • The resistance of internal ideal diode lower than 100mΩ


Gyroscope features

  • Digital-output X-, Y-, and Z-axis angular rate sensors (gyroscopes) with a user-programmable full-scale range of ±250 dps, ±500 dps, ±1000 dps, and ±2000 dps and integrated 16-bit ADCs
  • Digitally-programmable low-pass filter
  • Low-power gyroscope operation
  • Factory calibrated sensitivity scale factor
  • lens size: 1/5"
  • Self-test

Accelerometer features

  • Digital-output X-, Y-, and Z-axis accelerometer with a programmable full scale range of ±2g, ±4g, ±8g and ±16g and integrated 16-bit ADCs
  • User-programmable interrupts
  • Wake-on-motion interrupt for low power operation of applications processor
  • Self-test

SPI / I2C dual communication mode

Note: There are two versions of M5StickV currently released by M5Stack. When programming, users need to configure differently according to their corresponding pin mapping. The specific differences are as follows.

  • In the M2StickV circuit design of the I2C single-mode (blue PCB) version, MPU6886 only supports the user to configure its communication mode to I2C, and its pin mapping is SCL-28, SDA-29.
  • In the SPI/I2C dual mode (black PCB) version of the M5StickV circuit design, MPU6886 supports the user to configure its communication mode to SPI or I2C, and its pin mapping is SCL-26, SDA-27., when using, you can switch CS Pin level to switch modes (high level 1 is I2C mode, low level 0 is SPI mode)
  • The specific pin mapping is shown below:



Web page






results matching ""

    No results matching ""