Dissertation Project - Air Drumming System Design with LSTM Motion Detection

Posted on October 10, 2019 by Yanislav Donchev

Note: this is a brief demonstration of my dissertation project. For more details please check out my full dissertation here.

I proposed my dissertation project to Dr Jize Yan who was impressed by my idea. The project was to implement an air drumming device - a business idea I have had for a while. To play the drums, a drummer only needs a pair of drumsticks, one or two foot sensors and a mobile phone - this allows playing drums anywhere. Before I continue with the description, I would like to show a short video of my project, which the University of Southampton decided to play on open days (to attract students to electronics and computer science).

Detecting Strikes

Two approaches were used for detecting strikes. The first one was a custom, manually tuned, peak-detection algorithm which detects sharp peaks in acceleration (my research found that these correspond to strikes) while filtering noise. This algorithm works in real-time and it has a one sample delay between the peak occurence and its detection.

A state diagram of the strike detection algorithm. $a$ is the linear drumstick acceleration and $\dot{a}$ and $\ddot{a}$ are its two time derivatives. $a_{th}$, $\ddot{a_{th}}$ and $t_{dth}$ are three constants. Their optimal values have been found by inspection to be respectively $1.953ms^{-2}$, $0.733ms^{-4}$ and $20ms$.

The second approach for motion detection was by implementing a Long-Short Term Memory (LSTM) neural network on a microcontroller. This allowed for motion detection that could be easily extended to different types of motion by retraining. Below is a structure of the network I used.

A structure of the LSTM network. The LSTM layers consist of 64 and 32 units respectively, and the Dense layer before the output consists of 16 units.
This network would output the strike label for the whole duration of the strike peak. However, only a spike is desired to trigger a drum sound. To solve this, a helper network (see below) was stacked at the strike output, which will output 1 when 'Strike' transitions from 0 to 1, and 0 otherwise.
The helper network for spike generation. This network was trained to only output 1 when the strike signal rises.

Game Development for Automatic Data Labelling

One of the biggest problems that I faced was how to obtain a large datased of labelled data. Drumming requires very specific motions so there were no public datasets that I could use online. Manual labelling would be hugely impractical and prone to errors. Writing an algorithm that can label the data automatically would eliminate the need of a neural network for generalised motion detection. Hitting conductive surfaces and looking at the voltage waveform would not work either because there is a difference between the acceleration waveform of striking the air and striking a surface. These are just a few of the ideas I had which had flaws and would not work.

One day I finally found a solution. What if I could develop a game in which the player uses the drumstick as controller and he/she is forced to perform the drumming motion in order to win? If the player passed the level, then I could be sure that the acceleration waveform that I logged corresponds to a drumming motion. Eureka!

Below are a few screenshots of the game with the acceleration waveform underneath them. The player has to stay inside the blue tunnel while the spacecraft (the white arrow) progresses forwards at a constant speed. Hitting the walls of the blue tunnel causes the spacecraft to explode. The curves in the game were adapted, such that when flying through them, the player performs a drum strike motion with the drumstick. The width of the tunnel allows for variation in the strikes.

Plot of the logged acceleration data while passing certain sections of the game. The orange bounds represent the parts in which the acceleration data is labelled as “strike”; at any other time, the data is labelled as “no strike”. The distinctive acceleration peaks of air strikes are clearly visible at the curvy paths.

A total of 2 hours of data was collected from 4 different players (About 1100 labelled samples of acceleration data). The data was pre-processed as shown in the figure below. The trained network achieved a 100% accuracy and performed as expected when ported on the drumstick's microcontroller.

A visualisation of the data formatting process. The plotted data is the acceleration from the three axes against time.

Note: this was just a brief demonstration of my dissertation project. For more details please check out my full dissertation here.

Project Highlights
  • A game that automatically labels motion data in a novel approach
  • Generalised motion detection with an LSTM neural network
  • Application specific motion detection for better performance (negative latency)
  • Porting an LSTM neural network from TensorFlow to an ARM Cortex-M4 microcontroller
  • Compact and modern plastic enclosure
Drumstick Features
  • Negative latency - predicting strikes 27.8ms before occurence
  • Over-the-Air (OTA) firmware update
  • USB Type-C charging (0% to 100% in 20 minutes)
  • 9 Degrees-of-Freedom (DoF) motion sensing
  • BLE connectivity to wireless MIDI applications
  • Red LED charging indicator
  • Four white LEDs for status indication
  • >20 hour battery
  • Single button for (power on/off, bootloader mode, application functionality)