Digit Classification

Inertial Sensor Side Channel: Guessing Your Unlock Code

Few years back I started thinking about how much information could be extracted from a phone’s inertial sensors to guess some sensitive information such as PIN / UnLock Codes / other passwords. One of the first papers a came across was (sp)iPhone: Decoding Vibrations From Nearby Keyboards Using Mobile Phone Accelerometers where the authors were able to decode strokes from a nearby keyboard using the phone’s inertial sensors.

Not until a couple months ago a started working on this more seriously, when I wrote few applications to retrieve data, started analysing and came to some conclusions. In this post, I will describe the materials & methods used so they might be handy to someone else.

First step: Retrieve sensor information

At this stage I need to gather as much reliable sensor information as possible. I wrote a Python script running on Pythonista, where I displayed a capture of the unlock-screen and recorded the sensors for a fixed amount of time. Along with the sensors I appended the exact time when the screen was touched.

This approach allowed me to visually correlate the touch and the sensor data and think about which features should be included in the classification algorithm.

from scene import *
import editor
import csv
import sound
class MyScene (Scene):
        def setup(self):
                # This will be called before the first frame is drawn.
                self.touched = 0
                self.tt = ''
                self.cnt = 0
                self.filecnt = 0
        def draw(self):
                global ret
                # This will be called tfor every frame (typically 60 times per second).
                background(0, 0, 0)
                r = self.bounds
                image('_keypad_lock', 0, 0, r.w, r.h)
                if self.cnt < 500:
                        v = gravity()
                        print v.x, v.y, v.z, self.touched
                        self.tt += str(v.x) + ',' + str(v.y) + ',' + str(v.z) + ',' + str(self.touched) + '\n'
                        fill(1, 0, 0)
                        for touch in self.touches.values():
                        rect(touch.location.x - 13, touch.location.y - 20, 26, 40)
                        if self.cnt == 150:
                        self.cnt += 1
                        data = open('../data/data-789-%03d.txt' % self.filecnt, 'w')
                        self.filecnt += 1
                        self.cnt = 0
                        self.tt = ''
        def touch_began(self, touch): 
                self.touched = 1
        def touch_moved(self, touch):
        def touch_ended(self, touch):
                self.touched = 0
        def pause(self):
        def stop(self):

The script runs around 60 times / second for 500 times capturing the sensors and the positions where the screen is touched. It plays a sound so the user can start typing the code; this is really convenient at this stage since it helps to visually see things easier before going further with the data gathering.

I took samples of 3 digits at a time e.g. I tapped digits 123 once the sounds played all over again till I got tired and then tried with other set of numbers let’s say 147 and again and again.

Second Step: Analysing Raw Data

For making something useful out of the raw information from the sensor I wrote other python script this time using matplotlib and numpy.

I focused on data presentation so I could somehow realise whether which data could be extracted from the inertial sensors. I plotted the three inertial sensor axis along with the first derivative. Below two samples for the codes 147 and 123:

Digits 147
Digits 147
Digits 123
Digits 123

The left column is the raw data from the sensors (axis x first row, axis y and axis z last row), the right column is the derivative for each signal and the vertical red line marks the touch for each digit begins.

As can be seen, some common patterns for each digit arise if the three axis are seen together. The next task is to analyse more datasets than just one and try to do some classification from the data we have.

As a note, in the right column when the first derivative equals 0 means that we have reached the peak of the touch, this would be helpful when automating the whole process in a trojan-demo.

Here is the script that plots the graphics above and also calculates a mean for each digit. It takes as a first argument the digits that the processed files contains and appends a row to the file e.g. 4.txt, with distance from the maximum to the resting value of the sensors.

import matplotlib.pyplot as plt
import matplotlib.cm as cm
from scipy.signal import filtfilt, butter
import numpy as np
import sys
colors = cm.rainbow(np.linspace(0, 1, len(sys.argv) - 1))
files = [i + ".txt" for i in str(sys.argv[1])]
avg = {file: [100.0, 0.0, 0.0] for file in files}
b, a = butter(3, 0.05)
i = 0
for f in sys.argv[2:]:
        data = np.genfromtxt(f, delimiter=',')
        datax = [row[0] for row in data]
        datay = [row[1] for row in data]
        dataz = [row[2] for row in data]
        touch = [row[3] for row in data]
        color = colors[i]
        plt.plot(datax, color=color)
        plt.plot(np.diff(np.array(datax)), color=color)
        plt.plot(datay, color=color)
        plt.plot(np.diff(np.array(datay)), color=color)
        plt.plot(dataz, color=color)
        plt.plot(np.diff(np.array(dataz)), color=color)
        rest = [np.mean(datax[50:90]), np.mean(datax[50:90]), np.mean(datax[50:90])]
        print rest
        cnt = 0
        pre_j = 0
        number_i = 0
        touched = 0
        for j in touch:
                if j == 1:
                        if touched == 0:
                                plt.axvline(x = cnt, color='r')
                                plt.axvline(x = cnt, color='r')
                                plt.axvline(x = cnt, color='r')
                                plt.axvline(x = cnt, color='r')
                                plt.axvline(x = cnt, color='r')
                                plt.axvline(x = cnt, color='r')
                                touched = 1
                        if avg[files[number_i]][0] > 10.0:
                                avg[files[number_i]][0] = rest[0] - datax[cnt];
                                avg[files[number_i]][1] = rest[1] - datay[cnt];
                                avg[files[number_i]][2] = rest[2] - dataz[cnt];
                                avg[files[number_i]][0] += rest[0] - datax[cnt];
                                avg[files[number_i]][1] += rest[1] - datay[cnt];
                                avg[files[number_i]][2] += rest[2] - dataz[cnt];
                                avg[files[number_i]][0] /= 2.0;
                                avg[files[number_i]][1] /= 2.0;
                                avg[files[number_i]][2] /= 2.0;
                        touched = 0
                        if pre_j == 1:
                                number_i += 1
                cnt += 1
                pre_j = j
        i += 1
        for f in files:
                with open(f, "a") as datafile:
                        datafile.write("%f,%f,%f\n" % (avg[f][0], avg[f][1], avg[f][2]))

Third Step: Classification

Once all files were processed with the script shown above I wrote yet another script to plot the data it generated. This time I chose a bar plot grouped by digit so for each one a bar for the mean of each axis and the corresponding standard deviation was shown.

Digit Classification

This plots speaks for itself, different digits have distinguishable features but thanks to the standard deviation we can see that some overlap so there will be some inputs which we won’t be able to classify without more information.

To test this hypothesis I built a simple Classification BackPropagating Neural Network with pyBrain which allows to classify digits with an average error of 48%. This error is much expected since the standard deviation shows overlapping between different digits.

Below is the python scripts that plots the bars and uses pyBrain for the Classification Network.

import numpy as np
import matplotlib.pyplot as plt
import sys, time
### pyBrain
from pybrain.datasets            import ClassificationDataSet
from pybrain.utilities           import percentError
from pybrain.tools.shortcuts     import buildNetwork
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.structure.modules   import SoftmaxLayer
files = sys.argv[1:]
data = {}
i = 0
for file in files:
        datafile = np.genfromtxt(file, delimiter=',')
        data[file] = {}
        data[file]["x"] = [row[0] for row in datafile]
        data[file]["y"] = [row[1] for row in datafile]
        data[file]["z"] = [row[2] for row in datafile]
        data[file]["traindata"] = [(row[0], row[1], row[2]) for row in datafile]
        data[file]["class"] = [i]
        i += 1
means = [[np.mean(data[file]["x"]), np.mean(data[file]["y"]), np.mean(data[file]["z"])] for file in files]
stds = [[np.std(data[file]["x"]), np.std(data[file]["y"]), np.std(data[file]["z"])] for file in files]
X = [row[0] for row in means]
Xstd = [row[0] for row in stds]
Y = [row[1] for row in means]
Ystd = [row[1] for row in stds]
Z = [row[2] for row in means]
Zstd = [row[2] for row in stds]
ind = np.arange(len(files))
width = 0.25
fix, ax = plt.subplots()
recs1 = ax.bar(ind, X, width, color='lightgray', yerr=Xstd)
recs2 = ax.bar(ind + width, Y, width, color='y', yerr=Ystd)
recs3 = ax.bar(ind + 2 * width, Z, width, color='orange', yerr=Zstd)
ax.set_xticks(ind + 1.5 * width)
###### Neural Training
ds = ClassificationDataSet(3, 1, nb_classes=9)
row = {}
for file in files:
        for row in data[file]["traindata"]:
                ds.addSample(row, data[file]['class'])
trndata = ds
print "Number of training patterns: ", len(trndata)
print "Input and output dimensions: ", trndata.indim, trndata.outdim
print "First sample (input, target, class):"
print trndata['input'][0], trndata['target'][0], trndata['class'][0]
fnn = buildNetwork(trndata.indim, 7, trndata.outdim, outclass=SoftmaxLayer)
trainer = BackpropTrainer(fnn, dataset=trndata, learningrate=0.01, momentum=0.1, verbose=True, weightdecay=0.01)
trainer.trainOnDataset(trndata, 100)


Although a 48% error seems hight, it would be enough to reduce the space of an attack; even it doesn’t allow to guess a digit right away, this method allows us to reduce the most likely tapped digit to a groups of 3 of 4, making the guessing much more plausible.

After this result I started looking for new ways of extracting features from these signals and a came across a couple of papers which had already addressed this very same problem Practicality of Accelerometer Side Channels on Smartphones and TapLogger: Inferring User Inputs On Smartphone Touch screens Using On-board Motion Sensors. The authors of the last one created a PoC Trojan to demonstrate this side channel.

Since what I had in mind was already addressed before with a much deeper insight I won’t keep going forward with this research, although it has been much fun.