Data Processing with Metawear

banner-data-processing

INTRODUCTION

One of the best things about Metawear is the ability to quickly collect data via the on-board accelerometer and send it to your phone or computer via the iOS or Android app. This lets you play around with the data in any environment of your choosing and see what kind of useful, real world information can be extracted.

This post will give you a quick example of how you can post-process a 30 second sample of walking, jogging and running data (from the accelerometer on the board) in Python to determine how active a person was over that period of time.

Data Collection

First, let’s take a look at 30 seconds of unfiltered walking data collected at 100hz using the iOS application. The settings used to take the same is show in this screenshot:

IMG_1195

Using the email functionality, I emailed the data log to myself and started playing with it in Python.

The unfiltered raw accelerometer data looks something like this:

-0.177617,0.664062,-0.093750,-0.449219
-0.179693,0.664062,-0.078125,-0.425781
-0.207603,0.656250,-0.082031,-0.414062
-0.209163,0.640625,-0.097656,-0.367188
...
...

The data format is: time elapsed, x-accel, y-accel, z-accel

But instead of taking the data separate from the 3-axis, I’m going to perform what’s called the “Root Mean Square” or RMS of each data sample. The RMS is generally what you would consider to be the “magnitude” of a vector and is defined as the following: x_{\mathrm{rms}} = \sqrt{ \frac{1}{n} \left( x_1^2 + x_2^2 + \cdots + x_n^2 \right) }.

Here’s the code snippet to perform the RMS for the accelerometer data:

import pylab
import numpy as np
from numpy import mean, sqrt, arange

# Calculates Root Mean Square on accelerometer x,y,z data and returns two arrays, one with the timestamp and one with the RMS value
def calculateRMS(raw_data):
    rms_data = []
    time_interval = []

for row in raw_data:
    time_interval.append(abs(np.float32(row[0])))
    a = np.array(map(np.float32, row[1:]))
    rms = sqrt(mean(a**2))
    rms_data.append(rms)
return time_interval, rms_data

Data Visualization

Now that we have the RMS data over time, lets visualize the data by graphing it:

Screen Shot 2014-11-01 at 6.21.36 PM

Looking at this graph, it is very clear that the unfiltered data “centered” around 0.7 RMS. This is most likely because it is picking up a lot of extra acceleration data due to gravity. It would be really nice if we could isolate the gravity portion of acceleration out of the data because it’s a constant force from the Earth that is not particularly indicative of the person’s actual motion of activity (walking in this case).

Data Processing:

Normally, I would implement a software filter in Python to reduce the acceleration data from gravity by using a “High Pass Filter”. But it just so happens that the accelerometer onboard Metawear comes with a hardware High Pass Filter which will do exactly what we need! All we have to do is turn it on and watch the magic happen!

Here are the settings I used in the iOS app for the new data with the high pass filter enabled:

IMG_1196

I chose the lowest frequency cutoff for the high pass filter in this case because I didn’t want the higher cutoffs to eliminate potentially useful data. If I need additional noise reduction, I can always add additional post-processing later.

Now lets graph and compare the high pass filtered walking data to the unfiltered data:

Screen Shot 2014-11-01 at 6.36.50 PM

As you can see, the high pass filter on the lowest setting managed to isolate and eliminate a good portion of the effects of gravity. The data now looks to be “centered” around .2 RMS with large spikes representing each of the walking “strides” It’s still fairly noisy, but it’s pretty clear that there is a correlation between the steps and the “peaks” in the graph. This experiment was conducted with the Metawear attached to the back, in between the shoulders. In this position, we hoped to capture the movement of the entire body, not just the legs. Thus, we actually expect there to be quite a bit of noise in the data that doesn’t directly correlate to the walking motion.

In order to demonstrate a clear difference in the relationship between other movement and leg movement, we are going to take a look at a graph based on 30 seconds of high pass filtered jogging data.

Screen Shot 2014-11-01 at 7.47.04 PM

Because jogging motion is more intense than walking, it is easy to separate the jogging strides from the rest of the “body noise”. The graph above indicates that almost all other body noise is limited to < 0.8 RMS. Thus, all peaks above that can be safely considered motion data based on the jogging action.

In general, as a particular motion gets more intense,  it also becomes easier to track through the accelerometer data. If we look at some running data below, we see an even higher degree of separation between noise and peaks. The average running vector is in the range of 3-5 RMS, which is approximately 4 times higher than the average RMS of the noise.

Screen Shot 2014-11-01 at 7.41.45 PM

Data Analysis:

Now that we’ve explored and visualized what the data from the accelerometer actually looks like and how it correlates to a particular activity, we can focus on a “real world” question: What useful information can I actually conclude from the data?”

One important metric that we can look at is the relative “fitness activity” of walking vs. jogging vs. running. Looking at the running RMS vs. walking RMS gives us significant insight into the proportional acceleration our body undergoes during these activities. As seen below, the results are actually quite striking.

Screen Shot 2014-11-01 at 8.16.53 PM

Now that we know running demands much more acceleration of the body than walking, it would be nice to have a simple way to compare the fitness level of the two activities directly. In order to do this, we are going to calculate a single number to represent the “fitness level” of each activity over 30 seconds. A very simple way to perform this calculation is by adding up the RMS over time.

Here’s the code snippet:

# Calculates a running sum of the data for each sampling interval.
def calculateTotal(data):
    total_activity = [0]
    for row in data:
        total_activity.append(total_activity[-1]+row)
    total_activity = total_activity[1:]
    return total_activity

With this “total_activity” number, we can now directly compare the two activities and their relative intensities:

Screen Shot 2014-11-01 at 8.33.12 PM

Total activity for Running = 1670

Total activity for Walking = 550

Our basic analysis shows that running is at least 3 times as intensive on body acceleration as walking! Although acceleration data isn’t necessarily a directly, linear correlation with energy expenditure, I hope you can see how easy it might be to make a fitness tracker application with this kind of data processing and Metawear!

Next time, I will show you how to calculate the number of calories burned and steps taken based on the 30 seconds of walking, jogging, and running from the datasets shown above…. stay tuned!

P.S. I’ll post the code I used, including the snippets I showed in 1-2 python files available for download.

You must rename data_blog and analysis_blog to data_blog.py and analysis_blog.py since wordpress doesn’t allow python file uploads. running1 is the sample data I used in this article.

analysis_blog

data_blog

running1