CMU Localization Dataset

This dataset contains a variety of sensor inputs collected by a smartphone as they move through indoor spaces. The purpose of this dataset is to provide a corpus of sensor data along with ground truth that can be used to study indoor localization. Each test environment was instrumented with a set of ALPS beacons [1] that provide BLE and ultrasonic ranging sources. The test environments had WiFi access points that had been previously installed at various densities as part of the locations existing networking infrastructure. As part of the data collection process, users walked through the environment carrying a smartphone running a data collection application. The application continuously recorded a variety of sensors that are often used for indoor localization. The users also marked known waypoints throughout the space. Each reading has a timestamp that can be references against the waypoint timestamps taken at known locations. Each waypoint and beacon was precisely surveyed using a laser rangefinder.

The dataset contains the following sensor sources:
  1. WiFi Access Point Scans
  2. Bluetooth Low-Energy Readings
  3. Magnetometer Data
  4. Pedometer Data
  5. Accelerometer Data
  6. Gyroscope Data
  7. Inertial Measurement Heading Data
  8. Ultrasonic Time-of-Flight Values (including 48kHz Audio Recording)
  9. Ground Truth Waypoints

Data Collection Process

Each environment was instrumented with enough beacons that at any point 3-4 were always visible within line-of-sight at ceiling level (obstructions like shopping isles or cubical walls might block the signal in the user’s hand). The transmitters were outfitted with BLE radios running iBeacon at a 10Hz broadcast rate through a quarter wavelength whip antenna. The beacons also transmitted time synchronized ultrasonic chirps as described in [1]. To avoid the need to perform pseudo-ranging, the phone was synchronized with a nearby beacon at the start of each experiment. This data can be ignored to simulate the impact of TDOA as opposed to TOF on the receiver. Each beacon was mounted on a tripod near the ceiling as shown in Figure 1. The phone was carried by the user at a height of approximately 1.2 meters from the ground.

Figure 1: Beacons Placed in the environment

A user carried a data collection application running on a mobile phone while walking along a predefined path marked on the floor. Waypoints were marked on the ground and the user was told to pause briefly while holding the phone and select the "waypoint" button in the application at each waypoint. The application (shown below) would then display and log the timestamp when the user reached each point so that it can be compared to other sensors or computed locations offline. The idea was to capture both ranging data as well as inertial data recorded during natural movement that can be used for tracking. In each environment, we typically collect data multiple times from multiple unique paths throughout the space. We tried to capture cases when beacons were both line-of-sight as well as obstructed. The number of beacons can be decreased in post- processing to simulate a more sparsely instrumented space.

iOS Interface

Figure 2: Waypoint Selection Mobile App

Data Organization

The data is stored in files with the following path structure:

CMU-Localization-Dataset/environments/[location name]/[path number]/[run number]_[sensor type]

[location name] - is used to describe the particular building or area used for data collection
[path number] - denotes the particular path through the space. For each location, multiple paths were recorded.
[run number] - each path was repeated a number of times

[sensor type]

The sensor file types are described with example data below.

  • Heading.csv
    • This file contains heading information returned by the smartphone's motion co-processor. It contains a time series with magnetic heading, geographic heading and an accuracy estimate.
      # Timestamp (seconds), Magnetic Heading in degrees, Geographic Heading in degrees, Estimated Accuracy in degrees
  • iBeacon.csv
    • This file contains Bluetooth Low-Energy iBeacon readings as they are reported by the smartphone. Each BLE beacon was running iBeacon with a 10Hz broadcast rate using an omni-directional whip antenna pointed directly down from a ceiling mounted unit. All computed values were returned including the estimated distances reported by the phone OS, the RSSI and the proximity state of the device. iBeacons are identified by major and minor numbers.
    • Example Data:
      # Timestamp (seconds), Major Number, Minor Number, Distance (m), RSSI (dB), Proximity (0-unknown / 1-immediate / 2-near / 3-far)
  • Magnetometer.csv
    • The magnetometer was stored at a rate of approximately 10Hz. Research has shown that magnetic field strength can be used as a location finger printing mechanism.
    • Example Data:
      # Timestamp (seconds), Field Strength X, Field Strength Y, Field Strength Z
  • Pedometer.csv
    • Most modern smartphones contain a motion processor that is able to estimate number of steps and distance traveled while running in the background. These values update as available and are typically refreshed when the device detects that the user has moved. The values are reported across an interval which in most experiments was set between 0 and the time of the last update.
    • Example Data:
      # Timestamp Start (seconds), Timestamp End (seconds), Distance Traveled (m), Steps counted (steps) 
  • Waypoints.csv
    • Each time the user carrying the phone reached a known waypoint, they would press a button to record the timestamp. This information can be used as ground truth points for localization and tracking.
    • Example Data:
      # Timestamp (seconds), X location (m), Y location (m)
      5.425880, 2.21, 3.34
      6.675897, 2.21, 4,45
      7.909560, 2.21, 5.67
  • WiFi.csv
    • The device was continuously scanning for WiFi access points and recording a list approximately once every 2-3 seconds depending on the scanning rate. The system would record the access point name, MAC address and the signal strength of the last beacon.
    • Example Data:
      # Timestamp (seconds), SSID name, BSSID MAC, RSSI (dB)
  • ALPS.csv
    • ALPS uses ultrasonic time-of-flight to compute a distance between the mobile device and a transmitter. These files provide the processed output of the ALPS demodulator. The raw audio files are also available to test different demodulators or to extract other features.
    • Example Data:
      # Timestamp (s), cycle count, beacon ID, distance (m)
  • .wav
    • This contains an uncompressed audio file recorded at 24-bit 48kHz that can be used to extract ultrasonic beacon chips before they are demodulated.

Access the Data

We have a browsable version of the data posted here:


[1] Patrick Lazik, Niranjini Rajagopal, Oliver Shih, Bruno Sinopoli, Anthony Rowe, "ALPS: A Bluetooth and Ultrasound Platform for Mapping and Localization", The 13th ACM Conference on Embedded Networked Sensor Systems (SenSys 2015), Seoul, South Korea

IMG_1060.JPG (1.65 MB) Anthony Rowe, 08/10/2015 03:46 pm

IMG_1056.JPG (610 KB) Anthony Rowe, 08/10/2015 03:49 pm

collection-app.png - iOS Interface (101 KB) Anthony Rowe, 08/10/2015 06:06 pm