Back in December of 2023 Google announced a change to Android’s Location History feature that got quite a bit of attention. Prior to the announcement, users could view their location history via the Google Maps application and any web browser where they were signed in. That meant the data was stored by and likely accessible to Google, which made anyone who was privacy-minded nervous. Google’s December announcement changed all of that. According to the announcement Location History was getting a new name in the Google Maps UI, “Timeline,” the data it contained would be stored locally on the device instead of their servers, and the default retention time of location data was shortened from 18 months to 3. The announcement went on to say these changes were being done in the name of privacy and control, and they would happen gradually in 2024.
This post is going to be one of my shorter ones. There is not too much to look at; it’s location data, after all. However, the trick comes in validating both its accuracy and reliability. Is the data an accurate representation of a device being in a specific place at a specific time, or is it data that that is cached and more generalized that is used by the device for other purposes? These are the questions examiners need to answer when evaluating the value of location data.
There were five devices used for testing: a Galaxy S22, Galaxy S23, Pixel 5a, Pixel 6a, and a Pixel 8a. Four of the devices had an account had been migrated to this new location data paradigm. The Pixel 5a used the same account as the Galaxy S22 because I wanted to see how multiple devices on the same account may interact with each other.
Finally
Right after the new year I started keeping a lookout for when this data landed on my test Androids, and for months there was nothing that stood out as location data. As it turns out, Google was not exaggerating about that gradual rollout. It was not until May that I got the email for my first account.
I was already in the process of extracting test data from my devices due to a potential upcoming speaking engagement, and after each extraction I would look for this location data, and, initially, I wasn’t seeing anything obvious. The problem was I was looking in the wrong location in the file system. Because Timeline was accessed via Google Maps I had been looking in /data/data/com.google.android.apps.maps/. However, Timeline is a Google account-level feature. As it turns out, the data is in /data/data/com.google.android.gms/. The first file of interest tells an examiner if the account has migrated, and whether or not the feature is on or off. The file is ULR_USER_PREFS.xml, and it resides in the ~/shared_prefs folder. See Figure 2.
(Note: It’s refreshing to see non-Binary XML (ABX) for once. :-)) There are two entries in this file that are important. The first entry, in the blue box, lets you know if the account has migrated to storing location history on device. I suspect “Odlh” is “On-device Location History,” but that is purely speculation on my part. Here, the value is “true” as this account had already migrated to storing location data locally. I checked previous versions of this file from other extractions that pre-dated my receiving the email from Google, and this value was set to “false.” In those previous extractions, I still found location in the artifacts discussed later in this post even though I had not yet received the email from Google.
The entry in the red box is important. The value associated with the XML tag “historyEnabled_Account” is based on whether or not a user has turned on Timeline for their Google account. Figure 2 showed how ULR_USER_PREFS.xml looks while Timeline is turned off. Figure 2-1 shows how it looks with it turned on.
To adjust the setting a user can go through a web browser or adjust it via Google Maps. Figures 3, 4, and 5 show how the Google Maps flow would look to a user.
Getting back to the red box in Figures 2 and 2-1, if historyEnabled_Account is set to false, an examiner can expect to find little or no data in the following artifacts discussed in this article depending on two factors. First, Timeline is not enabled by default, so if you are hunting for location data from Timeline, make sure you check this setting first because there is a good chance a user never turned it on. Second, if a user has Timeline on and later opts to turn it off, they are given two options: to turn Timeline off, or to turn Timeline off and delete any existing data. See Figure 6.
One of the XML tags in this file has a timestamp associated with it: “serverMillis_Account.” During testing I found that changing the Timeline on/off setting would cause this timestamp to update; however, I found other extractions where I did not adjust this setting (and its predecessor) and the timestamp would update. So, just know that this timestamp is not indicative of when the setting was last changed.
A couple of notes with regards to the settings and multi-device accounts. First, if an account has more than one device associated with it, turning Timeline off on one device affects all devices on the account. Again, this is an account-level setting. Second, based on my testing, if a user turns Timeline off on one device and opts to delete the data, that only affects the device on which that happens. The historical data on the other devices (should it be there) remains. Again, this is how it was during testing, and Google could always change that.
The Good Stuff…?
Now to the location data I could find. As I mentioned earlier, while a user accesses Timeline via Google Maps, the data actually resides in the Google Mobile Service sandbox (/data/data/com.google.android.gms/). See Figure 7.
As the name suggests, the raw location data is found in app_semanticlocation_rawsignal_db, which is highlighted in red. Note, though, that this is a directory. See its contents in Figure 8.
This data is stored in a LevelDB. If you are not familiar with LevelDBs, I strongly recommend reading Alex Caithness’ article which you can find here. To make sense of the data, I used Mushy, a tool created by my colleague Ian Whiffin. Mushy was recently updated to support LevelDBs and Android Binary XML. To see Mushy in action with LevelDB, see Figures 9 and 10.
Figure 10 is an example of an entry in the database that has location data. The values highlighted in the red box are the latitude and longitude, respectively (top to bottom). The value highlighted in the green box is the horizontal accuracy. The value should be divided by 1,000 and then be read in meters (thanks to Ian for figuring this out). To see how this looks on a map, See Figure 11. Note that I converted the value to feet after dividing it by 1,000.
The last item is the timestamp, which is in the blue box in Figure 10. Note that there is a timestamp just above it that is not highlighted (at [1], [1], [6]). In my testing there were always two timestamps in these database entries, each within a few seconds of each other. I did notice the first timestamp may carry across other database entries, but the one highlighted in blue did not (it was unique). Thus, it is the one highlighted.
I was on vacation at the time represented by the timestamp and was visiting the waterfront area of Beaufort, North Carolina. More specifically, I was well within the horizontal accuracy at the time; I was walking on the sidewalk, which is approximately 50 feet (15.24 meters) from the map pin on the other side of the street. I will note that I have seen horizontal accuracy values as small as approximately 36 feet (11 meters) and substantially higher. As with all things location, it depends on the hardware capabilities of the device, topography, and environmental conditions. For an example of better accuracy, see Figures 13, 14, and 15.
The timestamp here is also correct. I was on my way home from the Beaufort, North Carolina area.
A few notes here. First, there are other entries in app_semanticlocation_rawsignal_db that I have not been able to decipher. If any further information can be gleaned from this database, I will update this post. Second, this location data is collected passively (i.e., without any user intervention). In both examples and in my overall testing, the phones were locked and in my back pocket and I was definitely not using Google Maps or any other navigation application. Third, the frequency at which the data is collected varies. Anecdotally, I will say that I observed more frequent location sampling while a device was actively mobile versus when it had been stationary for some time. More sampling in this fashion likely helps with determining routes, which can be helpful for users, examiners, and investigators. And finally, the retention time of data within this database varies, and while I suspect it may have something to do with the retention time set within Timeline, I have not been able to confirm that. The Galaxy S22 had location data in it going back over a month, while the Pixel 8a had about two weeks worth of data in it. So, your mileage may vary.
In instances where Timeline may have been turned on but subsequently turned off, app_semanticlocation_rawsignal_db looks a little different. See Figure 16.
Here the data looks different. Timeline had been turned on the Pixel 8a and then I turned it off. I subsequently traveled to the other side of the city, stayed for most of the day, and then drove back. There is a substantial amount of deleted entries appearing in more recent .ldb files (.ldb files with higher hexadecimal numbers in their file names compared to the others indicate they are more recent). This is further bolstered by the fact that the value seen in Figure 16 (with the GPS coordinates redacted) have an associated timestamp of one minute prior to me turning Timeline off. In my testing this behavior was indicative of the subsequent deactivation of Timeline after having previously been on.
The next item item of likely interest is highlighted in the blue box in Figure 17.
The item, app_semanticlocation_placeindex_cache, is another LevelDB, and is also located in /data/data/com.google.android.gms/. The reason I said this item is likely of interest is that I have only been able to partially decode it. An example entry is seen in Figure 18.
Initially, this may look like nothing, but I have been able to make out some structure. To illustrate, see Figure 19.
The bytes in the red box are a timestamp when read little endian. The bytes in the green box represent the length of the remaining bytes of the entry (again, read little endian) starting at offset 0x12. This structure has been consistent in every entry in this database that I have examined. Beyond that, unfortunately, I have not been able to determine what the rest of the data is. I (and others) suspect it is encrypted, and a quick examination of the GMS .apk lends credit to that theory, but I have no hard data by which to confirm it. Further, cursory attempts at decryption all failed. If anyone figures this out, please contact me and I will add your findings here along with full credit.
An additional directory in the same location, app_semanticlocation_odlh_cache, contains two files. The location data is found in a file that has an alphanumeric file name and has an extension .is. The file is protobuf, although none of the tools I used recognized it as such (while I was writing this Ian was nice enough to update Mushy to handle this file. The screenshots below depict how to do this manually using HxD and pre-updated Mushy). In my Galaxy S22, for example, the file was named 2c9438c3530c5925.is. The payloads in the protobuf may be separated by one or two bytes, depending on the length of the payload. An example of a one-byte payload is shown in Figure 20.
Here, the first byte is read to determine the length of the protobuf payload. 0x5B is decimal 91 so the following 91 bytes are read as protobuf. That payload, when decoded, is seen in Figure 21.
Using the same color schema as before, the red box contains the latitude and longitude values (top to bottom), the green box contains the horizontal accuracy (divide by 1000 for meters) and the blue box contains the timestamp. As before, an examiner can expect to find multiple timestamps in each entry with GPS coordinates, all within a few seconds of each other. Figure 22 shows how it looks on the map.
Based on the timestamp from Figure 21, I know I was at my vacation rental home, and the S22 was in the bedroom at the time, which, is where the red pin is in Figure 22. So, accurate.
Figure 23 shows an example of a protobuf payload preceded by two bytes.
Here, the two bytes that are in the red box are read as little endian base 128, which, in this case, is decimal 235. Starting at the 0x0A value at offset 0x426 and reading 235 bytes gets the protobuf payload. It is decoded in Figure 24 and mapped in Figure 25.
While it may seem that there hasn’t been much movement, there was. Here, at this time, the phone had moved from the bedroom to my car, which was parked in the driveway at my vacation rental home (the phone was packed away in my car for my trip home). The driveway was just to the right of the map pin within two dozen feet (approximately 8 meters) or so, so well within the horizontal accuracy.
Two notes about this file. First, if a user has turned off Timeline and opts to not delete their data, the .is file will remain. If the user opts to turn Timeline off and delete their data, the .is file is not present in app_semanticlocation_odlh_cache. Second, because this is a cache, expect a limited amount of data in it. The S22 had location data in app_semanticlocation_rawsignal_db going back a full month but the .is file in its app_semanticlocation_odlh_cache directory file had only a day’s worth of data in it. Again, because it is a cache, this type of retention behavior is expected.
Yay for (More) Location Data
The heading says it all. As examiners and investigators, we are always on the lookout for location data and an associated timestamp. That being said, we can not blindly accept it and call it accurate. The onus is on us to evaluate its reliability and apply it to the examination/investigation at hand. Does it make sense within the context? Is it something I can test myself? These are questions examiners should be asking themselves.
Android is slowly catching up to iOS in terms of available location data, but it still has a ways to go. Fragmentation and default settings can still play a role in whether or not an examiner can find reliable location data on an Android handset.
Article Link: The Green Look Back. Android’s On-Device Location History – The Binary Hick