Every OS that supports BLE caches parts of your device's profile. It's a method to save power and time by caching values that don't change very often. But what happens when the values change? If you're not careful, you can end up rendering your product useless.
Between new features and our firmware update process, the GATT table on the Bean changes often. We had a couple of 'interesting' weeks figuring out how this works on iOS, Android, and OS X. Read on to hear how we deconstructed the caching process, discovered some bugs, and got a few more gray hairs in the process.
A few weeks ago, Steve, one of our software engineers, was testing the Over-The-Air Firmware Update feature in the Android Bean loader. He noted that the Bean was returning a strange set of characters for the Hardware Revision number. As a result, he could not detect whether it was a Bean or Bean+. Sounds like a firmware problem, right?
Well, the Bean firmware can only return a hardcoded string of bytes. Plus, the firmware was previously tested intensively by the iOS and OS X Bean loaders, so if it was a firmware bug, how could it exist for so long without us knowing about it?
Kianoosh after round 1 of our caching woes.
The classic Software vs Firmware showdown
Steve started logging the bytes he received across various versions of the firmware image. Lo and behold, Android was returning strange values for the latest released firmware.
My first thought was that the firmware had finally given up and I needed to put in my two weeks notice. But I love my job, so my second thought was to look for a non-firmware culprit. I took my iPhone out and fired up the LightBlue Explorer app as I often do to relax after a long day of work.
Sigh of relief. LightBlue Explorer returned the expected values and I was happy.
So we fired up the Bluetooth Packet Analyzer. Android was sending read requests with the wrong handles.
But why? Why the wrong handles?
Steve and I puffed on our pipes, pondering the problem. We noticed that Android was not "Walking the GATT" or discovering services before sending the read requests. Steve connected to another Bean with the same firmware and the returned values were correct, but what was special about this Bean?
We programmed the Bean that was returning correct values with the firmware in the field. We repeated the Firmware Update with the new Bean, and this Bean contracted the same disease as the other Bean.
The handles that Android was using were the handles for the old firmware’s Device Information Service. Android had cached the attributes' information and there was no documentation to indicate this behavior.
This proved that Android had cached the handles for the attributes. But how and why?
Attribute Caching in BLE
Each attribute has a 16-bit identifier called a handle. Handles are used to identify the association of each Read, Write, Notification or Indication with a particular attribute.
Attribute handles are the primary key identifiers for the data transferred. When an application reads the data of a given characteristic, or receives a notification, the identifier for the data is the handle. Handles are typically abstracted away from developers, as the GATT server keeps track of them.
This was designed in the BLE spec for many purposes. One of the primary reasons is the reduction of the packet size. For example, it allows you to avoid transferring the long 128 bit UUIDs for every custom characteristic.
In order to make sense of received packets, the data has to be associated with a handle. In our experience, iOS and OS X ignore the packets received with unidentified handles (as they should).
The process of identifying the handles for their addressed attributes is often called Discovering Services , and is executed upon connection. This process is executed using the following API calls in the Android and iOS frameworks:
When LightBlue Explorer “Interrogates” the GATT server of a BLE device, it is essentially discovering its services and the associated attributes.
Attribute Caching is the caching of the attribute handles. It is often triggered by pairing and bonding. Attribute caching allows the client to avoid the process of discovering services upon connections. The length of the discovery process is directly correlated to the number of attributes on the BLE server side.
Attribute caching's negative side effect is the mismatch between the cached information and the GATT server. This was the problem that plagued us.
Attribute Caching in Android
It appears based on observation that Android will cache the entire attribute table of a given BLE device if the Generic Attribute Service (horrible name) exists in the GATT server. BLE has the worst acronyms. Hey, at least we're past that whole Bluetooth Smart BS.
We did not find any documentation for this behavior; it is, however, discussed in the Bluetooth Core Specification.
Generic Attribute Service
The Generic Attribute Service (link to more info) includes only one characteristic, called Service Changed. The Service Changed characteristic is designed to enable a GATT server to change its structure during a BLE connection while maintaining synchronicity with the client.
Let’s say you want to add or remove a characteristic during a connection. You need the GATT client to discover the new services and characteristics upon change. The GATT server can send a Service Changed indication to the GATT client with the payload representing the of range of handles changed. This is a common occurrence in smart phones when different applications open and close if they intend to use the phone as the GATT server.
iOS and OS X's behavior with respect to Generic Attribute Service
iOS and OS X do not cache the attribute table of the connected BLE device if the Generic Attribute Service is present in their attribute table. However, if you send a Service Changed indication to iOS devices, they will rediscover the services as they should. This is also the correct operation for OS X, however there is currently a bug that prevents this from happening on OS X. We were told by Apple this is to be fixed in macOS Sierra.
Follow these guidelines when dealing with caching on your BLE device:
- If the attribute table does not change dynamically, avoid using Generic Attribute Service. Simple.
- If the attribute table does change during the connection, add Generic Attribute Service to the attribute table. Not as simple as you think. Make sure the handle for the Service Changed characteristic is constant throughout the lifetime of the firmware versions and attribute tables. We suggest this to manage scenarios when the newer firmwares may need to alert the newly connected central device that the attribute table has completely changed since the last firmware.
- If the attribute table was cached as a result of pairing and bonding and the newer firmware has changed the attribute table, re-initiate the pairing and bonding process to kick off a new "Discover Services" process from the client side.
Thankfully our firmware in the field already had the Generic Attribute Service in the GATT table. The GATT table had changed for the latest firmware release to add the new features, the handles did not match the previous firmware revisions.