REUTERS

Experts with the international developer of antivirus software ESET have told about the way COVID-19 spread tracking applications work and what potential threats to personal data they bear.

"[M]ore than 30 countries have, or are planning to release, apps designed to contact trace or geo fence their users, for the purposes of limiting and managing the spread of COVID-19," the report says.

"The majority of apps available are government sponsored and use a variety of different methods to fulfill their purpose, such as Bluetooth vs. GPS, centralized vs. decentralized, and not all are sensitive to maintaining the privacy of the user," authors report.

Видео дня

There are two main methods being used to glean the physical proximity of users. The first is the global positioning system (GPS): this uses satellite-based radio-navigation to approximate the individual’s location and the location of other app users. The second, more prominent, solution uses Bluetooth and signal strength to identify other app users’ proximity, allowing the devices to exchange handshakes rather than track actual location. There are some solutions that use a mix of both Bluetooth and GPS and some even use network-based location tracking, but these methods have significant location-tracking privacy issues and are fortunately limited to only a few developments.

When a contact-tracing app comes into contact with another device running the same app there is a handshake and an exchange of keys. These keys are typically continually changing and are generated on a time basis and unique to the device. When device A meets device B, they share keys based on a predetermined distance and time requirement, for example within 2 meters for 15 minutes. The device either holds on to the keys or passes them to a central server; when users confirm they may be positive for infection then all the keys they have generated are added to a cloud system. All other devices will collect these on a frequent basis to see if there is a match with keys that have been collected or alternatively this match will be processed in the cloud. If there is a match, then those users are warned that they have been in contact with another device that is now reporting positive; they have no clue which device.

Read alsoUkraine cancels observation for foreigners with negative COVID-19 results after border crossing

If the user is identifiable and all data is held and processed centrally, then there is clearly a privacy issue; if, however, the user is not identifiable and the central cloud is only processing for matches, this could be more efficient than asking the local device to do this processing, especially if the end device is limited on resources … which could be the case in some areas of the world. This approach also gives the centralized system the ability to identify potential false positives, where some malicious users say they are infected, yet in reality they are not and are just attempting to create chaos for users, companies and society in general. Using complex algorithms to identify false positives in a decentralized approach is less realistic due to resource limitations.

A benefit of partial centralization is that the portion of centralized data that is being processed could be used to inform scientists how the population as a whole moves around and to quickly identify hotspots to enable medical resources to be allocated. If, for example, a ZIP code is requested at the time of installation, then data scientists may be able to predict disease spread. This is unlikely to enable the user to be identified, as a single ZIP code is used by hundreds or thousands of people; it does narrow the potential to identify an individual, but it may offer an acceptable compromise on privacy.

Even the solutions that claim to be the most privacy sensitive are open to abuse: take the extreme scenario where video surveillance is used in conjunction with capturing Bluetooth signals emitted from devices and capturing the keys that are being exchanged. Combined with facial recognition technology and the location of the device at a known time could mean the user is identifiable. While this may seem extreme it demonstrates that no one system offers a privacy guarantee.