Ce billet est également disponible en Français
Dear Mr. Cédric O
Small aside for my regular readers: Cédric O is in charge of digital matters in the French government. Of France. No, I am not reducing his name to its initial, this is indeed his name. So while I’ve got your attention, allow me to state that when requested to add name validation that mandates a minimum number of characters, the only ethical solution for a software developer is to refuse to implement that. Back to the matter at hand.
First, thank you for your recent public pronouncements (of which iMore provides good English-language coverage, in my opinion), which allow me to give a theme to this post I intended to write down: that of the open letter.
Indeed, I want to react to the direction you want to steer the StopCovid project towards, which is that of direct standoff with Apple. Now I will readily admit to a bias in favor of Apple, as the coverage of this blog shows: a lot of it centers on Apple technologies, with a focus on iOS. But I hope to convince you anyway that entering a public standoff with Apple would be futile, even if you’d wished for Apple to be pilloried.
It was basically an excellent idea to start and develop the StopCovid project, don’t get me wrong, but I think it’s useful to go over exactly why. Indeed, some observers (some of which I have a lot of respect for, such as Bruce Schneier) have dismissed this kind of computer-based contact tracing infrastructure as being tech solutionism, and these accusations deserve closer scrutiny. I don’t feel the need to go over the principle and usefulness of contact tracing, not only to warn those a newly diagnosed individual could have infected without his knowledge, but also to attempt to figure out the individual who himself infected the newly diagnosed one, if not known already, and thereby potentially walk back the contamination chain.
However, I do feel it is useful to restate the main use case, which is containment of reemerging outbreaks, be it upon the easing of lockdown or imported from endemic areas. As a result, we can assume being a context of a breach that needs repair, and therefore (as an example) means that see little use in the observation phase, such as tests on asymptomatic people, can suddenly be deployed on a large scale as soon as a confirmed case is raised. One of the aims of the lockdown (along with easing the pressure on hospital care) is precisely to get back to a situation where repairing the breaches is feasible anew, as opposed to a situation where there are more breaches than digits available to plug them.
But as we saw when looking for the index case in the Oise (where I’m told it is still being searched for; more than two months after the fact, the expected benefit seems rather thin in my opinion), current techniques appear to be outmatched in the case of Covid-19. And research coming from Oxford university confirms that assessment, as it suggests the inferred epidemiological characteristics of Covid-19 squash any hope of efficient contact tracing by traditional techniques, validating the need for an automated contact recording solution within a more general effort.
That qualification is useful to make, as no application will be a panacea, but rather a link in a chain where all elements must be present. Such as widely available testing (as France appears to be ready to provide as May begins) that also has quick turnover: if the swab result does not come back until three days later, we reach the contacts, and the tests of those don’t come back until three days after that, it is obvious that even if the application allowed for instant contact tracing, the disease would nevertheless outpace us. As a result, the buildup of a waiting list for testing must be avoided at any cost (PCR itself taking only a few hours), and we must ensure the swabs can be quickly taken to PCR spots. And no application can ever replace actual tracing squads, if only in order to trace contacts not covered, those where one of the two is not equipped with a compatible mobile phone, or any mobile phone at all, for instance. That is why it makes sense to display tracing capabilities at the départemental scale, rather than at the regional scale.
How: the physical layer
All that having been said, we can now start going over the different methods for automatically establishing a contact has occurred. Geolocation can be dismissed outright: it is simultaneously not accurate enough for the need: as GPS is not available everywhere or instantly, smartphones often fall back to cell tower or Wifi hotspot geolocation, and too invasive, for obvious reasons. Bluetooth, on the other hand, is designed to allow direct transmission between digital devices, including peripherals, and its low energy variant has even been designed to support very constrained environments such as being integrated in a key fob, thus enabling it to be found by proximity detection. This notion of proximity is essential: for determining potential contamination, we don’t so much want to compute an exact distance as to reduce testing to a list of potential contacts, erring on the side of caution, rather than having to test everyone in a 100 miles radius in case of a confirmed case.
How: Data processing
OK, so we will retrospectively determine a contact by the fact the mobile phones of the parties in question have been able to exchange (or “exchange strongly enough” by measure of signal intensity among other means) data through Bluetooth. But how should this data be kept (and which data), and where should that computation be performed when the time comes?
Any mobile phone already broadcasts Bluetooth messages to any interested party for various reasons (that is what allows them to appear in the list of reachable devices on the selection interface of a personal computer, for instance). So a first idea would be to set up passive listening of broadcast Bluetooth messages and locally store the Bluetooth identifier of the emitter. But that quickly runs into some issues: for privacy reasons, as it happens, that identifier rotates at regular intervals for implementations of recent versions of the standard, such that being forgotten by is own emitter, it becomes useless; furthermore, many mobile phones will throttle if not stop broadcasting when they are not in active use so as to save battery life, which has never ceased to be an important design constraint on mobile phone design.
So it seems necessary to install a change of behavior on both sides of the contact, which shifts the problem space: now both sides have to be modified for even either side to benefit from the system. As a result, it’s kind of like masks: if the source of the contamination did not previously install the application, the contaminated will get no benefit from having installed the application, so a reaching a sufficient density of participants is paramount. That could lead to consider providing smart watches (which are harder to forget at home) to those who, as is their right, do not own a compatible mobile phone.
Now that we can freely design the software on both sides of the interaction, the design space is greatly expanded, too much to explore it here. However, one significant choice is that of the processing which determines whether a contact previously occurred with a newly diagnosed individual, processing which therefore needs access to any necessary information for that purpose: should it occur on the mobile phone (either one: the two now being “hermaphroditic”, what one can determine, symmetrically the other will be able to, as well), or on a central server?
In the second case, since the aim is to warn the bearer of the phone involved in the contact, that would by all appearances entail that the server has the means to contact any user of the system (directly or indirectly: if every user must regularly query the server by asking “I am the user with pseudonym x, is there anything new for me? Send the answer to IP address 192.0.2.42”, that is equivalent to being reachable).
In the first case, however, it is impossible to prevent the user from figuring out on which occasion he has been in contact with a newly diagnosed person (though still coming short of directly informing of his identity): even if the processing which determines contact were a difficult to reverse cryptographic function, it would suffice for him to run that function on subsets of his contact database, through dichotomy if need be, until the single event sufficient to make the function return a positive is found.
The Apple and Google system
That having been established, let us look at the choices made by Apple and Google, as well as the API provided, over which an application will be able to be built.
To begin with, “Exposure Notification” is a service as far as Bluetooth Low Energy is concerned, that is to say a protocol relying on BLE for over the air data exchange, as HTTP is a protocol relying on TCP for over the Internet (usually) data transmission. It is documented here as I type this; as such, the consortium managing Bluetooth has provided a protocol identifier specifically for its use. The format as it appears on the wire (so to speak) is simple: beyond the mandatory Bluetooth framing, it’s mostly the rotating proximity identifier, but it comes with a metadata field, whose only purpose so far (beyond versioning the protocol, allowing to confirm implementation compatibility) is the improve signal intensity loss computations.
As the name suggests, the rotating proximity identifier rotates at more or less regularly: more or less because if rotation was too regular, that would make it easier to trace people, and render these changes useless: rotation occurs at most every ten minutes, and at most every 20 minutes. All this is properly detailed in the crypto document, which describes how data sent by the protocol mentioned above is generated, how data sent and received is stored, and in particular how to determine a potentially contaminating contact has occurred.
The most important property of the “Exposure Notification” system is that this determination is performed locally. The system assumes the presence of at least one server, but the latter only broadcasts non-nominative data collected (with permission) from diagnosed individuals so as to enable the recipient to make this determination: nothing is uploaded for users that have not been positively diagnosed yet. Even the data that does get uploaded reveals little, to the extent that it amounts to revealing the randomly generated data that was used for sending the rotating proximity identifiers, without any personal data, least of all identifying data, being involved.
The system credibly claims other properties: for instance, it would appear to preclude the collection of information emitted by others, only to later make that be broadcast by passing it off as one’s own information (innocents won’t be the only people positively diagnosed with Covid-19, you have to assume adversaries will too, and as a result minimize their nuisance potential).
That being said, the system does not ensure by itself that only positively diagnosed individuals will be able to upload their alleged contamination information: I have a hard time seeing how it could provide any control in this area, therefore it relies on the health authorities for such a filter.
That is apparent in the third and last document, which describes the API an application (here for an Apple device running iOS) will have to use for interfacing with the service. The API manages in the background everything in relation with the Bluetooth service and data storage, but does not involve itself with network interactions, which is the responsibility of the application proper: parts of the API consists of explicitly taking this data, or providing data to be sent; this is more explicit in the API documentation for a Google device running Android, which otherwise describes the same API, apart from the use of the Java language, as required by Android.
Aside from that, the API in general is unsurprising: it is modern Objective-C, with blocks used for callbacks when data processing is complete for example. It seems to provide good coverage for the future applications usage scenarios: an
ENManager class for interaction with the system as a whole such as displaying the current status (allowed or not, mostly) and recover data to be uploaded in case of a positive test result, and a
ENExposureDetectionSession to be used when checking whether the latest batch of uploaded data would not trigger an exposure when combined with the internally stored data. The only surprise being Objective-C: we would have expected Swift instead, given how trendy the language is for Apple, but that does not affect the interface functionally, it is even likely that it can directly be used from Swift.
The API reveals one final intriguing functionality, which is the possibility to declare your risk level to the system as soon as any contamination is suspected, before even a formal positive test result, so as to recursively warn contacts; with a lower intensity, of course. That will nevertheless have to go through the health authorities, so it remains to see what they will use this for.
The Apple and Google system: the unsaid
As we see, while the documentation mentions the role played by the server, it remains silent as to how many there will be, how it or they will be managed, and by whom. Multiplying the servers would be unfortunate to the extent it would force the installation of as many apps as there would be applicable servers in order to be properly covered (as much to receive the information broadcast by each server, as to be able to send to each of these servers should it become necessary); that could be acceptable (though we would rather do without) in specific cases, such as that of commuters who cross a border, it is however completely unacceptable for the average user.
Both companies intend to severely restrict access to this API: quoting from their commen document gathering answers to frequently asked questions:
10.How will apps get approval to use this system?
Apps will receive approval based on a specific set of criteria designed to ensure they are only administered in conjunction with public health authorities, meet our privacy requirements, and protect user data.
The criteria are detailed separately in agreements that developers enter into to use the API, and are organized around the principles of functionality and user privacy. There will be restrictions on the data that apps can collect when using the API, including not being able to request access to location services, and restrictions on how data can be used.
Apple in particular has the means to enforce these restrictions: an application in general not only cannot be distributed (beyond a handful of copies) without their permission, but it is certain that a sandbox entitlement (the sandbox being the environment in which an iOS app runs, restricting their access to the operating system to the necessary services with some exceptions, only recognized with a signed waiver from Apple) will be necessary, with very few entities being able to claim such an entitlement (state actors, mostly); sorry for those who would like to play with it: com.apple.developer.exposure-notification will be the most restrictive entitlement ever available for the iPhone… It is a sure bet that Apple will not hesitate to invalidate an entitlement that would have leaked or become abused.
Given the arbitrator position at least Apple holds, I therefore wonder about the lack of any rule or even recommendation on the multiplication of servers. I can conceive that neither Apple nor Google want to expose themselves even more to charges that they are holding themselves as above states, but a confirmation that one server at most would be allowed per country (defined as an entity with diplomatic representations or equivalent) would be desirable, while still allowing each country to have multiple apps if needed, as long as they are all based on the national server. I of course hope that the EU will set up a common server that all member states will adopt, and ideally you would only need one per continent, but it would seem difficult designate for each continent the political entity that would be able to be put in charge of that (as for having a single global server, if that were viable politically as well as technically, Apple and Google would already have handled that).
Additionally, the documentation mentions a second phase where the system will be able to be activated out of the box through the operating system without requiring any app to be installed, with the suggestion to install one appearing once the system has determined a potentially contaminating contact; but if the system can determine that out of the box, that implies the ability to recover data from a server without any previously installed app, so which one is used?
And for that matter, there is no guidance on the role of health authorities to ensure the reliability of data that their servers would broadcast. I recall an incident that was reported to me while working in the telecommunication industry: wireless networks used by police, firefighters, EMT, etc. provide a “distress call” functionality these customers take seriously, you can understand why. In practice, it is a voice call, but flagged so as to trigger some specific processing (in relation to priority, in particular) and raises alarms at many levels. And when initially interconnected with the system covering a neighboring district, it did not go over exactly as planned. Indeed, even though the interconnection protocol did specify how to transmit the distress status of the call, it left a lot of leeway in how to react to such a status; and as it happens, at least one of the systems would consider that distress status to matter so much as to make it sticky for the channel in question, up until explicitly cleared by a supervisor. Which by itself can make sense, but in the case where the channel would belong to the other interconnected system, that one having a different configuration would as a result never send such a confirmation, such that the channel would perpetually be considered in distress after that, even for ordinary calls. Pretty quickly, all channels in this situation ended up flagged as in distress without any way to clear them, and when all calls are distress calls, well none are any more. They had to be supplied an emergency patch.
So it would appear risky to invest resources (engineering, quality assurance, etc.) setting up a system without derisking the possibility that it be for naught if it ends up giving back too many false positives to remain usable. I can’t imagine doing without rules to ensure the reliability of information broadcast by the server or servers.
Finally, still in the geographical matter a risk (raised by Moxie Marlinspike) exists where the database to download could become too heavy (in particular in case the epidemic flares back up) to be practical to download, such that it would become necessary to partition it according to location… thus reintroducing geolocation for this purpose, ruining part of the assurances offered. Similarly to server multiplication, I think this is a matter for which Apple and Google should state their intentions.
StopCovid, as with many other projects, was started before the Apple/Google initiative was made public; the project has followed different principles, a process which I respect; the protocol, called ROBERT, is documented. The choice was notably made of an architecture where contamination determination is centralized, with the benefits and drawbacks this entails, I won’t go over them again.
As for the matter of server multiplication, we could question already the necessity of protocol multiplication: will the average user need to install one application for each protocol? But that is not where the issue currently lies: since the ROBERT protocol relies on Bluetooth as expected, its implementation on iPhone meets well-established restrictions by Apple on the ability to use Bluetooth by an app which is not in active usage; these restrictions are documented by Apple, even if in vague terms; I have no doubt the project was able to assess them first hand. They aim at preserving battery life and protecting privacy. They drastically reduce the viability of the solution.
Technologically, it was important to try, if only in order to be credible and not depend on a partner that would have been chosen out of a lack of any alternative. The StopCovid project, notably through its experiments on (approximate) distance measurement using Bluetooth, has already accomplished advances which will be more generally useful, and as a project it could go forward within the framework of a more general protocol, with it adopting another protocol not being synonymous with the end of the project.
Because let’s be clear: I can hear media mention systems that could be “made compatible”, but the only way I know of to make two systems communicate is to have them implement the same protocol. It can consist of two different implementations, but they must be of the same protocol.
For when confronted with these longstanding restrictions, you have chosen the path of the standoff with Apple: you express surprise that Apple would not lift these restrictions on demand, and you insist that the project be delivered in production according to its current principles, invoking a matter of technological sovereignty.
Such a standoff is futile and even harmful, and for at least two reasons.
Futile for now
Beyond these restrictions, Apple has more generally asserted the principle that they control the software that can be distributed on the mobile devices showing their brand. As time went on this control sat less and less well with me, such that I have looked for alternatives, in particular through the web, to be able to distribute my personal projects. This is becoming viable for some applications, such as file processing, but when it comes to Bluetooth there is no alternative to the “native” SDK.
So even if I personally object to some of Apple policies, do you seriously believe that after 12 years of dictating their conditions as they saw fit for iPhone software, witout exception, except those they defined themselves (there are a few more, less well documented), you were going to be able to just walk in and convince them or force their hand just like that? That is delusional.
It is all the more delusional as in other situations where they were largely more exposed (I am referring in particular to a decryption request from the FBI), Apple did not cave in to pressure, and they were proud of it and they still are. That is a matter among others of professional pride, as much in relation to the preservation of battery life as it is of privacy protection. Do you really think big words will be enough to make them change their minds?
If at least you were to invoke some argumentation, such as on the potential advantages for privacy of a centralized solution like ROBERT when compared to their solution, that could make them think twice about it. But instead, only denunciation of their financial health (insolent for sure, but what is the relationship with the matter at hand?) or invocation of technological sovereignty is brought.
You could get the upper hand through legal constraints, but its is certain it will take time, a lot of time. So defending technological sovereignty of France could make sense… in other circumstances. Because, I don’t know if you noticed, but France is in the middle of a pandemic right now. And France will only be able to eventually get rid of it through herd immunity; and I don’t know about you, but I’d rather acquire it through a vaccine, or failing that as late as possible. But by the time you’d have forced Apple’s hand through legal means, I think a vaccine will have been found.
Therefore, your insistence on technological sovereignty tells me that attempting to enforce it in the immediate situation is more important for you than having an efficient contact tracing solution, able to save lives. These priorities are backwards. Technological sovereignty matters, but there will be opportunities to enforce it later, or in other ways.
Maybe it’s unfair for Apple to be dictating their terms in such a way. Maybe it ought to be otherwise. But in the meantime, in the here and now, they hold the keys to the system.
Futile in the long run
Let us now assume Apple has given you the keys. Your troubles are not over. As the reasons for which they have set up these restrictions in the first place have not gone away, especially that of battery preservation.
So what is to say your solution will not excessively drain the battery? In particular, you will have a hard time finding skilled developers on the market when it comes to reasonable usage of Bluetooth when unshackled from these restrictions: even those who are currently doing so on Android may come to a few surprises once on iPhone.
This is particularly important as some iPhones remain in active usage for far longer than Android devices, so often with a degraded battery. I personally still haven’t replaced my 6-year-old iPhone 5S which still works fine, except its autonomy is no longer what it once was, and I don’t think I am alone. The matter isn’t merely that I will have to pay more attention to my battery level once StopCovid will be installed; the matter is how will the general public react to an application that, once installed, will cause the battery to drain more quickly, leading it to become fully drained if not carefully watched? A fraction at least will disable the application. And did not we tell earlier how important sufficient penetration was for the system to function?
Once again, the established situation matters, and the established situation is that Apple keeps maintaining many devices in the wild that others would consider obsolete (but which do feature Bluetooth Low Energy), and any shift in this equilibrium will be noticed. For instance, do you really want to expose yourself to accusations of trying to drive premature device renewal, when the additional battery drain of StopCovid will be realized, thus potentially forcing some to renew hardware that was previously more functional?
You could respond that Apple itself will encounter the same challenges and risk increasing the battery drain when they will implement their own system. That may be the case, but I would rather trust them in that domain than I do the developers of StopCovid, no offense meant.
I refuse the false dichotomy brought by some commenters, who would reduce the choice to the entity to which I would have to confide myself. With the right protocol, the right architecture, we can avoid having to trust any particular party more than we are comfortable with, and reconcile seemingly opposite requirements.
While Germany initially had its own project, it has recently announced joining the Apple and Google system, but that will not prevent it from proposing its own application based on that system. What is to prevent France from following the same path? We will all benefit by avoiding balkanization of solutions, and France here is not in a position of strength.