Looking back on WWDC 2016

Now that the most important Apple release of WWDC has been dealt with, we can cover everything else. I haven’t followed as closely as previous years (hence no keynote reactions on Twitter), but to me here is what stands out.

The Apple App Stores policy announcements

As seen at Daring Fireball for instance, Apple briefed the press on many current and coming improvements to the Apple App Stores (iOS, Mac, tvOS, watchOS). This actually happened ahead of WWDC, but is part of the package. There are a lot of good things, such as for instance the first acceptance that Apple isn’t entitled over the whole lifetime of an app to 30% of any purchase where the buying intent originated from the app with the 85/15 split instead of 70/30 for subscriptions after the first year. However, none of this solves the lack of free trials: if only subscription apps can have free trials, then thanks, but no thanks. I want to both try before I buy and avoid renting my software, and I don’t think subscriptions make sense for every app anyway, so improvements and clarifications (e.g. indication of whether the app is “pay once and play” or ”shareware” or ”coin-op machine”) to apps using non-recurring payment options would be welcome (more on that in a later post). Also, while those apply to the Mac App Store as well, this one will need more specific improvements to regain credibility. I don’t have much of an opinion on the new search ad system.

The new Apple File System (APFS for short)

Apple announced a new filesystem, and to say that it has, over the years, accumulated a lot of pent-up expectations to fulfill would be the understatement of the year. I can’t speak for everyone, but each year N after the loss of ZFS my reaction was “Well, they did not announce anything this year, it’s likely because they only started on year N-1 and can’t announce it yet because they can’t develop such a piece of software in a yearly release cycle, so there is not use complaining about it as it could be already started, and will show up for year N+1.” Repeat every year. So while I can scarcely believe the news that development of APFS only started in 2014, at the same time I’m not really surprised by it.

I haven’t been able to try it out, unfortunately, but from published information these are the highlights. This is as compared to ZFS because ZFS is the reference that the Mac community has studied extensively back when Apple was working on a ZFS port in the open.

What we’ll get from APFS that we hoped to have with ZFS:

  • A modern, copy-on-write filesystem. By itself, this doesn’t do much, but this is the indispensable basis for everything else:
  • Snapshots, or if you prefer, read-only clones of the filesystem as a whole. Probably the most important feature, by itself it alone would justify the investment of a new filesystem to replace HFS+.

    While the obvious use case is backups, particularly with Time Machine, it is not necessarily in the way you think. Currently, when Time Machine backs up a volume, it has to contend with it being in use, and potentially being modified, while it is being backed up; if it was required to freeze a volume while backing it up, you wouldn’t be able to use it during that time and, as a result, you would back up much less often and that would defeat most of the purpose of Time Machine. So Time Machine has no choice but to read a volume while it is being modified, and as a result may not capture a consistent view of the filesystem! Indeed, if two files are modified at the same time, but one was read by Time Machine before the modification and the other after, on the backup the saved filesystem will have one file without the modification and the other with, which has not been the state of the filesystem you intended to back up at any point in time. In fact, this may mean the data is lost if you have to reload from that backup in case neither half can work with the other as a result.

    Instead, with APFS the backup application will be able to create a snapshot, which is a constant time operation (i.e. does not depend on how much data the volume contains) and results in no additional space being taken, at least initially, then can copy from that snapshot, while the filesystem is in use and being modified, and be confident that it is capturing a consistent view of the filesystem, regardless of where the data is being saved (it could be to an HFS+ drive!). Once the copy is over, the snapshot can be harvested to make sure no additional space is used beyond that needed by the live data. Of course, this will also allow, by using multiple snapshots, to more efficiently determine what changed from last time, and with APFS on the backup drive as well the backup application will be able to save space on the backup drive, in particular not taking up space for redundancies the source APFS drive knows about already. But snapshots on the APFS source drive will mean that, after 10 years, Time Machine will finally be safe: this is a correctness improvement, not merely a performance (faster backups and/or taking less space) one.

  • Real protection in the face of crashes and power loss events. HFS+ had some of that with its journal, but it only protected metadata and came with a number of costs. APFS will make sure its writes and other filesystem updates are “crash-safe”.
  • I/O prioritization. A filesystem does not exist merely as a layout of the data on disk, but also as a kernel module that has in-memory state (mostly cache) that processes filesystem requests, and the two are generally tied. I/O prioritization, some level of it at least, will allow some more urgent requests (to load data for an interactive action for instance) to “jump the queue” ahead of background actions (e.g. reads by a backup utility), all the while keeping the filesystem view consistent (e.g. a read after a write to the same file has to see the file as modified, so it can’t just naively jump over the corresponding write).
  • Multithreaded. In the same vein of improvements to the tied filesystem kernel module, this will allow to better serve different processes or threads that read and write from independent parts of the filesystem, especially if multiple cores are involved. HFS+, having been designed at the time of single-processor, single-threaded machines, requires centralized, bottleneck locks and is inefficient for multithreaded use cases.
  • File and directory hierarchy clones. Contrary to snapshots, clones are writable and are copied to another place in the directory hierarchy (while snapshots are filesystem-wide and exist in a namespace above the filesystem root). The direct usefulness is less clear, but it could be massively useful as infrastructure used by specialized apps, version control notably (both for work areas and repositories).
  • Logical volume management. Apple calls this “space sharing”, but it’s really the possibility to make “super folders” by making them their own filesystem in the same partition, and allows this super folder to have different backup behavior for instance.
  • Sparse files. Might as well have that, too.

What APFS will provide beyond ZFS, btrfs, etc. features:

  • Encryption as a first class feature. Full disk and per-file encryption will be integrated in the filesystem and provided by a common encryption codebase, not as layers above or below the filesystem and with two separate implementations. This also means files that are encrypted per-file will be able to be cloned, snapshotted, etc. without distinction from their unencrypted brethren.
  • Scalability down to the watch. ZFS never scaled down very well, in particular when it comes to small RAM amounts.

What we hoped to have with ZFS, but won’t get from APFS:

  • Crazy ZFS-like scalability. For instance, APFS has 64-bit nodes, not 128-bit. This is probably not unreasonable on Apple’s part.
  • RAID integration as part of the filesystem. APFS can work atop a software or hardware RAID in traditional RAID configurations (RAID-0, RAID-1, RAID-10, RAID-5, etc.), but always as a separate layer. APFS does not provide anything like RAID-Z or any other solution to the RAID-5 write hole. That is worth a mention, though I have no idea whether this is a need Apple should fulfill.
  • Deduplication. This is more generally useful to save space than clones or sparse files, but is also probably only really useful for enterprise storage arrays.

What is unclear at this point, either from the current state or because Apple may or may not add it by the time it ships:

  • Whether APFS will checksum data, and thus guarantee end-to-end data integrity. Currently it seems it doesn’t, but it checksums metadata, and has extensible data structures such that the code could trivially be extended to checksum all data while remaining backwards compatible. I don’t know why Apple does not have that turned on, but I beg them to do so, given the ever-increasing amounts of data we store on disks and SSD and their decreasing reliability (e.g. I have heard of TLC flash being used in Apple devices); we need to know when data becomes bad rather than blindly using it, which is the first step to try and improve storage reliability.
  • Whether APFS is completely transaction-based and always consistent on-disk. Copy-on-write filesystems generally are, but being copy-on-write is not sufficient by itself, and the existence of a fsck_apfs suggests that APFS isn’t always consistent on-disk, because otherwise it would not need a FileSystem Consistency checK. Apple claims writes and other filesystem updates will be “crash-safe”, but the guarantees may be lower than a fully transactional FS.
  • Whether APFS containers will be able to be extended after the fact with an additional partition (from another disk, typically), possibly even while the volumes in it are mounted. APFS support for JBOD, and the fact APFS lazily initializes its data structures (saving initialization time when formatting large disks), suggest it, and it would be undeniably useful, but it is still unknown at this time.
  • Whether APFS will be composition-preserving when it comes to file names. It will, certainly, be insensitive to composition differences in file names, like HFS+; however HFS+ goes one step further and normalizes the composition of file names, which ends up making the returned file name byte string different from what was provided at file creation, which itself subtly trips up some software like version control (via Eric Sink), and which is probably the specific behavior that led Linux founder Linus Torvalds to proclaim that HFS+ was “complete and utter crap”; see also this (latter via the Accidental Tech Podcast guys, who had the same Unicode thoughts as I did). Won’t you make Linus happy now by at least preserving composition, Apple? This is your opportunity!
  • Whether APFS uses B+trees. I know, this is an implementation detail, but it’d be neat if Apple could claim to have continuously been using B-/+trees of either kind for their storage for the last 30 years and counting.

For a more in-depth look at what we know so far about APFS, the best source by all accounts is Adam Leventhal’s series of posts.

Apple File Protocol deprecation

Along with APFS, Apple announced it would not be able to be served over AFP, only SMB (Windows file sharing), and AFP was thus deprecated. This raises the question over whether SMB is at parity with AFP: last I checked (but it was some time ago), AFP was still superior when it came to:

  • metadata and
  • searching

But I have no doubt that, whatever feature gap is left between SMB and AFP (if there is even one left), Apple will make sure it is closed before APFS ships, just like Apple made sure Bonjour had feature parity with AppleTalk before stopping support for AppleTalk.

Playgrounds on iOS

I’m of two minds about this one. I’ve always found Swift playgrounds to be a great idea. To give you an idea, back in the day when the only computer in the house was an Apple ][e, I did not yet know how to code, but I knew enough syntax that my father had set up a program that would, in a loop, plot the result of an expression over a two-axis system, and I would only have to change the line containing the expression, with the input variable being conveniently x, and the output, y; e.g. to plot the result of squaring x, I would only have to enter1:

60 y = x*x

run the program, and away I went. It was an interesting lesson when, due to my limited understanding of expressions, specifically that they are not equations, I once wrote:

60 2y = x+4

Which resulted in the same thing as I previously plotted, because this command actually modified line 602 (beyond the end of the loop)… good times.

Anyway, Swift playgrounds, which automatically plot the outcome of expressions run multiple times in a loop for instance, and even more so on iPad where you have the draggable loop templates and other control structure templates, provide the necessary infrastructure program out of the box, and learners will be able to experiment and visualize what they are doing in autonomy.

These playgrounds will be able to be shared, but when I hear some people compare this to the possibilities of Hypercard stacks, I don’t buy it. There is nothing for a user to do with these playgrounds, the graphic aspect is only a visualization (and why does it need to be so elaborate? This is basically Logo, you don’t need to make it look like a Monument Valley that would not even be minimalistic); even if the user can enter simple commands, it always has to start back from the beginning when you change the code (which is not a bad thing mind you, but shows even the command area isn’t an interactive interface). You can’t interact with these creations. Sharing these is like sharing elaborate Rube Goldberg constructions created in The Incredible Machine: it’s fun, and it’s not entirely closed as the recipient can try and improve on it, but except watching it play there is nothing for the recipient to do without understanding the working of the machine first.

Contrast that with Hypercard, in which not only you set up an actual interface, but what you’d code was handlers for actions coming from the interface, and not a non-interactive automaton. This also means that it was much less of a jump to go from there to an actual app, especially one using Cocoa: it’s fundamentally just a bunch of handlers attached to a user interface. It’s a much bigger jump when all you’re familiar with is playgrounds or even command-line programs, because it’s far from obvious how to go from there to something interactive. Seriously, I’m completely done with teaching programming by starting with command-line apps. It needs to die. What I’d like to see Apple try on the iPad is something inspired by the old Currency Converter tutorial (unfortunately gone now), where you’d create a simple but functional app that anyone could interact with.

Stricter Gatekeeper

…speaking of sharing your programming creations. I’m hardly surprised. This shows web apps is definitely the future of tinkerer apps.

  1. In Apple II Basic, you’d enter a line number then a statement, and that would replace the line in the saved program by the one you just entered. Code editors have improved a bit since then.

Leave a Reply

Name *
Email *