Distributed concern

When the earthquake and tsunami hit Japan in 2011 I was in Austin for my first in-person company meeting. I remember being in the hotel lobby with my teammates an many other coworkers that had just arrived for SXSW. As we learned the news, many of them contacted some of our colleagues living in Japan and made sure they were fine. I had never felt worried about something so distant before.

One of the overlooked aspects of working for a distributed company like Automattic is that your circle of friends and acquaintances suddenly explodes geographically.

Since then, every time there’s a tragedy in the news I have to ask myself if I know someone there, and sometimes I do. I knew someone at a mass shooting in a school in Colorado, and several coworkers living in Paris during the 2015 attacks. A colleague’s parents lived in Cairo during the 2011 revolution. I’ve received Facebook notifications about friends being OK after a flooding in South Asia without even knowing they were there. And so many other times when I had to wonder again, do I know someone there?.

You see tragedies in the news every day. It’s so common that it’s sometimes hard to feel connected. It’s hard, I think, because if we cared as deeply about the remote tragedies as we do about the local ones, we would be in pain all the time.

I’m glad to know all those people around the world because that question has been very powerful. Most of the time I don’t know someone there, or if I know they’re fine. But then it’s easy to ask myself what if I did?. This isn’t just another segment in the news, this is real life, and real tragedy for many people.

A series of interiors

I’ve been reading Wanderlust: A History of Walking, and this paragraph made me pause:

Many people nowadays live in a series of interiors — home, car, gym, office, shops — disconnected from each other. On foot everything stays connected, for while walking one occupies the spaces between those interiors in the same way one occupies those interiors. One lives in the whole world rather than in interiors built up against it.

I’ve had a similar reflection before when traveling: that you could leave home and go from taxi to train to airport to the other side of the world and emerge on a subway exit in New York or Tokyo twenty-some hours later, without having set foot on an open space.

And a similar feeling is portrayed early on in Zen and the Art of Motorcycle Maintenance about riding a motorcycle instead of walking:

You see things vacationing on a motorcycle in a way that is completely different from any other. In a car you’re always in a compartment, and because you’re used to it you don’t realize that through that car window everything you see is just more TV. You’re a passive observer and it is all moving by you boringly in a frame.

On a cycle the frame is gone. You’re completely in contact with it all. You’re in the scene, not just watching it anymore, and the sense of presence is overwhelming. That concrete whizzing by five inches below your foot is the real thing, the same stuff you walk on, it’s right there, so blurred you can’t focus on it, yet you can put your foot down and touch it anytime, and the whole thing, the whole experience, is never removed from immediate consciousness.

Diving into Swift compiler performance

It all starts by reading this week in Swift, and the article The best hardware to build with Swift is not what you might think, written by the LinkedIn team about how apparently their Mac Pros are slower at building Swift than any other Mac.

I’ve spent so much time waiting for Xcode to compile over the past years than I’ve often toyed with the idea of getting an iMac or even a Mac Pro for the maximum possible performance, so this caught my attention. I’ve also been wondering if instead of throwing money at the problem, there might be some easy improvements to either reduce build time or to improve Swift performance.

Looking at the reported issue I discovered a couple Swift compiler flags that were new to me: -driver-time-compilation and -Xfrontend -debug-time-compilation, which will show something like this:

                               Swift compilation
  Total Execution Time: 10.1296 seconds (10.6736 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   3.9556 ( 99.9%)   6.1701 (100.0%)  10.1257 (100.0%)  10.6697 (100.0%)  Type checking / Semantic analysis
   0.0013 (  0.0%)   0.0002 (  0.0%)   0.0015 (  0.0%)   0.0015 (  0.0%)  LLVM output
   0.0011 (  0.0%)   0.0001 (  0.0%)   0.0013 (  0.0%)   0.0013 (  0.0%)  SILGen
   0.0005 (  0.0%)   0.0001 (  0.0%)   0.0006 (  0.0%)   0.0006 (  0.0%)  IRGen
   0.0003 (  0.0%)   0.0001 (  0.0%)   0.0003 (  0.0%)   0.0003 (  0.0%)  LLVM optimization
   0.0001 (  0.0%)   0.0001 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  Parsing
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  SIL optimization
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Name binding
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  AST verification
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  SIL verification (pre-optimization)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  SIL verification (post-optimization)
   3.9589 (100.0%)   6.1707 (100.0%)  10.1296 (100.0%)  10.6736 (100.0%)  Total

I started looking into the results for the WordPress app and it looks like almost every bottleneck is in the Type Checking stage. As I’d find later, mostly in type inference. I’ve only tested this on Debug builds, as that’s where improvements could impact development best. For Release, I’d suspect the optimization stage would take a noticeable amount of time.

Time to look at what’s the slowest thing, and why. Disclaimer: the next shell commands might look extra complicated. I’ve used those tools for over a decade but never managed to fully learn all their power, so I’d jump from grep to awk to sed to cut and back, just because I know how to do it that way. I’m sure there’s a better way, but this got me the results I wanted so ¯\(ツ)/¯.

Before you run these close Xcode, and really everything else if you can so you get more reliable results.

Do a clean build with all the debug flags and save the log. That way you can query it later without having to do another build.

xcodebuild -destination 'platform=iOS Simulator,name=iPhone 7' \
  -sdk iphonesimulator -workspace WordPress.xcworkspace \
  -scheme WordPress -configuration Debug \
  clean build \
  OTHER_SWIFT_FLAGS="-driver-time-compilation \
    -Xfrontend -debug-time-function-bodies \
    -Xfrontend -debug-time-compilation" |
tee profile.log

Print the compiled files sorted by build time:

awk '/Driver Time Compilation/,/Total$/ { print }' profile.log |
  grep compile |
  cut -c 55- |
  sed -e 's/^ *//;s/ (.*%)  compile / /;s/ [^ ]*Bridging-Header.h$//' |
  sed -e "s|$(pwd)/||" |
  sort -rn |
  tee slowest.log

Show the top 10 slowest files:

head -10 slowest.log
2.9555 WordPress/Classes/Extensions/Math.swift
2.8760 WordPress/Classes/Utility/PushAuthenticationManager.swift
2.8751 WordPress/Classes/ViewRelated/Post/AztecPostViewController.swift
2.8748 WordPress/Classes/ViewRelated/People/InvitePersonViewController.swift
2.8741 WordPress/Classes/ViewRelated/System/PagedViewController.swift
2.8699 WordPress/Classes/ViewRelated/Views/WPRichText/WPTextAttachmentManager.swift
2.8680 WordPress/Classes/ViewRelated/Views/PaddedLabel.swift
2.8678 WordPress/Classes/ViewRelated/NUX/WPStyleGuide+NUX.swift
2.8666 WordPress/Classes/Networking/Remote Objects/RemoteSharingButton.swift
2.8162 Pods/Gridicons/Gridicons/Gridicons/GridiconsGenerated.swift

Almost 3 seconds on Math.swift? That doesn’t make any sense. Thanks to the -debug-time-function-bodies flag, I can look into profile.log and see it’s all the round function. To make this easier, and since it doesn’t depend on anything else in the app, I extracted that to a separate file. In this case, the -Xfrontend -debug-time-expression-type-checking flag helped identifying the line where the compiler was spending all the time:

return self + sign * (half - (abs(self) + half) % divisor)

When you look at it, it seems pretty obvious that those are all Ints, right? But what’s obvious to humans, might not be to a compiler. I tried another flag -Xfrontend -debug-constraints which resulted in a 53MB log file ?. But trying to make sense of it, it became apparent that abs was generic, so the compiler had to guess, and +,-,*, and % had also multiple candidates each, so the type checker seems to go through every combination rating them, before picking a winner. There is some good information on how the type checker works in the Swift repo, but I still have to read that completely.

A simple change (adding as Int) turns the 3 seconds into milliseconds:

return self + sign * (half - (abs(self) as Int + half) % divisor)

I’ve kept going through the list and in many cases I still can’t figure out what is slow, but there were some quick wins there. After 4 simple changes, build time was reduced by 18 seconds, a 12% reduction.