Last time on iftheshoefritz

As discussed last week, installing things on my new M1 machine was mostly straight forward. Unfortunately installing older versions of Ruby (that I need for consulting) was not. This is a tale of fumbling in the dark with things that I know very little about.

Nonetheless, there were techniques I could apply.

It’s a long post, so a table of contents is in order:

TL;DR / Just give me the answer

If you’re just here for the answers to the “machine not recognised” error when building Ruby from source on an M1 machine, here’s the exact command I ended up running to make everything work:

☁  ~  export optflags="-Wno-error=implicit-function-declaration"; export LDFLAGS="-L/opt/homebrew/opt/libffi/lib"; export CPPFLAGS="-I/opt/homebrew/opt/libffi/include"; export PKG_CONFIG_PATH="/opt/homebrew/opt/libffi/lib/pkgconfig"; export CONFIGURE_OPTS="--build aarch64-apple-darwin20"; asdf install ruby 2.6.0

(your Ruby version and Ruby version manager may vary)

After this I had a segmentation fault when I ran Rails executables (rails c, rspec, etc), which I fixed by reinstalling Ruby-FFI with:

gem install ffi:1.14.2 -- --enable-libffi-alloc

(again, your version may vary)

And then bundle install again, and suddenly, everything works!

And now on to the main course!

Initial error: “machine not recognised”

The first version of Ruby that I installed myself through asdf, was the start of a whole lot of trouble:

☁  ~  asdf install ruby 2.6.0
Installing ruby-2.6.0...

... SNIPPED...

BUILD FAILED (macOS 11.2.3 using ruby-build 20210420)

checking build system type... Invalid configuration `arm64-apple-darwin20.3.0': machine `arm64-apple' not recognized
configure: error: /bin/sh tool/config.sub arm64-apple-darwin20.3.0 failed
make: *** No targets specified and no makefile found.  Stop.

The long debugging process

What I’m about to discuss is not just a story of “how to install Ruby”, but rather a story of a debugging process. It may be that if I just had the right version of something, I never would run into this exact error. Hopefully though, the process gives something generalisable beyond my specific issue. In particular, I’ll try to keep the headings “issue agnostic”.

1. Take stock: what do I know about this context?

I know a little bit about Ruby installation failures, mostly because of discrepancies in OpenSSL versions between Homebrew and older versions of Ruby.

asdf vs rbenv vs ruby-build and building from source

I was trying to install Ruby using asdf, but I knew that asdf depended on ruby-build, which was also used by rbenv to install Ruby. Therefore help I found related to one would probably be helpful for the other.

Ruby-build downloads the Ruby source and compiles it on the machine, so information about the process of building Ruby from source would also be helpful.

Adding command line options to the installation command

As a result of broken Mac OpenSSL issues, I suspect a lot of Ruby developers on Macs share my experience of specifying weird command line options, like RUBY_CONFIGURE_OPTS=--with-openssl-dir=... in front of rbenv install. These seemingly magical incantations could make a broken install suddenly work.

Running commands in the install directory

In the process of learning that with-open-ssl-dir trick, I also had the experience of looking at the errors to find the directory where the installation was building from source (the error would tell you where that was). Sometimes it was possible to fix an installation by CDing there and running ./configure.sh or make install. It could also be useful to grep for more information about what had gone wrong.

Configure vs make while building from source

I know very little about how Ruby is installed, but I know that there are two steps in the process after the source is downloaded: “configure”, and “make”. At the start of this process I didn’t know much about the difference, but I knew there were different problems that affected each.

2. Read the error: what is the exact problem?

I spotted what seemed to be the most significant part of the error: Invalid configuration 'arm64-apple-darwin20.3.0': machine 'arm64-apple' not recognized. I was installing Ruby on a new CPU architecture (M1 / arm64-apple) so this error didn’t surprise me.

The line below also told me that there was a problem in the configure step: configure: error: /bin/sh tool/config.sub arm64-apple-darwin20.3.0 failed.

3. Shortcut: web search for the exact error

Early on in any debugging process, I just copy and paste the most significant part of the error into my browser search bar.

There’s pride in me that resists this, but if someone out there has had the exact problem that I have, there might be a simple thing I can do that will fix everything without further effort.

Good sources on the web

When I make my initial search, I’m looking for the sites that experience has taught me have good information.

Bug discussion for the relevant tools

There are places where programmers who know the tools well will visit when they have problems. I-can’t-do-my-job-without-this people, or even I-stand-a-chance-of-fixing-this people. For instance:

These sites have enormous lists of issues and unfamiliar (or poor) search tools, so I don’t search or scroll directly, I’m using Google here.

Fixes take time, but until then, the discussion also involves workarounds, which are often good enough for me.

Ruby developer personal blog posts

Some Ruby developers will also write up posts about particularly difficult issues that they solved on a personal blog. These are often super quick “this was the error, here’s the answer” posts. A great resource if someone has had the inclination.

Stackoverflow

Failing that, stackoverflow.com, although I’m used to them more having topics about how to use tools than about problems with the actual tools.

Progress: good try, no cigar

This time, I found nothing related to my error. Some people had the error for certain gems with native dependencies. One blog post found the same problem with the system Ruby, but no useful solutions. Lots of other Ruby installation errors on M1, but not this one.

4. Virtual rubber ducking: raise an issue in github

Next up, I raised an issue on the github asdf-ruby project.

I’m installing this stuff on my own at odd hours of the day while my day job continues on my old machine. Thus the opportunities for help and rubber ducking are few.

The social pressure when raising an issue in an open source project is towards giving good information that will help someone else debug. I consider this a sort of virtual rubber ducking.

The start of this was to write out the exact error I was getting, and I noted the versions of Ruby I had tried to install (all with the same result).

Still no outside help

One thing I did not get from raising a github issue, was any outside help. I got exactly one thumbs up on the issue I raised, and no comments. With hindsight this may be because I raised the issue in the wrong place: the problem was really with how Ruby is built for certain versions of Ruby, and not with the asdf Ruby plugin.

Surprises are good: Ruby 3 installs perfectly

In trying to satisfy my expectations of what a good issue looks like, I tried several versions in different families (2.5.x-2.7.x, 3.0), and was shocked to find that Ruby 3 installed without any errors.

This is the exact result that I hope for while rubber ducking: the realisation that one of my assumptions (Ruby does not install on this machine) is too general. There’s something to learn from the exceptions.

Progress: hope and basis for comparison

I had gained hope that the problem was not only my machine. If Ruby 3 worked, maybe other versions of Ruby would work, too.

Further, I could now compare things between the working Ruby 3 build process and the broken Ruby 2.x (I later discovered that the latest versions of 2.x - released in late 2020 - also worked).

5. Isolate the error at a low level of abstraction

In the absence of outside help, I had to look for insight elsewhere.

Identify the pieces

One thing that bothers me is that I don’t know much about what asdf is doing when I run asdf install ruby .... There’s something it does related to switching between Ruby and other technologies, something about downloading source, and then there’s the actual building Ruby.

Focus on the parts most likely to have an error

I knew that Ruby 3 worked and Ruby 2.x did not. It seemed unlikely that asdf was downloading or storing the different versions in different ways. It would be nice to skip everything except the part of the process with the actual error, and that error was most likely to be in the build step.

Smallest failing command: configure

Included in the asdf install ruby 2.6.0 output was the line:

Inspect or clean up the working tree at /var/folders/_m/8mld51pj7tn48mztkc9j3p3c0000gn/T/ruby-build.20210514080334.1535.zwqARp

In other words: go look at that folder, that’s where the installation was tried and there might be something useful there.

I CDed into that directory and ran configure:

☁  ruby-2.6.0  ./configure
checking for ruby... /Users/frederickmeissner/.asdf/shims/ruby
tool/config.guess already exists
tool/config.sub already exists
checking build system type... Invalid configuration `arm64-apple-darwin20.3.0': machine `arm64-apple' not recognized
configure: error: /bin/sh tool/config.sub arm64-apple-darwin20.3.0 failed

The same error I had before, but I had a faster feedback loop now. No need to run asdf commands and wonder about what they do, no need to wait for files to download. I could just stay in this directory, try something, and run configure to see if it worked.

Progress: speed and confidence gained

At this point I had confirmed that the issue was occurring as part of the build step, and more specifically that configure was a problem. I now had a pretty good test case, and the error was consistent.

6. Broader web search with new knowledge

I now know that the problem is not related to only my machine, it was possible that it could occur in other settings. What would happen if I took the search, “Ruby install Invalid configuration machine arm64-apple not recognized”, and removed the text that was specific to M1 machines? I searched for, “Ruby install invalid configuration machine not recognized”.

It turns out that someone else had encountered this problem, for a pre-release version of Ruby 2.1 that they were building from source. In case you’re not keeping track of the history of every Ruby release, that was eight years ago (2013).

Luckily for me, they reported the problem on Ruby’s bug tracker, which keeps its issues for a long time.

The workaround they used to solve that problem was to copy a file, tool/config.sub, from elsewhere into the source folder where they were building Ruby.

Reviewing the build output:

configure: error: /bin/sh tool/config.sub arm64-apple-darwin20.3.0 failed

It mentioned this same file explicitly! There was definitely something here.

Progress: a possible solution

I found the same file in my own Ruby install directory. Obviously this was something that existed consistently in multiple versions of Ruby.

Maybe I could similarly find a replacement of that file that would work for the versions of Ruby that I wanted on my machine?

7. Test new theory

I’ve gone down many a rabbit-hole in my life where I chased an unreliable clue, assuming it to be true. After finding something that seems crucial, it’s necessary to prove that it is correct. Could I build Ruby from source using the config.sub file from a different version of Ruby?

Review the source

I scanned through config.sub and found that it mentioned many different manufacturers and architectures, some unfamiliar (when last did you encounter mips or pc532?) and some very familiar: x86 was all over the place.

config.sub not have arm64 anywhere in it, however. This was solid evidence that the file was responsible for guessing things about my machine, but that it did not have what it needed to know about arm64-apple, just as the error I enountered indicated.

Compare broken with known working equivalent

In step 5 I found that Ruby 3 installed correctly. What was the difference between config.sub in Ruby 3 and config.sub in Ruby 2.x?

☁  ruby-2.6.0  diff tool/config.sub ~/devtools/ruby/ruby-3.0.1/tool/config.sub | grep arm64
>       arm64-*)

Sure enough, the newer - working - version of config.sub mentioned arm64, and Ruby 2.6.0 did not!

This was enough to make me try copying config.sub from the Ruby 3 source and building with that:

☁  ruby-2.6.0  cp ~/devtools/ruby/ruby-3.0.1/tool/config.sub tool/config.sub
☁  ruby-2.6.0  ./configure
checking for ruby... /Users/frederickmeissner/.asdf/shims/ruby
tool/config.guess already exists
tool/config.sub already exists
checking build system type... aarch64-apple-darwin20.3.0
checking host system type... aarch64-apple-darwin20.3.0
checking target system type... aarch64-apple-darwin20.3.0
checking for clang... clang
LOTS MORE OUTPUT
GREAT SUCCESS

Previously configure broke instantly, now it succeeded!

Progess: initial error fixed! Now for the others?

This is fantastic news. Don’t be fooled, this result was spread out, bits at a time, over more than a week. Writing it down now made it feel easy to me, but this was hard won.

8. Attack the remaining errors with the fastest feedback loop

The absence of one error was progress, configure was working. Now I needed the rest of Ruby:

☁  ruby-2.6.0  make

Which didn’t work.

Apologies for leaving you in suspense here, but I’m going to skip a few iterations of:

  • run make
  • encounter an error
  • google and find the exact error in a github issue
  • run make with an additional option or environment variable (and repeat)

Because these were just “google and copy/paste”, I’ll just leave you with the github comment I found that eventually summarised everything (apart from my original machine not recognised problem):

https://github.com/rbenv/ruby-build/issues/1699#issuecomment-762122911

… which in my process left me with:

☁  export optflags="-Wno-error=implicit-function-declaration"; export LDFLAGS="-L/opt/homebrew/opt/libffi/lib"; export CPPFLAGS="-I/opt/homebrew/opt/libffi/include"; export PKG_CONFIG_PATH="/opt/homebrew/opt/libffi/lib/pkgconfig"
☁  asdf install ruby 2.6.0
# fails with machine not recognised problem
#
# cd to temp directory
# cp tool/config.sub from ruby3 source
☁  ./configure
☁  make
# GREAT SUCCESS

Progress: Ruby is built! Now how to use it?

So theoretically, Ruby is built on my system, and I can run Ruby code. But in order to use Ruby usefully, I need it on my path, and ideally not hanging around in an obscure temporary directory.

The easiest way to do that is to get it managed by asdf again. Up to this point, I have been messing around with copying config.sub, make and configure, which asdf knows nothing about.

9. Putting the abstractions back together

I’m now aiming for a repeatable, non-interactive combination of commands that will put Ruby in a place where I can switch easily between it and other versions using my version manager.

In other words, the commands that worked in steps 7 and 8, but using asdf install instead of changing into a directory, copying config.sub, running configure, make, adding something to the path, and so on.

Environment variables can be used by every layer

The environment variables (all the export calls) stick around beyond the lifetime of a single call. If I run them before one command (asdf) and that calls other commands (configure, make, etc.), the variables will apply to all of them.

Therefore, even though I don’t call configure explicitly, the following will still affect configure when asdf calls it:

export optflags="-Wno-error=implicit-function-declaration"
export LDFLAGS="-L/opt/homebrew/opt/libffi/lib"
export CPPFLAGS="-I/opt/homebrew/opt/libffi/include"
export PKG_CONFIG_PATH="/opt/homebrew/opt/libffi/lib/pkgconfig"
asdf install ruby 2.6.0 # calls configure and make, all the ENV vars still apply

Copying config attempt one: passing a new file location variable

In addition to the config above, I need to make sure the right version of config.sub is in the right place.

Can I tell asdf to tell configure where to find the correct config.sub?

It turns out there is an variable that points to the location of this file, found by fellow thoughtbotter Mike Burns:

☁  ruby-2.6.0  grep config.sub configure
ac_config_sub="$SHELL $ac_aux_dir/config.sub"  # Please don't use this var.

If I could work out how to influence $ac_aux_dir, I’d be able to point it to a different tool directory than the one included in the source.

The please don't use this var made me nervous though, and there is a lot of other stuff in the tool directory that I don’t want to mess with.

Copying config attempt two: patch the file

Substantial googling lead me to the possibility of creating a patchfile that would be passed to asdf to apply, after it downloaded the source, like this:

RUBY_APPLY_PATCHES=patchfile asdf install ruby 2.6.0

Patches are unfamiliar territory for me, so I backed out of this one after trying for a few hours to create a patch that could turn any given config.sub file into one that would have the right arm64-apple content.

Progress: still stuck

I’ve solved all the errors, but I still don’t have a way to avoid copying config.sub half way through Ruby installation.

10. Turn the problem into an easier problem

Interfering with the source that asdf downloads is difficult! Time to think laterally: what if I can tell the build process to ignore the result of config.sub entirely?

Read the manual

I went deeper into what config.sub actually is, and discovered Gnu Autoconf. It turns out that years ago, building software was about as difficult as writing software. Everyone had their own hacky way of compiling their project, and this changed for every new machine on which it was compiled (and run; not necessarily the same machine).

Gnu Autoconf was a standard that emerged for this, so that every project didn’t have to reinvent the wheel. Gnu Autoconf is responsible for config.sub. Our mystery file is part of a process called canonicalising: guessing the type of computer where the software will compile and run.

Override the canonical build alias

It turns out this process can be overridden with an argument to configure:

./configure --build aarch64-apple-darwin20

Progress: inches away

I’m still CDing into a weird temporary directory half way through an install, but now I don’t need to copy a file from somewhere else into the source. What I need is to get asdf to tell configure about --build.

11. Turn low level config into high level config

Google was not forthcoming about how to pass arguments from asdf to configure, but asdf uses ruby-build, and ruby-build supports passing configure arguments as environment variables:

CONFIGURE_OPTS="--build aarch64-apple-darwin20" ruby-build ...

Since environment variables stay around, I can set the variable before calling asdf, and this option will still be in effect when asdf calls ruby-build.

Putting it all together

I now had this command:

☁  ~  export optflags="-Wno-error=implicit-function-declaration"; export LDFLAGS="-L/opt/homebrew/opt/libffi/lib"; export CPPFLAGS="-I/opt/homebrew/opt/libffi/include"; export PKG_CONFIG_PATH="/opt/homebrew/opt/libffi/lib/pkgconfig"; export CONFIGURE_OPTS="--build aarch64-apple-darwin20"; asdf install ruby 2.6.0
Downloading ruby-2.6.0.tar.bz2...
-> https://cache.ruby-lang.org/pub/ruby/2.6/ruby-2.6.0.tar.bz2
Installing ruby-2.6.0...
ruby-build: using readline from homebrew
Installed ruby-2.6.0 to /Users/frederickmeissner/.asdf/installs/ruby/2.6.0

Progress: I AM INVINCIBLE!!!!!

This felt like a high point in my career. I announced to everyone in my team that I had finally solved the problem of installing Ruby, and I could now use my shiny new work laptop for, well, work.

No longer the guy who can’t even get his laptop to work! I was a tenacious master debugger who could solve things that no amount of googling could solve!

Old computer science joke

In University they told us that there’s an old computer science joke: “My code compiles, so it must work”.

A few days later I tried to run a Rails project for some real work:

☁  ~  bundle install
... lots of output

☁  ~  rails c
... BAD STUFF

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGABRT)

... 

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	0x0000000189d20cec __pthread_kill + 8
1   libsystem_pthread.dylib       	0x0000000189d51c24 pthread_kill + 292
2   libsystem_c.dylib             	0x0000000189c99864 abort + 104
# A BILLION MORE LINES

I can’t think of a time I’ve ever had Ruby actually crash like this.

Progress: WHY DO I EVEN TRY???

So I was the butt of an old Computer Science joke. All of the above _and now this!?!?!?

12. Search for the things I understand

At this point I was back where I started: with an error I did not understand. While questioning how many more times I could handle this, I searched for some things that I knew were part of this problem: “Ruby crash m1 segmentation fault”

Useful result part one: other people have this problem

I found this issue in the Ruby FFI project that looked promising. FFI allows native C functions to be called from Ruby, and it is included in numerous popular Ruby gems.

In that issue, multiple people had enountered my exact problem, which was a far cry from the first error I was battling which seemingly no-one in the world had ever encountered.

Unfortunately, the issue included commentary that the issue was solved as of ffi version 1.14.0, and I was on 1.14.2, still experiencing the problem.

There’s nothing quite like meeting people who had the same painful experience as you, who tell you that you that the problem doesn’t exist anymore.

The very useful aspect of this issue though, was that someone had created a 4 line Ruby example that I could reproduce the problem without starting up all of Rails. Fast feedback loops ftw!

There’s nothing quite like meeting people who had the same painful experience as you, and tell you that you that the problem doesn’t exist anymore.

Useful result part two

With the fast feedback loop of 4 lines of Ruby that I could invoke to test whether the problem was solved, I could afford to try lots of possibilities, even if I didn’t understand them.

The answer I needed was in the Ruby-FFI issues, where someone used a command line switch --enable-libffi-alloc. It didn’t tell me how to use it, but it didn’t take me too many iterations to find how to install FFI and with it, fix the problem:

gem install ffi:1.14.2 -- --enable-libffi-alloc

After reinstalling FFI, I could reinstall the bundled gems and now, finally, my project worked.

Progress: 100%

This was an arduous journey but very satisfying in that, “there was probably an easier way but hot damn it feels good to have solved it”, kind of way.

Conclusion

I tried to organise this such post that the table of contents above gives some steps that will be applicable to many situations:

  1. Take stock: what do I know about this context?
  2. Read the error: what is the exact problem?
  3. Shortcut: web search for the exact error
  4. Virtual rubber ducking: raise an issue in github
  5. Isolate the error at a low level of abstraction
  6. Broader web search with new knowledge
  7. Test new theory
  8. Attack the remaining errors with the fastest feedback loop
  9. Putting the abstractions back together
  10. Turn the problem into an easier problem
  11. Turn low level config into high level config
  12. Search for the things I understand

Looking at this process you can see how debugging operated in a cyclical fashion: use one technique until it stops yielding results, try something else, then return to the first technique armed with new information. I wish I could say there was more sophistication, but a large part of this was “learning what to Google”.

If there were humans around at the odd hours I was trying to fix this, it would also have been, “learning what to ask my colleagues”.

Finally, although the information presented here has been organised over and over, there were definitely times when it felt like it might never end. I hope the length of the article at least conveys some amount of how debugging is often a struggle right up until the moment of success.

Knowing that all programming involves debugging: good luck for your own debugging, may it all be easier than this!