Feb 10, 2023

Project update 33 of 39

Xous 0.9.12: FIDO2.1 Residential Keys, Kernel Improvements, USB Mass Storage and More!

by bunnie

I was thinking we’d be light on new features due to the holidays, but I was wrong! Our development community has rallied, and we’ve got a nice list of compelling improvements in this release.

`vault` Upgrade to FIDO2.1

The vault application has been upgraded from FIDO2.0 to FIDO2.1. The most significant user-facing improvement is support for "residential keys" on Precursor. In other words, if your SSH client is up-to-date, you can store SSH private keys on Precursor, instead of leaving it laying around in your .ssh/ folder. This allows you to "carry your keys" with you, allowing one to login using a less-trusted computer, without having to install a copy of your private keys on that machine.

If you’re running some flavor of Linux, you can generate such an SSH key by plugging in your Precursor and running

ssh-keygen -t ed25519-sk -O resident

If you haven’t already set a PIN on your device, you need to do that first, using the following command:

fido2-token -S /dev/hidraw0 (or whatever the path is to your device, consult dmesg)

If you’re using openSSH, you’ll need a version later than 8.3 or so for this to work. Note that many distros are still pushing out versions older than that, so if your SSH complains about -O resident being an unrecognized option, that’s probably why.

In addition to supporting residential keys, the vault application migrates your existing authenticator database to a new schema. The purpose of the migration is to make future upgrades of our OpenSK port much easier.

The migration should be seamless; if you encounter issues, you can always downgrade to an older version of Xous and your original database will be unscathed. Please open an issue if you do encounter a migration problem.

Kernel Improvements

Xobs has been very hard at work improving the performance and stability of the Xous kernel. After a heroic effort porting the Rust std tests, a huge number of bugs were fixed, and some very subtle performance regressions were also fixed. Here’s a partial list of the changes:

emulation: the ticktimer is now correctly able to handle delays of more than 49 days
emulation: timer0 is now correctly modeled, and the system timer works correctly
kernel: thread selection is now massively improved, and should be faster
kernel: thread selection now correctly selects and parks threads
kernel: the main loop is now simpler, though there is more room for improvement
kernel: execution now immediately transfers to spawned threads
kernel: fixed a bug where a thread immediately exited and the parent joined it
kernel: when returning a message twice, the error DoubleFree is now returned instead of ProcessNotFound
kernel: all scalar return calls are now unified
kernel: servers get to use their full scheduled quantum
libxous: syscalls now use asm! rather than external object files
ticktimer: the condvar implementation has been completely overhauled
ticktimer: implemented FreeMutex and FreeCondition api calls
ticktimer: only respond to RecalculateSleep when sent internally
ticktimer: use new .pop_first() function on BTreeHeap
ticktimer: fix a potential panic in the interrupt handler

Chasing down bugs is a never-ending task, and Xobs is still hard at work moving the furniture to find the bugs that scattered when the lights were shone on them. As Xobs quipped in the dev-chat channel: "There are two hard problems in computer science: Cache invaAnd concurrency.lidation".

We also added a chapter to the Xous Book about performance, aided by cycle-accurate, system-level simulations of the hardware.

I was pleasantly surprised to discover that Verilog simulators are performant enough to boot Xous in a cycle-accurate simulation. This means I can dig through saved waveforms from a single run, and learn things such as cache and TLB miss rates, or the overhead of kernel to userspace transitions. These sorts of stats are usually very hard to come by because any on-line performance logger will incur some penalty for creating trace messages. It’s also hard to dig into things like the state of the instruction decoder, branch predictor, or L1/L2 cache interactions using code-based profiling techniques. However, with a system-level Verilog simulation (it even models the overhead of fetching data out of our SPINOR flash chip!), we can drill down into every flip flop and gate inside the system, and pinpoint code problems, bus contention, and/or logic bugs in the RTL.

Above: an example analysis of the simulated RTL in a waveform browser. The diagram shows just a few signals out of thousands: here we have some external SRAM signals, a couple ranges of the program counter (rendered as an "analog" waveform), and the state of the CPU’s ASID register. Steps in the program counter "waveform" height correlate to jumps between various functions, some of which are annotated.

I feel like this is one of the stronger arguments for open RTL CPUs — being able to straddle the boundary between hardware and software with a simulator like this is extremely powerful for identifying regressions and bugs. Of course, a lot of this can be done with an emulator; but simulating with the RTL captures not only the intended behavior, but also the unintended bugs baked into the chip.

I think it’s an interesting enough topic that I’ll probably write a blog post dedicated to this in the coming months. For now, I’m polishing the technique a bit so that it’s more accessible to users and less of a pile of Rube Goldberg scripts that break when you look at them funny.

USB Mass Storage (`transientdisk`)

@gsora has spearheaded an initiative to build a USB mass storage driver into Precursor, and it is starting to bear fruit. At the moment, an app he wrote called transientdisk is available in the source tree. It’s in early days, so for now, you’ll have to make a custom build to play with it. When activated, the app causes Precursor to enumerate a 1.44MiB USB mass storage device to a connected host. All the data is stored in RAM. As its name indicates, the data is transient — once you leave the app for any reason, the disk is de-allocated, and if you reboot all memory is zeroized.

The app is currently more of a proof-of-concept that the USB mass storage layer is sound and performant, but even in this state it’s useful for sneakernetting sensitive encryption keys between machines. You could do that right now with a USB stick, but then you’d have to grind up the stick to ensure the data is not recoverable; with transientdisk, you can easily inspect the code and see that erased means erased, and not simply shuffled to a list of blocks to eventually be erased as part of some opaque wear-levelling algorithm.

Now that the USB mass storage driver has been proven, we’re exploring using the mass storage interface to present virtual filesystems to allow for easier backup/restore of key data, and possibly even OS updates.

Efuse/BBRAM Backup Key Burning Available in Stable

I’ve had at least one documented success of an unbiased third party going through the key burning flow without incident, so the feature is now available in the stable release train.

It is still marked as Beta and has a stern warning about the risks, but at least you don’t have to jump through special hoops to try it out. If you are successful at executing the flow, I’d be interested in knowing. And of course, if it fails — please file a detailed bug report.

Improved `mtxcli`

Thanks to significant efforts by @tmarble, mtxcli has some significant bug fixes and stability improvements! It now auto-syncs messages from servers more reliably, and it has filters so if you have a lot of active chats, the device is not overwhelmed by a deluge of chat logs.

Other Bug Fixes and Improvements

Many other fixes and features are in this release, including:

The Basis priority order is displayed in the status bar (resolves issue #269). The left-most basis is the default basis. When no secret bases are open, no notification is displayed (the .System basis is assumed).
Fixed issue #109, where PDDB can panic after a memory cache prune due to missing keys.
Reduced kernel code size by about 737kiB (10%) by restoring lto=fat and pushing FFT test code onto the tester. Note that any users who wish to write code that relies on built-in floating point transcendental functions will have to restore lto=thin, at least until https://github.com/rust-lang/rust/issues/105734 is resolved.
bip-utils dependency removed from Python packages. This allows backalyzer and precursorupdater to run on older platforms that don't have the latest-greatest Python. A hand-rolled BIP-39 word-to-bits converter is used instead.
More optimizations to vault passwords path. Records are re-used instead of re-allocated if they don't change. This should speedup switching to vault passwords by about 2x after the very first time the records are loaded (the first time will take longer because the records have to be built up).
Extended watchdog reset time to ~30s from 7s, enabling easier guru meditation reporting.
PDDB fixes to edge cases in key deletion.
- Keys that were supposed to be deleted were being re-fetched from RAM cache, which leads to them re-appearing in the UX and when one attempts to re-delete the key it triggers a double-free error.
- Small pool key packing was incorrectly using an old key offset when repacking keys, leading to data corruption/loss after deletion events
Fixed some edge cases in the I2C driver; multiple atomic R/W transactions are now grouped together to form "molecular" operations that cannot be interrupted.
Improved sleep/resume stability: servers that are too busy to respond to sleep can now abort the sleep request, instead of being shut down ungracefully. This results in a user dialog box that tells the user to try again. This request happens at about a 2% frequency, and is likewise 98% likely to be resolved by just trying to sleep again immediately.

By the time you read this update, I’ll hopefully have shepherded a new batch of Precursors through our factory in South Korea. Either that, or I’ll be cursing vendors for having sent me substandard parts. For the past couple years, the pain in the supply chain has been real. While there has been some recovery in the supply chain, it has been uneven; some parts are readily available, but some are still extremely hard to get.

Of course, you can’t build a system until every single part is on the line, so a partial recovery is still a headache for small-scale producers like us. I have managed to acquire enough parts to do another build only by twisting some arms fairly hard. The risk of arm-twisting — especially when you’re a small player — is that vendors extract revenge by sending substandard or incorrect parts. They can afford to piss you off much more than you can afford to piss them off (cue the "You’re not Apple" refrain). This is why I’m traveling to the factory to oversee the build and personally check the parts for irregularities.

Unfortunately, simply waiting for parts to arrive on their "natural course" was not an option — without constant pushing we would have sold out months ago, with no hope for a resupply in over a year. Such a protracted lack of availability risks shaking confidence in the project, sapping precious momentum from our nascent developer community. I’d personally much rather take a measurable supply chain risk, than an immeasurable risk of eroding community goodwill.

Precursor

Mobile, Open Hardware, RISC-V System-on-Chip (SoC) Development Kit

Xous 0.9.12: FIDO2.1 Residential Keys, Kernel Improvements, USB Mass Storage and More!

`vault` Upgrade to FIDO2.1

Kernel Improvements

USB Mass Storage (`transientdisk`)

Efuse/BBRAM Backup Key Burning Available in Stable

Improved `mtxcli`

Other Bug Fixes and Improvements

Questions?

Learn More About This Project

Precursor

Mobile, Open Hardware, RISC-V System-on-Chip (SoC) Development Kit

Xous 0.9.12: FIDO2.1 Residential Keys, Kernel Improvements, USB Mass Storage and More!

vault Upgrade to FIDO2.1

Kernel Improvements

USB Mass Storage (transientdisk)

Efuse/BBRAM Backup Key Burning Available in Stable

Improved mtxcli

Other Bug Fixes and Improvements

Questions?

Learn More About This Project

Subscribe to the Crowd Supply newsletter, highlighting the latest creators and projects

`vault` Upgrade to FIDO2.1

USB Mass Storage (`transientdisk`)

Improved `mtxcli`