MNT Reform

by MNT Research GmbH

The open source DIY laptop for hacking, customization, and privacy

View all updates Jun 14, 2020

Graphics Improvements - 4K HDMI, and Proper KiCAD Support

by Lukas H

4K HDMI 2.0a Video Output

The NXP i.MX8MQ system-on-chip used in Reform has an external display output that supports HDMI 2.0a (if you can tolerate a piece of binary firmware loaded into the HDMI controller). The maximum resolution is 4096 x 2160 (4K) at 60 Hz. While we still lack a display capable of this resolution in the lab, I connected Reform to the Ultra HD TV at home to at least validate that it could output to the 3840 x 2160 @ 30 Hz maximum of that TV, and it was no problem at all:

MNT Reform connected to 4K TV

We’ll obtain a display that can accept 4K @ 60 Hz over HDMI and report back. Needless to say, this resolution is great for working with lots of text or terminals on a big screen. The Hantro H.264 hardware decoder (now supported by the Linux kernel) can decode 4K video in realtime, but we still have to validate this. While the built-in GC7000L GPU has significantly more work to do to render to 4K compared to 1080p, it is possible to clock it up to 1 GHz to squeeze out some more performance.

How to Fix Show-Stopper Graphics Bugs on Reform

For at least the last half year I’ve been unhappy about not being able to use KiCAD on Reform, the free and open source electronics design program that I used to create the circuit boards that make up Reform. While KiCAD compiles and runs on Reform, it wasn’t possible to enable the "Accelerated Graphics" mode that leverages the GPU for schematics and PCB rendering. With software-based rendering, KiCAD performs very slowly on Reform/ARM64 systems, especially for complex boards, so that wasn’t an option for me.

There were three roadblocks in the way to running KiCAD on Reform, two of which affected some other applications/games as well:

1. Number of GL Framebuffer Attachments

KiCAD implements an "overlay", which is a third OpenGL framebuffer (in addition to background and foreground graphics) for interactive operations like drawing new traces or artwork on top of existing, cached graphics. The etnaviv open source driver for GC7000L doesn’t support this feature, so I came up with a workaround patch for KiCAD that doesn’t use an overlay but renders to the foreground framebuffer instead. This cleared the way to being able to turn Accelerated Graphics mode on.

2. Early-Z Bug

GC7000L has a new architecture internally called "HALTI 5" which introduced some differences to older generations supported by the etnaviv drivers, so some GPU features behave differently than expected or are activated by unknown registers or bit positions that have to be reverse engineered-again. One such feature is disabling "Early-Z Reject". To figure out the correct 3D order of the pixels the GPU has to paint every frame, it uses a so-called Depth Buffer to record the Z (depth) position in 3D space of every pixel it has rendered so far. When another pixel is scheduled to be painted on top, its Z position is first compared to what value is already in the Depth Buffer at that X/Y coordinate. If there is already something that is logically in front of what we want to paint, we don’t paint over it. This way, it doesn’t matter in which order the objects (triangles) are painted. The Depth Buffer will make sure that pixels closer to the camera obscure pixels that are further away.

There is an optimization in modern GPUs called "Early-Z Reject" that sorts out the Depth Buffer before running all the expensive shaders that determine the actual texture and color of the pixels. Triangles that are determined to be fully obscured can be skipped altogether, saving rendering time.

A problem with this approach appears when using a shader function called "discard". The discard (sometimes called TEXKILL) instruction can be used to poke transparent holes into the currently drawn texture/triangle, so that the background would shine through instead. But this can only work if objects behind the current object have been painted, or something wrong will show up instead. The GPU drivers have to detect that "discard" is being used and disable the Early-Z optimization for the scene. In the case of etnaviv driving the GC7000L, this did not work.

This affected the rendering of KiCAD’s traces, text and zones, which is best explained with a picture:

Screenshot of KiCAD showing Depth Bug

After much frustration, I decided to dig into Mesa’s etnaviv driver source code to see if I could figure out how to disable Early-Z rejection in GC7000L myself. Christian Gmeiner and Marek Vasut, both etnaviv contributors, helped me - each providing puzzle pieces of the toolset required for reverse engineering the GPU. In the end, I was able to find the GPU bits that need to be toggled to disable Early-Z rejection and fix all KiCAD rendering problems.

Here’s a quick walkthrough for anyone wanting to do more etnaviv (very welcome!) reverse engineering.

The main strategy of figuring out the correct way of doing things with Vivante GPUs is to watch what the proprietary blob, "GALcore", would do, and compare that to the operations etnaviv does. The difference between these behaviours often contains the key to unknown bits in registers and their meanings.

First, try to isolate the behaviour that you want to analyze and boil it down to a minimal test case. I did this when originally reporting the bug. You can find the test case sources on GitHub.

To obtain the command stream trace of my test case from the blob, I did the following:

  • Get an NXP ("vendor") Linux distribution with the blob installed. In my case, Boundary Devices provides a vendor distribution of Debian 9.5 with a 4.9 kernel for download.
  • Boot this image on a development board or on MNT Reform and execute modprobe vivante to insert the proprietary kernel module.
  • Marek helped me to run my test case "headless" by following the code patterns of kmscube. This helps with executing the test remotely without a full desktop running.
  • Clone libvivhook on the development board: "This library hooks into the galcore driver to provide logging functionality in userspace".
  • Follow the instructions of libvivhook to build the correct version for the ABI (application binary interface) matching the proprietary kernel driver.
  • In my case, the version is listed on Boundary's website as "Vivante 6.2.4p1.8". This versions headers were not included in the "galcore_headers" repository referenced by libvivhook, so I had to generate my own include folder imx_v6.2.4p1.8 by copying the header files from /usr/include/vivante plus gc_abi.h from the etnaviv project (it's in all the other imx_... folders_). I could then export GCABI=imx_v6.2.4p1.8 and was good to build the library viv_interpose.so.
  • Launch the test case and inject the libvivhook interposer: LD_PRELOAD=/path/to/viv_interpose.so ETNAVIV_FDR=/tmp/trace.fdr ./test_case
  • The resulting file /tmp/trace.fdr contains the command stream trace in binary form. Copy this over to your workstation.
  • Clone the etna_viv tools repository on your workstation.
  • Decode the trace into a text file: ./tools/dump_cmdstream.py trace.fdr ./data/gcs_hal_interface_imx_v6.2.4.p1.8.json >blob_dump.txt
  • In my case, there was no matching JSON file for the GCABI, so I had to build one. This is a somewhat obscure process. You have to build a dummy binary called gcabi on the target device that references the GALcore API. This binary then contains all the debug information needed by the tool ./tools/build_json.sh in the etna_viv repository. Calling ./build_json.sh gcabi imx_v6.2.4p1.8 created the JSON files for me.

Quite a lot of work to set everything up, but once you have it, you can start feeding test cases to GALcore and analyze them.

With a similar, but slightly less complicated process, you can trace the command streams of etnaviv (the open source driver):

  • Build the dump branch of mesa by Christian Gmeiner with the meson option -Dtools=etnaviv and install it on your etnaviv-powered system, changing the hardcoded dump path before.
  • Build your test case and launch it while preloading the dumper library: LD_PRELOAD=/path/to/libetnaviv_dump_cmdstream.so ./test_case
  • This will result in a lot of files in the folder you specified in the first step. The interesting ones end with _cmdstream and can be converted to text form like this (with another tool from the etna_viv repository): ./tools/dump_separate_cmdbuf.py -b submit_00000003_cmdstream >decoded_cmdstream.txt

Armed with these text files, you can compare the commands and values that etnaviv sends to the GPU versus the ones that the GALcore blob sends. My breakthrough however came after comparing two traces from the blob, one with depth testing enabled vs depth testing disabled. I noticed that the blob would toggle not one, but three bits to disable Early-Z rejection, spread across two registers. One function had to be turned off while another function hat to be turned on. You can see my work-in-progress patch here to see which ones.

There is still no automatic way in etnaviv to recognize "discard" in shaders to trigger turning of Early-Z rejection, but it can be set with an environment variable, ETNA_MESA_DEBUG. Normally, I set this to nir to enable the NIR shader compiler, but now I set it to nir,no_early_z to disable Early-Z rejection as well. I’ll continue work to make this switch happen automatically.

For now, this patch completely fixes KiCAD accelerated rendering on MNT Reform/i.MX8MQ:

Screenshot of KiCAD on MNT Reform, fixed

Screenshot of KiCAD on MNT Reform, fixed

As a bonus, this fixes some games and emulators as well, including (with a tiny shader patch) the rendering of plants and transparent objects one of my favorites, Minetest, an open source voxel game engine with some fantastic mods such as a multiplayer Minecraft-like world.

Screenshot of Minetest on MNT Reform, fixed

3. Xwayland Compositing Synchronization Bug

There was one more rendering issue plaguing the toolbars in KiCAD and GUI elements of applications using legacy X11 toolkits through Xwayland (the X compatibility layer for wayland compositors). This resulted in elements sometimes not being fully drawn and flickering in and out of existence. After some discussion with Daniel Stone, I decided to hunt for a patch for this problem as well. After a few days, I was lucky: placing a glFinish() in a strategic location at the end of the glamor_composite_clipped_region() function in the X server code mitigates the problem in almost all cases. This also fixes the pre-GTK3 version of the GIMP running on Xwayland. GTK3, Qt and SDL applications are immune because they bring their own rendering and don’t rely on the X server’s drawing functions.

Hardware and Software, Together

In my opinion, shipping hardware is also about shipping working software. That’s why I try to catch as many problems in important applications as I can before shipping Reform. I also wanted to detail my approach to fixing certain problems directly on the system, because this has been a great and rewarding learning experience for me, and it can be for you, too. This can be intimidating and frustrating at first, but with every subsystem that you manage to take apart and solve a problem in, you gain a more intimate understanding of the hardware and software you rely on every day. And you can learn valuable programming and engineering skills on the way. This is a big part of what the MNT Reform project is all about.

About the Author

Lukas H

 Berlin, Germany


$269,866 raised

of $115,000 goal

234% Funded! Order Below

Product Choices

$1,500

MNT Reform Max

A fully assembled MNT Reform with 1 TB NVMe SSD, an mPCIe Wi-Fi card, a printed and signed operator handbook, our custom Black Piñatex Leather Sleeve (vegan) made in Berlin by fashion designer Greta Melnik, as well as a Debian GNU/Linux 11 SD card and international power supply (110/230 V).


$999

MNT Reform DIY Kit

You assemble MNT Reform yourself from the individual boards, display and case parts, and print the manual on your own (if you want). All circuit boards are populated; no soldering required. Save some money and have a great learning experience building your own laptop.


$40

MNT Reform T-Shirt

Just want to support the development of an open hardware laptop and get a nice shirt, too? Then this Fair Wear certified MNT Reform T-Shirt made from 100% cotton is for you. Color: Black. Screen printed in Berlin by blackink.

Credits

MNT Research GmbH

MNT creates open source hardware and software like the VA2000 FPGA-based Amiga graphics card and the most prominent project Reform, an open DIY laptop. The Reform team consists of Lukas F. Hartmann (electronics, software design), Ana Beatriz Albertini Dantas (product / industrial design) and Greta Melnik (sleeve design, SMD assembly). We believe that computers and personal electronics should be open, documented, understandable and repairable and respect their owner's rights. Reform is an attempt to push the industry in that direction.


Lukas Hartmann


OSH Park

PCB Manufacturer

PCBWay

PCBA Manufacturer

Subscribe to the Crowd Supply newsletter, highlighting the latest creators and projects: