Learning Zig and Zig Build by porting Piper's CMakeLists.txt

Published

Cross Compiling Woes

I still use Rust a lot. Occasionally I like to build command line utility tools at my job. As much as I love the language features that Rust has, usually I will write these tools in Go or Python. Why? Cross compiling. Most of my coworkers are on MacOS, and some are on Windows, and our CI build system is all Linux.

Cross compiling Rust from Linux to Windows isn't impossible, but cross compiling to MacOS just about is - you need to package the gigantic XCode SDK, run some scripts on it, and there's no guarantee that an XCode update won't break everything. Golang (and Zig) can cross compile to any supported platform using just the compiler, and Python scripts don't need any compilation. Apparently they do this by implementing all of the platform's syscalls natively, rather than depending on the system's libc.

Cross compilation is also somewhat important to me since I have a few Raspberry Pi's at home, and they run my homegrown tools and projects, so I always need to think about how exactly I'm going to build stuff for it. I could build directly on the Pi itself, but this is often slow and might need more RAM than the system has. Usually what I do is build everything in a Docker container on my desktop using Docker's --platform feature, which uses Qemu under the hood.

Getting Zig

Recently I learned that Zig can cross compile as easily as Go can. Not only that, but Zig is intended to be very compatible with a C and C++ build system. The final piece of the puzzle was discovering a project called cargo zigbuild. Cargo Zigbuild will use Zig's toolchain to build and link Rust projects, making it finally as easy to cross compile Rust projects as it is Go and Zig. I tested it out and it worked right out of the box with zero issues.

Now at long last I can build a portable MacOS binary from our CI system for my Rust projects. This made me interested in learning Zig Build for C and C++ projects.

Zig has a declarative build system that you define in-code in a build.zig file in a project. This is kind of like a CMakeLists.txt file, in that you tell it what kind of artifacts you are building and where its source files are, along with whatever other flags you want to define. Being a compatible toolchain with C and C++, there is also a way to easily add C and C++ source files to a Zig project, letting you mix and match.

From the official docs, a really basic build.zig file looks like this:

const std = @import("std");

pub fn build(b: *std.Build) void {
    const target = b.standardTargetOptions(.{});
    const optimize = b.standardOptimizeOption(.{});
    const exe = b.addExecutable(.{
        .name = "hello",
        .root_source_file = b.path("hello.zig"),
        .target = target,
        .optimize = optimize,
    });

    b.installArtifact(exe);
}

The syntax took some getting used to for me. The Zig Language Server helped a lot. But you can see what it's doing here: your build function gets a Build object, it creates some "target" and "optimize" objects probably from flags to zig build, and then adds an executable compilation step that points to hello.zig.

For integration with C files, Build gives you functions addCSourceFiles and addIncludePath, eg:

    exe.addIncludePath(.{ .path = "espeak-ng/src/speechPlayer/include" });
    exe.addCSourceFiles(.{ .files = &.{ ... });

You don't even need a Zig file, the whole thing can be C or C++.

Enter Piper TTS

Brief segway to talk about the amazing Piper TTS project by Rhasspy https://github.com/rhasspy/piper. I have tried the coqui-tts, but having an Intel Arc A770 GPU and not an Nvidia card, none of the hardware acceleration is available to me, and boy is it slow. Also, like many machine learning python projects, installation requires downloading gigabytes of very specifically versioned libraries. Piper TTS is extremely fast - on CPU, I can generate 10 minutes of speech in about 7 seconds using all cores. The quality is also great to my ear - it's not as good as the state of the art TTS models, but its plenty good for my use case of turning web tutorials into audiobooks.

The Full Piper build.zig

Here is what I came up with for compiling a Piper executable, as long as its run in a directory containing piper, piper-phenomize, and espeak-ng.

const std = @import("std");
const ArrayListu8 = std.ArrayList([]const u8);

/// Non-recursively iterate over a directory and add its *.c paths to paths.
/// This isn't used anywhere, but I'm putting it here in case anyone finds it useful.
/// If you want it to be recursive, you basically change dir.iterate() to dir.walk().
pub fn glob_sources(allocator: std.mem.Allocator, base: []const u8, ext: []const u8, paths: *ArrayListu8) !void {
    var dir = try std.fs.cwd().openDir(base, .{ .iterate = true });
    var iterator = dir.iterate();

    while (try iterator.next()) |entry| {
        const path_ext = std.fs.path.extension(entry.name);
        if (std.mem.eql(u8, path_ext, ext)) {
            const path = try std.fs.path.join(allocator, &.{ base, entry.name });
            try paths.append(path);
        }
    }
}

pub fn build(b: *std.Build) !void {
    const target = b.standardTargetOptions(.{});
    const optimize = b.standardOptimizeOption(.{});

    // ----------------------------------------------------
    // Espeak-ng (todo, the espeak-ng data needs to be installed prior to this)
    // ----------------------------------------------------

    const espeakg_ng = b.addStaticLibrary(.{
        .name = "espeak_ng",
        .target = target,
        .optimize = optimize,
    });

    espeakg_ng.addIncludePath(.{ .path = "espeak-ng/src/include" });
    espeakg_ng.addIncludePath(.{ .path = "espeak-ng/src/ucd-tools/src/include" });
    espeakg_ng.addIncludePath(.{ .path = "espeak-ng/src/speechPlayer/include" });

    espeakg_ng.addCSourceFiles(.{ .files = &.{
        "espeak-ng/src/libespeak-ng/common.c",
        "espeak-ng/src/libespeak-ng/mnemonics.c",
        "espeak-ng/src/libespeak-ng/error.c",
        "espeak-ng/src/libespeak-ng/ieee80.c",
        "espeak-ng/src/libespeak-ng/compiledata.c",
        "espeak-ng/src/libespeak-ng/compiledict.c",
        "espeak-ng/src/libespeak-ng/dictionary.c",
        "espeak-ng/src/libespeak-ng/encoding.c",
        "espeak-ng/src/libespeak-ng/intonation.c",
        "espeak-ng/src/libespeak-ng/langopts.c",
        "espeak-ng/src/libespeak-ng/numbers.c",
        "espeak-ng/src/libespeak-ng/phoneme.c",
        "espeak-ng/src/libespeak-ng/phonemelist.c",
        "espeak-ng/src/libespeak-ng/readclause.c",
        "espeak-ng/src/libespeak-ng/setlengths.c",
        "espeak-ng/src/libespeak-ng/soundicon.c",
        "espeak-ng/src/libespeak-ng/spect.c",
        "espeak-ng/src/libespeak-ng/ssml.c",
        "espeak-ng/src/libespeak-ng/synthdata.c",
        "espeak-ng/src/libespeak-ng/synthesize.c",
        "espeak-ng/src/libespeak-ng/tr_languages.c",
        "espeak-ng/src/libespeak-ng/translate.c",
        "espeak-ng/src/libespeak-ng/translateword.c",
        "espeak-ng/src/libespeak-ng/voices.c",
        "espeak-ng/src/libespeak-ng/wavegen.c",
        "espeak-ng/src/libespeak-ng/speech.c",
        "espeak-ng/src/libespeak-ng/espeak_api.c",
    } });
    espeakg_ng.addCSourceFiles(.{ .files = &.{
        "espeak-ng/src/ucd-tools/src/case.c",
        "espeak-ng/src/ucd-tools/src/categories.c",
        "espeak-ng/src/ucd-tools/src/ctype.c",
        "espeak-ng/src/ucd-tools/src/proplist.c",
        "espeak-ng/src/ucd-tools/src/scripts.c",
        "espeak-ng/src/ucd-tools/src/tostring.c",
    } });

    espeakg_ng.installHeadersDirectory(b.path("espeak-ng/src/include/espeak"), "espeak", .{});
    espeakg_ng.installHeadersDirectory(b.path("espeak-ng/src/include/espeak-ng"), "espeak-ng", .{});
    espeakg_ng.linkSystemLibrary("c");
    espeakg_ng.linkSystemLibrary("c++");

    // Install the artifact, because certain include headers need to be present
    b.installArtifact(espeakg_ng);

    // ----------------------------------------------------
    // Piper-phonemize
    // ----------------------------------------------------

    const piper_phonemize = b.addStaticLibrary(.{ .name = "piper_phonemize", .target = target, .optimize = optimize });

    piper_phonemize.addCSourceFiles(.{ .files = &.{
        "piper-phonemize/src/phonemize.cpp",
        "piper-phonemize/src/phoneme_ids.cpp",
        "piper-phonemize/src/tashkeel.cpp",
        "piper-phonemize/src/shared.cpp",
    } });

    piper_phonemize.addIncludePath(.{ .path = "piper-phonemize/src" });

    piper_phonemize.installHeader(b.path("piper-phonemize/src/phonemize.hpp"), "piper-phonemize/phonemize.hpp");
    piper_phonemize.installHeader(b.path("piper-phonemize/src/shared.hpp"), "piper-phonemize/shared.hpp");
    piper_phonemize.installHeader(b.path("piper-phonemize/src/phoneme_ids.hpp"), "piper-phonemize/phoneme_ids.hpp");
    piper_phonemize.installHeader(b.path("piper-phonemize/src/tashkeel.hpp"), "piper-phonemize/tashkeel.hpp");
    piper_phonemize.installHeader(b.path("piper-phonemize/src/json.hpp"), "piper-phonemize/json.hpp");

    piper_phonemize.linkSystemLibrary("c++");
    piper_phonemize.linkLibrary(espeakg_ng);

    b.installArtifact(piper_phonemize);

    // ----------------------------------------------------
    // Main piper executable
    // ----------------------------------------------------

    const piper = b.addExecutable(.{
        .name = "piper",
        .target = target,
        .optimize = optimize,
    });

    piper.addCSourceFiles(.{ .files = &.{
        "piper/src/cpp/main.cpp",
        "piper/src/cpp/piper.cpp",
    } });

    // Add spdlog to piper directly
    piper.addCSourceFiles(.{ .files = &.{
        "spdlog/src/spdlog.cpp",
        "spdlog/src/stdout_sinks.cpp",
        "spdlog/src/color_sinks.cpp",
        "spdlog/src/file_sinks.cpp",
        "spdlog/src/async.cpp",
        "spdlog/src/cfg.cpp",
        "spdlog/src/bundled_fmtlib_format.cpp",
    }, .flags = &.{"-DSPDLOG_COMPILED_LIB"} });

    piper.addIncludePath(.{ .path = "spdlog/include" });

    piper.linkLibrary(piper_phonemize);

    piper.linkSystemLibrary("c++");
    piper.linkSystemLibrary("onnxruntime");

    b.installArtifact(piper);
}

This all works and produces a piper executable in the zig output folder that you specify!

Lessons Learned

Is all of this worth it? Maybe. I can't say this is easier to read or better than a CMakeLists.txt file. You have to roll your own glob function. It also doesn't handle project versions, so there's nothing keeping me from accidentally pointing to the wrong version of a dependency. Even so, it does appeal to me as a programmer, and I like that I can build a statically linked binary (except for onnxruntime) for any target platform and I know exactly what's in it.

If I had a wishlist for zig build, it would be:

  1. Some kind of a blessed glob function for larger projects
  2. A way to reference external build.zig files as submodules - this might already be possible, I see a function called dependencyFromBuildZig in std.Build that might do the trick. But I also want to make sure that I can statically link with any dependencies, even ones that use b.addSharedLibrary(...). Often, compiling a project from source either means polluting /usr/local or some other prefix with very specific versions of libraries, and then you have to bundle those .so files with your binary if you want it to be portable. Having everything be static can make this portability a lot easier.