kaashif's blog

Programming, with some mathematics on the side

Home
//
About
//
Contact
//
Archive
//
GitLab

Book Review: Children of Memory

2024-04-02

I recently read Children of Memory by Adrian Tchaikovsky. It's the third in the Children of Time series. Each book in the series shares a common backstory and general template:

Humanity sends out terraforming missions to the stars.
Earth collapses into war and disaster over hundreds of years.
Survivors send out ark ships to find and hopefully colonise the Edens waiting for them.
It doesn't go as planned: uplifted spiders are waiting for them, alien parasites, spacefaring octopuses, reality-bending alien technology does something weird.

I think this book was a bit disappointing and I'll explain why. Heavy spoilers, and this "review" makes no sense if you haven't read the book.

2024-01-03

I was running something in WSL, as you do, then I thought about it for a second. When I'm doing this in WSL:

$ clip.exe < file.txt

How does that actually work? It turns out this is done using /init which is two things:

PID 1, it's the init system, the parent of all processes in WSL.
An "interpreter" for Windows executables. When you run clip.exe, that's the actual Windows binary you're running directly. This works via the binfmt_misc mechanism of Linux, which allows you to register runners for any binary with specific magic bytes.

/init is a bit hard to get at since it's a closed source component of WSL. We can get some idea of how it might work by looking at (1) a Microsoft blog post describing how this works at a high level and (2) cbwin, an open source implementation of this.

We can also do fun things, like make Java jars directly executable without needing to run them with java -jar. But beware - if you have "fully executable" jars with scripts embedded at the start (like the ones Spring Boot makes), binfmt_misc can't possibly be able to tell that they're jars.

But java -jar still works on them! Weird. Here are the questions we want to answer:

What happens when you run a "normal" Linux executable? What about a shell script?
How does Linux tell that clip.exe is a Windows executable, and how does it run from inside Linux?
How can Java tell that a shell script with some binary junk at the bottom is really a jar, but the Linux kernel (via binfmt_misc) can't?!

Answers are below.

2024-01-02

Why is the following change to a Rust struct backwards incompatible?

 struct S {
+    y: i8,
     x: i32,
 }

And why is the following change to a C++ struct backwards incompatible?

 struct S {
+    char y;
     int x;
 };

The answers are different and may surprise you. Rust provides fewer compile guarantees about structs by default and more guarantees in code interacting with structs than C++.

C++ is batshit crazy as always, providing guarantees no-one cares about while allowing you to invoke UB by accident.

Worth thinking about when deciding whether you need to bump the major version number of your Rust crate.

2023-09-23

I just noticed that we're now past the 10th anniversary of my first blog post, which I made on 2013-08-11! Maybe I'll write a retrospective. Moving on.

Recently, a friend suggested I'd be interested in EVE Online. For those not familiar, it's an MMO with lots of stuff but in particular, it has a player driven market economy. Prices are driven by market forces. Players place buy and sell orders for various items, other players fill those orders, market prices move over time.

It's exciting! CCP (the developers of EVE) even employ economists to help manage the in game money supply and inflation. Very exciting!

With any market the question is: how can I make the most money as quickly as possible for the least effort?

In real life, the answer may be to get a job. In EVE, I thought the answer might be to find and exploit arbitrages: market mispricings where someone is selling for a low price and someone else is buying for a high price.

Websites like eve-trading.net exist but don't let us answer the following questions:

Given an initial investment and fixed cargo space, which opportunity has the highest return (%) per jump? How does that vary with available capital? Warren Buffett famously said that having a small amount of money to invest is the trick to making good returns. Having $100B in cash is a curse - most "good" opportunities are simply too small.
Has the size of arbitrage opportunities changed over time?
Historically, where (in which systems) are the best opportunities?

These questions can be answered by downloading these datasets:

Market data from https://data.everef.net/market-orders/
Static data (e.g. about jump routes and how much cargo space items take up) from https://wiki.eveuniversity.org/Static_Data_Export

And doing some analysis. That's what I do in this post.

2023-08-08

(or: How I learnt to stop worrying and love opportunity cost)

I'm a fan of the board game Sidereal Confluence. It's a game where:

There are resources: small cubes, large cubes, ultratech (octagons)
You have converters that change sets of cubes into others (e.g. 2 white and one blue cube into 3 black cubes)
Your goal at the end of the game is to have the most victory points (VPs). Some converters make VPs, during the game you can research techs that give you VPs, and all resources you've accumulated convert to VPs at set rates.

The meat of the game is trading - players can trade anything, with any terms. The only rule is that trades are binding - you must honour your agreements. This leads to simple agreements like "I'll give you one white cube for one green cube", where the trade is obviously fair. You also sometimes get harder to value trades, like "This turn you give me one small white cube and next turn I'll give you one large black cube".

The rules have a guideline - three small cubes are worth two large cubes, which is worth the same as an ultratech. This helps with valuations.

One of my friends executed a strategy I found interesting:

He played the Eni Et, a race which has special converters. For example, a usual converter might take 3 small cubes and give 4 small cubes, for a ratio of 4/3, but an Eni Et converter might have a ratio of 2 or 3.
The catch is the Eni Et can't use their special converters and must trade them to other people who can.
He sold them permanently early on, valuing a 2 to 4 or 8 to 13 converter pretty highly. After all, they'll make dozens of cubes for you over the game, right?
He won. Did we pay too much for his converters?

Questions I want to answer quantitatively:

A simple one: if someone gives me a cube this turn in exchange for some number of cubes next turn, how many cubes should they demand?
A harder one: when buying an Eni Et converter permanently, how many cubes should I pay?

2023-07-13

alloca is a function provided by several C libraries (e.g. glibc) that lets you allocate a block of memory that will be freed when the calling function exits. It's usually done by allocating memory on the stack.

But here are a couple of questions:

No C standard or POSIX standard mentions alloca, so what "should" it really do?
Given that no C standard mentions the stack, is it even possible to implement alloca in C, or do you need assembly to move the stack pointer?
Given that compiling code with -fomit-frame-pointer usually results in addresses being expressed as relative to the stack pointer rather than the frame pointer, is it safe to move the stack pointer ourselves?

TL;DR: The answer is that you need special support from the compiler to implement alloca and you can't do it yourself, in C or assembly.

2023-06-19

386BSD 1.0 was released in 1994 on a CD in an issue of Dr Dobb's Journal. There are guides on the internet on how to boot 386BSD 1.0 in QEMU, like http://gunkies.org/wiki/Talk:Installing_386BSD_1.0_on_Qemu but I don't think there are any guides on how to boot it like someone in 1994 would've booted it, from a real MS-DOS installation.

Rather funnily, 386BSD is listed as "theoretically bootable" here: https://gunkies.org/wiki/386BSD_1.0. And there's a post on WinWorld saying "Personally I have no idea how to boot it (honestly don't ask)" with no elaboration: https://forum.winworldpc.com/discussion/13240/offer-386bsd-reference-cd-rom.

It's time to put theory into practice and work out how to boot this OS. There are a couple of things I want to try:

DOSBox - maybe it'll work?
QEMU with MS-DOS 6.22
The instructions from gunkies.org

You can download the CD image here: https://archive.org/details/386BSD1.0 and follow along.

Also, RIP Bill Jolitz.

2023-03-27

Java is a language missing a lot of features. One of those missing features is keyword arguments. By that, I mean something that lets you call functions like this:

my_function(x=1, y=2, z=3)

Or even:

my_function(z=3, x=1, y=2)

That is, arguments that are named, can be reordered, and are non-optional at compile time. You might quibble: Python doesn't have compile time. But you can run mypy to check types and if you're missing a required keyword argument, mypy will fail.

Let's limit our scope to constructors, and aim that if given code like this:

package org.example;

@ReorderableStrictBuilder
public record MyBuiltClass(String first, String second, String third) {
}

We want to be able to construct an object something like this:

// Named arguments
var x = Builder.create().setFirst("1").setSecond("2").setThird("3").build();

// Reorderable
var x = Builder.create().setSecond("2").setThird("3").setFirst("1").build();

// Compile time error if you miss out any arguments - this shouldn't compile.
var x = Builder.create().setSecond("2").setThird("3").build();

Is that even possible? I tried to find out. First, let's look at some solutions I don't like.

2023-03-12

A few weeks ago, I was reading a Hacker News post about a clipboard manager. I can't remember which one exactly, but an example is gpaste - they let you have a clipboard history, view that history, persist things to disk if you want, and so on.

One comment caught my eye: it asked why clipboard managers didn't use the splice(2) syscall. After all, splice allows copying the contents of a file descriptor to a pipe without any copies between userspace and kernelspace.

Indeed, replacing a read-write combo with splice does yield massive performance gains, and we can benchmark that. That got me thinking: why don't other tools use splice too, like cat? What are the performance gains? Are there any edge cases where it doesn't work? How can we profile this?

There are blog posts from a while ago lamenting the lack of usage of splice, e.g. https://endler.dev/2018/fastcat/ and interestingly enough, things may have changed since 2018 (specifically, in 2021), giving us new reasons to avoid splice.

The conclusion is basically that splice isn't generic enough, the details are pretty interesting.

2023-02-26

A few weeks ago, I played the board game "The Search for Planet X". The premise is that you have a circular board divided into 12 sectors, each containing one object. That object could be an asteroid, a gas cloud, and so on, but most importantly, it could be Planet X. Which object is in each space is hidden at the start of the game and you're racing your opponents to discover Planet X by scanning sectors and deducing information using a set of rules like "an asteroid is always adjacent to another asteroid". The winner is the first player to correctly guess the location of Planet X and the two adjacent objects.

The full rules can be found here: https://foxtrotgames.com/planetx/.

I don't find these kinds of games very fun, but it did get me thinking: what's the best strategy? How many possible boards are there, and how hard is this game?

This meant I had to write a program to:

Generate all possible boards
Come up with various strategies to pick the best action
See how good those strategies are

The source code is here: https://github.com/kaashif/search-for-planet-x-solver.

2023-01-23

Mockito is a pretty popular Java mocking library. It lets you write code like this:

MyClass mockObject = mock(MyClass.class);
when(mockObject.myMethod(1)).thenReturn("one");

Which is pretty cool, even if it's a bit magic. It's not really that magical, conceptually - Mockito simply intercepts method calls and keeps track of which methods have been called globally, and with what arguments. The call to .thenReturn effectively writes to global state, so that the next call to mockObject.myMethod(1) will have the right behaviour.

My question is simple: Mockito uses bytecode generation libraries (cglib or bytebuddy) to construct the proxies - why do we need to go to those lengths? Can't we get by with something more mundane, meaning either in the standard library or higher level (where I consider JVM bytecode to be low level)?

2022-10-23

This is a post from the perspective of a new Java programmer, so it is 100% likely that the concerns here are well-known and already addressed. Or at least discussed.

Java, as a language, doesn't get (understand) immutability and "delivers" it in a way that grants almost none of the benefits of immutability in other languages, like C++ or Rust. I picked those examples to show that the lesson was learnt a long time ago (C++) and the lesson is still valid and a good idea (Rust).

Java can, in some sense, be forgiven of its crimes because it's a pretty old language and is stuck with backwards compatibility. But that doesn't mean it doesn't commit those crimes.

The primary benefit of immutability is that the programmer knows that value cannot be changed, so they no longer need to think about what would happen if it did.

Java doesn't give you that and worse, it pretends that it does. Let's look at some examples of Java lies and deception.

2022-10-16

People write code that relies on all sorts of implicit or obfuscated knowledge. In the worst case, people write code that requires any caller to read through the entire source to work out how to use it or what it does.

What confuses me is that people often seem to do this intentionally, it's like they want to require omniscient knowledge of the codebase for anyone wanting to call or write tests for their code.

I can hear you saying that's ridiculous, and telling me to ask literally anyone whether they think they have the entire codebase in their heads: they'll definitely say no.

Everyone will say fitting a huge codebase into their mental working memory is impossible, but actions speak louder than words. Many people (I see it all the time) constantly choose programming patterns and idioms that only make sense if you think everyone coming after you will have read, digested, and memorised all of the code. There are a few really important ones:

Using nullability
Using mutable state
Using global (or static) variables

If you ever have to familiarise yourself with a codebase, or ever misremember anything, you should try to encourage authors to avoid these as much as possible. But people don't! They love to make things hard for reviewers and future generations of code readers.

In the same breath as complaining about something a coworker has written, people will go on to make the same faulty assumptions and obfuscate their code, perhaps in a slightly different but materially equivalent way.

Let's look at what the problems are. How to convince your coworkers to stop is left as an exercise for the reader.

2022-08-23

While in Venice, I picked up a book, If Venice Dies by Salvatore Settis. The main thrust of the book is that tourism is bad, Venice has died or is on the cusp of death, and changes need to be made to remedy the situation.

This is the kind of book that only works if you already agree with its premise, and read it as outrage porn rather than as a well-motivated, well-explained argument. That might seem a bit uncharitable, but the author makes a lot of claims that are only backed up by vibes and impassioned rhetorical questions like "Wouldn't that be a tragedy?" rather than any kind of reasoning.

There are two ways to look at each of the claims:

The author wants to enforce their vision of what Venice should be, and came up with various justifications post hoc to make it seem like it's in everyone's interests.
The author actually believes their own arguments, which largely amount to fluff, vibes, and vaguely authoritarian policy prescriptions.

We can go through the book and see if (1) the sinister explanation or (2) the naive explanation fits better.

There's of course the third explanation: that the author doesn't believe in any of it and just wrote the book for a quick buck, which we'll ignore. Settis is a lifelong archaelogist and art historian, so it's at least credible that he does believe in the conclusions of the book.

Anyway, let's look at some of the unmotivated fluff. I don't expect this blog post to be particularly entertaining or anything more than a rant.

2022-08-02

There are many mistakes we made when halfheartedly trying to get funding for our startup. The worst was that we didn't actually have a business at all - we had no users yet due to not having regulatory approval for our financial product, and no practical plan to get that other than sending applications and hoping. The second worst was that our funding applications were really bad in many ways.

Not having a business is obviously a bigger problem than anything else, but being unable to convince investors is a big deal too, especially if you're really bad at it.

Here's a list of mistakes we made in our YCombinator application.

2022-07-31

I recently read a couple of blog posts about deserving success, and I found them very interesting, mostly because of what they tell me about the people writing them.

I have some thoughts on these posts but reading them back, I think they're nonsense. I'll post them anyway. Here are the posts:

One from someone who hates the word "deserve" because they believe luck plays a much larger part in success than people want to believe: https://moontowermeta.com/my-personal-trigger/
A follow up from the same author on why talking about "deserving" things makes their skin crawl: https://moontowermeta.com/why-deserve-makes-my-skin-crawl/
And a series of blog posts where the author argues they don't deserve their success, full socialism wouldn't work, and that the way to make things better doesn't involve taxing him more: https://russroberts.medium.com/do-i-deserve-what-i-have-part-i-6553091dd85c

The first two are written by an options trader and the last by an economist, both wealthy. Both seem to feel it's obvious that they don't "deserve" their success, but neither seem to actually attempt to define whatever it is they're talking about.

I don't think it's possible to define "deserve" in a way that matches our intuitions but also means those two people don't deserve their success at all.

I'll try to define "deserve" but it'll probably go really badly.

2022-06-05

No-one reads this blog, so this is a safe place to make a public announcement I don't really want anyone to see.

I'll be moving to New York in a few months when my visa goes through.

There are many reasons I'm moving to New York, the main ones are:

More money
Looking to run away from my startup failure
And the main one, I'm looking for something cool to do because I'm bored.

Read on for some elaboration.

2022-02-10

A while ago, I soldered together a Z80 homebrew computer and ran some programs on it, then wrote a blog post about it.

I did end up running some toy programs and some BASIC games like Super Star Trek, but that got me thinking - how hard would it be to solve a real algorithmic problem on it?

It turns out it wasn't hard at all, thanks to the Z80 development kit (z88dk) and the hexload BASIC program that allows running arbitrary binaries from BASIC, without needing to reprogram the ROM. That's very handy since I don't have a ROM programmer!

Comparing compiled Z80 assembly to modern x86 assembly for the same C program is pretty cool - the Z80 is a slightly extended 8080 instruction set, so you might expect the programs to look pretty similar, but x86 is really a whole different beast.

2022-02-01

The conventional wisdom is that if you're starting a business you should do something you're good at or know something about. If you're a software engineer, you should probably be the guy running engineering. If you're in sales, you should probably be the sales guy. I mean, that makes sense.

This is going to sound stupid and obvious, but people have a strong bias towards doing things they enjoy and are good at. And when you're doing those things it feels good and even worse, it feels productive.

Why is that bad? Because feeling productive can have almost no correlation to being productive.

I quit my job to do things that felt more productive. Spoiler alert: they were dismal failures and produced nothing except some bitter lessons! Isn't that fun?

I'm going to dissect some of my failed businesses:

I tried to re-sell NordVPN activation keys, passing on some of the bulk discount. What could go wrong - people already buy NordVPN and this is just NordVPN but cheaper, right? Not quite, which was a rather painful lesson.
I co-founded a remittance service (think TransferWise, Western Union) that was faster, cheaper, and aimed mainly at transfers from the UK to under-served African countries. We had people lining up to use it, we had all of the tech in place to actually provide the service, it was 10x cheaper, 10x faster, worked on weekends, and the Financial Conduct Authority would've never approved it. Oh.

Why did I keep (or even start) working on them? Because writing software felt good. It felt productive. But it wasn't! I'll go through what I think the warning signs were and how you (and I) can avoid making the same mistakes.

2022-01-25

Since C++17, us C++ programmers have rejoiced in the fact that when we return something from a function, the standard now guarantees that the value won't be copied. This is known as return value optimization (RVO) or copy/move elision, and happens in cases like this:

MyType myfunc() {
    // ...
    return MyType{arg1, arg2};
}

That is, MyType is constructed once and never copied or moved.

But some compilers still don't perform RVO in some cases. It turns out this is because RVO refers only to when you return unnamed values. Named RVO is apparently not considered RVO by the standard's definition. Named means something like:

    MyType x{};
    x.do_something();
    return x;

And gcc (11.2) doesn't always perform NRVO, even if it "obviously" can. Why? Do other compilers do better? I tried to find out.

2020-06-03

There are a lot of ways to represent numerical values on a computer, you've got the various fixed-size integer types and floating point types, but you also have arbitrary precision arithmetic, where the size of the number is limited only by the memory of the machine.

To represent the real numbers, $\mathbb{R}$, programmers often choose floating point numbers. But of course, floating point is terrible: multiplication isn't associative or commutative, neither is addition, you don't really have reciprocals, and so on. Useless for exact arithmetic of the kind you need when doing algebra.

This blog post is about a way to represent cyclotomic fields, which provide (among other things):

All rational numbers
$\sqrt{m}$ for any integer $m$
All of the field axioms: multiplication and addition with proper associativity, commutativity, and invertibility
Enough numbers to represent any finite group

They're still a bit tricky to represent on a computer though, as you'll see.

2020-05-17

I was recently trying to get to grips with the GAP programming language. For those not familiar, it's the programming language for the GAP computational algebra system. It has tons of algorithms implemented for group theory, representation theory, algebraic number theory, and so on. I was thinking about implementing a TypeScript-style transpiler so I could program with some types, and the first step is to parse the syntax.

To get the most elegant parser, I went for a parser written in Haskell using Parsec, which is an elegant library for LL(1) parsers.

The first problem I ran into was that GAP supports several function call syntaxes:

f(x, y, z); # positional
g(p := 1, q := 2); # named-ish parameters
h(x, y, z : p := 1); # mixed

This is surprisingly non-trivial to parse in general! The path is fraught with infinite recursion, ambiguities, and backtracking.

2020-04-19

For my Master's degree, I (helped greatly by my supervisor) implemented some algorithms and even invented some new algorithms to decompose representations of finite groups. I wrote an extremely long (well, relative to other things I've written) and technical thesis about this, but I find myself increasingly unable to understand what any of it means or why I even have a degree.

I thought being forced into a short-form blog post would help me remember whatever it is I spent a few years studying to do. There are some foundational questions:

What is a group?
What is a representation?
What is a decomposition of a representation?

And some more interesting questions, involving some computational tricks relevant to a wider audience:

Why is this useful?
How do you get a computer to do it?
How do you get a computer to do it, quickly?

These are the questions I'll attempt to answer in this blog post. It'll be fun!

2020-04-15

I'm somehow one of the maintainers of rss2email, a popular Python program for reading RSS feeds and sending feed updates to your email address. I think I reached this point via an unusual route, so I thought I'd write a little about how it happened.

For those not familiar, back in 2004, Aaron Swartz (yes, that one, rest in peace) wrote a short Python script that would read an RSS feed and send you emails. It had a few options, but was fairly simple. It was a few hundred lines long.

After 16 years of features being added and being passed around various maintainers (see here for a complete list of contributors), rss2email is still around, with many more features, many more lines of code, and many (many) more bugs.

How did I get involved?

2019-10-29

Anyone who has played Halo and fired the Gauss cannon on the Warthog has experienced a strong desire to do it in real life. Usually this is cut down by the idea that Gauss guns aren't real. But of course, they are, and I've built a few. Feel free to replace all mentions of "coilgun" with "Gauss superaccelerator" or "magnetic spaceship cannon" if this will satisfy a childhood fantasy of yours.

There are two main types of coilgun I've built:

The "standard" kind, which involves turning an electromagnet on, attracting a ferromagnetic projectile down a barrel towards the coil, then turning the magnet off once it reaches the centre of the coil. This is a reluctance coilgun.
The other kind, which involves turning an electromagnet on and using the sudden change in magnetic field to induce eddy currents in a non-ferromagnetic projectile. The projectile must be an inductor of some kind (e.g. a shorted coil) so that the eddy currents form a magnetic field that repels the projectile away from the coil. This is an inductance coilgun.

I thought an induction coilgun would be easier to time, since we don't have to quickly turn off the coil, but intractable problems led me to turn my coilgun into a reluctance coilgun.

2019-09-04

I was decommissioning a few of my older computers and servers, stripping them down for parts. The hard drives in them were mostly IDE drives which had long stopped working. There are almost no useful parts you can strip from a non-functional hard drive except maybe the IDE connector and of course, the extremely strong rare-earth magnets.

But what could I do with them? Here are some things I did and photographed:

Book light that holds on with magnets
Converting a Star Trek combadge from using pins to magnets
Sticking an amplifier to the bottom of my desk

2019-08-14

I have a Sun Ultra 45, the last and most powerful Sun SPARC workstation. Even though mine doesn't have both of the two CPU slots filled, the fans are still really loud. Is there a way to control the fans? How could I slow them down?

I first tried to find a software solution on OpenBSD and later had to resort to soldering and adding some resistors to the fans to slow them down.

2019-08-06

Like everyone, I have a ton of old routers lying around. It pains me to see these very useful computers go to waste, so I made it my business to hack into all of mine and replace the firmware. Maybe the title is a bit dramatic, but it's technically accurate.

My first target was an old Sky router, a Sagemcom F@ST2504n.

2019-07-10

I've always wanted to design and build my own homebrew computer. By this, I mean buying some ICs and soldering together something like an Apple I or a ZX Spectrum.

This post isn't going to be about my struggles designing a homebrew PC, since I haven't done that. Instead, I bought a new Z80 homebrew kit, the RC 2014. That's right, there is still someone out there making these kits.

2019-06-07

Until recently, my transcript search web app was running (at https://transcripts.kaashif.co.uk, check it out) in a tmux session, with a PostgreSQL server running on the same machine in the usual way, as a daemon.

The web app knows nothing about its dependency on the database, this information is not recorded anywhere except in the code itself. And the database knows nothing about the web app. This isn't a huge problem except the database has a config of its own which isn't recorded anywhere in the source repo. If you try to get the web app to work with a misconfigured database, it won't work, of course.

Wouldn't it be nice if all of that configuration were in one place? And if the services all restarted themselves if they failed? And if you could migrate the entire blob of interconnected web apps and databases to a different machine with a single command?

That's where Docker comes in!

2019-04-18

Since I have access to a machine that has the PA-RISC architecture, I thought I'd compile some test programs and see what sort of assembly code produced. Some highlights:

A neat way to manage the stack pointer (and one surprise)
Every instruction seems to be shorthand for or
Completers - a weird way of giving switches to your instructions

PA-RISC is considerably less popular than x86, MIPS, PowerPC, even SPARC. And being a RISC architecture means that humans hardly ever wrote assembly for it themselves. Most of the time, programmers probably never even gave (past tense since PA-RISC is dead) their binaries a second glance. Or really even any kind of look.

Well, that's about to change! The first program we'll look at is, of course, hello world.

2019-04-13

A while ago, I got my hands on a beast of a machine, a 7U HP L3000 (rp5470) PA-RISC server. These were released in the year 2000 and came with up to 16GB (whoa) of RAM and up to 4 CPUs.

The best site for information on PA-RISC machines is, no doubt, OpenPA.net, and they have a fantastic page on my machine.

This is the story of how I managed to install Gentoo GNU/Linux on this classic UNIX server.

2019-03-31

Remember my transcript search engine, https://transcripts.kaashif.co.uk?

I'm not a database expert, but even I realised that spending hours and hours trying to optimise my homebrew database transcript search engine was a waste of time. For no reason at all other than to try it out, I went with ElasticSearch. The astute reader will notice that ElasticSearch is meant for "big" data. My collection of transcripts tops out at few dozen megabytes at most - this is most certainly not big (or even medium-sized) data, really.

So after getting some real-world experience with SQL databases (at a real company) and taking an in-depth database algorithms course at university, I decided to convert my web app to use a PostgreSQL database to store the transcripts, searching with bog-standard SQL queries.

There were a couple of neat tricks I used to speed things up, too.

2018-08-11

Everyone's studied x86 assembly (just objdump any program on your PC...) and maybe even some ARM or MIPS in a class somewhere, but there are a few features that exist in some CPUs that don't exist at all in any of these designs.

I'm talking about register windows! When you call a function on SPARC, the new function just magically gets its own registers neatly separated into input registers, output registers and local registers. You're allowed to mess up your local registers as much as you want and the CPU does all of the saving and swapping for you.

No more weird arbitrary calling conventions about r10 and r11 being caller-saved, rax being return, rqb being Cthulhu-saved, rpqwuqew being quantum entangled with r554 on Tuesdays...

2018-01-13

It's a little known fact that there is actually no way in C or C++ to do an unaligned access without invoking undefined behaviour. It's true! Read it yourself here:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned [...] for the referenced type, the behavior is undefined.

C11 (n1570) 6.3.2.3 p7

Sadly, the authors of many programs ignore this and rely on it working. Which it does, on x86, with little performance impact in most cases. On some architectures, like MIPS and PowerPC, unaligned access instructions exist but are slow. But on SPARC...unaligned access is impossible and leads to this:

$ openjk.sparc64
Bus error (core dumped)

Solving these issues with OpenJK is very difficult, especially considering Jedi Knight was never meant to run on SPARC (or indeed OpenBSD, but that's less of an issue).

2017-12-03

A few weeks ago, I got my hands on a Sun T2000 server. It's got an UltraSparc T1 CPU, 32 threads, 32 GB of memory, a Sun XVR-300 GPU and what sounds like a huge jet engine mounted at the front.

It's a great machine (although maybe not as a workstation...), and there are a few things unique to SPARC that I've really been looking forward to playing around with. Mostly LDoms (logical domains - Sun's virtualization technology), OpenBSD on a beefy sparc64 (compared to my older UltraSparcs anyway) and Solaris (just as a curiosity).

2017-09-10

This server (the very one you are reading this post on), at the time of writing this post, runs OpenBSD 6.1-stable. It's fully patched and updated and everything, so it's a perfectly fine OS to run. But the VPS has limited memory and disk space, and the CPU isn't very fast, so compiling large projects on it, especially Haskell ones, is impractical.

This post describes a way to build fully-functional, dynamically linked (so you get all those security updates, super important for a public-facing web service), native Haskell binaries with cabal-install, Stack and...OpenBSD's (kind of) new native hypervisor, vmm.

2017-08-10

I recently got an old Sun Ultra 5 working. It wasn't too difficult, but I needed to dig up a few old serial cables...

It already had SunOS 5.8 installed, but I put OpenBSD 6.1 on it, since I need a modern OS to actually do anything with it.

2017-08-04

Migrating mail servers is a tricky business, especially when one server doesn't have IMAP set up. The easiest way is to download all the mail and reupload it to the new mail server.

This seems simple enough, but I ran into problems. After all, I was going from an IMAP server somewhere to a maildir (no IMAP sync tool supports mbox for some reason) to an mbox through procmail to a directory of mboxes. Not trivial.

2017-08-03

There I was, a loyal user of http://mail.zoho.com, when I decided to download all of my emails. For archival purposes, you know.

So I fired up mbsync, set everything up, let 'er rip, but after only about 10,000 emails downloaded, I got this error:

IMAP command 'AUTHENTICATE PLAIN <authdata>' returned an error: NO [ALERT] Your account is currently not accessible via IMAP due to excessive usage. Kindly try after some time.
*** IMAP ALERT *** Your account is currently not accessible via IMAP due to excessive usage. Kindly try after some time.

What sort of...anyway, this was unacceptable, so I decided to set up my web server as a mail server.

2017-08-02

Before I get my Sun Ultra 5 working and can write something about that, I thought I'd go through all the hardware I'm using right now and the OSes I'm running on them. Spoilers: it's all OpenBSD and Debian.

2017-06-28

You may have heard of the Star Trek script search tool at http://scriptsearch.dxdy.name. I'm writing a similar thing for Stargate. The difference is, of course, is that my tool will be running on some crappy Amazon AWS t2.nano with no RAM.

The first prototype was written in Python, but the parsing code was always written in Haskell (Parsec is great). I decided to move everything into Haskell so there wouldn't be this redundancy of parsing in Haskell then serializing to disk then reading it from Python...one codebase for everything would be much simpler.

I wrote up a quick Haskell version, but there was one small problem when I tried to use it:

$ ./stargate-search
Setting phasers to stun... (port 5000) (ctrl-c to quit)
Killed

That's right, the OOM killer had to step in and put my app down like the sloppy wasteful piece of junk it was.

How could I fix this?

2017-06-22

So I was setting up a laptop I had just picked up, when it came to deciding what OS to install on it. Obviously, I'd probably end up installing some Linux and maybe also OpenBSD (hard drives are huge nowadays, I could fit hundreds of OSs on there). While I had tried out FreeBSD and NetBSD, DragonflyBSD had never been on my radar.

It still isn't, really, but I thought I'd try it out on an old laptop, just to see what it was like. It went pretty well, but there were a few oddities and one or two kind of weird design choices.

2017-03-20

Today, I decided to install Gentoo on a spare machine I had lying around, since I was bored. Obviously, the first issue I ran into was that emerge x11-base/xorg-server was taking a really long time to run since the Xorg server is a pretty bloated program. Then emerge firefox was taking forever too.

One solution (for Firefox anyway) was to use the provided binary packages, this meant firefox-bin for Firefox. But this means I abandon all of the nice features (for me, that means USE flags) that Gentoo offers. If I'm going to download a load of binaries that I can't customize, why not just install Debian?

So the solution is to speed up compilation. That means putting more CPU cores to work. But my poor old ThinkPad only has 2 cores! This is where distcc comes in.

2017-03-10

There are hundreds of blog posts about backing up your PGP keys around on the internet. Most of them just say something like: put a passphrase on it, keep it on a USB stick, a CD, a floppy disk, or something like that. These are all very useful ways to back important stuff up - in fact, I just restored a backup of my GPG keys from a CD after deleting them by accident. These mediums are, however, not going to survive for many decades like paper can. And if you store to a writeable medium like a rewritable CD, DVD, USB stick or floppy, there is still the danger of you trying to restore then accidentally deleting your backup. Or, more likely, you write over it without realising many months later that you wrote a Linux distro or movie to that DVD that had your only PGP key backup.

2016-05-05

I wanted to make a list of the websites of the people on the website http://nixers.net, and I decided to solve it not by asking people to tell me what their sites were called, but by scraping the forum.

I didn't scrape the whole forum, I just scraped one topic on the forum that I created a few years ago: https://nixers.net/showthread.php?tid=1547

2016-05-03

Some people reading this might be thinking: "hey, it's really easy to do this, why is he writing an article on this?". You are partially right, this should be really easy, but there are some weird things that happened while I set this up that I feel should have been written down somewhere, so my fear that I was completely borking my system would have been assuaged.

2016-04-29

What is frog?

Frog is a static website generator written in Racket. It does the same sort of thing as Jekyll, Hakyll and other software like that, some of which I've used in the past.

2015-06-28

Before a few weeks ago, I was always one of those people who said that Lisp isn't useful, it's not type-safe, it's not pure, Haskell is better etc etc ad nauseam. All of that may be true for writing some sorts of programs, but Lisp (well, Common Lisp anyway) provides something a lot more pervasive.

What does pervasive mean? Well, right now, I'm controlling my window

2015-06-18

Is it portable?

First off, the information in this post definitely doesn't apply to Linux (as it has a completely different way of doing things) and may or may not apply to other BSDs (I see that NetBSD and FreeBSD both have similar, maybe identical, kvm(3) interfaces). There certainly isn't anything in POSIX to make this standard. The only real reason

2015-04-19

I know that sometimes, I've bene stuck in an airport or in a coffee shop without internet. That's annoying in and of itself, but it's even more annoying when there's a WiFi hotspot nearby, but it requires you to pay £4/hour or something crazy like that.

You can still connect to the network, it's just that whatever URL you

2015-04-11

You know something that really annoys me? When I'm writing some Racket, Clojure, or any other Lispy language, and my editor won't cooperate. Emacs is far, far, better than most other editors for this sort of thing, mostly due to paredit-mode and SLIME (and geiser-mode, and clojure-mode, and evil-mode, and...), but there's still one problem I hadn't solved until recently.

2015-03-30

There are a few Platform as a Service (PaaS) services out there, and the most famous is probably Heroku. I know I've had people come up to me and suggest using Heroku for the next big thing they're planning. There is a problem with Heroku: it's not free (libre). That didn't stop me from at least trying it out to see what all the fuss was about.

2014-12-26

Earlier, I was trying to find something I could talk about at my school's maths society. It had to be something exciting, useful, or at least beautiful in some way. I really wanted to do something on quaternions and vectors, because it seemed fun. The problem came when I realised I had to do something more substantial than stand there and explain something that boring. Then I saw this quote:

2014-12-06

The other day, I was trying to access http://sharelatex.com at school, and it didn't really work, probably due to a combination of Internet Explorer and possibly an overzealous filter that could have been blocking something. That's what I thought, anyway, until I tried it on Chrome and it still didn't work. Odd. The best solution was obviously to set up my own ShareLaTeX instance on my server.

2014-11-15

UPDATE: I now use Mercurial on the client side (i.e. everything I do locally involves hg) and Git on the server-side. It just makes it easier to mirror to Gitorious, GitLab, etc. You can view all the repos at http://kaashif.co.uk/code.

Earlier in the year, I was getting curious about version control

2014-10-11

Back in the day, I used to use Slackware. It was the best distro around and all the cool kids used it. Nowadays, it's rather different: a much lower proportion of people use Slackware. Despite the efforts of Eric and Patrick (and whoever else), Slackware isn't really all that popular. It's still a solid distro, though. There is one problem

2014-10-07

Earlier today, I was discussing operating systems and came onto the subject of ease of installation. Which OS had the easiest installer? The obvious answers would tend towards OSs with GUI installers, but is that really easy? Sure, it could be familiar, but there's a lot more you have to do with GUI installers compared to, say, OpenBSD's

2014-09-25

I've heard a lot of things about Racket (well really, things about many different Lisp dialects), and most of them were good. Recently, I decided to try to decide between Haskell and a Lisp once and for all. I wanted to go for a Lisp-1, since they keep functions and values in the same namespace, which is how it should be. Eventually, after

2014-09-18

I've seen many completely stupid articles where people furiously circlejerk over how Vim is the best and nothing will ever come close, but it's rare that I see anyone write much about Emacs, probably because fewer people use it (or maybe Emacs attracts a different demographic). It's rare I see an article like

2014-08-24

Recently, I've been trying to understand the ins and outs of CVS in order to be able to contribute to OpenBSD without messing up anything. I have sent a few patches to ports@, but anything complex was beyond my abilities until recently.

2014-08-22

Recently, I've been trying to get away from pre-packaged file sharing solutions (e.g. FreeNAS) and trying to set up the services they provide from scratch. While I obviously won't be able to write a web GUI or create a whole distro, that simply isn't necessary. What is necessary is setting up a file share and appropriate read/write permissions.

2014-07-25

People like to say that learning Haskell is hard, because a pure, lazy, functional language isn't something most people are used to. The simple fact of the matter is that most people (nowadays, at least) will be introduced to programming through Python, Javascript or maybe even Java or C#. The problem with learning dynamically typed languages that don't tell you

2014-06-08

Earlier, I saw this article claiming to describe how to share a file "simply" by running Python's web server module (with either Python 2 or 3). While that may be easy, it's not simple, and certainly not fast.

2014-05-31

The inspiration for this came from this blog post, where the author describes how he uses his computer. While he does use CRUX, a GNU/Linux distro and I use OpenBSD, our workflows are actually surprisingly similar, which can, in part, be attributed to the

2014-05-08

I'm sure lots of people (dozens, perhaps) have installed OpenBSD on ThinkPad T61s of some description, but with the recent release of OpenBSD 5.5, lots of documentation has become (or already was, and now is even more so) obsolete, like this article_on_a_Thinkpad_T61)

2014-04-18

On all of my computers, I like being efficient. That means eliminating everything which uses all that precious CPU time and using applications which are very customisable and configurable. These sorts of applications tend to be text-based, which is, in my eyes, a good thing, since they'll show you the information you need with a minimum of

2014-04-18

There is always a lot of buzz around the idea of "learning to program". While I think it's very important that children learn logical thinking and problem solving, I also realise that the majority of children, and people in general, would probably not benefit from a very language specific, rote learning based, generally old-fashioned approach to

2014-04-13

UPDATE: I now use Hakyll. See http://kaashif.co.uk/about for more. Also, 100% of the information in this blog post is now wrong or outdated.

When I decided I wanted to write a blog, I had to come up with some way of writing posts (in a markup format which isn't HTML), and serving them somehow.

2014-04-12

After a few months of running a website, there are a few things I have realised about how I ran my server when it was first delivered, and how I run it now. The changes have been, for the most part, for the better. Needless to say, when I first started out, I was clueless, overeager and far too ambitious with my plans for "the next Facebook" or something

2014-04-03

UPDATE: The project died, it went nowhere.

On nixers.net, the IRC channel of which I spend a bit of time in, there has been a bit of a stir as the community tried to decide on a project to commit to. The idea of creating a distro of some OS came up. A few people wanted a BSD-based distro, but it was decided that the Linux kernel was

2014-03-01

When I first started programming, I barely had any idea of what constituted a good text editor, or why I'd want to use some old, texty editor from the 90s which didn't even have most of the features I took for granted in the IDE I was using at the time. Maybe this had something to do with one of my first languages being Java, which is widely considered an IDE language, but I went through the

2014-02-09

Over the years, I've used a few different OSes and desktop environments, and the one I use currently is portable to many operating systems, mostly due to the efforts of the writers of i3 over at i3wm.org, but also the standards-compliance of POSIX, meaning that my shell scripts (which you can find here) work on all

2014-02-05

Whenever you hear something about Haskell, chances are it sounds arcane and involves lots of complicated and intimidating mathematical language. Well, the truth is that all this talk of 'endofunctors' and 'monoids' is really unnecessary, if the concept of functors is explained using a simple analogy.

2014-01-22

When running a website on a residential connection, a problem one might run into is the dynamic IP address usually assigned by one's ISP. There are a few dynamic DNS services which basically let you have a subdomain (e.g. mydomain.example.com) and let you update it to point to your IP address whenever it changes. At one time, your IP might be 10.0.0.1, and your domain correctly

2014-01-14

If you have spent time on programming or technology boards like /g/ or /r/programming, chances are you might have heard the word "monad" thrown around a lot. You may have even heard the oft misquoted phrase "A monad is a monoid in the class of endofunctors" intended to be a joke, or to scare programmers away from scary functional languages like Haskell. The truth is that monads aren't

2014-01-08

There are a few very important things that everyone involved in software (particularly free software) can do to help out. The most important is to file detailed and helpful bug reports, so the developers working on your favourite program can get the problem fixed. Since it is not very hard to write a bug report, and projects generally have their own bug report guidelines, I won't

2013-12-23

While I did have some old hardware lying around, I had never committed to actually getting that hardware usable. By that, I mean I had never tried to browse the web, read emails and that sort of day-to-day stuff on anything older than a few years. To see if it were really possible, I decided to buy an old ThinkPad (a 760EL from 1995) and see if I could get it working. Before I

2013-12-01

stow is a cool little Perl script which basically just creates and deletes symlinks. That sounds pointless, but let me explain with an example. Let's say you want to install a program using the usual make install, which probably installs into /usr/local, which means it's separated from the rest of your system, which is managed with a "real" package manager. Unless you're using a

2013-11-30

For a few years now, I have been using Vim to edit config files, program in C, Python, even Lisp (people apparently think that Vim isn't the best for programming in Lisp). This isn't because I took a side in the so-called "editor wars", it's just because it came preinstalled on the first GNU/Linux system I used, Debian. Over time,

2013-11-27

This tutorial is designed for those who have programmed before, perhaps in a higher level language like Python or Ruby. It's not too hard to understand for those who are completely inexperienced, but some knowledge of functions, data structures and pointers might help. Most of the low-level stuff will be new to high level programmers, however.

2013-11-12

Often, people look at me oddly when I suggest that they email me something. "Why can't I just send it to you on Facebook or Skype?" they say. Well, it doesn't have to be those two media of communication, but it's usually something like that. When I say often, I also mean that this has only happened on two occasions, so bear that in mind as I make things up about the types of people

2013-11-06

This article is not only about disk wiping, it will hopefully teach you something about using some GNU command line tools . This tutorial was written on my ThinkPad, which runs Debian, so the output should be pretty similar to what you'd get on Ubuntu, Mint or any other Debian- or Ubuntu-based systems. Basically, if you're using something

2013-10-28

Installing packages from source

Recently, I had SSHed into one of Debian Stable virtual machines I was using as a file server. The main services I was running were FTP, an HTTP server with a directory listing and a Samba server, with shares set up for a few users on my

2013-09-26

You probably have not heard of LaTeX before now. If you have, then it is likely that you have no idea what LaTeX is, save for a vague feeling that it relates to documents in some way. By the end of this short post, you will not only know what LaTeX is, but be able to understand why people use it, and what its advantages are. While you probably won't switch to LaTeX immediately, you might

2013-09-17

Imagine you're a person on some sort of device, using the internet. You see all of these websites and what do you ask? "How can I set up a web server?", of course. If you did not ask that question, then this guide is not for you. Anyway, down to business. You will need:

2013-08-11

Why am I writing this?

I have looked up "how to use gpg" so many times, on so many websites, and have found every guide to be focused on something I don't use or worded in such a way that I get confused and revoke all of my keys (that hasn't actually happened...yet). I thought I'd whip up a quick guide

Book Review: Children of Memory

2024-04-02

binfmt_misc: The magic behind Linux/Windows interop

2024-01-03

Differences in backwards incompatibility between Rust and C++

2024-01-02

How large are the arbitrage opportunities in Eve Online?

2023-09-23

Valuing converters in Sidereal Confluence

2023-08-08

Is implementing alloca(3) in C really impossible?

2023-07-13

Booting the 1994 Dr Dobb's 386BSD 1.0 CD

2023-06-19

Adding keyword arguments to Java with annotation processing

2023-03-27

The problem with using splice(2) for a faster cat(1)

2023-03-12

Searching for Planet X with the Z3 solver

2023-02-26

Why does Mockito need JVM bytecode generation?

2023-01-23

Java doesn't really get immutability

2022-10-23

Don't hide things from people reading your code

2022-10-16

Why object to the death of Venice?

2022-08-23

We sent the worst YCombinator application possible

2022-08-02

What does it mean for someone to "deserve" success?

2022-07-31

I'm moving to New York, here's why

2022-06-05

LeetCode on a Z80 CPU from 1976

2022-02-10

Don't stick to what you're good at - my startup catastrophes

2022-02-01

Why doesn't GCC do this "easy" NRVO optimization?

2022-01-25

Representing complex numbers exactly on a computer

2020-06-03

Writing a parser for a function call is surprisingly hard

2020-05-17

Decomposing representations: a slice of computational group theory

2020-04-19

How to accidentally become a maintainer of a project

2020-04-15

Electromagnetic weaponry for fun and profit

2019-10-29

Finding uses for neodymium magnets

2019-09-04

Modding a Sun Ultra 45 fan module

2019-08-14

Hacking into a Sky router

2019-08-06

My first homebrew computer!

2019-07-10

Containerizing my transcript search app

2019-06-07

HP PA-RISC Assembly Crash Course

2019-04-18

Reviving an HP PA-RISC server

2019-04-13

Using PostgreSQL to search transcripts

2019-03-31

Register windows: a cool feature of SPARC

2018-08-11

Porting OpenJK to sparc64

2018-01-13

Playing with LDoms, OpenBSD and Solaris

2017-12-03

Using vmm(4) to target old OpenBSD releases

2017-09-10

Reviving a Sun Ultra 5 workstation

2017-08-10

Sorting a ton of mail

2017-08-04

Moving to my own email server

2017-08-03