From kafkatrap to honeytrap

http://esr.ibiblio.org/?p=6907

I received a disturbing warning today from a source I trust.

The short version is: if you are any kind of open-source leader or senior figure who is male, do not be alone with any female, ever, at a technical conference. Try to avoid even being alone, ever, because there is a chance that a “women in tech” advocacy group is going to try to collect your scalp.

IRC conversation, portions redacted to protect my informant, follows.


15:17:58 XXXXXXXXXXXX | I'm super careful about honey traps.  For a 
                      | while, that's how the Ada Initiative was    
                      | trying to pre-generate outrage and collect  
                      | scalps.                                     
15:18:12          esr | REALLY?                                    
15:18:22          esr | That's perverse.                           
15:18:42 XXXXXXXXXXXX | Yeah, because the upshot is, I no longer   
                      | can afford to mentor women who are already 
                      | in tech.                                    
15:18:54          esr | Right.                                     
15:19:01 XXXXXXXXXXXX | I can and do mentor ones who are not in    
                      | it, but are interested and able            
15:19:21 XXXXXXXXXXXX | but once one is already in...  nope        
15:20:08 XXXXXXXXXXXX | The MO was to get alone with the target,   
                      | and then immediately after cry "attempted  
                      | sexual assault".                            
15:23:27          esr | When the backlash comes it's going to be   
                      | vicious.  And women who were not part of   
                      | this bullshit will suffer for it.          
15:23:41 XXXXXXXXXXXX | I can only hope.                            
15:25:21          esr | Ah. On the "Pour encourager les autres"    
                      | principle?  I hadn't thought of that.      
                      | Still damned unfortunate, though.          
15:26:40 XXXXXXXXXXXX | Linus is never alone at any conference.    
                      | This is not because he lets fame go to his
                      | head and likes having a posse around.      
15:26:54 XXXXXXXXXXXX | They have made multiple runs at him.       
15:27:29          esr | Implied warning noted.                      
15:27:34            * | XXXXXXXXXXXX nods

An A&D regular who is not myself was present for this conversation, but I’ll let him choose whether to confirm his presence and the content.

“They have made multiple runs at him.” Just let the implications of that sink in for a bit. If my source is to be believed (and I have found him both well-informed and completely trustworthy in the past) this was not a series of misunderstandings, it was a deliberately planned and persistent campaign to frame Linus and feed him to an outrage mob.

I have to see it as an an attempt to smear and de-legitimize the Linux community (and, by extension, the entire open-source community) in order to render it politically pliable.

Linus hasn’t spoken out about this; I can think of several plausible and good reasons for that. And the Ada Initiative shut down earlier this year. Nevertheless, this report is consistent with reports of SJW dezinformatsiya tactics from elsewhere and I think it would be safest to assume that they are being replicated by other women-in-tech groups.

(Don’t like that, ladies? Tough. You were just fine with collective guilt when the shoe was on the other foot. Enjoy your turn!)

I’m going to take my source’s implied advice. And view “sexual assault” claims fitting this MO with extreme skepticism in the future.

Hieratic documentation

http://esr.ibiblio.org/?p=6903

Here’e where I attempt to revive and popularize a fine old word in a new context.

hieratic, adj. Of or concerning priests; priestly. Often used of the ancient Egyptian writing system of abridged hieroglyphics used by priests.

Earlier today I was criticizing the waf build system in email. I wanted to say that its documentation exhibits a common flaw, which is that it reads not much like an explanation but as a memory aid for people who are already initiates of its inner mysteries. But this was not the main thrust of my argument; I wanted to observe it as an aside.

Here’s what I ended up writing:

waf notation itself relies on a lot of aspect-like side effects and spooky action at a distance. It has much poorer locality and compactness than plain Makefiles or SCons recipes. This is actually waf’s worst downside, other perhaps than the rather hieratic documentation.

I was using “hieratic” in a sense like this:

hieratic, adj. Of computer documentation, impenetrable because the author never sees outside his own intimate knowledge of the subject and is therefore unable to identify or meet the expository needs of newcomers. It might as well be written in hieroglyphics.

Hieratic documentation can be all of complete, correct, and nearly useless at the same time. I think we need this word to distinguish subtle disasters like the waf book – or most of the NTP documentation before I got at it – from the more obvious disasters of documentation that is incorrect, incomplete, or poorly written simply considered as expository prose.

I’m not going to ‘take everyone’s guns away’

http://esr.ibiblio.org/?p=6895

An expectation of casual, cynical lying has taken over American political culture. Seldom has this been more obviously displayed than Barack Obama’s address to police chiefs in Chicago two days ago.

Here is what everyone in the United States of America except possibly a handful of mental defectives heard:

Obama’s anti-gun-rights base: “I’m lying. I’m really about the Australian-style gun confiscation I and my media proxies were talking up last week, and you know it. But we have to pretend so the knuckle-dragging mouth-breathers in flyover country will go back to sleep, not put a Republican in the White House in 2016, and not scupper our chances of appointing another Supreme Court justice who’ll burn a hole in the Bill of Rights big enough to let us take away their eeeeevil guns. Eventually, if not next year.”

Gun owners: “I’m lying. And I think you’re so fucking stupid that you won’t notice. Go back back to screwing your sisters and guzzling moonshine now, oh low-sloping foreheads, everything will be juuust fiiine.”

Everyone else: “This bullshit again?”

Of course, the mainstream media will gravely pretend to believe Obama, so that they can maintain their narrative that anyone who doesn’t is a knuckle-dragging, mouth-breathing, sister-fucking, Comfederate-flag-waving RAAACIST.

NTPsec is not quite a full rewrite

http://esr.ibiblio.org/?p=6881

In the wake of the Ars Technica article on NTP vulnerabilities, and Slashdot coverage, there has been sharply increased public interest in the work NTPsec is doing.

A lot of people have gotten the idea that I’m engaged in a full rewrite of the code, however, and that’s not accurate. What’s actually going on is more like a really massive cleanup and hardening effort. To give you some idea how massive, I report that the codebase is now down to about 43% of the size we inherited – in absolute numbers, down from 227KLOC to 97KLOC.

Details, possibly interesting, follow. But this is more than a summary of work; I’m going to use it to talk about good software-engineering practice by example.

The codebase we inherited, what we call “NTP Classic”, was not horrible. When I was first asked to describe it, the first thought that leapt to my mind was that it looked like really good state-of-the-art Unix systems code – from 1995. That is, before API standardization and a lot of modern practices got rolling. And well before ubiquitous Internet made security hardening the concern it is today.

Dave Mills, the original designer and implementor of NTP, was an eccentric genius, an Internet pioneer, and a systems architect with vision and exceptional technical skills. The basic design he laid down is sound. But it was old code, with old-code problems that his successor (Harlan Stenn) never solved. Problems like being full of port shims for big-iron Unixes from the Late Cretaceous, and the biggest, nastiest autoconf hairball of a build system I’ve ever seen.

Any time you try to modify a codebase like that, you tend to find yourself up to your ass in alligators before you can even get a start on draining the swamp. Not the least of the problems is that a mess like that is almost forbiddingly hard to read. You may be able to tell there’s a monument to good design underneath all the accumulated cruft, but that’s not a lot of help if the cruft is overwhelming.

One thing the cruft was overwhelming was efforts to secure and harden NTP. This was a serious problem; by late last year (2014) NTP was routinely cracked and in use as a DDoS amplifier, with consequences Ars Technica covers pretty well.

I got hired (the details are complicated) because the people who brought me on believed me to be a good enough systems architect to solve the general problem of why this codebase had become such a leaky mess, even if they couldn’t project exactly how I’d do it. (Well, if they could, they wouldn’t need me, would they?)

The approach I chose was to start by simplifying. Chiseling away all the historical platform-specific cruft in favor of modern POSIX APIs, stripping the code to its running gears, tossing out as many superannuated features as I could, and systematically hardening the remainder.

To illustrate what I mean by ‘hardening’, I’ll quote the following paragraph from our hacking guide:

* strcpy, strncpy, strcat:  Use strlcpy and strlcat instead.
* sprintf, vsprintf: use snprintf and vsnprintf instead.
* In scanf and friends, the %s format without length limit is banned.
* strtok: use strtok_r() or unroll this into the obvious loop.
* gets: Use fgets instead. 
* gmtime(), localtime(), asctime(), ctime(): use the reentrant *_r variants.
* tmpnam() - use mkstemp() or tmpfile() instead.
* dirname() - the Linux version is re-entrant but this property is not portable.

This formalized an approach I’d used successfully on GPSD – instead of fixing defects and security holes after the fact, constrain your code so that it cannot have defects. The specific class of defects I was going after here was buffer overruns.

OK, you experienced C programmers out there are are thinking “What about wild-pointer and wild-index problems?” And it’s true that the achtung verboten above will not solve those kinds of overruns. But another prong of my strategy was systematic use of static code analyzers like Coverity, which actually is pretty good at picking up the defects that cause that sort of thing. Not 100% perfect, C will always allow you to shoot yourself in the foot, but I knew from prior success with GPSD that the combination of careful coding with automatic defect scanning can reduce the hell out of your bug load.

Another form of hardening is making better use of the type system to express invariants. In one early change, I ran through the entire codebase looking for places where integer flag variables could be turned into C99 booleans. The compiler itself doesn’t do much with this information, but it gives static analyzers more traction.

Back to chiseling away code. When you do that, and simultaneously code-harden, and use static analyzers, you can find yourself in a virtuous cycle. Simplification enables better static checking. The code becomes more readable. You can remove more dead weight and make more changes with higher confidence. You’re not just flailing.

I’m really good at this game (see: 57% of the code removed). I’m stating that to make a methodological point; being good at it is not magic. I’m not sitting on a mountaintop having satori, I’m applying best practices. The method is replicable. It’s about knowing what best practices are, about being systematic and careful and ruthless. I do have an an advantage because I’m very bright and can hold more complex state in my head than most people, but the best practices don’t depend on that personal advantage – its main effect is to make me faster at doing what I ought to be doing anyway.

A best practice I haven’t covered yet is to code strictly to standards. I’ve written before that one of our major early technical decisions was to assume up front that the entire POSIX.1-2001/C99 API would be available on all our target platforms and treat exceptions to that (like Mac OS X not having clock_gettime(2)) as specific defects that need to be isolated and worked around by emulating the standard calls.

This differs dramatically from the traditional Unix policy of leaving all porting shims back to the year zero in place because you never know when somebody might want to build your code on some remnant dinosaur workstation or minicomputer from the 1980s. That tradition is not harmless; the thicket of #ifdefs and odd code snippets that nobody has tested in Goddess knows how many years is a major drag on readability and maintainability. It rigidifies the code – you can wind up too frightened of breaking everything to change anything.

Little things also matter, like fixing all compiler warnings. I thought it was shockingly sloppy that the NTP Classic maintainers hadn’t done this. The pattern detectors behind those warnings are there because they often point at real defects. Also, voluminous warnings make it too easy to miss actual errors that break your build. And you never want to break your build, because later on that will make bisection testing more difficult.

Yet another important thing to do on an expedition like this is to get permission – or give yourself permission, or fscking take permission – to remove obsolete features in order to reduce code volume, complexity, and attack surface.

NTP Classic had two control programs for the main daemon, one called ntpq and one called ntpdc. ntpq used a textual (“mode 6”) packet protocol to talk to nptd; ntpdc used a binary one (“mode 7”). Over the years it became clear that ntpd’s handler coder code for mode 7 messages was a major source of bugs and security vulnerabilities, and ntpq mode 6 was enhanced to match its capabilities. Then ntpdc was deprecated, but not removed – the NTP Classic team had developed a culture of never breaking backward compatibility with anything.

And me? I shot ntpdc through the head specifically to reduce our attack surface. We took the mode 7 handler code out of ntpd. About four days later Cisco sent us a notice of critical DoS vulnerability that wasn’t there for us precisely because we had removed that stuff.

This is why ripping out 130KLOC is actually an even bigger win than the raw numbers suggest. The cruft we removed – the portability shims, the obsolete features, the binary-protocol handling – is disproportionately likely to have maintainability problems, defects and security holes lurking in it and implied by it. It was ever thus.

I cannot pass by the gains from taking a poleaxe to the autotools-based build system. It’s no secret that I walked into this project despising autotools. But the 31KLOC monstrosity I found would have justified a far more intense loathing than I had felt before. Its tentacles were everywhere. A few days ago, when I audited the post-fork commit history of NTP Classic in order to forward-port their recent bug fixes, I could not avoid noticing that a disproportionately large percentage of their commits were just fighting the build system, to the point where the actual C changes looked rather crowded out.

We replaced autotools with waf. It could have been scons – I like scons – but one of our guys is a waf expert and I don’t have time to do everything. It turns out waf is a huge win, possibly a bigger one than scons would have been. I think it produces faster builds than scons – it automatically parallelizes build tasks – which is important.

It’s important because when you’re doing exploratory programming, or mechanical bug-isolation procedures like bisection runs, faster builds reduce your costs. They also have much less tendency to drop you out of a good flow state.

Equally importantly, the waf build recipe is far easier to understand and modify than what it replaced. I won’t deny that waf dependency declarations are a bit cryptic if you’re used to plain Makefiles or scons productions (scons has a pretty clear readability edge over waf, with better locality of information) but the brute fact is this: when your build recipe drops from 31KLOC to 1.1KLOC you are winning big no matter what the new build engine’s language looks like,

The discerning reader will have noticed that though I’ve talked about being a systems architect, none of this sounds much like what you might think systems architects are supposed to do. Big plans! Bold refactorings! Major new features!

I do actually have plans like that. I’ll blog about them in the future. But here is truth: when you inherit a mess like NTP Classic (and you often will), the first thing you need to do is get it to a sound, maintainable, and properly hardened state. The months I’ve spent on that are now drawing to a close. Consequently, we have an internal schedule for first release; I’m not going to announce a date, but think weeks rather than months.

The NTP Classic devs fell into investing increasing effort merely fighting the friction of their own limiting assumptions because they lacked something that Dave Mills had and I have and any systems architect necessarily must have – professional courage. It’s the same quality that a surgeon needs to cut into a patient – the confidence, bordering on arrogance, that you do have what it takes to go in and solve the problem even if there’s bound to be blood on the floor before you’re done.

What justifies that confidence? The kind of best practices I’ve been describing. You have to know what you’re doing, and know that you know what you’re doing. OK, and I fibbed a little earlier. Sometimes there is a kind of Zen to it, especially on your best days. But to get to that you have to draw water and chop wood – you have to do your practice right.

As with GPSD, one of my goals for NTPsec is that it should not only be good software, but a practice model for how to do right. This essay. in addition to being a progress report, was intended as a down payment on that promise.

To support my work on NTPsec, please pledge at my Patreon or GoFundMe pages.

Are tarballs obsolete?

http://esr.ibiblio.org/?p=6875

NTPsec is preparing for a release, which brought a question to the forefront of my mind. Are tarballs obsolete?

The center of the open-source software-release ritual used to be making a tarball, dropping it somewhere publicly accessible, and telling the world to download it.

But that was before two things happened: pervasive binary-package managers and pervasive git. Now I wonder if it doesn’t make more sense to just say “Here’s the name of the release tag; git clone and checkout”.

Pervasive binary package managers mean that, generally speaking, people no longer download source code unless they’re either (a) interested in modifying it, or (b) a distributor intending to binary-package it. A repository clone is certainly better for (a) and as good or better for (b).

(Yes, I know about source-based distributions, you can pipe down now. First, they’re too tiny a minority to affect my thinking. Secondly, it would be trivial for their build scripts to include a clone and pull.)

Pervasive git means clones are easy and fast even for projects with a back history as long as NTPsec’s. And we’ve long since passed the point where disk storage is an issue.

Here’s an advantage of the clone/pull distribution system; every clone is implicitly validated by its SHA1 hash chain. It would be much more difficult to insert malicious code in the back history of a repo than it is to bogotify a tarball, because people trying to push to the tip of a modified branch would notice sooner.

What use cases are tarballs still good for? Discuss,,,