My “wiggle” program which applies a patch to a different version of the file needs to compute the shortest edit path between two texts, which is the smallest set of additions and deletions that must be made to one file so as to produce the other. From this edit path it is easy to produce a list of sections that are the same or are different, and wiggle uses this to work out what changes need to be made, and where to make them.
When the two files being compared are large – on the order of tens of thousands of words – this process can start taking a long time. I recently started experiencing such a case fairly often, so it was clearly time to optimize the code. I managed to find two optimizations that don’t change the result at all, and two others that degraded the result on large files in exchange for substantial speed improvements.
One of the behaviours of emacs that I want to avoid as I build edlib (my answer to emacs) is modal dialogs. I recently saw mention of Oberon and this link in particular: https://pdfs.semanticscholar.org/d48b/ecdaf5c3d962e2778f804e8c64d292de408b.pdf. It also mentions an aversion of modal dialogues. I don’t find myself completely convinced by the example given, but I do support the idea. This is as good an excuse as any to clarify my thoughts on the topic.
I have a new topics to write about – macro-economics. My interest is partly due to the enormous economic upheaval caused by COVID-19, but is more specifically due to reading Stephanie Kelton’s “The Deficit Myth” which outlines MMT – Modern Monetary Theory. I find this theory to be interesting and valuable but not entirely convincing (sometimes because I don’t agree, sometimes because there are gaps in the presentation). As this is my first post in a long time I’ve decided to start small and explore one simple topic: the value of money.
It has been some years since I last wrote about about parsing with indents and line-breaks. Some of that time I’ve been distracted by other things, some of it has involved working on other areas of language design, but some of it has seen me struggling with line breaks again.
In my last note I had made significant progress over my original design, and there is much in that note that is still valuable. But some of it I got wrong too.
Most particularly, the idea that NEWLINES should be separators rather than terminators was mostly wrong. I now believe they must be mostly thought of as terminators, though there is one area where we must blur our definitions a little.
While I was writing test code for my first toy language I particularly noticed the lack of interesting data structures: to write interesting loops you can do a lot more if you can build data structures while doing it. So my second toy will begin to introduce data types. It won’t be until a later iteration before we get structures though.
The first step to this is knowing how to declare variables, or local bindings. Then I’ll have something to attach types of data structures too. So this installment is about local variables and their scope.
I started writing this 3 years ago and am only publishing it now. Sometimes life works like that. I had to wait until the code worked and I managed to lose interest for a while and get distracted by other things. At the recent linux.conf.au I went to a talk about the language “Pony“. While I’m not thrilled with pony (I don’t currently think that expressions and statements are interchangeable), the talk inspired me to get back to ocean… So what can we say about scopes?
One of my core goals in developing edlib is to allow the display to be programmatically controlled: the content of a document is formatted dynamically as it is displayed, and so can respond to context (e.g. location of “point” can modify appearance more than just by displaying a cursor) and also so that the entire document doesn’t need to be rendered into a buffer, just the parts being displayed.
A natural (for me) extension to this idea was the possibility that the source of the display wasn’t just one single document – multiple documents could be blended. Various examples have occurred to me, though few have been implemented.
My recent efforts with edlib have been to get rid of some potential NULL pointer dereference issues. When I first started writing “commands” I assumed that the correct value would always be passed. Of course that isn’t very safe unless it is enforced, and is particularly unsafe if I’m going to let users write their own commends in extension languages like python. So more safety is required.
Auditing the code to ensure I’m always being careful enough is fairly boring and error prone, so I wrote a tool to help me. More accurately I extended some existing tools. You can read lots more details in my LWN.net article. At the time I wrote that I still had some unresolved issues. I’ve resolved enough of those now that I no longer have warnings. Sometimes that is because I used casts to hide things, but a lot of real issues have been addressed. The versions of “sparse” and “smatch” that I am using are in the “safe” branch of the copies of these trees on github.com. So this for smatch and this for sparse.
Doing this involved adding a lot of ‘safe’ annotations throughout edlib. Hopefully these aren’t too confusing. It is nice to have the documentation of intent, and nice to know that a whole class of errors is now impossible.
I had to fix a few bugs in smatch/sparse as well as add new functionality to smatch. I really should post those bug fixes upstream…
I’ve been very slack. Sorry. I keep thinking “I should write a blog post about that” when I make some progress with edlib. But I also think “or I could write some more code instead”. I do enjoy writing blog posts. But I seem to enjoy the code more.
Consequently, I’ve lost track of what has happened since my last post. Git tells me that I have made 379 commits since then. I’ve added “Alt-!” to run a shell command, with a history of recent commands stored in a buffer. I’ve provided a way for attributes on text to trigger call-backs to display panes, to enabling highlighting of text; and used this to better display matches for the current search. I’ve broken the “Refresh” event into three separate events, one that updates the sizes of panes, one that can update the position of a document in a pane, and one that redraws the content. And I’ve fixed lots of bugs and cleaned up lots of code. But the big thing that I’ve been working on is a notmuch email client.
In the weeks leading up toe linux.conf.au 2016 I approached edlib development as a sprint. I wanted to put together sufficient functionality so that I could present my LCA2016 talk using slides displayed by edlib. I achieved this goal but at some cost. A lot of the development that went into that sprint was fairly hackish and not well thought out. It achieved the immediate goal but wasn’t good long term development.
With that cost came benefits. Not just working code but a clearer perspective on what I needed to do and what problems would be faced by creating a slide-presentation mode. I don’t regret the sprint at all but it certainly didn’t result in completed work. I had to go over all of that development and turn the prototype into well designed code. This naturally took quite a bit longer but resulted in much more coherent designs and a stronger overall structure. This is done now and my LCA2016 presentation is now in my mainline of edlib development and works well.
There were a number of structural changes that I made while revising all this functionality, I’ll just address a few of them here.
I’ve been busy of the last couple of months. A number of family and personal things meant I have less time for edlib, but I had a lot to do for edlib too. I really wanted to use edlib to give my presentation at linux.conf.au 2016 in beautiful Geelong.