Commands to an editor – at least the sort of editor that edlib is designed for – usually involve single keystrokes or key combinations. Translating those keystrokes, as well as mouse actions, into commands is the topic of this note. Many editors also support commands that are written out with words. Such commands are often introduced with Meta-X in emacs or “colon” in vi. For the purposes of edlib, such interactions can be managed by creating a simple document in a small window somewhere and accepting key-stroke commands to edit that document. Some keystrokes will “complete” the command which will make the window disappear and will perform the required action. So word-based commands are certainly possible, but they happen at a different level.
The complex part of managing keystrokes is in tracking how they can mean different things in different contexts. There are a number of different sorts of context that all need to be managed.
- In today’s world it makes sense to support both “vi” and “emacs” styles of key bindings. These are global modes which potentially affect everything – at least everything on the one display. They determine not only how keys are interpreted but also what subsidiary modes are available. These modes stay in effect until they are explicitly changed to something else.
- With these there are long-lasting subsidiary modes. “vi” has insert mode and vim adds visual mode to that. emacs has a variety of major modes, some of which only redefine a few keys, others redefine almost all of them. These modes also stay in effect until canceled or explicitly changed, but they can be seen as subordinate to the global modes, not just other different global modes.
- Then there are short-term modes which only last for one to two more keystrokes before the prior major mode is reverted to. The vi commands “c” and “d” (change and delete) introduce such temporary modes. In these modes a movement command is expected and then text which would have been passed over by that movement is deleted. In emacs C-x (control-X) enters a temporary mode were most keystrokes have an alternate meaning.
- Numeric prefixes can be seen as introducing new modes too. Typing ‘2’ in vi will switch to a mode where the next command will be executed twice where that is meaningful.
Search commands fit into the category of “word-based” commands though they might seem quite different. A keystroke that initiates a search, such as ‘/’ in vi or ‘control-S’ in emacs, will create or enable a small window somewhere that becomes the focus for keystrokes. As the search string is built up incremental searching or possible-match highlighting might update the display of the target document. When “enter” is pressed the window will close and the originating window will receive some synthetic event describing the search. In vi you can type “d/string” and everything up to the start of “string” will be deleted. So exactly what happens with that synthetic event from the search box will depend on the state of the originating window.
Finding an abstraction that encompasses all of the above (and probably more) requires finding effective answers to a few questions:
- It is clear that some state needs to be stored to guide how each keystroke is handled. How much state exactly?
- State can be stored explicitly, such as a number to represent the repeat-count, or implicitly by selecting a different key-to-command mapping. To what extent should each of these be used?
- How is state changed? Should all changes between explicitly requested by a command, or should some be automatic such as a prefix being cleared once it has been used?
- Assuming that key-to-command mapping are used, should there be separate mapping for different states, or should the mappings be from state-plus-key-to-command and some part of the state be included in each lookup?
- If there are multiple windows on multiple documents, all in the one display, then some elements of state will affect all keystrokes, and some will only affect key strokes in a particular window. How is this distinction managed?
The choices that edlib makes are:
- There are 4 element of state: a ‘current’ mode, a ‘next’ mode, a numeric prefix, and 32 bits available for extra information. The ‘mode’ is a small number which can represent both long-term and short-term modes. Some modes might be vi-command, vi-insert, vi-change/delete, emacs-normal, emacs-C-x, emacs-escape. To enter a long-term mode, both ‘current’ and ‘next’ mode are set. To enter a transient mode, only ‘current’ mode need be set.
- Each pane (which can be nested arbitrarily deeply) can identify a single mode+key-to-command mapping. When a keystroke is received it is combined with the current mode and this serves as a lookup-key for those mapping. Look up starts in the current focus window, which is often a leaf, and continues toward the root until a command is found for the mode+key, and the command accepts it (a command can return a status to say “keep looking”).
- The current mode, numeric prefix, and extra information are provided to the command and those values are reset (current mode being set to ‘next’ mode) before running the command. The command can make arbitrary changes, such as restoring any of the state with arbitrary modifications.
- Command can synthesize new requests which can then be re-submitted at the current focus pane. These requests do not affect the current state, though the caller can modify the state before and after initiating the request. This allows responsibility for some commands to be shared. For example the keystroke to “move forward one word” might be translated in to a virtual keystroke which always means “move forward one word”. Different panes which might contain different sorts of document might then interpret “word” differently. The emacs command alt-D (delete word forward) might record current location, submit that “move forward one word” virtual keystroke, and then submit a “replace” virtual keystroke which tell the underlying document to replace everything from one mark to another with the empty string.
When a search command, as discussed earlier, opens a search window, it might store the context that came with the search command, particularly the current mode. When the search completes and a virtual ‘search’ keystroke is synthesize, it can include the saved search context so that, for example, vi-like editor know whether to move to, delete to, or change to the search destination.
Mouse events are handled much the same as keystroke events, except that they are directed to the leaf-most pane which covers the location of the mouse event. State is passed in and changed in much the same way as for key strokes.
Key-up events are not tracked or supported. mouse-button-up events will be but the details are not yet clear. Some sort of ‘grab’ will be required so that the mouse-up goes to the same pane as mouse-down. When an event is synthesized, it can be directed to a specific location so that, for example, drag-and-drop functionality could be implemented by the mouse-up handler sending a ‘drop’ virtual keystroke to the appropriate co-ordinates.
So far, this model seems sufficient for all of my needs. If it turns out to be insufficient in some way, it will doubtlessly be revised.