Nifty fractal

Immediate Mode GUI Is a Mirage

Periodically someone rediscovers immediate-mode GUIs: hey, instead of creating this tree of complex objects, why not just call DrawWidget() in the draw function? So much simpler! (It’s genius!) Then a bunch of people chime in “I used Dear ImGUI in a project and it was fantastic! Oh, and uses it, too!” I, too, was excited about immediate-mode GUI when I first heard about it. I have also used Dear ImGUI, which is the canonical immediate-mode GUI, and widely used in games development.

You think about drawing your UI, especially if you are doing game development, and you think labels, checkboxes, buttons, and of course, it’s so brilliant, you don’t need a heavyweight retained-mode UI, just call a function! In fact, that’s what Unity did in the 2013 time-frame. There are two major problems, however: most widgets have a fair amount of internal state, and figuring out the rectangle to draw the widget tends to require information about the sizes of widgets that have not been drawn yet.

Widgets have a lot of internal state that somehow has to be passed to the draw function. It seems manageable for a button or checkbox: just a boolean of whether it is down/on. Assuming that mouse information was passed to the BeginGUI() call, that is sufficient for a no-frills button. However, generally buttons click on mouse release, so that you can move off it to cancel if you clicked by mistake, so now you need to know if what widget the previous mouse down happened in (if the mouse is current down or just released). If you want tooltips, you need to know how long the mouse has been unmoved, if this widget is currently showing the tooltip, and if it already showed a tooltip (which needs to be reset when the mouse moves out of the button). List boxes need to keep track of selected rows, the selection mode (none, select one, select many), and the scroll position. Text editing needs the selection and the scroll position, and if it supports CJK editing or compose characters it also needs to know the not-yet-converted text. If you want to support undo, then you need to keep track of undo history since the last text comment. All this state adds up, and a true immediate-mode GUI pushes this state onto the caller. Once you start getting a lot of elements this state is probably going to be either a hard-coded, structs-in-structs UI tree, or it is a dynamic tree with a dynamically allocated chunks. Once you start dynamically allocating things, though, you are “retain”ing that state, and this is starting to be a retained-mode GUI, except that the caller has to manage the internals. (A hard-coded structs-in-structs approach is also retained, too, although not it the sense of malloc/new that is usually meant.)


// Naive immediate-mode API exports all its internals. A more sophisticated
// API could provide a struct for the internal state, which would improve
// the situation a litte.
struct {
  ...
  struct {
    Rect frame;
    std::vector entries = { “Mandelbrot”, “Julia” };
    int selectedIdx = 0;      // widget value
    bool menuIsOpen = false;  // internal state
  } typeCombo;
  ...
  struct {
    Rect frame;
    int value = 0;               // widget value
    int minValue = 512;
    int maxValue = 10;
    int step = 1;
    int selectionStartIdx = -1;  // internal state
    int selectionStartIdx = -1;  // internal state
    bool hasTextFocus = false;   // internal state
  } maxIterEdit;
  ...
  // More sophisticated version
  struct {
    Rect frame;
    float value;
    float minValue = -100.0;
    float maxValue = -100.0;
    float step = 0.01;
    BetterFloatEditState state;
  } x0Edit;
} ui;
// Hmm, how do we compute the values for the frames?
BeginGUI(...);
  ...
DrawComboBox(“Type”, &ui.typeCombo.frame, ui.typeCombo.entries,
             &ui.typeCombo.selectedIdx, &ui.typeCombo.menuIsOpen);
  ...
DrawIntEdit(“Max iterations”, &ui.maxIterEdit.frame, &ui.maxIterEdit.value,
            ui.maxIterEdit.minValue, ui.maxIterEdit.maxValue, ui.maxIterEdit.step,
            &ui.maxIterEdit.selectionStartIdx, &ui.maxIterEdit.selectionEndIdx,
            &ui.maxIterEdit.hasTextFocus);
  ...
DrawBetterFloatEdit(“x0", &ui.x0Edit.frame, &ui.x0Edit.value,
                    ui.x0Edit.minValue, ui.x0Edit.maxValue, ui.x0Edit.step, &ui.x0Edit.state);
  ...
EndGUI();

Widgets also need to be positioned, and widget layout is messy. Even something like right-justifying Ok | Cancel buttons requires some calculation, because you need to know the width of each button to know what the starting x-coordinate is. But knowing the width requires that you know the details of how the button is drawn, not only the text + icon (if applicable), but the internal padding. Things get more difficult if you have a property panel, because the width of the property panel is the width of the text column and the property widget column. But to get the width of the widget column you need to know the width of the text column, and then the width of the properties column is the width of the panel minus the width of the text column (and that is not accounting for internal padding, but at least that is generally a fixed amount). This is kind of a mess, and we have not even got to a complicated layout yet.

Type Palette Max iterations x0 y0 x1 y1 Mandelbrot Rainbow 256 -2.20 -1.50 0.80 1.50
The minimum width of the property panels depends on the longest label (“Max iterations”) and the longest combobox text (“Mandelbrot”).

Unity’s old UI handled state data by requiring the caller to manage it. It handled layout and other details by calling your draw function multiple times. In all draw pass it actually draw everything, but in all the other passes (e.g. the layout pass) the draw commands were ignored. This makes sense, it is how an immediate-mode GUI needs to work, but it was a pain to use. Managing the internal state of all those elements got unwieldy after three or four of them, and having your draw function called multiple times was … unexpected, although you got used to it. So everything was clunky and nobody liked it, and they eventually replaced it. (Several times, I think, but fortunately I was no longer using Unity at that point. I fairly quickly came to the conclusion that all the API decisions [not just UI] that Unity had made were rational, and probably what I would have done in their place, but in actual fact, were the wrong decisions.)

But, you say, Dear ImGUI is immediate mode and does not have these problems. Ah, but no, it has exactly these problems but it hides them from you. It keeps a retained-mode copy of the widget state internally. When you draw the widget, it uses the text of the widget (or an id string, particularly for widgets like list boxes) to go find the widget’s state. This is why if you forget and have two buttons named the same, the second button does not work properly. The string search (or hash, I’m not sure) is wastefully slow, but if it passed back a pointer to the state it would reveal that it is retained-mode under the hood, and you’d lose the convenience of not having to keep track of the pointer. All in all, it is probably the right decision, but it is hard to call Dear ImGUI immediate mode if it has retained-mode state hidden from you down in the crypt.

Dear ImGUI deals with the problem of layout by relying on the caller drawing a 30 fps (which is reasonable, since it was intended for use for game debugging UIs.) When a layout is required it either does not draw certain elements for a frame, or it draws their text with glyphs that are rectangles shaped roughly like the real glyphs for one frame. That frame figures out the proper widths so that the next frame is drawn correctly. Since there is only one in-process frame it all works seamlessly. Well, it does unless you are not writing a game, and do not want to be redrawing all the time and wasting the user’s battery, then you have to figure out when you need to redraw twice.

Now Dear ImGUI is a convenient API, and compared to Unity, it clearly made the right decisions on these two points. It is a nice API, but the fact that the most notable immediate-mode GUI is actually implemented as retained-mode proves the point that immediate-mode GUI is not a feasible solution.