The Document Model and The Machinery

Since my last few blog posts have been rummaging around at a pretty low level, let’s try something different this week and lift our gaze high, high — almost all the way to the top, to something that we refer to as the Document Model.

I’m not sure if anyone else is using this term too, or if there is another, more commonly used word for this thing. I just remember having these (somewhat heated) discussions a couple of years ago. We were clearly arguing about something, but it didn’t seem to have a good name, so we just started calling it the Document Model.

By the Document Model we mean the way documents, save/revert, undo/redo and window layout interoperate to provide an editing experience for the user. Note that we are not interested in how anything is implemented (that’s for later), just how things hang together conceptually. Any application that allows the user to create and edit things has some kind of Document Model.

This may seem suuuper abstract at this point, so let’s look at some concrete examples:

Microsoft Word has a pretty typical old-school document model. You have a file on disk. When you double-click it, it opens in a new window. In that window, you can edit it, undo/redo, save it back to disk or revert it. The undo stack is unlimited (or at least really big), but it is not saved with the document, so if you close your window you lose the possibility to undo.

For something radically different, consider a To-Do list app for your phone. Here, there is no concept of opening a document, you are just immediately editing (creating tasks and checking them off). Since there are no “documents” there is also no concept of “saving” or “reverting” — you just expect the data to somehow be automatically persisted for you and be there the next time you open the app. There is also (typically) no undo or redo functionality.

And between these two extremes there is a plethora of other possibilities.

In the document model we also count the way windows and tabs behave. Are there different kinds of windows — like regular windows and floating “tool” windows? What can be docked where? When do we open new windows? Etc. Version control is another important part of the document model — for any project that involves multiple people.

The Document Model for something like a game editor can be pretty complex, because game editors deal with a lot of stuff. They typically have an Asset Browser or similar that let’s the user navigate between thousands of different resources with different editors and tools. How do we keep all that consistent and usable?

In the low-level world, where I like to spend a lot of my time, you can often find answers that are right, or at least right-ish. After all, you can always benchmark your code to see what runs faster. But up here in Document Model Land, things are a lot more loosey-goosey.

So it’s not that one document model is right or wrong, rather we have choices. And the choices have consequences. In this post I’ll try to map out some of these choices and their consequences as they apply to our world of 3D scene/simulation/game editing. I’ll also talk a bit about some of the choices we’ve made for The Machinery, though a lot remains unanswered.

Let’s begin!

Saving: Projects or Individual Assets

A 3D world is typically built up from a collection of distinct, but interconnected, assets: models, textures, particle effects, scenes, etc. When designing a saving model, you can either focus on the distinctness or the interconnectedness. There are three basic approaches you can take:

  1. Assets are saved/reverted individually. I.e. you edit a specific scene/texture/model and when you are done, you save that particular thing.

  2. Saves are done on the project level. I.e. you can’t save changes to an individual model or texture, you can only save or revert the entire project.

  3. There are no explicit save or revert operations. Changes are always permanent (persisted automatically for the user).

Visual Studio is an example of model 1. Source code files are saved and reverted individually, even though they’re all a part of the same project. Powerpoint/Keynote could be considered as model 2. Even though a presentation has multiple individual parts (images, film clips, etc) it’s all saved as a single thing. Finally, Google Docs and the Calendar app are using model 3. Your changes are saved automatically without any explicit save step.

Going from 1 → 2 → 3 we are trading explicit control for convenience and “not having to think about it”. Modern phone and web apps tend to increasingly favor model 3. Why should the user have to worry about when things are saved when it can be taken care of automatically?

Many desktop programs are also secretly implementing model 3 behind the scene, even if they seem to be using model 1 on the surface. You can see this when you restart the program after a crash and it asks you if you want to “recover” the files you where working on — it was saving them all along!

Are there any advantages at all to explicit saving? I can see two:

  1. It explicitly indicates that the file is ready for sharing with other programs. I.e. you can attach it to an email or check it in to source control. When working in a large project together with other people, it is important to know exactly what has changed, when and why. Source control can do this too — you see what files you’ve changed when you check in, but it’s nice to be explicit about this. Furthermore, since source control works on a file-by-file basis, it maps a bit better to model 1.

  2. It gives you a “checkpoint” that you can go back to (by reverting or discarding changes). This allows you to play around a bit more without worrying that you will destroy your work, because you can always go back to the “checkpoint”. Model 1 allows you to checkpoint individual assets, while model 2 only allows you to checkpoint the project as a whole. Source control can do this too, but in a more complicated way.

It is possible to implement checkpoints in model 3 too. We could have a menu option that allows the user to create one (or more) checkpoints that they can revert to later. But since users aren’t that familiar to this new “checkpoint” concept we just invented, maybe we should just call the operation “Save” instead of “Create Checkpoint”. Now we’re back to implementing model 3, but disguising it as model 2.

Model 1 gives us individual checkpoints for each asset, while model 2 only gives us a checkpoint for the entire project. Individual asset checkpoints can be useful, but when the assets are interdependent they can also create inconsistencies.

For example, suppose that you add a bone to a model asset, and then an animation for that bone to an animation asset. Then you revert the model, but not the animation. Now you have an animation for a bone that doesn’t exist — an inconsistency.

Inconsistencies like these are kind of common in applications that use model 1. For example, in Visual Studio (our model 1 example) they can show up as compile errors, linker errors or missing project files. If you are building a model 1 application where there are dependencies between the assets you will face similar issues. You have to make sure your application has a way of displaying inconsistencies, such as missing files or references, and a way of resolving them.

All this extra error handling can be a lot of work, so we might be tempted to prevent it by using model 2. It would seem that with model 2, the application is always in control and can prevent inconsistencies from happening.

Unfortunately, if you want to use version control to collaborate with others the application is not in control and inconsistencies can still happen. Even if you use model 2 and save all files at the same time, the version control software can still update, commit, merge and revert files individually. This can cause the exact same inconsistencies that you can get with model 1 and the revert operation. It might even be a bit trickier, since the saves are now less obvious. So you can’t get away from the inconsistencies, your program will just have to deal with them.

A side note about collaboration: AFAIK, the latest versions of Keynote will actually break down your slides into multiple individual pieces when they’re saved to iCloud, so that multiple people can work in the same document without causing conflicts (as long as they work on different pieces). So in a way, Keynote is a model 1 program, posing as model 2!

Oh, and one more thing. Whenever you are faced with a tricky UI decision like which save model to use, there will always be someone who says “why don’t we implement both and leave it up to the user as a configuration setting”. I think that’s a cop-out and hardly ever the right decision. It’s hard enough to make one workflow run smoothly, if you have thousands of different possible workflows based on different setting flag combinations, none of them will be great. It also fragments the user base and makes it harder to follow tutorials (because the tutorial might use different settings than you). Don’t be afraid to take a stance. Applications should be opinionated! At least that’s my opinion.

Undo and Redo

The choice between a Projects viewpoint and an Assets viewpoint pops up again when you start looking at Undo and Redo. The Project approach is to have a single undo stack for the entire application. Any operation you perform ends up on that stack. The Individual Asset approach is to give each document its own undo stack. When you undo, you undo changes to that particular document.

Typically the Undo model follows the Save model, so if assets are saved individually they also have individual undo stacks. You can see this for example in Visual Studio. You can make some changes in a.cpp, make some other changes in b.cpp, then go back to a.cpp and undo the changes you did there without affecting b.cpp. With a shared undo stack, you would have to undo all the changes to b.cpp before you could get to the earlier changes you made to a.cpp.

I sometimes take advantage of these multiple undo stacks — i.e., go back and undo an older change in a different file — but it is a pretty rare occasion. I don’t know if that’s just me though. Do you ever use this feature, or do you mostly just undo your most recent changes?

Having multiple separate undo stacks can be problematic. With separate undo stacks per asset, any operation that affects multiple assets cannot be undone as a single operation. You can see this in Visual Studio with Find & Replace. Suppose you do a find-and-replace that replaces foo with bar in 200 files. Since there is no global undo stack, that operation has to go into the undo stack of each one of those individual documents. And if you want to undo it, you have to go into each document and undo the change there. That gets pretty tedious.

Another example: renaming a file is not an undoable action in Visual Studio. This makes sense, because it is not clear what undo stack that rename operation should go into. If we had a single project-wide undo stack, on the other hand, both Find & Replace and Rename could be undoable actions.

When files have interdependencies, multiple undo stacks can lead to the same inconsistencies as the revert operation we talked about earlier. We can use the same example as before. Suppose you create a bone and an animation and undo the creation of the bone, but leave the animation there. Now there is an animation for a bone that doesn’t exist.

If there is only a single undo stack, undoing will always take you back to a previous state of the project, so weird surprises like this can’t happen.

A single undo stack can be surprising in other ways though. Suppose you open a material asset, work on that for a bit, save the project and close it. Then you open a scene asset and start working on that. Now if you press undo too many times you will start undoing the changes you made in the material file. Moreover, since you’ve closed the material window there is no on-screen indication of what is happening. (Unless you consider the opening and closing of windows undoable operations too, in which case windows would pop up and disappear as you kept undoing. This is weird in a different way.)

Side note: If you are into collaborative editing (which, full disclosure, I am) even if you go with a unified undo stack, multiple undo stacks will creep back into your life again, because each user needs their own undo stack. Otherwise, a user couldn’t undo their changes without risking undoing changes made by other users, which would be super weird. This mirrors what we saw for save and revert. There, we thought we got rid of the inconsistencies, but they crept back through collaboration (via version control software). Here, online collaboration puts us at risk for undo inconsistencies.

Tabs & Windows

Another important part of the Document Model is how tabs and windows work. Again, this is something that may not seem like a big deal, but in a game editor things can quickly get out of hand with different editors, tool panels, properties, message logs, etc, etc:

A typical editor window.

The questions you face when designing the window system are things like:

  • Are there different kinds of windows? Is the Animation Editor a different kind of window than the Scene Editor?
  • Is the Asset Browser a window, a tab, or both?
  • Can you drag tabs from one kind of window to another?
  • Where do you put menu bars? In the tabs or in the windows? Or only in the main window? Is there a main window, by the way? What makes the main window different from other windows? If you close the main window, does the application shut down or does another window become the main window?
  • Are there tabs and child tabs? Is the Properties tab a child of the Scene tab since it displays the properties of objects in the Scene tab? Does this mean the Properties tab should be closed if the Scene tab is closed? Should the Properties tab in fact be docked inside the Scene tab so we have nested levels of tabs? Do we need multiple levels of nesting?
  • When we open a scene for editing does it open a new window with multiple tabs, or just a new tab, or do we reuse an existing tab?

There are a lot of different possible approaches. For Stingray we designed a solution with “big tabs” and “small tabs”. “Big tabs” where full-sized editors with menu bars that could be docked in windows. “Small tabs” where smaller auxiliary tabs that where docked inside the big tabs. It ended up being pretty complicated and was never fully implemented.

For The Machinery we wanted a simpler approach, so we started with a few basic rules:

  • Everything is a tab.
  • All the tabs are on an equal footing — there are no child tabs and parent tabs, big tabs and small tabs.
  • The user can create as many tabs as they want of any particular type. There can be multiple Properties tabs, multiple Asset Browser tabs, etc.
  • Tabs can be freely moved around, docked and arranged in windows any way the user likes.
  • Windows don’t have any meaning by themselves — they are defined by the tabs that are docked there. You can create a window out of any combination of tabs.
  • All the windows are on an equal footing — there is no “main” window. Every window has a menu bar. The content of the menu bar is defined by the tabs that are docked in the window.

So for example, if the Profiler tab is docked in a window, that window’s menu bar will get a Profiler menu. The tabs “carry” their menus with them and “dock” them in the window where they are docked.

An important piece of the puzzle is how we handle dependencies between tabs. For example, the Asset Preview tab depends on the Asset Browser tab, because it should show a preview of the asset that is currently selected in the browser. How do we handle this without introducing a “child tab” concept when there can be multiple Asset Previews and Asset Browsers open?

Our solution is to define this relationship implicitly. Instead of permanently storing and keeping track of the parent-child relationship between tabs we determine it at runtime, when a tab is rendered. (This is where it helps to have an IMGUI based system.)

The way it works is this: When it’s time for the Asset Preview to render it checks the window it is currently docked in to find an Asset Browser there. If it finds one, it shows a preview of the currently selected asset in that browser. If there are multiple Asset Browsers docked in the window, it shows the preview of whichever one the user last interacted with. We assume that this is the one the user wants to see.

If there is no Asset Browser docked in the same window we look for the last Asset Browser interacted with in any window. If we don’t find any asset browser at all, the preview is blank. Note that this means that if you move the Asset Preview from one window with an *Asset Browse*r to another, the Asset Preview will change to show the selected asset in the other window.

The Properties tab works the same way, but instead of just looking for the Scene tab we look for any tab that wants to display properties of objects. This way we can reuse the same Properties tab for multiple things. For example, when an object is selected in the asset browser, the Properties tab can show the properties of that asset. This helps keep the number of windows down which reduces the tab/window management that the user has to deal with. Other helper windows, such as the Tree or Help can also be reused in the same way.

In the future we’ll probably implement some way of “locking” and “unlocking” the connection between the tabs, so that you can keep the Asset Preview focused on one Asset Browser regardless of how you move it around, if you want to.

There is a bit of interaction between the window model and the save model. In save model 1, open windows are typically used to represent unsaved files and when you close the window you will be asked if you want to save the changes. Having this one-to-one correspondence between changed documents and open windows means that if you do something that changes a bunch of documents you also have to open windows corresponding to all those documents. This is why in Visual Studio, if you do a search-and-replace across multiple files, all those files will be opened.

There are situations where this model works really badly. For example, suppose that one of our assets is an Entity Prototype, and suppose we have an operation in the Scene editor that allows us to place a prototype in the scene, make some changes to it and then save them back to the prototype, so that they apply to all instances of the prototype. If we have the rule that every modified document must correspond to an open window, that “save back” operation has to open a window for the prototype asset, which feels pretty weird.

Another situation where this doesn’t work so well is if you allow the Properties editor to work directly on assets. I.e., if you can select one (or more) assets in the asset browser, see some properties for them in the Properties window and modify them. Now we would have to open editing windows for all those assets.

Of course, it is possible to use save model 1 without requiring every modified asset to have an open window. We could just keep all the modified assets in a list somewhere. This means that changes will not actually be lost when a window is closed. So should we still ask the user to save the changes?

Conclusions

At this point we’re pretty happy with the way windows and tabs work, but we are still weighing the pros and cons of the different save and undo models. Is being more explicit about exactly what files are being changed and when worth the risk of inconsistencies and non-undoable actions? Stay tuned, and let me know your opinions at @niklasfrykholm.