VisualEditor/Tutorial

Tutorial introduction

What is VisualEditor?

VisualEditor is an in-browser rich text editor for HTML documents. It is most widely used as the Wikipedia editor. However, the core implementation is a standalone JavaScript library that does not depend on the MediaWiki platform.

Who is this tutorial for?

This tutorial focuses mainly on the fundamentals of the VisualEditor core implementation. It is aimed at newcomers joining the Editing team and volunteers who wish to hack on internals, but it should be interesting to anyone who wants to understand how an in-browser rich text editor works.

Some of the fundamentals depend heavily on algorithms, so a computer science background would be very helpful, but pointers will be provided so the reader can fill in necessary background learning.

A word about MediaWiki integration

VisualEditor works on HTML documents — it doesn’t know wikitext. However, MediaWiki’s native storage format is wikitext. This is possible because MediaWiki’s parser (Parsoid) can automatically translate to/from HTML+RDFa format, and a lot of work has gone into ensuring diffs round-trip cleanly, so that source editors who use raw wikitext can work side-by-side with rich-text editors who use VisualEditor.

VisualEditor never needs to parse wikitext directly; from its point of view, it just sees MediaWiki loading/saving HTML+RDFa documents.

Tutorial 1: Dive right in

Let’s dive right in, without even bothering to install a development environment on your computer. Open https://simple.wikipedia.org/w/index.php?title=Oolong&oldid=9187227&veaction=edit in your browser.

Some notes.

We’re going to try things out on a live Wikipedia page, loaded into VisualEditor. So of course don’t click Publish! (Though if you did ever publish a test change by accident, you could just revert it or someone else would).

Working on a live Wikipedia page means you don’t need a local development environment identical to the instance, which can actually be hard to achieve. There are over 300 versions of Wikipedia (in different languages) and each may have different templates and extensions installed, not to mention a data set that can be very large.

We’re using Simple English Wikipedia, which is written in easy-to-understand English and has a (much) smaller data set than full English Wikipedia.

In the URL, oldid=9187227 means we’re opening a specific revision of the page, not necessarily the most current. This is helpful for a tutorial because it means the data will be exactly as the tutorial expects. In general, if you edit an old revision then click Publish, you’ll potentially undo the changes made in subsequent revisions — as an editor you’re responsible for handling this manually. This doesn’t matter for the tutorial, because you won’t ever publish.

In the URL, veaction=edit means we’re jumping straight into VisualEditor. Normally, you’d open an article in read mode, and then click to edit the page.

Q1

Now open the developer tools. In Firefox and Chromium, you can press Ctrl+Shift+I to do this. Click on the Console tab, and type:

s = ve.init.target.surface.model
d = s.documentModel

Then d is the document model (i.e. the abstract data representation version of the document we're editing). s is the surface model, which additionally represents the current selection.]

We will use s and d so defined throughout this tutorial.

Now try:

d.getData()

The array returned is a dump of the linear model

Q1a. How long is the array?

A1a. The array has length 779.

Q1b. What section within the array represents the word “traditional” in the first paragraph?

A1b. Offsets 46-56 contain:

[
	//…
	"t",
	"r",
	"a",
	"d",
	"i",
	"t",
	"i",
	"o",
	"n",
	"a",
	"l",
	//…
]

Q1c. What section within the array represents the word “Oolong” in the first paragraph?

A1c. Offsets 1-6 contain:

[
	//…
    ["O", ["hed5e5a34cf0f5c5b"]],
    ["o", ["hed5e5a34cf0f5c5b"]],
    ["l", ["hed5e5a34cf0f5c5b"]],
    ["o", ["hed5e5a34cf0f5c5b"]],
    ["n", ["hed5e5a34cf0f5c5b"]],
    ["g", ["hed5e5a34cf0f5c5b"]],
    //…
]

Q1d. What difference do you notice between the representation of “traditional” vs “Oolong”?

A1d. Each letter of “Oolong” is represented as an array, the first element being the letter and the second element being ["hed5e5a34cf0f5c5b"]. But it is not immediately obvious what hed5e5a34cf0f5c5b means.

LEARNING GOALS: We learned how to access the live VisualEditor instance running within an editing session on a Wikipedia page, and how to query the document model to see the abstract representation of the content.

Q2

Now we’ll take a step back and learn about how the document model fits into VisualEditor as a whole.

Q2. Read over VisualEditor/Design/Software_overview#Architecture then fill in the blanks:

“The three primary components of VisualEditor are ve.__, ve.__ and ve.__ .”
“The linear model is optimized for ___ editing. It is similar to an ___ token stream, but with ___ ___ composed onto each character. This allows arbitrary ___ of content to be simple and efficient.”

LEARNING GOALS: We learned about the VisualEditor architecture, and in particular what the linear model is. Now we’ll return to the live editing session, and learn about the transactions system.

Q3

In the developer tools console tab (see above), type:

d.completeHistory

and find the .transactions array. If you haven’t edited anything, it should contain exactly one transaction. (If you have edited something, you may need to open this afresh in a private browser tab, to defeat VisualEditor’s autosave feature which may preserve your edits even if you close and reopen the page). That transaction should consist of a single .operation.

Q3a. What is the .type of the operation? What does the operation do?

A3a. The .type of the operation is 'retain'. It is essentially a no-op, keeping content unchanged.

Q3b. The .length property is 779. Where have you seen that number before? Why do you think it appears here?

A3b. 779 was the length of the linear model data. So retaining 779 items means keeping the entire document unchanged.

Q4

Now click on the document text, press Ctrl+A (to select all), then press Backspace (to delete the entire selection). Look again at .getData().

Q4a. How long is the array now?

A4a. The array now has length 348.

Q4b. What are the items at offsets 2 and 3?

A4b. The items at offsets 2 and 3 are open and close tags for an mwCategory item.

Q4c. Can you find content in the editing interface that corresponds to the items? Can you do something to delete that content and make those items disappear?

A4c. They correspond to the Category: Tea tag at the bottom of the page. Clicking on the tag, removing the category Tea, then clicking “Apply changes” removes the tag from the page and the items from the linear model data (so calling d.getData() will show the length becomes 346).

Q5

Now try:

d.getDocumentRange()

Then try it again after pressing Ctrl+Z repeatedly to undo all changes.

Q5a. What is the response? Comparing to the linear model, what do you think “document range” might mean?

A5a. After doing Select All + Delete, the document range is 0-2. For the original document state, the document range is 0-435. So “document range” in some sense represents the entire document. But it is not immediately obvious why some content lies beyond the document range.

Q5b. Look closely at the linear model data again. What tag contains all the content outside the document range? What is this content?

A5b. All the content outside the document range lies inside a single internalList tag pair. It appears to contain content that appears in references.

Q5c. Why do you think such content is stored separately?

A5c. Each reference can be cited more than once, so there may be a need to have a handle on them in an object store that’s separate from the main document content.

Q5d. It feels like such content should have disappeared when you deleted everything. But it didn’t. Can you explain why that is not actually a bug?

A5d. Uncited references just take up unnecessary memory in the internalList store. This is not really a problem because a VisualEditor edit session has a relatively short lifetime. Once the editor publishes, the edit session will end and the memory will be freed.

Q6

In the linear model, find the words “Oolong” and “traditional” from the first paragraph again. Recall there’s an interesting difference, and expand the items to see it completely.

Q6. What do you think hed5e5a34cf0f5c5b might mean?

Walkthrough of Tutorial 1 steps
A6. The word “Oolong” is bolded and each letter is represented with the hed5e5a34cf0f5c5b code, whereas the word “traditional” is not bolded and the letters do not have the hed5e5a34cf0f5c5b code. So it looks like something to do with the bold.

Q7

Take a look into:

d.store

Look at the .hashStore attribute and find hed5e5a34cf0f5c5b

Q7. Does this confirm your suspicions about hed5e5a34cf0f5c5b?

A7. Yes, hed5e5a34cf0f5c5b is the key for VeDmBoldAnnotation

Tutorial 2: Transactions

Check out https://gerrit.wikimedia.org/r/VisualEditor/VisualEditor.git and host the files in a local webserver (e.g. python -m http.server).

Browse to http://localhost:8000/demos/ve/desktop.html#!h1

In the browser console, do:

s = ve.init.target.surface.model
d = s.documentModel
tx1 = ve.dm.TransactionBuilder.static.newFromRemoval( d, new ve.Range( 3, 5 ) )

Q1

Q1. Look inside tx1.operations and guess at the meaning of everything

A1. tx1.operations is a diff representing how the document shall be changed. Conceptually, it is applied by putting a pointer at the start of the linear model then working through the operations
the “retain” operation of length 3 means the next three characters shall remain unchanged;

the “replace” operation removes [ 'c', 'd' ] and inserts [] (i.e. it is a pure removal);

finally the “retain” operation of length 6 means the rest of the document shall remain unchanged.

Notice that the “replace” operation is symmetrical: it specifies the content to remove as well as the content to insert. This is useful for creating the reverse (“undo”) transaction:

s.change( tx1 )
s.change( tx1.reversed() )
s.change( tx1 ) // error, because tx1.applied === true
tx2 = ve.dm.TransactionBuilder.static.newFromInsertion( d, 6, [ 'h', 'e', 'l', 'l', 'o' ] )

Q2

Q2. Select ‘ef’ with the mouse. What do you expect will happen to the selection when you apply tx2?

A2. Since the insertion will happen in the interior of the selection, the inserted text will grow the selection.

Q3

s.change( tx2 )
s.change( tx2.reversed() )

What do you think will happen if you apply tx1 then tx2?

tx1 = tx1.clone() // so the .applied flag will be false
tx2 = tx2.clone() // so the .applied flag will be false
s.change( tx1 )
s.change( tx2 ) // error

Q3. What’s wrong with tx2? How would you change tx2 so it would make sense on top of tx1?

[ tx1on2, tx2on1 ] = ve.dm.Change.static.rebaseTransactions( tx1, tx2 )
s.change( tx2on1 )

Q4

Q4. Reload the page, setup s= and d= and tx1= again. Set a breakpoint inside s.change. Apply tx1 and step into everything interesting.

An interesting place to explore: Step into changeInternal and see that it commits each transaction via ve.dm.Document.commit. Step inside again. Notice that each commit creates a new ve.dm.TransactionProcessor and calls its process method, which then calls ve.dm.TreeModifier’s process method. Step into ve.dm.TreeModifier.static.applyTreeOperations. From here we arrive at the ve.dm.TreeModifier.static.applyTreeOperation method that we’ll learn about in Tutorial 3.

Tutorial 3: Synchronous tree updates

First, disable ve.freeze with an unconditional return before line 8, and force-reload.

Open http://localhost:8000/demos/ve/desktop.html#!h1 and set a breakpoint inside ve.dm.TreeModifier.static.applyTreeOperation at “case ‘removeText’” on the line that calls spliceLinear.

Put the cursor at ‘abc|defg’ and backspace the c. The breakpoint should trigger.

In the console, check d.data.data. Notice the ‘c’ is still present.

Now step over once (i.e. over the ‘spliceLinear’ line). Check d.data.data again. Notice the ‘c’ has disappeared — the linear model has been updated.

Step over the checkEqualData line (which is just verification), then step into ve.dm.Node#adjustLength.

Q1

Q1 What are the parameters to this function?

A1. adjustment = amount to adjust length by (-1 in the case of removing one char)

Q2

Step in again, into ve.dm.Node#setLength. This method changes the length of this DM node, and makes the corresponding change to all ancestor nodes (recursively), and uses “emit( ‘update’ )” to notify listeners there has been a change.

Q2 Read this method very carefully, and try to state in what order the following things happen: updating this node’s length, updating ancestor nodes’ lengths, notifying listeners of these changes.

A2.
Nitty gritty details:

The node we’re updating is type “text”, its parent is type “heading”, and grandparent is type “document”.

Updates this node’s length (7 to 6)
Updates parent’s length (7 to 6)
Updates grandparent’s length (11 to 10)
Emits ‘lengthChange’ and ‘update’ from grandparent’s setLength

Emits ‘lengthChange’ and ‘update’ from parent’s setLength

Emits ‘lengthChange’ and ‘update’ from node’s setLength

Takeaway:
Recursively updates all lengths, starting at the current node. When it hits the end of the recursion, it emits ‘lengthChange’ and ‘update’ from each node, all the way back to the starting node. All lengths must be adjusted before emitting update; the LM and DM tree must be in sync.

Q3

Q3 Make yourself 100% sure how the recursion plays out and the ordering of these changes … you’ll need this knowledge to continue.

Q4

Step into the emit( ‘update’ ) line. This will pass into OO.EventEmitter#emit; you want to step into the method.apply line, which will pass into ve.ce.BranchNode#onModelUpdate. Notice we’re now in a completely different part of the codebase: the listener lives in the CE.

Q4 Is this listener running synchronously or asynchronously, with respect to the emit call? How do you know?

A4. Synchronously. The emit call blocks until all the listener functions for it have completed their execution.

Q5

Notice we’re processing operations one by one. Each operation modifies the linear model, then we update the DM tree correspondingly, then each node change in the DM tree emits an ‘update’ event which the CE node uses to update itself correspondingly.

Q5 How is this even possible? A single linear operation, in isolation, does not necessarily preserve tree validity. It can leave the linear data in a state that does not even represent a tree. For instance <heading>...</paragraph>. So how does VE update the tree incrementally?

A5. There are two different types of operations here: linear operations and tree operations. ve.dm.TreeModifier calculates tree operations from the linear ones, and each tree operation is guaranteed to leave the tree in a valid state. Step into ve.dm.TreeModifier.calculateTreeOperations to see how tree operations are made.

Next time: synchronous updates originating in the model.

Tutorial 4: Updates initiated in the model vs the view

Open http://localhost:8000/demos/ve/desktop.html#!h1 and set a breakpoint at the start of ve.ce.ContentBranchNode#renderContents.

Create and apply a transaction to remove the ‘c’ and ‘d’ programmatically:

tx1 = ve.dm.TransactionBuilder.static.newFromRemoval( d, new ve.Range( 3, 5 ) )
s.change( tx1 )

The breakpoint in renderContents should trigger.

Q1

Q1. Look at the call stack. How did renderContents (which is CE code) get called from DM code (which isn’t supposed to know or care whether there’s a CE listening)? Is this call synchronous (=happens while the DM is applying a transaction) or asynchronous (=happens after the DM has finished applying a transaction)?

A1. renderContents is called from the event emitter that we went over in the previous section. The call is synchronous; it happens while the DM is applying a transaction (to be precise, after the current tree operation has been processed but before the next tree operation is processed)

Q2

Q2. Step carefully through renderContents. When does the update reach the DOM?

A2. Child nodes are detached from $this.element and then changes are made. The changes reach the DOM when the nodes are reattached to $this.element with appendRenderedContents.

Q3

Now undo the transaction:

s.change( tx1.reversed() )

Now apply the same change but do it by editing the contentEditable DOM directly: select the letters ‘cde’ and press ‘e’ (so the net effect will be to remove the ‘c’ and ‘d’). The breakpoint in renderContents should trigger again.

Q3. Look at the call stack this time. Can you see where the following things happened?

ve.ce.SurfaceObserver detected that the content changed
ve.ce.Surface built a ve.dm.Transaction
ve.ce.Surface added a render lock then applied the transaction
ve.ce.ContentBranchNode saw the render lock and so did not try to update its contents

A3
ve.ce.SurfaceObserver detects that the content has changed in pollOnceInternal

ve.ce.Surface builds a ve.dm.Transaction in handleObservedChanges; specifically, it calls ve.ce.TextState.getChangeTransaction to build the transaction from the observed change (this call is not seen in the call stack because it has already returned)

ve.ce.Surface adds a render lock in handleObservedChanges and applies the transaction in changeModel

ve.ce.ContentBranchNode checks for the render lock in renderContents (first if statement returns false)

Q4

Q4. Describe briefly the difference in control flow between the first example (where the update was initiated in the model) and the second example (where the update was initiated in the view).

A4
(A) Update initiated in model, (B) Update initiated in view
Differences:

In (A), the DM initiates the transaction; in (B), the ve.ce.Surface initiates it
More specifically, in (A), we build a transaction manually and then call ve.dm.Surface.change on it (though in other model-initiated changes it could come from a keydown handler and be built programmatically)

Whereas in (B), ve.ce.Surface observes a change that already happened to ContentEditable, builds a transaction from the observed changes, and then calls ve.dm.Surface.change

In both (A) and (B), DM is then updated through the TransactionProcessor

In (A), the view is updated in renderContents; in (B), renderContents does nothing

Tutorial 5: Annotation nails

Open data:text/html,<h1 contenteditable>abc <i>def</i> ghi</h1> in Chromium and inspect the elements.

Q1

Q1. Guess, and then test, what formatting will appear if you type text after placing the cursor:

between the space and the ‘d’
between the ‘f’ and the space

Notice that the cursor positions above are visually ambiguous: it’s not clear whether they lie inside the italic tags or outside. Chromium normalizes ambiguous cursor positions towards the left, or more precisely, towards the document start (since it applies in right-to-left scripts too).

Q2

Using the console, try placing the cursor programmatically after the f and outside the italic tag:

sel = window.getSelection();
textNode = document.body.firstChild.firstChild.nextSibling.nextSibling;
r = document.createRange();
r.setStart( textNode, 0 );
sel.removeAllRanges();
sel.addRange( r );

Q2. Then close the inspector and start typing. What happens?

Notice that the text is italicized anyway.

Q3

Q3. Try the same experiments from Q1-2 in Firefox. Does the result depend whether you click on the cursor position or cursor there?

Notice that Firefox does NOT normalize ambiguous cursor positions. When moving the cursor with left/right arrow keys, it moves lazily (choosing the nearest of the ambiguous cursor positions to the prior position).

Q4

Q4. Repeat Q1-2 in Chromium but with the italic tags replaced by <a href=xxx>...</a>. What is different?

Notice that Chromium has special rules whereby typing at the end boundary of a link never extends the link.

Q5

Q5. Open http://localhost:8000/demos/ve/desktop.html#!h1 and paste abc <a href=xxx>def</a> ghi inside. Cursor very slowly across the content. Do you notice interesting behaviour?

Notice that VE adds an extra cursor step to step into/out of a link, whereby you can type text that extends the link or not, depending on your wishes. Can you think of how this might have been implemented? Bear in mind you’ve just seen Chromium’s native behaviour won’t let you extend a link by typing text at its end.

Fixup as you type, to add link annotation? No, we used to do that but it breaks IMEs. In general we can’t fixup text if it might be part of uncommitted IME candidate text, and there’s no easy way to detect whether text is part of uncommitted IME candidate text. This massively constrains what fixups we can apply.

IME = Input Method Engine, a software component for typing languages with complex scripts, such as Chinese or Japanese. An IME treats a combination of keystrokes as a composite character. This might look like a dropdown of candidate text that the user can choose from as they type. On mobile browsers, the mobile keyboard is an IME and so imposes these same constraints.

Change the link styling to inline-block or block? No, the latter can actually solve this problem, but it has major side effects (e.g. breaks word wrapping)

Inspect the link to see how we achieve this behaviour. We call the extra <img> elements “annotation nails”.

Q6

Q6. Can you explain we need two at each end of the link, and not just one?

A6. (Depending on the browser) Without the second nail on either side, the browser might always treat the text next to the link as “not a link”, because of how the browser sees img tags.
An example of the difference is how Chromium would actually work with just one tag on either end, since Chromium doesn’t normalize across an img tag. However, this wouldn’t work in Firefox.
It’s less messy to have two images, as it’s more predictable across browsers and responds better to potential future browser implementation changes.

Q7

Click the “Input debugging” button in VE standalone, so the link nails become visible. Now cursor carefully across the entire content again.

Q7. Can you see a point where the cursor jumps two nails in one step? Why do we want this?

A7. The nails create an extra cursor position that sits outside the link.

Q8

Now select “Disable JavaScript” in the Chromium inspector settings and cursor again. Notice the cursor does not jump two nails in one step.

Q8. Guess how we used javascript to fix this behaviour.

A8. When the cursor key is pressed, we see if we’re about to jump over a nail and if so we jump over two.

Q9

Go to he.wikipedia.org and copy a single word of Hebrew (language written right to left). Paste it over the ‘def’. Now try cursoring right across the h1.

Q9. Is your browser doing visual bidi cursoring or logical bidi cursoring? (Search for these terms if you’re not sure what they mean)

A9. Bidi = bidirectional; combines LTR and RTL scripts
With bidi text, cursor movement/selection is handled in two ways:

Visual = Cursor moves to the next visually adjacent character, regardless of text’s directionality
If you press the left arrow, the cursor moves left, regardless of the direction of the text at the cursor position

Logical = Cursor decides what “before” means based on what’s in memory, the data model, regardless of how it’s rendered

https://codemirror.net/examples/bidi/
Note that when extending a selection (e.g. with Shift held down), any app uses logical cursoring because the selection has to be contiguous.

Q10

Q10. Bearing in mind that some browsers do visual cursoring and others do logical cursoring, and there’s no easy way for our code to know which will happen, do you need to improve on your answer to Q8?

A10. We have to wait and see whether we did jump over a nail, and then fix the jump to cross the other nail too.

Q11

Q11. What do you think “prepare-observe-fixup” means with respect to how we handle cursoring across link boundaries? Can you think of other cases where this would be a useful pattern?

A11. Another case is different scripts and how they may affect cursoring.
In bidirectional text, we cannot necessarily tell whether pressing Left will move the cursor towards the logical start of the document or towards the logical end.

In text with complex grapheme clusters, we cannot necessarily tell how many logical offsets the cursor will skip.

Therefore we need to let native cursor movement happen — and then potentially need to fix up where the cursor landed, for example if it lands inside content that should be read-only (such as template output).

Tutorial 6: Debugging

Put “throw new Error( 'foo' )” at the top of ve.ce.ClipboardHandler#afterPaste. Then open http://localhost:8000/demos/ve/desktop.html#!h1 and try to paste something with the console open. Note how the browser gives an “uncaught error” warning in the console.

Next, move the “throw new Error( 'foo' )” into the .then callback at the bottom of that method. And try the paste again. Note how there is NO warning now. This is because the .then callback is called through the jQuery promise system, and there’s no good way for it to know whether the promise error is uncaught. (See https://phabricator.wikimedia.org/T233480 ).

Now instead try “Promise.resolve().then( function () { throw new Error( 'foo' ); } )” Note this time you do get an “uncaught error” warning — because the native promise system does know whether the native promise error is uncaught.

Q1

Q1. Suppose you suspect an uncaught error is happening in a .then callback within some new code, but you can’t see exactly where. How might you temporarily use native promises to help you debug this?

A1. You could wrap the suspect code inside a native promise callback to quickly see where the uncaught error might be.

Q2

Read https://developer.mozilla.org/en-US/docs/Web/API/HTML_DOM_API/Microtask_guide/In_depth to learn about microtasks.

Takeaways:
Microtasks were added to JS as a way to escape the limitations of a single-threaded language (which JS is)

Agents

Runtime engine maintains set of agents in which to execute JS code

Agents are made up of
Set of execution contexts

Execution context stack

Main thread

Set for any additional threads that may be created to handle workers

Task queue

Microtask queue

Each component of an agent is unique to that agent (except the main thread)

Event loops
Each agent is driven by an event loop

Each iteration of an event loop:
Runs at most one pending JS task

Runs any pending microtasks

Performs any needed rendering and painting before looping again

Task = anything scheduled to be run by the standard mechanisms such as initially starting to execute a script or async dispatching an event
A task can be enqueued by using events, setTimeout(), setInterval(), etc

Microtasks vs tasks
Only one task executes per event-loop iteration; microtasks run after each task finishes, before the next task begins (including any microtasks scheduled by those microtasks)

Q2. VisualEditor’s jQuery promises are created using the helper method ve.createDeferred . It wouldn’t be a huge code change to reimplement this to use native promises instead. Why might this create subtle timing issues? (Hint: native promises use microtasks but jQuery promises don’t)

A2. Because native promises use microtasks, they execute immediately after a task finishes; once the current stack finishes, all pending promise callbacks run. jQuery’s promise callbacks run later in the event loop, after the browser has processed other events. However, we might not want callbacks to run any earlier, as we need the information that the browser is processing. More generally, the slight change in the processing moment might introduce some very subtle timing bugs that are infeasible to debug, e.g. in interactions with IMEs.

Q3

Reload http://localhost:8000/demos/ve/desktop.html#!h1 and place the cursor between ‘c’ and ‘d’. Click Filibuster, press Enter, then click Filibuster again. You should see a call tree appended below the document, with each function call numbered sequentially.

Q3. Can you see where the javascript started handling the ‘Enter’ keypress? And how that triggered one of the two processes you learned in tutorial 4? Is it a model-initiated change or a view-initiated change?

A3. Model-initiated change.
It started handling the ‘Enter’ keypress in 62 (902.00ms-902.00ms) VeCeKeyDownHandlerFactory.lookupHandlersForKey(13, "linear")--->["(function VeCeLinearEnterKeyDownHandler)"]
The KeydownHandler triggers a transaction being built programmatically - from keyboard input and not from any observed changes in the CE.
Tricky. We pressed a key, so why did this not result in the ContentEditable being updated (...and then the surface observing a change in the CE, and then all the steps of a view-initiated change)?
Because in ve.ce.LinearEnterKeyDownHandler, we override the Enter key behavior with preventDefault, so ContentEditable is not actually updated when you would expect it to be.

Q4

Now reload and place the cursor between ‘c’ and ‘d’ again. Click Filibuster, press ‘x’, then click Filibuster again, to get a new call tree.

Q4. Is this a model-initiated change or a view-initiated change?

A4. View-initiated change.
ve.ce.Surface observes changes to ContentEditable:
181 (871.00ms-886.00ms) VeCeSurface.handleObservedChanges("(VeCeRangeState)", "(VeCeRangeState)")
…then builds a transaction from the observed changes:

189 (871.00ms-872.00ms) VeCeTextState.getChangeTransaction("(VeCeTextState)", "(VeDmDocument)", 0, null)

Q5

Now reload set a breakpoint at the start of ve.ce.Surface#onDocumentInput. Again place the cursor between ‘c’ and ‘d’, and press ‘x’. You should be able to resume as normal

Install a Japanese romaji input method on your operating system, activate it, and learn how to enter the Kanji ‘日本’ (“Japan”, probably by typing ‘nipon’ and selecting from a list).

Try to use the Chromium debugger to put a breakpoint at the start of ve.ce.Surface#onDocumentBeforeInput. Now type ‘日本’ in Japanese.

Q5. Does the debugger close the input method? Why do you think this might happen?

“Closing the input method” means prematurely committing the candidate text.

Exactly how an input method interacts with Javascript is highly platform-specific. It depends on platform combinations: OS, browser, language, input method software, and even software version. Different input methods can send radically different sequences of events, even if they look like they’re doing exactly the same thing. For instance, as of 2023, pressing Enter in Android Gboard does completely different things depending on whether the language is English or Cantonese.

Q6

Now reload, and place the cursor between ‘c’ and ‘d’ again. Click Filibuster, type ‘日本’ in Japanese, then click Filibuster again, to get a call tree.

Q6. Look carefully through the call tree. Can you list the Javascript events which VisualEditor observes from the input method? For many platform combinations, you’ll see interesting changes of selection and content as the input method software builds up candidate text and then commits it. Or at the other extreme, you may only see a single ‘input’ event where ‘日本’ is inserted.

Filibuster works by wrapping every method in ve.ce.*, ve.ui.* and ve.dm.* with a proxy that logs the call and its return value. It slows down execution greatly and the logs are too vast to be useful for complex edit sessions. Its main use is for debugging input method behaviour — where you can’t set a breakpoint because it will disturb the IME (see Q5) — and for this purpose a few keystrokes on a document of a few words usually suffices.

Tutorial 7: Miscellaneous

Open http://localhost:8000/demos/ve/desktop.html#!simple and use the debugger's element tree to select the first h2.

In the debugger, do

n = $.data( $0 ).view

to get the CE ContentBranchNode. Do

r1 = n.getRange()

To get the range object. Observe it has .start / .end properties and also .from / .to properties.

Q1

Q1. How do start/end relate to from/to?

A1
Start = whichever of ( from, to ) is earlier in the document
End = whichever of ( from, to ) is later in the document

Q2

Now use the cursor and shift keys in VE to select the entire H2, starting from the end and moving left (so that your cursor ends at the beginning of the node). Try

r2 = s.getSelection().getRange()

Q2. How do r1 and r2 differ?

A2. ‘from’ and ‘to’ are swapped for r1 and r2. These two variables track selection.

Q3

Move the selection to the start of the document, and press ‘x’ to insert a character.

Q3. Did the properties r1 or r2 change value? What are the consequences of this answer?

A3. No, they stay the same. The range is static.

Q4

Now undo the ‘x’ insertion, select the entire H2 again using cursor and shift keys, and do

sel = s.getSelection()
sf = new ve.dm.SurfaceFragment( s, sel )

and look at the property values returned by sel.getRange() and sf.getSelection().getRange().

Q4. If you insert text at the start of the document, does calling sf.getSelection().getRange() again now give a result with different property values? What are the consequences of this answer?