Developing a custom collaborative editor for journalists at Contexte

At Contexte — a fast-growing B2B online media on politics founded in 2013, based in Paris and Brussels - email is our main information product. Daily briefings result from collaborative work involving many roles, from journalists to proofreaders and passing through production editors. After years of working with Google Docs, the solution’s limitations led us to develop our collaborative editor: Echo. I’m going to tell you how we did it.

This article was originally published on Contexte's blog.

It all started a year ago when this New York Times article was shared on our tech team’s Slack channel. It explains how a custom collaborative editor was designed, allowing several people to work simultaneously on the same article.

They had achieved what we’d been dreaming of. We quickly asked ourselves how we could adapt this solution to our situation and technical environment with our available human and financial resources.

“We don’t need roads”

From the outset, Contexte has published a daily newsletter with a handful of news briefs informing subscribers about public policy developments. Over time, the editorial team grew. Today, 50 journalists based in Paris and Brussels cover nine publications – that’s around one hundred news briefs drafted, proofed and corrected daily.

The tech team’s role is to make sure that this grand orchestra can keep pace by providing editing tools fit for each stage in the briefing process:

During the afternoon, when journalists are writing news briefs.
During the evening, when proofreaders correct the briefings.
At dawn, when production editors implement last-minute changes.

developed a custom collaborative editor for our journalists at Contexte

Contexte edits briefings with the solution Django, well-known among Python developers for easy back-office development. However, it has a handicap: only one person can work on an article at a time. To make up for this shortfall, we introduced Google Docs at the start of the chain: journalists would write the bulk of the article in a collaborative document formatted using a special markup language, which would then be exported to Django. Done and dusted!

On paper, this tool combination worked and we wrote more than 1,750 articles with it in 2023. But in practice, the imbalance of tools between the afternoon team (pre-export) and the evening team (post-export) caused several issues. On the Google Docs side, language errors made the export process completely freeze; on the Django side, the editor’s inflexibility meant that only one proofreader could work on an article at a time. Overall, this fragmented approach needed a lot of maintenance and hindered us from changing the briefing format.

During the summer of 2023, when we got serious about tackling the tool situation, we opted for a hands-on approach: a developer (yours truly) was in charge of working as closely as possible with the editorial team, collecting feedback, discovering what they need, and then providing an overview of the technical solutions available to overcome their limitations. 🐬 Echo was born.

I know what you did last summer

It may be counter-intuitive, but in the age of TikTok and electric scooters, developing software that allows several people to work on the same document simultaneously is still a tricky problem to solve. Fortunately, there are plenty of open-source and pay-as-you-go solutions to get some pointers. Two really caught my eye during my review last year.

The first text editor solution is Prosemirror, which stands out for its positioning. Its role is not to provide the most text formatting options possible (bold, italics, links, etc.) but rather to be easily extensible. This editor ensures that HTML content is always valid by checking each amendment against a predefined schema. So, if I copy and paste some text in Comic Sans MS, green and size 48 into a document that does not support fonts or colours, the text will be preserved but the formatting will be obliterated. This robust, extensible approach has made Prosemirror a benchmark in the editing world, being the aforementioned New York Times’ editor of choice.

The second solution is Tiptap, which entered the text editor game with a bang thanks to a novel marriage “between tradition and modernity”. Built on top of Prosemirror, its strength lies in the fact that it has managed to assemble all of Prosemirror’s concepts in a system of Extensions that define the editor’s formatting and behavior. They offer a range of extensions that the open-source community can contribute to:

Tiptap is also appealing for its modernity. “Reactive” frameworks marked a turning point in the field of web development. These frameworks – React and Vue.js are the best known – lend apps wings by letting the interface update itself if underlying data are updated. Tiptap is natively compatible with these frameworks, saving precious time when developing a reactive app. Moreover, it is compatible with Yjs, a collaborative text-editing building block that particularly got my attention.

CRDT or the prospect of a decentralized, conflict-free world

The Yjs building block is a CRDT implementation (“Conflict-Free Replicated Data Types”), behind which lies an algorithm based on academic research which allows two pieces of data to be replicated, so they can be edited separately and then merged without conflicts.

What’s exceptional about this algorithm is that it’s “conflict-free”. We’re all familiar with data replication. Who hasn’t uploaded an Excel spreadsheet in Dropbox and got home to turn on the computer and discover they’re working on an old version of the document and indeed now have two versions of the file? Replicating data in Dropbox is considered “naive”: if two amendments are made at the same time, Dropbox is not capable of managing conflicts and asks us to choose.

With Yjs, that’s ancient history. The algorithm has an in-depth knowledge of the content and each addition or deletion is displayed as a Transaction. While several transactions are created, they are applied taking into account the editor’s intention in a way that minimises data loss. Broadly speaking, whoever adds text has a priority over whoever deletes text. These transactions can be exchanged between two peer-to-peer documents or via a central server like HocusPocus, developed by Tiptap’s creators.

More specifically, the second a text is loaded into the browser, journalists can work on it, even offline. When the Wi-Fi connection is re-established, the two versions of the text are merged.

Demo:

Tiptap and its Yjs building block gave us our proof of concept. Creating our own collaborative editor could be possible! But there was still one big mystery: how do you integrate a collaborative building block into a system that is not collaborative?

One API to rule them all

Contexte’s briefing editor is based on Django. It’s the engine behind our website contexte.com (which got a makeover for our 10th birthday!) but the building block also sends the briefings via email at eight o’clock sharp every morning. It’s also the engine for Scan, our legislative monitoring tool (whose technical architecture was presented at DjangoCon 2022 in case you’re interested).

To avoid the “big bang” effect when completely changing a tool, we opted for the _API approach: what if Django exposed a web service for editing the briefing and the collaborative editor was just another client?

When connected to the editor, the intermediate server loads the briefing via the API and creates a Yjs document from it. This document is replicated to all users and automatically synced. Every time the server detects a modification, it regularly saves the briefing via the API. Technically, we implemented the API with Django REST framework, the intermediate server is a Node server, and we chose the Vue.js framework for the application because we were already using it for our pretty data visualisations and knew it was easy to use.

This approach allowed us to develop the new editor as a separate building block without impacting the existing tool. The best bit: during the first real-life tests, we partially deployed it to some of the team, keeping the option of being able to roll back should a problem arise under our hats. We learnt a lot about how collaborative editing works during these tests. And of course, things didn’t go to plan.

Syncing is never smooth sailing

The main technical complexity of editing a briefing is that – unlike an article – it is not a long text but rather a series of news briefs, each with a title, a body of text and several annotations:

To simulate an entire briefing, we had to take Yjs out of its original role, as it is not only responsible for blocks of text but also the briefing note’s structure. We could do this because Yjs offers shared data structures, like _Y.Map and _Y.Array, which we used to represent everything that was not a piece of text. A document therefore contains several _fragments, for example, a Monday-morning briefing includes more than 170 fragments, of which half correspond to blocks of text and the other half to metadata.

The issue? The way Yjs resolves metadata conflicts wasn’t exactly what we were looking for. For example, if a journalist deletes a news brief at the exact same time that someone else rearranges the news briefs, the new order (which includes the deleted news brief) takes precedence over the deletion, causing the news brief to reappear. Oops!

Alice: “A!”
Bob: “B!”
Yjs: “AB?”

The solution would be to include a validation process in the chain, but there’s a snag: in the world of Yjs, the server is not capable of arbitration and is a peer like the others. We, therefore, gave it with an additional purpose: each time the briefing is modified, Django (our source of truth) reports any anomalies to the server, which then rectifies the problems. Introducing this mechanism allowed us to resolve the last of the briefing-related glitches.

More on the topic: we then discovered liveblocks.io, a French-made solution inspired by Yjs, which deserves attention from all real-time tool developers.

Ain’t no mountain high enough

Trekkers are familiar with feeling like they are pushing themselves to the limit when climbing a mountain, only to find it’s the lowest peak in the range. We endured many mountains during the Echo project: how could we help an entire newsroom transition to a tool we’d developed from scratch? How could we make it easier for different roles (journalists, proofreaders, production editors) to work on the same document? How could we ensure that instant editing did not create more errors than it resolved?

During the first months of product design, we employed a highly iterative approach: we prioritised the biggest challenges with internal prototyping phases and demos for the tech team and editorial staff in charge of monitoring the project. We regularly conducted user test sessions with a panel of journalists to check how user-friendly the prototypes were in real-life conditions.

For issues requiring conflict management, seeking advice proved the most effective method: the parts of the solutions that had the most impact on the editorial processes were devised collectively through brainstorming workshops, inviting the people involved to take part.

Lastly, in addition to the user testing and workshops, we ran a beta phase with a pilot team of journalists and created a communication channel to collect their feedback. We meticulously analysed and compiled the feedback on Linear, our tech team’s project management solution:

This design phase was facilitated by the streamlined development methodology adopted by Contexte, which places developers at the heart of design (inspired by Shape Up, Basecamp creators), as well as the editorial teams’ availability and willingness to cooperate, providing their thoughts on the tool being developed.

Echo, the new collaborative briefing editor

We designed the Echo interface to be minimalist and content focused. We simplified the transition from Google Docs by using keyboard shortcuts (did you know that Cmd+Shift+8 inserts a bulleted list?). Input help is integrated into text fields to automatically apply French typographical rules.

To avoid any catastrophic situations like “my cat sat on my keyboard and overwrote my news brief”, we provide a history of the changes so all users can go back and track any given news brief’s progress.

We didn’t leave Stabilo© enthusiasts out either, as they can make a note on any part of the briefing by leaving a comment. Journalists and copy editors can exchange and perfect new briefs’ content without having to leave the tool. We set up a two-way integration with Slack: when a comment is posted about a news brief, the author receives a notification and can respond directly in Slack, which then simultaneously appears in Echo.

—-

It’s 24 January 2024, D-day, and we’re rolling the solution out to the whole editorial team. Our morning coffee tastes a bit different; the monitoring dashboard is set at full screen. The day’s briefings are gradually imported to Echo and the editorial team is asked to migrate to the new tool. The switchover goes smoothly. Initial feedback was enthusiastic, from “It’s the bomb!” to “Echo has given me an extra two or three years’ life expectancy”.

Echo marks a major step forward in the editorial experience, but the adventure doesn’t end there! The strong ties Echo forged between the tech and editorial teams are now fueling a long list of ideas to improve the tool, from usability to robustness. We plan to extend Echo to article creation in the near future and are continuing to invest in editorial tools to make our publishing processes as smooth and enjoyable as possible.