Conveyor Log In

Conveyor Architecture Series: Data Syncing

We recently launched Conveyor, the newest product in the Wildbit family. Conveyor is a three-in-one solution that includes a web app, Git hosting and a desktop client. It fuses project management and version control into one seamless workflow. By taking a more integrated approach, you and your team are more able to focus on things that matter and let Conveyor handle more of the plumbing. That means less overhead and juggling separate apps to commit code and update tasks.

We’ve been so busy working on Conveyor that we didn’t set aside time to share any details about how we’re building it, what makes it special, and the technical challenges we had to solve along the way. This post is the first of a series that will give you a look behind the scenes—a tour of the sausage factory if you will.

A bit of a back story

When we originally started working on Conveyor we quickly realized that in order for our vision to work we had to have a desktop client. We needed to unify and bridge the user experience between what’s local and what’s remote (your computer and our servers) to make the experience more seamless. And we really wanted to make a native macOS app because we believed that that would provide the best user experience.

Screenshot of the native macOS version of Conveyor's client.
The initial version of Conveyor was a native macOS client.

I already wrote a blog post in the past describing our experience with building the native app, so I’m not going to go into details here. But long story short, after two years of developing we had to scrap it and return to the drawing board. We then spent the first half of 2018 rewriting Conveyor desktop client using technologies that are close and dear to our hearts: the web stack. 

A screenshot of the Electron version of the Conveyor app with a section highlighted and a window for viewing the source of the app.
Conveyor currently has an Electron-based desktop app for managing development work.

After some research, we decided on the following stack for our desktop app: Electron, TypeScript, and React. For the data store, we use Git for version control in conjunction with MySQL, a central CouchDB instance, and individual instances of PouchDB for the desktop app.

It has been an enormous challenge, and we explored quite a few interesting solutions that we think are worth sharing. Before we begin with the nitty gritty details, a quick disclaimer is in order. A lot of solutions we’ve implemented so far remain mostly unproven. Our system seems to be humming along very nicely with the arrival of our first users but your mileage may vary.

A quick summary of Conveyor’s architecture

As I mentioned earlier, Conveyor is a product that consists of three major parts: the web app, the git service and the desktop app. All three parts must work together seamlessly in order for your local development workflow, task management, and hosted version control to feel like a unified system. 

The web app is primarily for users to manage their workspaces and projects, as well as the majority of the project management tasks. It’s perfect for collaboration and for members of the team that don’t usually make commits (managers, team leads, owners, etc.) 

Our Git service is coupled tightly with the desktop app to store and securely distribute Git repositories. Having our own Git service allowed us to provide a zero-configuration access to repositories. In Conveyor we treat Git increasingly as an implementation detail, a powerful engine that lives in the basement which you don’t need to interact with. That’s why the operation of our Git service is transparent to the end user. You don’t have to setup your SSH keys, manage the Git config on your computer, make clones, setup remotes, etc. It just works out of the box.

In terms of implementation, both the web app and the Git service were a familiar territory for us. We’ve done that before with Beanstalk and DelpoyBot.

The desktop client, however, was nothing like what we’ve built before. It’s responsible for providing a quick & easy way of accessing and using repositories on your computer. Things like cloning a repository, creating a branch, checking out a branch, staging files, making commits, stashing files. It also provides a natural interface for managing your tasks in conjunction with your version control work.

With the desktop client installed on thousands of computers all over the world, all of the instances would need to to talk to each other, as well as to the web app and the Git service in order to exchange data.

Consider also that some desktop clients would have excellent internet connectivity, some would have subpar (busy coffee shop) and some none (airplane.) In all three cases you should be able to do basic work on your projects: create tasks and make commits. We quickly realized that we needed a robust fault tolerant system that allowed all members of the system to exchange data with each other. How hard can that be?

The centralized approach

Our first take on the issue was with the original native client. We wanted to keep things simple, so we opted for a basic centralized system: an API service to which every desktop client would connect and use various endpoints to receive and submit the data. This was good because we had a single place to control what happened to the data and how it happened. It ensured consistency in behavior between the desktop app and the web app.

A diagram illustrating how the initial desktop client communicated directly with a Rails API backed by MySQL
The centralized approach required Conveyor to be online to connect with the Rails-based client API.

But it wasn’t all peachy. If you lost your internet connection, your client suddenly could no longer talk to the API and your user experience went down the drain. We had to add a lot of conditions in the app to adjust its behavior depending on it being offline or online (or somewhere in between). This increased the complexity of our code quite a bit. 

And when the client came back online we needed a way to quickly catch up with all the updates that happened while it was disconnected (imagine if your computer was asleep for 7 days.) We spent quite a lot of time optimizing this syncing process but we never got it to a point where it was really fast. It created a noticeable delay when launching the app after it had been offline for an extended period of time. Moreover, we had to maintain dozens of API endpoints on the web to keep the centralized approach working. 

The distributed approach

Ideally what we needed is a distributed system, so that our desktop clients could operate independently, even if our API was unreachable or if you were having connectivity problems. After some searching and experimenting we decided to leave the heavy lifting of data syncing to a proven solution that already works: CouchDB and its replication mechanism. Why reinvent the wheel?

A diagram illustrating desktop apps backed by individual PouchDB instances tied to the primary CouchDB service which relies on a custom Sync Service with MySQL.
While we'd eventually like to entirely remove MySQL from the architecture, for now, we've created a custom Sync Service that monitors MySQL and the primary CouchDB instance to keep everything current.

We decided that each desktop client would have its own PouchDB server (uses the same protocol as CouchDB but is easier to distribute.) That local Pouch would store all the data necessary for the client to work so it wouldn’t need to connect to anything else. Every little Pouch server would then connect to and replicate the data with the central CouchDB cluster. This way we have a mechanism for data syncing between the central cluster and every single desktop client. Out of the box. Practically for free!

And let me tell you, CouchDB replication is fast. If your computer was asleep or offline for several days, as soon as you came online your local PouchDB would receive all of the updated data from the central CouchDB almost instantly.

There is as little caveat with the web app though. The web app was built with the centralized system in mind. It’s a simple Rails app with MySQL as its primary datastore, which was supposed to be the source of truth for the entire system. With the new distributed approach and CouchDB serving as the source of truth for the desktop clients we needed a way to make the web app play nicely with that.

After migrating the desktop app from native to Electron, we couldn’t afford to also rewrite our entire web app, so we decided to leave MySQL and Rails in place. So we implemented a special service called SyncService which is responsible for moving the data between MySQL and the central CouchDB. When Rails app puts something new to MySQL, the SyncService would automatically export it to CouchDB. On the other hand, whenever something new appeared in the CouchDB, the SyncService would get notified and would import that change into MySQL.

An diagramo illustrating our custom service, Octopus, listening for changes and notifying our custom Sync Service for propagating changes in the CouchDB to our MySQL instance.
We have a custom listening service named Octopus that monitors our CouchDB instance for changes and notifies Sync Service to propagate the changes to MySQL.

SyncService gets its notification about CouchDB changes from a service that we’ve built called Octopus. It’s a Node.js app that watches for changes across all our CouchDB databases (we have a separate database for each Workspace) and then sends an HTTP hook to the SyncService when the time comes to suck those updates into MySQL.

The SyncService is one of the pieces of our system that I’m not entirely happy with right now, as it adds quite a bit of complexity. Eventually we’re planning to make CouchDB our primary store for most of the data and use MySQL for a smaller subset of data that’s only needed by the web app.

Offline-by-default mentality

The fact that each desktop client has its own local PouchDB means that it never has to connect to a remote server to get its data. From the perspective of the desktop client it is always operating as if there is no internet, it simply doesn’t care. PouchDB will replicate with the central server when the internet is available, moving the data back and forth automatically. And when you go offline the replication will just pause and your desktop client will continue working as usual, because your local PouchDB is alive and well. When the internet comes back, PouchDB reconnects to the central server automatically and resumes replication. Works like a charm!

This means that for you as a user going offline is not a scary experience when half of Conveyor switches into emergency mode and stops working. You can pretty much continue working as usual. Create a new task, add a markdown description for it, make some commits, post some comments. When Conveyor senses that your internet is back on it will synchronize all of your changes with the rest of the clients.

Finally, with the CouchDB taking care of the data syncing for us we no longer needed the majority of our API. All endpoints dedicated to sending or receiving data were removed. Now we only have a handful of actions that absolutely have to happen on the server side when the client is connected (e.g. authentication.)

To be continued

In this post we looked at Conveyor’s history, our original native macOS client and the centralized approach for data syncing, as well as the new Electron client and the new distributed data syncing approach. In the next post we’re going to take a deeper dive into our Electron app and talk about how we handle our data layer in there with React, MobX, mobx-state-tree and other nifty libraries. Stay tuned!

Reach out if you have any questions, we’re always happy to chat. And sign up for the free Conveyor account, we’re curious to hear your opinion.

Get updates and early access

Or follow along on .