ミツモア Tech blog

「ミツモア」を運営する株式会社ミツモアの技術ブログです

Redesigning Kanban Board Sorting for Better Performance and Lower Costs

Storing and updating a list of ordered items is a common requirement, though one that can be more complicated than first impressions might suggest. As is often the case, solutions at scale are where problems arise, and this was true for MeetsMore's Kanban board feature in its ProOne product.

As a first implementation, we simply stored an array of Kanban item identifiers in the database record for each Kanban column. The order of identifiers indicated the display order; when a user moved an item to a new position in a Kanban column, we updated the corresponding array and persisted it to the database. However, as the number of items in a Kanban column increased (sometimes in the tens of thousands), we encountered issues:

  • The amount of data being transferred between server and database was large, and frequent
  • Only one insert or re-order operation could happen at a time, leading to long transaction blocking and ultimately timeouts
  • Referential integrity was not enforced, leading to situations where Kanban data was out of sync with actual item state

We needed a better solution that would scale the number of Kanban items "infinitely" with minimal impact to performance, such that:

  • It is quick to read the sort order
  • It is quick to insert a new item in any position
  • It allows concurrent re-order operations
  • Ideally, the sort data does not require any additional periodic maintenance (e.g. redistributing values)

Some Internet research led us to an excellent article by Nick McCleery: "A robust mechanism for Kanban board column indexing". We recommend reading the original article for a more in-depth view, but ultimately we chose the "Lexicographic Indexing" approach, where item positions are represented as character strings that use alphabetic sorting to determine their order. For example:

AAB AAC AAD

This allowed us to store the Kanban position of each item independently in the database (rather than in a single array), providing cheap database-side sorting, fast insertion/update of positions, and referential integrity.

The challenge we now faced was to address some of the limitations of the Lexicographic Indexing solution for our specific usage scenarios; namely:

  1. Prevent rapid growth of position value length, in a system where the expected number of items is not known and can grow substantially
  2. Prevent duplicate position values, which would cause non-deterministic sort order and make insertion between values impossible (or overly complex)

Prevent rapid growth of position value length

The basic principle of "infinite" position values is to append a new character whenever the incrementing character reaches the end of the alphabet; for example:

B...Z->ZB...ZZ->ZZB...ZZZ

NOTE: This is also true for decrementing, when reaching the beginning of the alphabet (Z...B->AZ...AB->AAZ).

Let's call this "increasing precision". While this does work, you can see that each increase in "precision" only affords 25 additional position values before needing to increase precision again. In the example above, incrementally increasing precision to 3 provides a total of 75 unique position values. This results in very lengthy position values in cases with more than a few hundred items.

To exponentially increase available position values at the same precision, we instead use "fixed precision". In this approach, each position value has a "fixed" (or implied) precision, essentially pre-allocating a block of position values. For example, at a fixed precision of 3:

AAB...AAZ->ABB...ABZ->ACB...ZZZ

With this approach, we now have roughly 16900 (25*(26^(precision-1))) unique position values before needing to increase precision incrementally. Since using a fixed precision provides an exponential increase, it's relatively cheap to choose a precision that far exceeds the expected maximum number of items, to avoid ever needing to increase precision incrementally. While the fixed-length position value strings will consume more storage space initially for a small number of items, it will more than recoup that loss as the number of items grows.

Prevent duplicate position values

In order to have deterministic ordering and insertion between existing position values, we must guarantee that no two positions can have the same value. From a database perspective, that's easy: simply add a unique constraint. However, ensuring that the system doesn't generate two identical position values - or being able to handle it when it does occur - is perhaps more challenging than expected.

When adding a new item to the top or bottom of the Kanban column, we need to retrieve the current first or last position in order to generate the new position value - decrementing or incrementing it accordingly. But when serving concurrent requests, this becomes a race condition. Essentially this problem is analogous to implementing a custom auto-incrementing primary key.

Initially we opted for optimistic concurrency, where we allowed duplicate position values to be generated if a race condition occurred; one database operation would succeed, while the constraint violations would be caught and retried. However, testing revealed a higher number of collisions than expected, and selectively retrying transactions proved too complex.

Instead, pessimistic concurrency was implemented via locking of the Kanban column during position generation. While this isn't ideal given our goal of avoiding transaction blocking, we were able to narrow its scope such that no negative impact was observed.

Outcome

The Lexicographic Indexing with fixed precision and position value generation via pessimistic concurrency was a success:

  • Insertion & re-ordering operations on a Kanban column containing a million items takes less than 200 milliseconds
  • Transaction deadlocks and timeouts have been eliminated
  • Data transfer has been significantly reduced
  • Referential integrity of Kanban items is maintained
  • General database health has been improved

We now have a Kanban solution that provides a much better user experience and is scalable for the future. And as a bonus, the reduced load on our database has resulted in a general reduction of resource consumption, saving the company a not-insignificant amount on our monthly operational invoice.

Acknowledgements

We'd like to thank Nick McCleery for generously sharing his knowledge, and we hope that this article will provide useful information for the next person working on this topic.

Further reading

To learn more about the exciting work being done at MeetsMore, check out this story about how AI is changing the way MeetsMore develops software!