Learning from challenges

I recently got asked to write up a challenging software project that I was involved into and what I have learned from it. The following story is not the hardest coding problem I ever worked on, but is nonetheless an exciting project and — apart from coding — gives you a good insight into my understanding of an agile and iterative approach to solving problems.

One sentence of context upfront: the company where I worked here is the biggest german platform for physicians (160k users) and is both an expert network where colleagues can ask for help and also a resource for medical knowledge and pharma information.

The community management of my company wanted to launch a regular newsletter, that would keep users informed about the most recent activities on the platform (new articles, questions, news, …). The clue was that the newsletter content should be individual for each user based on the areas of interest they had picked in their settings.

In order to gather early feedback on this product idea, we strived for a simple yet sufficiently viable solution: we built an endpoint on top of our main database and made use of existing algorithms to grab the user specific content. A worker app on our messaging system requested this endpoint to assemble each newsletter and another worker handed the emails over to our mail service provider.

The MVP was a great success: We processed a huge amount of newsletters per day that yielded a lot of user traffic. But we were facing two major problems:

  1. We wanted to split off all the editorial content from the user generated content in order to migrate it to a separate CMS. That way, the content didn’t live in a single database anymore, but we had two distinct sources of data; additional content sources were planned.
  2. We wanted to improve the recommendation algorithms, but our main (relational) database didn’t do well in this specific purpose.

In order to solve the first problem, we decoupled the recommendation system from the main application and duplicated all content items to a standalone microservice. Whenever content was published or edited anywhere on the platform, an event got emitted to the messaging layer; a worker transformed the data and pushed it into this recommendation service. That approach was far more resilient and we also saw a huge performance boost. In terms of technology, we chose to start with a document based datastore and reimplemented the data processing logic on the new server.

Next, we worked on the recommendation algorithms: Together with product owners and stakeholders, we thought of more sophisticated ways to tag the content. As upcoming requirements, the system should be capable of promoting particular content items or should take tracking data into account (e.g. popularity of an item). It became clear that our content structure could be best modeled as graph, so we decided to exchange the document store with a graph database, which allowed us to link each item to all kinds of metadata. We had little hands-on experience with this database type, but we considered it to be the right choice and we were given the chance to learn about it.

Although both software and product are still work-in-progress, this is a good point to conclude the story. There is no finish; my description is a window in time of something that is ever changing and continuously transforming. The moral is not the solution — it is the approach.

Having told you this story lets me point out three things that I take away and got clear about. These are my learnings:

  1. Read here about shearing layers and see this video [return]