As one of the major news organizations in the world, The New York Times published its first issue on September 18, 1851. In the ensuing 164 years, we have published approximately 16 million articles, for the past 20 years also online. We are now building a new publication pipeline around Kafka. This pipeline functions as the source of truth for all published content, and decouples producers from consumers of content. The pipeline is implemented as an immutable log – all content is published to the log, and all the back-end systems driving the different online experiences access content by consuming this log. I want to explain how this publishing pipeline works, how it interacts with other system, and what our experiences have been so far.
|Boerge Svingen, Director of Engineering, The New York Times|