We identified messages from a particular external data source that were malformed in a way that dramatically slowed down our processing but did not cause visible errors. Having identified those messages, we've disabled our integration with that data source for now and deleted all the malformed messages from our queues. We are currently re-enabling some monitoring we turned off while investigating (the monitoring itself was impacted by the bad messages as well). If we don't identify any other places where the messages got stored, we expect to be fully recovered within 30 minutes.
The external data source in question only affects on customer. We'll be in touch with that customer tomorrow and investigate the cause of the malformed messages and appropriate next steps.
Posted Jan 22, 2020 - 22:34 EST
We are seeing some workers in our data processing pipeline performing very sluggishly, delaying some data updates.