Platform outage
Incident Report for Malomo
Resolved
We are now considering this incident “Resolved” following Friday’s restoration of orders placed beginning Nov. 1 leading up to the outage. We are continuing to restore all historical data for orders placed prior to Nov. 1, and will share an update once this is completed.

Our dashboard and reporting features are operational, but are limited to the order data currently available in our system. Please see our previous status update for more details regarding expected behavior for orders placed beginning Nov. 1.

You may notice some older orders appear in our system before the full restoration is complete. This will happen if we receive an order update from Shopify for orders placed prior to Nov. 1. Our system will create a new order and trigger a ShipmentCreated event to be sent to integrated apps, regardless of when the order was fulfilled. If you are using the Malomo: ShipmentCreated metric to send Shipment Confirmation emails in Klaviyo, we recommend adding a new trigger split to your flows to check for the latest carrier status. If the status is “in-transit”, “out for delivery” or “delivered”, do not send the email.

We are working on an incident report and root cause analysis in partnership with a third-party consultant. This report will detail the corrective and preventive measures we have either already implemented or plan to implement. We will share this report when it is completed.
Posted Dec 12, 2023 - 18:03 EST
Update
We have finished importing data for all orders placed beginning Nov. 1 leading up to the outage.

During import, we identified a scenario affecting a small number of orders where a shipment was not attached to the order. At the moment, we do not plan to re-register shipments for these orders because doing so might trigger unwanted events and messages in connected apps which may confuse those affected customers. We will investigate potential workarounds.

For the majority of orders, excluding those missing shipments, tracking pages will update as new shipment events are received by our system. New order and shipment events received by our system after the time of import will be processed and sent to integrated apps as expected. Any order or shipment events received by our system prior to the time of import will not be sent to integrated apps.

Orders placed during the outage up until last Friday’s partial database restoration have been fully restored and will continue to receive updates from Shopify and send events to integrated apps. Orders placed after the partial database restoration on Friday, 12/1 will continue to work as expected.

As you review the Malomo dashboard, you may notice that some November orders are duplicated on the Orders Beta page. One record is the newly imported order, which includes all data as expected on the Order Details page. The other record is leftover from the outage and will display an “Order Not Found” error when clicked. This duplicate record has no impact on your Malomo data or integrated apps. Our team is currently working to remove all duplicate records from the Orders Beta page. There will be no downtime from this process.

Now that priority data has been restored to accounts, our team will begin importing historical data prior to November 1st.

For more information on smooth transition back to Malomo powered notifications, please view our Merchant Action Plan here: https://drive.google.com/file/d/1djmD0ztpP9rDlJOSmkVJPO-pSkU_WGKD/view
Posted Dec 08, 2023 - 09:11 EST
Update
We are continuing to import data into our platform for all orders placed in November leading up to the outage. At this point, 78% of November orders have been imported.

Once this first import is completed, tracking pages will update with the most recent shipment status provided by the carrier. New events received after the time of import will be processed and sent to Klaviyo as expected. Shipping update events received prior to the time of the import will not be sent to integrated apps.

As a reminder, you can access our Merchant Action Plan here: https://drive.google.com/file/d/1djmD0ztpP9rDlJOSmkVJPO-pSkU_WGKD/view
Posted Dec 07, 2023 - 10:28 EST
Update
We have released an update to resolve an issue with our Postscript integration that resulted in duplicate events being sent. The Postscript integration has been re-enabled, and the majority of duplicate events were removed from our system. Events received by our system while the integration was paused have been processed and sent, although a small number of customers may notice duplicate events sent from this period. New events received by our system after the integration was re-enabled are now being processed and sent as expected. The Postscript integration was paused between approx. 4:10 pm - 8:11 pm EST.
Posted Dec 06, 2023 - 20:38 EST
Update
We have released an update to resolve an issue with our Attentive integration that resulted in duplicate events being sent. The Attentive integration has been re-enabled, and duplicate events were removed from our system. Events received by our system while the integration was paused have been processed and sent. New events received by our system after the integration was re-enabled are now being processed and sent as expected. The Attentive integration was paused between approx. 4:10 pm - 7:01 pm EST.

We are continuing to work on resolving the same issue for the Postscript integration.
Posted Dec 06, 2023 - 19:41 EST
Update
We are now importing data into our platform for all orders placed in November leading up to the outage. Once this first import is completed, tracking pages will update with the most recent shipment status provided by the carrier. New events received after the time of import will be processed and sent to Klaviyo as expected. Shipping update events received prior to the time of the import will not be sent to integrated apps.

Note for Klaviyo customers:
If you are using our event metric “Malomo: ShipmentCreated” to trigger shipping confirmation emails, we recommend adding a flow filter to your Klaviyo flows to check whether an order has been delivered, and if so, filter customers out from the flow. As we import orders, this will prevent a unique situation we’ve identified that might trigger an email to customers if the order was placed prior to the outage yet fulfilled in the past week.

Note for Attentive and Postscript customers:
We are also working to resolve an issue with our Attentive and Postscript integrations resulting in duplicate events being sent. We have temporarily paused all outgoing events to these integrations at approximately 4:10 pm EST in order to troubleshoot the issue.

We are working on an action plan that you can use to transition your post-purchase experience back to Malomo. We will share this plan shortly.
Posted Dec 06, 2023 - 17:10 EST
Update
Our database has been fully restored from backup. We are now preparing to import orders from the backup, prioritizing those placed beginning Nov. 1 up until the outage occurred.
Posted Dec 06, 2023 - 10:24 EST
Update
Our team is continuing to monitor the restoration process. We do not have any new information to share at this time.

Please see our previous updates for more details on our recovery strategy.
Posted Dec 05, 2023 - 21:46 EST
Update
Our team is finishing a full database restoration and once complete, will begin importing orders, prioritizing the last 2 weeks of data prior to the outage. Once those orders have been imported, tracking pages will update with the most recent shipment status provided by the carrier. New events received after the time of import will be processed and sent to integrated apps as expected. Any events received prior to the time of import will not be sent to integrated apps. Once we complete the initial import of the most recent orders, we will begin to import all prior historical data.
Posted Dec 05, 2023 - 17:30 EST
Update
Our team is finishing a full database restoration and once complete, will begin importing orders, prioritizing the last 2 weeks of data prior to the outage. Once those orders have been imported, tracking pages will update with the most recent shipment status provided by the carrier. New events received after the time of import will be processed and sent to integrated apps as expected. Any events received prior to the time of import will not be sent to integrated apps. Once we complete the initial import of the most recent orders, we will begin to import all prior historical data.
Posted Dec 05, 2023 - 13:56 EST
Update
Our team is finishing a full database restoration and once complete, will begin importing orders, prioritizing the last 2 weeks of data prior to the outage. Once those orders have been imported into our platform, tracking pages will be restored for those orders and we should start to receive events for new carrier updates.
Posted Dec 05, 2023 - 10:45 EST
Update
Our engineering team is continuing to restore missing data for orders placed prior to the outage beginning on Nov. 30 at approximately 3:00 am EST. At this time, our platform does not yet have order and shipment data for orders placed prior to the outage. Tracking pages for fulfilled orders prior to the outage have not yet been restored and we are not sending carrier events for those shipments.

Our team is finishing a full database restoration and once complete, will prioritize importing the last 2 weeks of data prior to the outage. Our team is targeting the import to begin today, Monday 12/4. Once those events have been imported into our platform, tracking pages will be restored for those orders and we should start to receive events for new carrier updates.

Orders placed during the outage up until Friday’s partial database restoration have been fully restored and will continue to receive updates from Shopify and send events to integrated apps. Orders placed after the partial database restoration on Friday, 12/1 will continue to work as expected. Please see our previous status updates for more details.
Posted Dec 04, 2023 - 12:12 EST
Monitoring
Incident status has been updated to Monitoring while we continue to restore our database.
Posted Dec 01, 2023 - 21:44 EST
Update
Incident severity has been downgraded to a Partial Outage.
Posted Dec 01, 2023 - 21:35 EST
Update
Our engineering team is actively working to restore missing data for orders placed prior to the outage beginning on Nov. 30 at approximately 3:00 am EST.

Orders placed during the outage up until today’s partial database restoration have been imported into the database, and should continue to receive updates from Shopify and send events to integrated apps. Orders placed after today’s partial database restoration should continue to work as expected. Please see our previous status update for more details.

We do not yet have a clear timeline for the full restoration of our database but we have increasing confidence that our restoration process is working. We will continue to post updates here throughout the weekend and until all issues are fully resolved.
Posted Dec 01, 2023 - 21:21 EST
Update
Our database has been partially restored from backup. We are actively working to restore missing data for any orders placed prior to the outage beginning on Nov. 30 at approximately 3:00 am EST.

The Malomo dashboard is accessible with limited data until the full database restore is complete. Orders placed during the outage have been imported and can be viewed in the dashboard via the Orders page, but not the Orders Beta page. Please note that the “Order Placed” timestamps displayed in our dashboard correspond to the time of import rather than the time the order was placed. Corresponding events sent to Klaviyo include the correct timestamp.

Tracking pages for orders placed during the outage are beginning to show shipment updates as expected. Tracking pages for orders placed prior to the outage will continue to experience issues until the full database restore is complete.

Our Events Processor is now operational, and you will begin to see events flowing back into your integrated apps. Events will be processed and sent in the order in which they were received, so new events triggered after restoration will continue to experience some delays until the system is fully caught up.

If you placed your Klaviyo flows in manual mode, you can begin working through your queue in the Needs Review tab to manually send messages. If you did not make any changes to your Klaviyo flows, they will begin triggering as events are received by Klaviyo.

For more information on manually sending flow messages in Klaviyo, as well as turning messages from manual to live, please visit the Klaviyo Help Center (https://help.klaviyo.com/hc/en-us/articles/115002779331).

We will continue to post updates here throughout the day and until all issues are fully resolved.

We appreciate everyone’s patience, understanding and your kind words as we continue to push to full resolution.
Posted Dec 01, 2023 - 17:51 EST
Update
Our engineering team continues to work on our approach to restoring our systems, and we feel very confident that our current approach is working. At this time, however, we do not yet have a clear ETA for resolution. We will continue to post updates here throughout the day and until all issues are fully resolved.
Posted Dec 01, 2023 - 15:15 EST
Update
Our engineering team continues to work on our approach to restoring our systems, and we feel very confident that our current approach is working. At this time, however, we do not yet have a clear ETA for resolution. We will continue to post updates here throughout the day and until all issues are fully resolved.
Posted Dec 01, 2023 - 12:26 EST
Update
Our engineering team continues to work on our approach to restoring our systems, and we feel very confident that our current approach is working. At this time, however, we do not yet have a clear ETA for resolution. We will continue to post updates here throughout the day and until all issues are fully resolved.
Posted Dec 01, 2023 - 09:47 EST
Update
Our engineering team continues to work around the clock on our approach to restoring our systems, and we feel very confident that our current approach is working. At this time, however, we do not yet have a clear ETA for resolution. We will continue to post updates here throughout the day and until all issues are fully resolved.
Posted Dec 01, 2023 - 05:04 EST
Update
Our engineering team continues to work diligently on restoring the database. Until it is restored, you will experience issues accessing our dashboard, viewing tracking pages and receiving events to our integrations during this time. This size of our database makes this a time-intensive process since we are transferring huge files and loading them into a fresh database. Once the database is restored, we’ll begin bringing the Malomo application back up. While the engineering team has made material progress on the issue, we do not yet have an ETA.

We will share an action plan for merchants to follow once our application is online.
Posted Nov 30, 2023 - 21:23 EST
Update
We are continuing to work on restoring the database, but do not yet have an ETA.

Important note for Klaviyo customers who have switched email notifications back to Shopify:

We recommend keeping your Klaviyo flows in MANUAL mode. This will prevent emails from automatically sending once our platform is restored. This will also allow you to bulk send any missed notification emails as our system catches up and sends events from the past 15 hours to Klaviyo. Important: Please make sure to delete any old notifications that have collected in the “Needs Review” section prior to today before turning your flows to Manual mode.

For more information on manually sending flow messages in Klaviyo, please visit the Klaviyo Help Center (https://help.klaviyo.com/hc/en-us/articles/115002779331).
Posted Nov 30, 2023 - 18:39 EST
Update
We are actively working on restoring the database, but do not have an ETA at this time. We are working on multiple approaches to speed up the restore.

In the meantime, we recommend that merchants temporarily switch back to Shopify emails for their order and shipping notifications. Please see our Knowledge Base article (https://help.gomalomo.com/csc/how-to-restore-shopify-email-notifications) for full instructions.

If you need help with this, our support team is here to assist you in implementing this workaround at help@gomalomo.com.
Posted Nov 30, 2023 - 16:23 EST
Update
We truly apologize for the inconvenience and frustration this issue has caused to you and your customers. We want to provide you with the most recent information we have, what we’re doing to address the outage, and what you can do in the meantime.

This update will be fairly technical in nature. In the spirit of transparency, we plan to over-communicate rather than under communicate at this time. But first, here’s what you can do right now:

Workaround
----------------
For immediate relief, we recommend that merchants temporarily switch back to Shopify emails for their order and shipping notifications. Please see our Knowledge Base article (https://help.gomalomo.com/csc/how-to-restore-shopify-email-notifications) for full instructions.

If you need help with this, our support team is here to assist you in implementing this workaround at help@gomalomo.com.

What To Expect
-----------------
We will continue to update you on this page every 1-3 hours until all issues are resolved.

Full Incident Details
---------------------
There are a few issues occurring, and we are actively and urgently working on addressing them.

- The first issue, related to notifications not being delivered, has been occurring intermittently since Nov. 28, and our team has been working on resolution around the clock.
- Impact:
- All types of outgoing events from Malomo to integrated apps are affected.
- Some events are delayed, but are still being sent to integrated apps.
- Some events may have been lost, and are not being sent to integrated apps.
- We previously thought this only affected order confirmation events, but have confirmed that all events are affected.
- We have been investigating the issue since Nov. 28 and have implemented some manual fixes.
- While investigating the issue, we discovered that the application that processes all events in Malomo (the “Events Processor”) experienced a memory leak, which caused the Events Processor to crash.
- Impact:
- When the Events Processor crashes, we lose events in the processing queue.
- The Events Processor has experienced intermittent crashes since Nov. 28.
- The majority of events were processed, but some events were lost when crashes occurred.
- We profiled the system and isolated the memory leak to the process that handles outgoing webhooks to our integrations.
- In order to address the webhook issue, we increased processing capacity, split processing loads and optimized memory utilization of the underlying code.
- At that point things appeared to stabilize.
- Simultaneously, we started working on a long-term fix that will pull all event queues out of memory and into a more resilient storage system that can survive system crashes.

- The second issue is caused by a database crash occurring at approximately 3:00am EST on Thursday, Nov. 30.
- Impacts:
- Malomo platform outage
- No events/notifications are being sent to integrated apps
- Tracking pages are not working
- We are still investigating the cause of the crash, but are currently restoring the database from backup and awaiting that restore to complete. This is a time-consuming process, and unfortunately the restore script does not provide any ETA for the restore.
- Once the restore is complete, we anticipate being able to restore full functionality to the platform while we continue investigating the fix for the first issue.

Please rest assured that we are taking this, and every outage, very seriously and are investing all of our resources to get to a full and speedy resolution.
Posted Nov 30, 2023 - 15:05 EST
Identified
The issue has been identified and we are currently in the process of restoring the platform.
Posted Nov 30, 2023 - 11:07 EST
Investigating
We are currently investigating an issue. You may experience issues accessing our dashboard, viewing tracking pages and receiving events to our integrations during this time.
Posted Nov 30, 2023 - 10:03 EST
This incident affected: Malomo - Platform (Dashboard, API).