How Replicate transformed their webhook infrastructures using Svix

Replicate as an AI and machine learning platform that enables people to run machine learning models in the cloud. Replicate customers can run and train models on the Replicate platform for anything from development and research, and all the way to powering their production applications.

Replicate supports all available AI workloads including generative AI workloads like image generation (e.g. stable fusion) as well as large language models (LLMs).

Replicate is one of the most exciting AI infrastructure companies around, with more than 30,000 paying customers, including some of the world's leading AI companies and brands such as: BuzzFeed, Character.ai, and Unsplash.

Morgan Fainberg

Svix's platform is reliable, fast, and makes our lives easier by handling the complexities of webhooks, allowing us to deliver a better experience for our users.

Morgan FainbergInfrastructure Engineer

Replicate and its customers make heavy use of webhooks

Many AI workloads are inherently asynchronous: you trigger an inference or training, and you want to get notified when it's queued, succeeds, fails, and other such lifecycle updates. Once these are done, you want to take actions like sending it to the next step in your AI data pipeline.

Webhooks enable Replicate and their customers to interact with complex AI models like they would interact with any other API. Making Replicate a core part of their workflows without necessarily having deep AI knowledge.

One common example is fine-tuned training (e.g LoRa). Let's say you want to create a service that lets customers upload a set of images to set the stylistic tone, and then be able to transform images to that same style. We'll call that service Painter for the sake of example.

The Painter UI will let a customer upload a set of images with the wanted style, and the image they would like to transform. In the background, Painter will then send Replicate the set of images to fine-tune the model. Once done, Replicate will trigger a webhook to let Painter know, and at that point Painter will send the image to transform to Replicate to run through the model. Once that's done, Replicate will send another webhook to let Painter know, who can then notify the customer.

Essentially, webhooks enable Replicate and its customers to build single and multi-step AI workflows that operate in real-time with minimal code and operational complexity.

Their existing webhook system wasn't serving them well

Before discovering Svix, Replicate had a different webhook sending system that they built in house. The kept running into the many known limitations in their internal webhook sender, and they felt like the customer experience could be much better.

Morgan Fainberg

Webhooks are not an easy thing to build. You can do an OK or even decent job, but doing webhooks great is a real challenge.

Morgan FainbergInfrastructure Engineer

Replicate was initially planning to invest time and resources into fixing some of the issues they were experiencing in they webhook system. Though as they said: if you build your own webhook system you may be able to send webhooks, but getting actionable data back to customers in terms of webhooks observability is a another challenge; referring to the Svix webhook management and observability utilities.

Being such a large and flexible platform as they are, they were also facing challenges with noisy-neighbors: essentially when one customer would send too much data or start failing too much, webhook deliveries for all of their customers will start to suffer.

Replicate transformed their webhooks with Svix

Replicate eventually started looking around for what options there are available and saw that: Svix was the best platform out there. It had a rich SDK, a very nice API and platform, you could see all of the additional benefits that we could get, and the added observability is a great addition. They further added: With just a little bit of additional work we also had custom components in our UI that present information about webhooks to our customers. We now have graphs, logs and everything that customers love. They can also diagnose and solve issues on their own..

In addition to increasing the reliability, delivery rates, and durability of their platform, Replicate also gained deeper visibility and understanding into the webhooks they were sending. Catching potential mistakes as they were happening and fixing them, an area that they didn't have much visibility into in the past.

Morgan Fainberg

We now felt confident with our webhooks system, and could trivially start sending more types of webhooks.

Morgan FainbergInfrastructure Engineer

Another unexpected benefit that they noticed is that webhook delivery is now significantly faster and have a more consistent latency. So webhooks are not only delivered faster, but they variability in the delivery time is much lower causing for a better experience for their customers and their workflows.

Switching to Svix was relatively low effort for them. It took them two weeks to go from deciding to use Svix to going to production, with very little of it spent coding, and most of it spent on verification and end-to-end testing to make sure there were no differences compared to their previous implementation.

Morgan Fainberg

We have customers that absolutely rely on webhooks, and they were unhappy with the current system. We had to fix it, and with Svix it was a relatively low effort to fix a very visible piece of poor experience in our platform.

Morgan FainbergInfrastructure Engineer

Svix is a partner they can rely on

One of the main things that Replicate were hoping to achieve by switching to Svix is once it's out of our system, it's going to be delivered. They have customers that rely on their webhooks so reliability is key. Svix has delivered that and more.

Morgan Fainberg

We would sometimes have some suggestions for improvements to the product to make things even more ergonomic for us. We would suggest them to Svix in our shared Slack channel, and most of the time it would magically show in the product the next day, or at most the next week. It's an amazing experience.

Morgan FainbergInfrastructure Engineer

Svix also lets its customers consume a stream of OpenTelemetry events that they can then consume in their observability platform of choice. Replicate is using that in order to have tight observability and alerting on top of Svix. Making Svix feel like a tightly integrated part of their core infrastructure.

Happier customers and better service

Replicate's customers were very happy with the more reliable system, but another great addition is that now both customers and internal Replicate support teams had better insights into exactly what was going on with the webhooks.

Morgan Fainberg

One benefit of Svix is that it improved the business by having happier customers, and happier customers make for a whole lot of better things.

Morgan FainbergInfrastructure Engineer

They can see how much volume is going through, what's failing and why, as well as self-serving into resolving issues. Svix supports creating customer portal magic links which can be used to log in support agents directly to a customer environment and investigate errors and performance issues. These are the type of things our larger customers really cared about.

After seeing how much their large customers were enjoying the additional visibility, they just used the Svix libraries to offer all of these insights directly in their product, available to all Replicate customers. Now all of their customers love and enjoy this new visibility, and it took their product engineer a few days to go from a vague idea to a fully custom UI up and running.

The end result

We asked Replicate how they quantified switching to Svix, and they said we ran the numbers, and using Svix was cheaper than having our own system. Though there's really no one number to quantify it, because the biggest value is that they just work. It's easy to forget that it was ever a challenge because how much easier things currently are.

Morgan Fainberg

Svix is absolutely fantastic. It's fast, it's reliable, and we no longer have to think about webhooks.

Morgan FainbergInfrastructure Engineer