Spinoff your data pipeline with cloud functions
What is this article about?
Earlier this week, I have been working with a business requirement that requires a not so sophisticated data pipeline in place. The conventional way of deploying a typical pipeline requires a greater understanding of upstream and downstream data and transformation being applied to it. Generally, the data pipeline consists of mainly 3 entities.
A source, a messaging queue, set of functions to transform data such as aggregate all clickstream events from a client application and send it back to the backend for storage. Right of the bat, the first preference would be any streaming framework like spark streaming Kafka streaming so on. Managing these services on-premise would be a scalability issue as we don’t exactly know about the spike and sudden load. In such cases meeting your SLO and SLA would be like finding a pin in a haystack. With cloud functions; you can archive the power of serverless computing with function as a service.
These functions scale-out with a blink of an eye and help you manage the uncertain traffic at peak hours. As the solution is to pay as you go, deploying a pipeline would be a great move. At Montaigne, we as developers focus more on sophistication; and cloud functions just helped us with that. Our CI/CD pipelines have become even smoother than ever before and error logging was never been easier. Follow the link to read more.