Logs as Code: Building a Serverless ETL Security Pipeline
Security engineering often runs into this painful truth: your tools don’t talk to each other — especially when you’re working with emerging platforms and SIEMs.
That’s exactly what happened when I tried integrating Vicarius VRx (a modern vuln management tool) with our SIEM stack. Waiting for a native integration? Not an option.
We needed visibility — and we needed it now.
This is the first post in a series I plan to publish, and it outlines the design phase. I’ll get to technical details in further posts.
- Logs as Code: Building a Serverless ETL Security Pipeline
- The Integration Gap: A Common Security Challenge
- Figuring out a solution
- Enter Project Iris: A Serverless Security Pipeline
- Technical Deep Dive: The Power of Stateless Processing
- The Architecture: Simple Yet Powerful
- Robust Logging: Because Security Tools Need Security
- And so I started..
- Lessons Learned
- Looking Forward: Evolution of Security Integration
The Integration Gap: A Common Security Challenge
Emerging security vendors often lack established integrations with major SIEM platforms. Vicarius VRx, while offering advanced patch and vulnerability management capabilities, was no exception. This limitation threatened to create a critical gap in our security monitoring - one we couldn't afford to have.
Figuring out a solution
The project was also clear - Building “Logs as code” or an engineering solution to “glue” things together.
Get logs from point A to point B. Use Python to retrieve logs from VRx API and drop them into a GCP Bucket. Very easy.
The idea was simple and elegant, but…
- But I need to paginate the API
- But I need to also do pagination for storage in buckets.
- But I need to track time across executions
- But I need to convert those nanoseconds into human time
- But I need to handle secrets securely
- But I need to log everything
- But I need to send Slack alerts
- But I need to avoid duplicates
- But I need to modularize it (eventually), or make it as generic as possible
- But I need it to be serverless
Because both the API and the SIEM have some limitations…
Because “last 4 hours” isn’t a thing when your API wants nanoseconds since epoch.
So the SIEM events actually make sense.
Because I need to know what happened, when, and get good feedback when (Not if) something goes wrong.
So I need to know if it’s working, or more importantly—when it’s not. Easily.
Because confusing a monitoring team is exactly the opposite of my mission.
Because today it’s VRx, tomorrow it’s XYZ Corp and their weird GraphQL event feed.
Because I’m not spinning up a VM just to run code for 5 seconds every 4 hours.
Me, thinking the whole weekend about this before actually starting this project:
Enter Project Iris: A Serverless Security Pipeline
Rather than deploying a server to achieve this, I built Iris - a serverless security data pipeline leveraging Google Cloud Platform's Cloud Run service. Here's why this approach made sense:
- Cost-effective: Pay only for actual execution time vs maintaining 24/7 infrastructure - The estimated time of execution (Every 4 hours) was no more than 5 seconds: Grab logs from API iteratively, push them to GCP Bucket.
- Resource-efficient: Process 500 events in milliseconds with minimal overhead
- Highly reliable: Leverages GCP's robust platform and built-in redundancy
- Secure by design: Managed identity and secrets handling through Cloud Secret Manager
- Weapon of choice: Python (Of course)
Technical Deep Dive: The Power of Stateless Processing
The Architecture: Simple Yet Powerful
Or, in a nice Mermaid diagram (I’m loving mermaid this days so here it is):
OVERALL EXECUTION FLOW
- Google Cloud Scheduler triggers Iris Job every 4 hours,
- Iris Job pulls secrets from Google Cloud Secrets Manager:
- Slack Webhook URL,
- Vicarius VRx API Key,
- Iris Job pulls last 4 hours of data from Vicarius VRx in batches of 500 events with the following pagination conditions:
- If the events pulled are equal to 500, it begins to transform the data into NDJSON and jumps to step (5) to save the events in NDJSON at the required Google Cloud Storage Bucket. After this is done, starts again at (3) with the next 500 events, completing the necessary iterations until the following condition is met:
- If the events pulled are less than 500, transform the data into NDJSON, jumps to step (5) to save the events in NDJSON at the required GCP Storage Bucket and continues the flow.
- If the events pulled are 0, jumps to step (4) and finishes the execution.
- Slack Notifications are sent in only 3 cases: SUCCESS, CRITICAL ERROR, INFO (See “Slack notifications” section below).
- In case of successful completion of the flow, a Success slack notification is sent.
- In case of zero events, an “informational” slack message is sent.
- In case of critical, unrecoverable errors in any of the processes in this flow, a Slack notification is sent with a critical alert, and the process aborts immediately.
Iris processes security events in batches of 500, using intelligent pagination and timestamp tracking to ensure no event is missed or duplicated. The pipeline includes:
- API ingestion with built-in rate limiting and error handling
- Data transformation from raw API format to SIEM-compatible NDJSON
- Timestamp conversion from nanoseconds to RFC 3339 format
- State management through Cloud Storage for consistent processing
Robust Logging: Because Security Tools Need Security
A security pipeline needs comprehensive logging for accountability and troubleshooting. The logging strategy was dual:
- Detailed stdout logs in Cloud Run for complete execution tracing (
logging
python package), - Real-time Slack notifications for critical status updates:
- Success confirmations
- Processing errors
- Zero-event notifications
And so I started..
On a personal note, I have never worked coded blindly for that much time. There were no chances for testing code fully. Only “microtests”: Essentially sandbox or scratchpad files to test things out - "Will this secret load? Will this timestamp convert? Will GCP accept this blob?”.
If the final deploy failed, cleanup was pain. Success had to be one-shot clean.
Me several hours into this solution:
Lessons Learned
Overall, it was a very enriching experience. Building this pipeline taught me valuable lessons about modern security integration:
- Serverless isn't just for web applications - it's perfect for scheduled security tasks
- State management in serverless requires careful design but enables reliable processing
- About AI: AI was definitely a factor for me to achieve this in 2 days instead of 1 week (Minus the design and “mental modeling” phase). My idea that now we’re too slow for LLMs usage is very strong now (My typing speed is waay behind my thinking speed, and hence I’m typing dozens or hundreds of words to prompt GPT-4 o1 which is sometimes miles behind my thinking, and very annoying).
- I need to improve my Vim workflow - For this kind of development, Vim was a bit clunky and kicked me out of Flow state several times. Dropped it and used VSCode for the whole time. There are some remaps to do for me to breeze through this kind of workflow in Vim.
Looking Forward: Evolution of Security Integration
While vendors will eventually provide native integrations, the ability to build secure, efficient data pipelines is becoming a crucial skill in security engineering. The approach with Iris demonstrates that with modern cloud services, we can bridge integration gaps without compromising on security or scalability.