From spreadsheet chaos to a Python serverless pipeline | Harsh Mehta - AI Automation & Full-Stack Engineer

Spreadsheets are a great exploration surface; they are a fragile system of record when multiple teams depend on weekly exports. This post outlines a staged migration to Python + serverless components, with pointers to official AWS documentation.

1. Document the workflow before code

Capture:

Trigger: schedule (cron), file upload (S3 event), or message (queue).
Inputs: CSV columns, encodings, header row, allowed nulls.
Outputs: database tables, files, or API calls.
SLA: max runtime, who gets paged on failure.

If you cannot describe it in one page, split into phase 1 (read + validate + email report) and phase 2 (write + notify).

2. A thin Python core with testable transforms

Keep I/O at the edges; pure functions in the middle.

# shapes only — adapt types to your stack
from dataclasses import dataclass
from typing import Iterator

@dataclass
class Row:
    customer_id: str
    amount: str  # parse/validate downstream

def parse_rows(raw_lines: Iterator[str]) -> Iterator[Row]:
    ...

def validate_business_rules(row: Row) -> None:
    """Raise ValueError with a stable error code for DLQ messaging."""
    ...

Unit-test validate_business_rules with table-driven tests (pytest @pytest.mark.parametrize). For messy CSVs, Python’s csv module plus explicit encoding (utf-8-sig for Excel exports) prevents silent mojibake.

3. Idempotency: design for reruns

Jobs will retry after timeouts. Strategies:

Natural keys: upsert on (source_system, external_id).
Job id: store job_id on every row written; reject duplicates.
S3 + DynamoDB: conditional writes or transactions for small metadata.

This mirrors webhook design in idempotency and retries.

4. When Lambda + SQS fit

AWS Lambda works well when:

Duration stays within your account’s limit (configure timeout; watch memory—it scales CPU).
Concurrency is bursty, not 24/7 pinned at high utilization.

Pattern:

S3 ObjectCreated → Lambda validates and enqueues a message (payload = S3 key + checksum).
SQS → Lambda worker processes batches; failures go to a DLQ after N attempts—see SQS dead-letter queues.

[S3 upload] --> [Lambda: validate + enqueue] --> [SQS] --> [Lambda: transform + write]
                                                      |
                                                   [DLQ on poison messages]

For long-running ETL, consider AWS Batch or containers instead of stretching Lambda duration.

5. Cold starts and packaging

Cold start matters if the trigger is user-facing. Mitigations:

Provisioned concurrency (cost tradeoff)—Lambda scaling.
Slim dependencies (avoid shipping unused scientific stacks).
Initialize heavy clients outside the handler where possible.

6. Observability and alerts

Structured JSON logs with job_id, record_count, duration_ms, outcome.
CloudWatch alarms on DLQ depth and error rate—Lambda monitoring.
One on-call channel (PagerDuty, Slack) beats silent failure.

References

Lambda developer guide
SQS developer guide
S3 event notifications
Well-Architected Serverless lens — operational excellence checklist

Related posts: webhook reliability, LLM cost controls.