Platform Architecture
The Deeper Truth: Platform Complexity
In a recent article, I described my preference for a cloud-native ELT architecture centered around BigQuery, dbt, Airflow, and lightweight execution layers such as Cloud Run or Cloud Functions.
The core argument was straightforward:
Modern data platforms are often more operationally complex than they need to be.
That remains my default architectural posture.
But there is a deeper truth worth discussing:
Complexity is not inherently bad.
Complexity is sometimes earned.
The challenge for architects and engineering organizations is distinguishing between:
- necessary complexity
- premature complexity
- accidental complexity
That distinction matters enormously.
Simplicity Is a Strategy, Not a Religion
When I argue for simpler architectures, I am not arguing that distributed systems, Spark, Databricks, Snowflake, Kubernetes, or lakehouse architectures are unnecessary technologies.
Many are exceptional technologies.
The question is not whether they are powerful.
The question is:
Under what conditions does their additional complexity become economically and operationally justified?
That is a much more interesting engineering discussion.
Architectural maturity is not about avoiding complexity at all costs.
It is about introducing complexity deliberately, proportionally, and with clear justification.
Workload Characteristics Matter
One reason architectural discussions often become polarized is that people generalize from their own workloads.
But workloads vary enormously.
A SQL-centric ELT pipeline performing relational transformations against structured business data has very different requirements from:
- real-time telemetry systems
- AI feature engineering pipelines
- graph analytics
- scientific computing
- recommendation engines
- stateful streaming applications
In many organizations, BigQuery or another cloud-native warehouse can absorb the overwhelming majority of transformation workloads efficiently and elegantly.
But there are legitimate cases where additional distributed compute layers become appropriate.
Examples include:
- iterative machine learning workflows
- large-scale Python or Scala processing
- graph traversal algorithms
- stateful streaming joins
- GPU-oriented processing
- workloads that do not naturally map to declarative SQL
At that point, systems like Spark begin solving real problems rather than hypothetical ones.
Streaming Changes the Conversation
Streaming architectures are one of the clearest examples of justified complexity.
Batch-oriented ELT pipelines and continuously operating event-processing systems have fundamentally different characteristics.
Once an organization begins dealing with:
- sub-second latency requirements
- event-time semantics
- watermarking
- stateful stream processing
- massive continuous ingestion
the architecture often changes substantially.
The engineering tradeoffs become different:
- operational complexity increases
- but so does the value of specialized streaming infrastructure
In those environments, additional distributed processing frameworks may become entirely reasonable.
Organizational Scale Matters Too
Technology choices are rarely driven solely by technical characteristics.
Organizations themselves impose constraints.
Large enterprises may require:
- strict workload isolation
- chargeback models
- dedicated compute domains
- regulatory segmentation
- multi-tenant governance
- independently managed engineering environments
In those situations, platforms that provide explicit compute segmentation and resource governance may become attractive even when the underlying workload itself is not especially exotic.
Architecture exists within organizations, not in isolation from them.
This is one reason why engineering discussions that focus exclusively on technical purity are often incomplete.
Operational governance is itself a technical requirement.
The Operational Cost of Complexity
At the same time, complexity always carries a price.
Every additional distributed subsystem introduces:
- additional monitoring
- additional IAM configuration
- additional deployment processes
- additional upgrade paths
- additional operational expertise
- additional troubleshooting surface area
- additional organizational coupling
These costs are frequently underestimated because architectural diagrams tend to emphasize capability rather than operational lifetime burden.
This matters more than many teams initially realize.
Operational simplicity improves:
- reliability
- onboarding
- debugging
- security posture
- maintainability
- long-term adaptability
A platform that fewer engineers fully understand may be more technically sophisticated while simultaneously becoming more organizationally fragile.
Open Formats and Data Locality
There are also legitimate strategic reasons organizations move toward lakehouse-oriented ecosystems.
Open table formats such as:
- Iceberg
- Delta Lake
- Hudi
provide attractive properties:
- decoupled storage and compute
- cross-engine interoperability
- long-term data portability
- flexible processing models
Similarly, some organizations reach data scales where moving large datasets repeatedly into centralized warehouse environments becomes economically inefficient.
Processing data closer to object storage may then become the rational choice.
Again, the point is not that one model is universally superior.
It is that workload economics eventually shape architecture.
Architectural Defaults Still Matter
Even after acknowledging all of this, my own default posture remains largely unchanged.
I still believe many organizations prematurely adopt:
- distributed compute layers
- Kubernetes-based data infrastructure
- complex streaming systems
- heavily fragmented platform architectures
before their actual workload characteristics justify doing so.
In many cases:
- simpler orchestration
- declarative transformations
- managed cloud-native compute
- warehouse-centric execution
remain entirely sufficient.
The existence of edge cases does not invalidate the value of simplicity as a starting principle.
Quite the opposite.
A simpler architecture creates a clearer baseline from which additional complexity can later be justified.
The Real Goal
Ultimately, the goal of architecture is not simplicity.
Nor is it sophistication.
The goal is fitness.
A good architecture:
- matches workload characteristics
- matches organizational maturity
- minimizes unnecessary operational burden
- evolves proportionally to demonstrated need
That evolution should ideally be intentional rather than fashionable.
The deeper truth is that complexity is neither virtue nor failure.
It is a tool.
And like all tools, it should be used carefully, deliberately, and with full awareness of its cost.