How do you guys effectively switch to smaller buffer size (assuming 32KB or less) and then switch back to 4MB for actual data read? Did you modify the Spark codes?

Ultimately, I guess, you may still need Iceberg on top of S3.

Advocate best practice of big data technologies. Challenge the conventional wisdom. Peel off the flashy promise in architecture and scalability.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store