インサイト · 読了約 6 分
Parquet vs CSV: when analysts still reach for flat files
Columnar storage wins in warehouses; CSV remains the handoff format to humans, Excel, and legacy tools.
公開日 2025年3月21日 · Table
Parquet compresses and types data efficiently for Spark, DuckDB, and cloud warehouses. CSV stays the lowest-common-denominator for email attachments, regulatory submissions, and quick human review.
Split the workflow
- Store canonical tables in Parquet/Iceberg inside the lake.
- Emit bounded CSV slices for stakeholders who will not query SQL.
- Use a viewer for those slices instead of re-importing to Sheets.