Apache Parquet
Open jobs
145
Companies looking for Parquet
141
Back in the day, when data lakes were threatening to drown us all in unstructured chaos, came Apache Parquet. The promise? Columnar storage to rescue our queries from the abysmal depths of full table scans. It’s essentially a clever way of organizing data on disk so you only read what you need, not everything. Quite clever, really, though one does wonder if everyone truly understood the implications for schema evolution.
The reality? It's become a de facto standard for anything touching Spark, Hive, or Presto. Forget CSV; Parquet’s where the cool kids hang out. It's not a silver bullet, mind you – small files can be a nightmare, and it’s not ideal for every workload. But compared to row-oriented formats or even older columnar solutions, it offers a compelling balance of compression, performance, and ecosystem support. Today, if you're not using it, you're probably doing it wrong, or at least making your data engineers weep softly into their lattes.
Used together with Parquet
Additional Resources
Compare to other file formatsJobs (this month)
145
Companies with Jobs
141
Jobs in using Apache Parquet for but please no
|
data engineer
Junior Data Engineer @ burson
GB | 2025-12-26
A Junior Data Engineer role at Burson involves supporting data pipelines and AI models in a hybrid London setting. The position emphasizes Python scripting, Azure cloud infrastructure, and...
read more »
|
AI/ML, DevOps, Python, Azure, Computer Science, BI, Agile/Scrum, Git, API, Power BI, Azure DevOps, Java, R, DAX, NoSQL, SQL, MongoDB, Parquet | ||
|---|---|---|---|
|
data engineer
Senior Data Engineer - (Genetics) Maternity Cover - 12 months FTC @ our-future-health-uk
GB | 2025-12-22
This role is for a Senior Data Engineer specializing in genetic data processing, with responsibilities involving building and maintaining robust pipelines for data storage and release. The...
read more »
|
Data Engineering, CI/CD, Agile/Scrum, Cloud Computing, Python, Unix, Azure, Parquet, Delta, Docker, Kubernetes, Spark, Databricks, Git, GitHub |