Apache Parquet
Open jobs
145
Companies looking for Parquet
141
Back in the day, when data lakes were threatening to drown us all in unstructured chaos, came Apache Parquet. The promise? Columnar storage to rescue our queries from the abysmal depths of full table scans. It’s essentially a clever way of organizing data on disk so you only read what you need, not everything. Quite clever, really, though one does wonder if everyone truly understood the implications for schema evolution.
The reality? It's become a de facto standard for anything touching Spark, Hive, or Presto. Forget CSV; Parquet’s where the cool kids hang out. It's not a silver bullet, mind you – small files can be a nightmare, and it’s not ideal for every workload. But compared to row-oriented formats or even older columnar solutions, it offers a compelling balance of compression, performance, and ecosystem support. Today, if you're not using it, you're probably doing it wrong, or at least making your data engineers weep softly into their lattes.
Used together with Parquet
Additional Resources
Compare to other file formatsJobs (this month)
145
Companies with Jobs
141
Jobs in using Apache Parquet for but please no
|
chief
Assistant Director - Analytics & Modeling @ moodys-corporation
US | 2025-12-28
| USD
139900 - 202750
/ year
Moody’s is hiring into the Model Certification Products team to prototype and certify complex catastrophe, climate, and cyber models, with an emphasis on terabyte-scale streams and batch data. The...
read more »
|
AI/ML, R, Python, C, C++, Rust, Arrow, Parquet, SQL, Amazon RDS, LLM, Data Science, Computer Science, API, Analytics | ||
|---|---|---|---|
|
data engineer
Data Engineer, Active Grid Response @ gridwareinc
US | 2025-12-23
Gridware seeks a data engineer to develop ETL/ELT pipelines for its Active Grid Response platform, emphasizing high-precision sensor data integration and real-time processing, with a notable focus...
read more »
|
Management, Data Lakehouse, Analytics, ETL/ELT, Data Lake, Python, SQL, Databricks, Data Quality, Data Science, Cloud Computing, Big Data, Spark, Airflow, Dagster, Prefect, Data Streaming, Kafka, Kinesis, Data Modelling, IoT, Protobuf, Avro, Parquet, Grafana | ||
|
promoted
Masthead DataMasthead is a data reliability platform built for Google Cloud, focused on detecting anomalies and ensuring smooth data pipeline operations. It offers real-time notifications for data issues and pipeline errors without direct access to your sensitive data. |
|
||
|
data engineer
Senior Data Engineer @ autodesk
US | 2025-12-23
| USD
130600 - 211200
/ year
Seeking a Principal Data Engineer to lead data infrastructure design supporting ML, personalization, and search systems. A key differentiator is the opportunity to influence strategic initiatives...
read more »
|
AI/ML, Data Science, RAG, Data Engineering, Agile/Scrum, Analytics, Kafka, Flink, SQL, NoSQL, Vector DB, Python, Java, Big Data, Spark, Parquet, Iceberg, Delta, ETL/ELT, Cloud Computing, AWS, Azure, GCP, DWH, Snowflake, Redshift, Data Modelling, Computer Science, PhD, Pinecone, ELK, Data Streaming, MLOps |