Open jobs

145

Companies looking for Parquet

141

Back in the day, when data lakes were threatening to drown us all in unstructured chaos, came Apache Parquet. The promise? Columnar storage to rescue our queries from the abysmal depths of full table scans. It’s essentially a clever way of organizing data on disk so you only read what you need, not everything. Quite clever, really, though one does wonder if everyone truly understood the implications for schema evolution.

The reality? It's become a de facto standard for anything touching Spark, Hive, or Presto. Forget CSV; Parquet’s where the cool kids hang out. It's not a silver bullet, mind you – small files can be a nightmare, and it’s not ideal for every workload. But compared to row-oriented formats or even older columnar solutions, it offers a compelling balance of compression, performance, and ecosystem support. Today, if you're not using it, you're probably doing it wrong, or at least making your data engineers weep softly into their lattes.

Used together with Parquet

Additional Resources

Compare to other file formats
Jobs (this month)

145

Companies with Jobs

141

Jobs in using Apache Parquet for but please no

chief

Assistant Director - Analytics & Modeling @ moodys-corporation

US | 2025-12-28 | USD 139900 - 202750 / year
Moody’s is hiring into the Model Certification Products team to prototype and certify complex catastrophe, climate, and cyber models, with an emphasis on terabyte-scale streams and batch data. The... read more »
AI/ML, R, Python, C, C++, Rust, Arrow, Parquet, SQL, Amazon RDS, LLM, Data Science, Computer Science, API, Analytics
data engineer

Data Engineer, Active Grid Response @ gridwareinc

US | 2025-12-23
Gridware seeks a data engineer to develop ETL/ELT pipelines for its Active Grid Response platform, emphasizing high-precision sensor data integration and real-time processing, with a notable focus... read more »
Management, Data Lakehouse, Analytics, ETL/ELT, Data Lake, Python, SQL, Databricks, Data Quality, Data Science, Cloud Computing, Big Data, Spark, Airflow, Dagster, Prefect, Data Streaming, Kafka, Kinesis, Data Modelling, IoT, Protobuf, Avro, Parquet, Grafana
promoted

Masthead Data

Masthead is a data reliability platform built for Google Cloud, focused on detecting anomalies and ensuring smooth data pipeline operations. It offers real-time notifications for data issues and pipeline errors without direct access to your sensitive data.

data engineer

Senior Data Engineer @ autodesk

US | 2025-12-23 | USD 130600 - 211200 / year
Seeking a Principal Data Engineer to lead data infrastructure design supporting ML, personalization, and search systems. A key differentiator is the opportunity to influence strategic initiatives... read more »
AI/ML, Data Science, RAG, Data Engineering, Agile/Scrum, Analytics, Kafka, Flink, SQL, NoSQL, Vector DB, Python, Java, Big Data, Spark, Parquet, Iceberg, Delta, ETL/ELT, Cloud Computing, AWS, Azure, GCP, DWH, Snowflake, Redshift, Data Modelling, Computer Science, PhD, Pinecone, ELK, Data Streaming, MLOps