Apache Spark
Open jobs
3184
Companies looking for Spark
2819
Apache Spark revolutionizes large-scale data processing with its lightning-fast, in-memory distributed computing framework. Designed for complex analytics, machine learning, and streaming workloads, Spark dramatically outperforms traditional MapReduce approaches. Its unified analytics engine supports multiple programming languages (Scala, Python, Java, R) and integrates seamlessly with diverse data sources. Spark's resilient distributed datasets (RDDs) and DataFrame abstractions enable sophisticated data transformations with minimal infrastructure complexity. While powerful, it demands significant computational resources and expertise to optimize. Machine learning libraries (MLlib) and streaming capabilities make it a go-to solution for enterprises processing petabyte-scale datasets across distributed environments.
Used together with Spark
Jobs (this month)
3184
Companies with Jobs
2819
Jobs in using Apache Spark for but please no
|
data engineer
Snowflake Data Engineer @ morgan-stanley
US | 2025-12-27
| USD
90000 - 150000
/ year
A Data Engineer role at Morgan Stanley requires developing Scala/Spark data pipelines on Databricks and managing datasets in Snowflake, with a focus on regulatory reporting. The position's...
read more »
|
Data Engineering, Scala, Spark, Databricks, Snowflake, Python, Agile/Scrum, Computer Science, Cloud Computing, SQL, Unix, Linux, GenAI | ||
|---|---|---|---|
|
data engineer
Data Engineer II - QuantumBlack, AI by McKinsey (Critical Industries) @ quantumblack
US | 2025-12-27
Data Engineer II role promises to design, build, and optimize modern data platforms powering analytics and AI, with a tour through streaming and batch pipelines across aerospace, utilities, and...
read more »
|
Analytics, AI/ML, R, Data Streaming, Vector DB, RAG, Agile/Scrum, C, Computer Science, Data Engineering, Python, Scala, Java, SQL, Cloud Computing, AWS, GCP, Azure, Oracle, Snowflake, BigQuery, Redshift, Delta, Databricks, AWS Glue, dbt, Spark, Flink, Kafka, Kinesis, Airflow, Dagster, Prefect, CI/CD, Terraform, CloudFormation, DataOps, Datadog, Prometheus, Amazon SageMaker, MLOps, GenAI, LLM, Management | ||
|
promoted
Masthead DataMasthead is a data reliability platform built for Google Cloud, focused on detecting anomalies and ensuring smooth data pipeline operations. It offers real-time notifications for data issues and pipeline errors without direct access to your sensitive data. |
|
||
|
data engineer
Data Engineer @ fanduel
US | 2025-12-26
| USD
116000 - 152250
/ year
FanDuel's data engineer role reads like a standard pipeline gig: build and maintain batch and streaming jobs, code in Python and SQL, and keep Databricks, Spark, Airflow, and dbt humming. It...
read more »
|
Data Engineering, Analytics, AI/ML, Data Streaming, Python, SQL, Spark, Agile/Scrum, Data Quality, Analytics Engineering, Java, Scala, Databricks, Airflow, dbt, Kafka, Data Modelling, DWH, ETL/ELT, Cloud Computing, AWS, GCP, Azure, BI, Data Science, Git, CI/CD, Data Governance | ||
|
data engineer
Senior Engineer - Data Analytics @ geico
US | 2025-12-25
| USD
100000 - 230000
/ year
GEICO is seeking a Senior Engineer to develop high-performance data pipelines and models, with a focus on building resilient distributed systems and supporting internal analytics. The role's...
read more »
|
Data Analytics, Analytics Engineering, Data Modelling, Python, SQL, NoSQL, Spark, dbt, Docker, Kubernetes, Azure, Power BI, Superset, Big Data, Kafka, Git, Snowflake, dimensional modeling, Analytics, BI, Iceberg, Airflow, ETL/ELT, CI/CD, DevOps, Azure DevOps, Management, API, Data Quality, Marketing, AWS, GCP, Cloud Computing, Databricks, Computer Science, Data Science | ||
|
data engineer
Data Engineer @ orveonglobal
US | 2025-12-24
| $
80500 - 100500
/ year
Orveon seeks a Data Engineer to craft scalable data pipelines using Microsoft Fabric, collaborating with Power BI developers and business analysts. A key differentiator is the focus on integrating...
read more »
|
Fabric, Power BI, Analytics, Data Lakehouse, Data Quality, Computer Science, Data Science, Spark, Python, SQL, CI/CD, Data Modelling, Git, Azure DevOps, Microsoft, Azure, Synapse | ||
|
data engineer
Data Engineer -601/602 @ ptrglobal
US | 2025-12-24
| USD
65 - 70
/ hour
A data engineer role focused on AWS, Python, and Spark within a bank's consumer division; requires designing data pipelines and models, with a differentiator being expertise in cloud data lake...
read more »
|
AWS, Python, Spark, Data Lake, Agile/Scrum, Data Collection, Analytics, SQL, NoSQL, Data Quality, Data Governance, Data Engineering, PySpark, GenAI, API, Cloud Computing, Data Lakehouse, Databricks, Hadoop, PostgreSQL, Oracle, Cassandra, DynamoDB, MongoDB, Snowflake, Redshift, Airflow, Unix, Avro, Protobuf, Parquet, Iceberg, Data Streaming, Data Modelling, Data Vault, dimensional modeling, CI/CD | ||
|
promoted
The Fundamentals of Analytics EngineeringThe Fundamentals of Analytics Engineering gives a holistic understanding of the analytics engineering lifecycle by integrating principles from both data analysis and engineering. It's a book that teaches concepts and best practices, not just tools and technologies. |
|
||
|
data engineer
Quantitative Developer - Risk & Data Platform @ balyasny-asset-management-l.p.
US | 2025-12-23
Longaeva seeks a Quantitative Developer for its Data Platform team, anchoring risk and research workflows in a New York hedge fund with a global mandate. The role is hands-on and highly visible:...
read more »
|
Management, DWH, Dashboard, Python, SQL, Snowflake, Spark, dbt, Iceberg, AWS, API | ||
|
data engineer
Data Engineer II - QuantumBlack, AI by McKinsey @ quantumblack
US | 2025-12-22
Data Engineer II role promises high impact across clients and a culture of continuous learning, but the job description reads like a curriculum vitae for a consulting fortress: heavy emphasis on...
read more »
|
Analytics, AI/ML, R, Data Streaming, Vector DB, RAG, Agile/Scrum, C, Computer Science, Data Engineering, Python, Scala, Java, SQL, PySpark, Cloud Computing, AWS, GCP, Azure, Oracle, Snowflake, BigQuery, Redshift, Delta, Databricks, AWS Glue, dbt, Spark, Flink, Kafka, Kinesis, Airflow, Dagster, Prefect, CI/CD, Terraform, CloudFormation, DataOps, Datadog, Prometheus, Amazon SageMaker, MLOps, GenAI, LLM, Management |