Resume and JobRESUME AND JOB
IBM logo

Spark developer (San Jose/CA)

IBM

Software and Technology Jobs

Spark developer (San Jose/CA)

full-timePosted: Dec 12, 2025

Job Description

Spark developer (San Jose/CA)

📋 Job Overview

Join IBM as a Spark Scala Developer in San Jose, CA, to develop and optimize big data applications using Apache Spark and Scala. You will architect scalable data pipelines, collaborate with data teams, and ensure data quality and compliance. This role offers growth opportunities within IBM's innovative software environment.

📍 Location: San Jose, US (Remote/Hybrid)

💼 Career Level: Professional

🎯 Key Responsibilities

  • Design, develop, and optimize big data applications using Apache Spark and Scala
  • Architect and implement scalable data pipelines for both batch and real-time processing
  • Collaborate with data engineers, analysts, and architects to define data strategies
  • Optimize Spark jobs for performance and cost-effectiveness on distributed clusters
  • Build and maintain reusable code and libraries for future use
  • Work with various data storage systems like HDFS, Hive, HBase, Cassandra, Kafka, and Parquet
  • Implement data quality checks, logging, monitoring, and alerting for ETL jobs
  • Mentor junior developers and lead code reviews to ensure best practices
  • Ensure security, governance, and compliance standards are adhered to in all data processes
  • Troubleshoot and resolve performance issues and bugs in big data solutions

✅ Required Qualifications

  • 12+ years of total software development experience
  • Minimum 5+ years of hands-on experience with Apache Spark and Scala
  • Strong experience with distributed computing, parallel data processing, and cluster computing frameworks
  • Proficiency in Scala with deep knowledge of functional programming
  • Solid understanding of Spark tuning, partitions, joins, broadcast variables, and performance optimization techniques
  • Experience with cloud platforms such as AWS, Azure, or GCP (especially EMR, Databricks, or HDInsight)
  • Hands-on experience with Kafka, Hive, HBase, NoSQL databases, and data lake architectures
  • Familiarity with CI/CD pipelines, Git, Jenkins, and automated testing
  • Strong problem-solving skills and the ability to work independently or as part of a team

⭐ Preferred Qualifications

  • Exposure to machine learning pipelines using Spark MLlib or integration with ML frameworks
  • Experience with data governance tools (e.g., Apache Atlas, Collibra)
  • Contributions to open-source big data projects

🛠️ Required Skills

  • Apache Spark
  • Scala
  • Distributed computing
  • Parallel data processing
  • Cluster computing frameworks
  • Functional programming
  • Spark tuning
  • Partitions
  • Joins
  • Broadcast variables
  • Performance optimization
  • AWS
  • Azure
  • GCP
  • EMR
  • Databricks
  • HDInsight
  • Kafka
  • Hive
  • HBase
  • NoSQL databases
  • Data lake architectures
  • CI/CD pipelines
  • Git
  • Jenkins
  • Automated testing
  • Problem-solving
  • Teamwork
  • Spark MLlib
  • Machine learning
  • Data governance
  • Apache Atlas
  • Collibra
  • Open-source big data projects
  • HDFS
  • Cassandra
  • Parquet
  • ETL
  • Mentoring
  • Code reviews
  • Security
  • Governance
  • Compliance
  • Troubleshooting

🎁 Benefits & Perks

  • Healthcare benefits including medical & prescription drug coverage, dental, vision, and mental health & well being
  • Financial programs such as 401(k), cash balance pension plan, the IBM Employee Stock Purchase Plan, financial counseling, life insurance, short & long-term disability coverage, and opportunities for performance based salary incentive programs
  • Generous paid time off including 12 holidays, minimum 56 hours sick time, 120 hours vacation, 12 weeks parental bonding leave in accordance with IBM Policy, and other Paid Care Leave programs
  • Training and educational resources on our personalized, AI-driven learning platform where IBMers can grow skills and obtain industry-recognized certifications to achieve their career goals
  • Diverse and inclusive employee resource groups, giving & volunteer opportunities, and discounts on retail products, services & experiences

Locations

  • San Jose, US, India (Remote)

Salary

Estimated Salary Rangemedium confidence

2,500,000 - 4,200,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Apache Sparkintermediate
  • Scalaintermediate
  • Distributed computingintermediate
  • Parallel data processingintermediate
  • Cluster computing frameworksintermediate
  • Functional programmingintermediate
  • Spark tuningintermediate
  • Partitionsintermediate
  • Joinsintermediate
  • Broadcast variablesintermediate
  • Performance optimizationintermediate
  • AWSintermediate
  • Azureintermediate
  • GCPintermediate
  • EMRintermediate
  • Databricksintermediate
  • HDInsightintermediate
  • Kafkaintermediate
  • Hiveintermediate
  • HBaseintermediate
  • NoSQL databasesintermediate
  • Data lake architecturesintermediate
  • CI/CD pipelinesintermediate
  • Gitintermediate
  • Jenkinsintermediate
  • Automated testingintermediate
  • Problem-solvingintermediate
  • Teamworkintermediate
  • Spark MLlibintermediate
  • Machine learningintermediate
  • Data governanceintermediate
  • Apache Atlasintermediate
  • Collibraintermediate
  • Open-source big data projectsintermediate
  • HDFSintermediate
  • Cassandraintermediate
  • Parquetintermediate
  • ETLintermediate
  • Mentoringintermediate
  • Code reviewsintermediate
  • Securityintermediate
  • Governanceintermediate
  • Complianceintermediate
  • Troubleshootingintermediate

Required Qualifications

  • 12+ years of total software development experience (experience)
  • Minimum 5+ years of hands-on experience with Apache Spark and Scala (experience)
  • Strong experience with distributed computing, parallel data processing, and cluster computing frameworks (experience)
  • Proficiency in Scala with deep knowledge of functional programming (experience)
  • Solid understanding of Spark tuning, partitions, joins, broadcast variables, and performance optimization techniques (experience)
  • Experience with cloud platforms such as AWS, Azure, or GCP (especially EMR, Databricks, or HDInsight) (experience)
  • Hands-on experience with Kafka, Hive, HBase, NoSQL databases, and data lake architectures (experience)
  • Familiarity with CI/CD pipelines, Git, Jenkins, and automated testing (experience)
  • Strong problem-solving skills and the ability to work independently or as part of a team (experience)

Preferred Qualifications

  • Exposure to machine learning pipelines using Spark MLlib or integration with ML frameworks (experience)
  • Experience with data governance tools (e.g., Apache Atlas, Collibra) (experience)
  • Contributions to open-source big data projects (experience)

Responsibilities

  • Design, develop, and optimize big data applications using Apache Spark and Scala
  • Architect and implement scalable data pipelines for both batch and real-time processing
  • Collaborate with data engineers, analysts, and architects to define data strategies
  • Optimize Spark jobs for performance and cost-effectiveness on distributed clusters
  • Build and maintain reusable code and libraries for future use
  • Work with various data storage systems like HDFS, Hive, HBase, Cassandra, Kafka, and Parquet
  • Implement data quality checks, logging, monitoring, and alerting for ETL jobs
  • Mentor junior developers and lead code reviews to ensure best practices
  • Ensure security, governance, and compliance standards are adhered to in all data processes
  • Troubleshoot and resolve performance issues and bugs in big data solutions

Benefits

  • general: Healthcare benefits including medical & prescription drug coverage, dental, vision, and mental health & well being
  • general: Financial programs such as 401(k), cash balance pension plan, the IBM Employee Stock Purchase Plan, financial counseling, life insurance, short & long-term disability coverage, and opportunities for performance based salary incentive programs
  • general: Generous paid time off including 12 holidays, minimum 56 hours sick time, 120 hours vacation, 12 weeks parental bonding leave in accordance with IBM Policy, and other Paid Care Leave programs
  • general: Training and educational resources on our personalized, AI-driven learning platform where IBMers can grow skills and obtain industry-recognized certifications to achieve their career goals
  • general: Diverse and inclusive employee resource groups, giving & volunteer opportunities, and discounts on retail products, services & experiences

Target Your Resume for "Spark developer (San Jose/CA)" , IBM

Get personalized recommendations to optimize your resume specifically for Spark developer (San Jose/CA). Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Spark developer (San Jose/CA)" , IBM

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

Software EngineeringSoftware Engineering

Answer 10 quick questions to check your fit for Spark developer (San Jose/CA) @ IBM.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.

IBM logo

Spark developer (San Jose/CA)

IBM

Software and Technology Jobs

Spark developer (San Jose/CA)

full-timePosted: Dec 12, 2025

Job Description

Spark developer (San Jose/CA)

📋 Job Overview

Join IBM as a Spark Scala Developer in San Jose, CA, to develop and optimize big data applications using Apache Spark and Scala. You will architect scalable data pipelines, collaborate with data teams, and ensure data quality and compliance. This role offers growth opportunities within IBM's innovative software environment.

📍 Location: San Jose, US (Remote/Hybrid)

💼 Career Level: Professional

🎯 Key Responsibilities

  • Design, develop, and optimize big data applications using Apache Spark and Scala
  • Architect and implement scalable data pipelines for both batch and real-time processing
  • Collaborate with data engineers, analysts, and architects to define data strategies
  • Optimize Spark jobs for performance and cost-effectiveness on distributed clusters
  • Build and maintain reusable code and libraries for future use
  • Work with various data storage systems like HDFS, Hive, HBase, Cassandra, Kafka, and Parquet
  • Implement data quality checks, logging, monitoring, and alerting for ETL jobs
  • Mentor junior developers and lead code reviews to ensure best practices
  • Ensure security, governance, and compliance standards are adhered to in all data processes
  • Troubleshoot and resolve performance issues and bugs in big data solutions

✅ Required Qualifications

  • 12+ years of total software development experience
  • Minimum 5+ years of hands-on experience with Apache Spark and Scala
  • Strong experience with distributed computing, parallel data processing, and cluster computing frameworks
  • Proficiency in Scala with deep knowledge of functional programming
  • Solid understanding of Spark tuning, partitions, joins, broadcast variables, and performance optimization techniques
  • Experience with cloud platforms such as AWS, Azure, or GCP (especially EMR, Databricks, or HDInsight)
  • Hands-on experience with Kafka, Hive, HBase, NoSQL databases, and data lake architectures
  • Familiarity with CI/CD pipelines, Git, Jenkins, and automated testing
  • Strong problem-solving skills and the ability to work independently or as part of a team

⭐ Preferred Qualifications

  • Exposure to machine learning pipelines using Spark MLlib or integration with ML frameworks
  • Experience with data governance tools (e.g., Apache Atlas, Collibra)
  • Contributions to open-source big data projects

🛠️ Required Skills

  • Apache Spark
  • Scala
  • Distributed computing
  • Parallel data processing
  • Cluster computing frameworks
  • Functional programming
  • Spark tuning
  • Partitions
  • Joins
  • Broadcast variables
  • Performance optimization
  • AWS
  • Azure
  • GCP
  • EMR
  • Databricks
  • HDInsight
  • Kafka
  • Hive
  • HBase
  • NoSQL databases
  • Data lake architectures
  • CI/CD pipelines
  • Git
  • Jenkins
  • Automated testing
  • Problem-solving
  • Teamwork
  • Spark MLlib
  • Machine learning
  • Data governance
  • Apache Atlas
  • Collibra
  • Open-source big data projects
  • HDFS
  • Cassandra
  • Parquet
  • ETL
  • Mentoring
  • Code reviews
  • Security
  • Governance
  • Compliance
  • Troubleshooting

🎁 Benefits & Perks

  • Healthcare benefits including medical & prescription drug coverage, dental, vision, and mental health & well being
  • Financial programs such as 401(k), cash balance pension plan, the IBM Employee Stock Purchase Plan, financial counseling, life insurance, short & long-term disability coverage, and opportunities for performance based salary incentive programs
  • Generous paid time off including 12 holidays, minimum 56 hours sick time, 120 hours vacation, 12 weeks parental bonding leave in accordance with IBM Policy, and other Paid Care Leave programs
  • Training and educational resources on our personalized, AI-driven learning platform where IBMers can grow skills and obtain industry-recognized certifications to achieve their career goals
  • Diverse and inclusive employee resource groups, giving & volunteer opportunities, and discounts on retail products, services & experiences

Locations

  • San Jose, US, India (Remote)

Salary

Estimated Salary Rangemedium confidence

2,500,000 - 4,200,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

  • Apache Sparkintermediate
  • Scalaintermediate
  • Distributed computingintermediate
  • Parallel data processingintermediate
  • Cluster computing frameworksintermediate
  • Functional programmingintermediate
  • Spark tuningintermediate
  • Partitionsintermediate
  • Joinsintermediate
  • Broadcast variablesintermediate
  • Performance optimizationintermediate
  • AWSintermediate
  • Azureintermediate
  • GCPintermediate
  • EMRintermediate
  • Databricksintermediate
  • HDInsightintermediate
  • Kafkaintermediate
  • Hiveintermediate
  • HBaseintermediate
  • NoSQL databasesintermediate
  • Data lake architecturesintermediate
  • CI/CD pipelinesintermediate
  • Gitintermediate
  • Jenkinsintermediate
  • Automated testingintermediate
  • Problem-solvingintermediate
  • Teamworkintermediate
  • Spark MLlibintermediate
  • Machine learningintermediate
  • Data governanceintermediate
  • Apache Atlasintermediate
  • Collibraintermediate
  • Open-source big data projectsintermediate
  • HDFSintermediate
  • Cassandraintermediate
  • Parquetintermediate
  • ETLintermediate
  • Mentoringintermediate
  • Code reviewsintermediate
  • Securityintermediate
  • Governanceintermediate
  • Complianceintermediate
  • Troubleshootingintermediate

Required Qualifications

  • 12+ years of total software development experience (experience)
  • Minimum 5+ years of hands-on experience with Apache Spark and Scala (experience)
  • Strong experience with distributed computing, parallel data processing, and cluster computing frameworks (experience)
  • Proficiency in Scala with deep knowledge of functional programming (experience)
  • Solid understanding of Spark tuning, partitions, joins, broadcast variables, and performance optimization techniques (experience)
  • Experience with cloud platforms such as AWS, Azure, or GCP (especially EMR, Databricks, or HDInsight) (experience)
  • Hands-on experience with Kafka, Hive, HBase, NoSQL databases, and data lake architectures (experience)
  • Familiarity with CI/CD pipelines, Git, Jenkins, and automated testing (experience)
  • Strong problem-solving skills and the ability to work independently or as part of a team (experience)

Preferred Qualifications

  • Exposure to machine learning pipelines using Spark MLlib or integration with ML frameworks (experience)
  • Experience with data governance tools (e.g., Apache Atlas, Collibra) (experience)
  • Contributions to open-source big data projects (experience)

Responsibilities

  • Design, develop, and optimize big data applications using Apache Spark and Scala
  • Architect and implement scalable data pipelines for both batch and real-time processing
  • Collaborate with data engineers, analysts, and architects to define data strategies
  • Optimize Spark jobs for performance and cost-effectiveness on distributed clusters
  • Build and maintain reusable code and libraries for future use
  • Work with various data storage systems like HDFS, Hive, HBase, Cassandra, Kafka, and Parquet
  • Implement data quality checks, logging, monitoring, and alerting for ETL jobs
  • Mentor junior developers and lead code reviews to ensure best practices
  • Ensure security, governance, and compliance standards are adhered to in all data processes
  • Troubleshoot and resolve performance issues and bugs in big data solutions

Benefits

  • general: Healthcare benefits including medical & prescription drug coverage, dental, vision, and mental health & well being
  • general: Financial programs such as 401(k), cash balance pension plan, the IBM Employee Stock Purchase Plan, financial counseling, life insurance, short & long-term disability coverage, and opportunities for performance based salary incentive programs
  • general: Generous paid time off including 12 holidays, minimum 56 hours sick time, 120 hours vacation, 12 weeks parental bonding leave in accordance with IBM Policy, and other Paid Care Leave programs
  • general: Training and educational resources on our personalized, AI-driven learning platform where IBMers can grow skills and obtain industry-recognized certifications to achieve their career goals
  • general: Diverse and inclusive employee resource groups, giving & volunteer opportunities, and discounts on retail products, services & experiences

Target Your Resume for "Spark developer (San Jose/CA)" , IBM

Get personalized recommendations to optimize your resume specifically for Spark developer (San Jose/CA). Takes only 15 seconds!

AI-powered keyword optimization
Skills matching & gap analysis
Experience alignment suggestions

Check Your ATS Score for "Spark developer (San Jose/CA)" , IBM

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check
Keyword optimization analysis
Skill matching & gap identification
Format & readability score

Tags & Categories

Software EngineeringSoftware Engineering

Answer 10 quick questions to check your fit for Spark developer (San Jose/CA) @ IBM.

Quiz Challenge
10 Questions
~2 Minutes
Instant Score

Related Books and Jobs

No related jobs found at the moment.