RESUME AND JOB

Spark developer (San Jose/CA)

IBM

Spark developer (San Jose/CA)

IBM

full-timePosted: Dec 12, 2025

Job Description

Spark developer (San Jose/CA)

📋 Job Overview

Join IBM as a Spark Scala Developer in San Jose, CA, to develop and optimize big data applications using Apache Spark and Scala. You will architect scalable data pipelines, collaborate with data teams, and ensure data quality and compliance. This role offers growth opportunities within IBM's innovative software environment.

📍 Location: San Jose, US (Remote/Hybrid)

💼 Career Level: Professional

🎯 Key Responsibilities

Design, develop, and optimize big data applications using Apache Spark and Scala
Architect and implement scalable data pipelines for both batch and real-time processing
Collaborate with data engineers, analysts, and architects to define data strategies
Optimize Spark jobs for performance and cost-effectiveness on distributed clusters
Build and maintain reusable code and libraries for future use
Work with various data storage systems like HDFS, Hive, HBase, Cassandra, Kafka, and Parquet
Implement data quality checks, logging, monitoring, and alerting for ETL jobs
Mentor junior developers and lead code reviews to ensure best practices
Ensure security, governance, and compliance standards are adhered to in all data processes
Troubleshoot and resolve performance issues and bugs in big data solutions

✅ Required Qualifications

12+ years of total software development experience
Minimum 5+ years of hands-on experience with Apache Spark and Scala
Strong experience with distributed computing, parallel data processing, and cluster computing frameworks
Proficiency in Scala with deep knowledge of functional programming
Solid understanding of Spark tuning, partitions, joins, broadcast variables, and performance optimization techniques
Experience with cloud platforms such as AWS, Azure, or GCP (especially EMR, Databricks, or HDInsight)
Hands-on experience with Kafka, Hive, HBase, NoSQL databases, and data lake architectures
Familiarity with CI/CD pipelines, Git, Jenkins, and automated testing
Strong problem-solving skills and the ability to work independently or as part of a team

⭐ Preferred Qualifications

Exposure to machine learning pipelines using Spark MLlib or integration with ML frameworks
Experience with data governance tools (e.g., Apache Atlas, Collibra)
Contributions to open-source big data projects

🛠️ Required Skills

Apache Spark
Scala
Distributed computing
Parallel data processing
Cluster computing frameworks
Functional programming
Spark tuning
Partitions
Joins
Broadcast variables
Performance optimization
AWS
Azure
GCP
EMR
Databricks
HDInsight
Kafka
Hive
HBase
NoSQL databases
Data lake architectures
CI/CD pipelines
Git
Jenkins
Automated testing
Problem-solving
Teamwork
Spark MLlib
Machine learning
Data governance
Apache Atlas
Collibra
Open-source big data projects
HDFS
Cassandra
Parquet
ETL
Mentoring
Code reviews
Security
Governance
Compliance
Troubleshooting

🎁 Benefits & Perks

Healthcare benefits including medical & prescription drug coverage, dental, vision, and mental health & well being
Financial programs such as 401(k), cash balance pension plan, the IBM Employee Stock Purchase Plan, financial counseling, life insurance, short & long-term disability coverage, and opportunities for performance based salary incentive programs
Generous paid time off including 12 holidays, minimum 56 hours sick time, 120 hours vacation, 12 weeks parental bonding leave in accordance with IBM Policy, and other Paid Care Leave programs
Training and educational resources on our personalized, AI-driven learning platform where IBMers can grow skills and obtain industry-recognized certifications to achieve their career goals
Diverse and inclusive employee resource groups, giving & volunteer opportunities, and discounts on retail products, services & experiences

Locations

San Jose, US, India (Remote)

Salary

Estimated Salary Rangemedium confidence

2,500,000 - 4,200,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

Apache Sparkintermediate
Scalaintermediate
Distributed computingintermediate
Parallel data processingintermediate
Cluster computing frameworksintermediate
Functional programmingintermediate
Spark tuningintermediate
Partitionsintermediate
Joinsintermediate
Broadcast variablesintermediate
Performance optimizationintermediate
AWSintermediate
Azureintermediate
GCPintermediate
EMRintermediate
Databricksintermediate
HDInsightintermediate
Kafkaintermediate
Hiveintermediate
HBaseintermediate
NoSQL databasesintermediate
Data lake architecturesintermediate
CI/CD pipelinesintermediate
Gitintermediate
Jenkinsintermediate
Automated testingintermediate
Problem-solvingintermediate
Teamworkintermediate
Spark MLlibintermediate
Machine learningintermediate
Data governanceintermediate
Apache Atlasintermediate
Collibraintermediate
Open-source big data projectsintermediate
HDFSintermediate
Cassandraintermediate
Parquetintermediate
ETLintermediate
Mentoringintermediate
Code reviewsintermediate
Securityintermediate
Governanceintermediate
Complianceintermediate
Troubleshootingintermediate

Required Qualifications

12+ years of total software development experience (experience)
Minimum 5+ years of hands-on experience with Apache Spark and Scala (experience)
Strong experience with distributed computing, parallel data processing, and cluster computing frameworks (experience)
Proficiency in Scala with deep knowledge of functional programming (experience)
Solid understanding of Spark tuning, partitions, joins, broadcast variables, and performance optimization techniques (experience)
Experience with cloud platforms such as AWS, Azure, or GCP (especially EMR, Databricks, or HDInsight) (experience)
Hands-on experience with Kafka, Hive, HBase, NoSQL databases, and data lake architectures (experience)
Familiarity with CI/CD pipelines, Git, Jenkins, and automated testing (experience)
Strong problem-solving skills and the ability to work independently or as part of a team (experience)

Preferred Qualifications

Exposure to machine learning pipelines using Spark MLlib or integration with ML frameworks (experience)
Experience with data governance tools (e.g., Apache Atlas, Collibra) (experience)
Contributions to open-source big data projects (experience)

Responsibilities

Design, develop, and optimize big data applications using Apache Spark and Scala
Architect and implement scalable data pipelines for both batch and real-time processing
Collaborate with data engineers, analysts, and architects to define data strategies
Optimize Spark jobs for performance and cost-effectiveness on distributed clusters
Build and maintain reusable code and libraries for future use
Work with various data storage systems like HDFS, Hive, HBase, Cassandra, Kafka, and Parquet
Implement data quality checks, logging, monitoring, and alerting for ETL jobs
Mentor junior developers and lead code reviews to ensure best practices
Ensure security, governance, and compliance standards are adhered to in all data processes
Troubleshoot and resolve performance issues and bugs in big data solutions

Benefits

general: Healthcare benefits including medical & prescription drug coverage, dental, vision, and mental health & well being
general: Financial programs such as 401(k), cash balance pension plan, the IBM Employee Stock Purchase Plan, financial counseling, life insurance, short & long-term disability coverage, and opportunities for performance based salary incentive programs
general: Generous paid time off including 12 holidays, minimum 56 hours sick time, 120 hours vacation, 12 weeks parental bonding leave in accordance with IBM Policy, and other Paid Care Leave programs
general: Training and educational resources on our personalized, AI-driven learning platform where IBMers can grow skills and obtain industry-recognized certifications to achieve their career goals
general: Diverse and inclusive employee resource groups, giving & volunteer opportunities, and discounts on retail products, services & experiences

Target Your Resume for "Spark developer (San Jose/CA)" , IBM

Get personalized recommendations to optimize your resume specifically for Spark developer (San Jose/CA). Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "Spark developer (San Jose/CA)" , IBM

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

Software EngineeringSoftware Engineering

Answer 10 quick questions to check your fit for Spark developer (San Jose/CA) @ IBM.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap

Spark developer (San Jose/CA)

IBM

Spark developer (San Jose/CA)

IBM

full-timePosted: Dec 12, 2025

Job Description

Spark developer (San Jose/CA)

📋 Job Overview

📍 Location: San Jose, US (Remote/Hybrid)

💼 Career Level: Professional

🎯 Key Responsibilities

Design, develop, and optimize big data applications using Apache Spark and Scala
Architect and implement scalable data pipelines for both batch and real-time processing
Collaborate with data engineers, analysts, and architects to define data strategies
Optimize Spark jobs for performance and cost-effectiveness on distributed clusters
Build and maintain reusable code and libraries for future use
Work with various data storage systems like HDFS, Hive, HBase, Cassandra, Kafka, and Parquet
Implement data quality checks, logging, monitoring, and alerting for ETL jobs
Mentor junior developers and lead code reviews to ensure best practices
Ensure security, governance, and compliance standards are adhered to in all data processes
Troubleshoot and resolve performance issues and bugs in big data solutions

✅ Required Qualifications

12+ years of total software development experience
Minimum 5+ years of hands-on experience with Apache Spark and Scala
Strong experience with distributed computing, parallel data processing, and cluster computing frameworks
Proficiency in Scala with deep knowledge of functional programming
Solid understanding of Spark tuning, partitions, joins, broadcast variables, and performance optimization techniques
Experience with cloud platforms such as AWS, Azure, or GCP (especially EMR, Databricks, or HDInsight)
Hands-on experience with Kafka, Hive, HBase, NoSQL databases, and data lake architectures
Familiarity with CI/CD pipelines, Git, Jenkins, and automated testing
Strong problem-solving skills and the ability to work independently or as part of a team

⭐ Preferred Qualifications

Exposure to machine learning pipelines using Spark MLlib or integration with ML frameworks
Experience with data governance tools (e.g., Apache Atlas, Collibra)
Contributions to open-source big data projects

🛠️ Required Skills

Apache Spark
Scala
Distributed computing
Parallel data processing
Cluster computing frameworks
Functional programming
Spark tuning
Partitions
Joins
Broadcast variables
Performance optimization
AWS
Azure
GCP
EMR
Databricks
HDInsight
Kafka
Hive
HBase
NoSQL databases
Data lake architectures
CI/CD pipelines
Git
Jenkins
Automated testing
Problem-solving
Teamwork
Spark MLlib
Machine learning
Data governance
Apache Atlas
Collibra
Open-source big data projects
HDFS
Cassandra
Parquet
ETL
Mentoring
Code reviews
Security
Governance
Compliance
Troubleshooting

🎁 Benefits & Perks

Healthcare benefits including medical & prescription drug coverage, dental, vision, and mental health & well being
Financial programs such as 401(k), cash balance pension plan, the IBM Employee Stock Purchase Plan, financial counseling, life insurance, short & long-term disability coverage, and opportunities for performance based salary incentive programs
Generous paid time off including 12 holidays, minimum 56 hours sick time, 120 hours vacation, 12 weeks parental bonding leave in accordance with IBM Policy, and other Paid Care Leave programs
Training and educational resources on our personalized, AI-driven learning platform where IBMers can grow skills and obtain industry-recognized certifications to achieve their career goals
Diverse and inclusive employee resource groups, giving & volunteer opportunities, and discounts on retail products, services & experiences

Locations

San Jose, US, India (Remote)

Salary

Estimated Salary Rangemedium confidence

2,500,000 - 4,200,000 INR / yearly

Source: ai estimated

* This is an estimated range based on market data and may vary based on experience and qualifications.

Skills Required

Apache Sparkintermediate
Scalaintermediate
Distributed computingintermediate
Parallel data processingintermediate
Cluster computing frameworksintermediate
Functional programmingintermediate
Spark tuningintermediate
Partitionsintermediate
Joinsintermediate
Broadcast variablesintermediate
Performance optimizationintermediate
AWSintermediate
Azureintermediate
GCPintermediate
EMRintermediate
Databricksintermediate
HDInsightintermediate
Kafkaintermediate
Hiveintermediate
HBaseintermediate
NoSQL databasesintermediate
Data lake architecturesintermediate
CI/CD pipelinesintermediate
Gitintermediate
Jenkinsintermediate
Automated testingintermediate
Problem-solvingintermediate
Teamworkintermediate
Spark MLlibintermediate
Machine learningintermediate
Data governanceintermediate
Apache Atlasintermediate
Collibraintermediate
Open-source big data projectsintermediate
HDFSintermediate
Cassandraintermediate
Parquetintermediate
ETLintermediate
Mentoringintermediate
Code reviewsintermediate
Securityintermediate
Governanceintermediate
Complianceintermediate
Troubleshootingintermediate

Required Qualifications

12+ years of total software development experience (experience)
Minimum 5+ years of hands-on experience with Apache Spark and Scala (experience)
Strong experience with distributed computing, parallel data processing, and cluster computing frameworks (experience)
Proficiency in Scala with deep knowledge of functional programming (experience)
Solid understanding of Spark tuning, partitions, joins, broadcast variables, and performance optimization techniques (experience)
Experience with cloud platforms such as AWS, Azure, or GCP (especially EMR, Databricks, or HDInsight) (experience)
Hands-on experience with Kafka, Hive, HBase, NoSQL databases, and data lake architectures (experience)
Familiarity with CI/CD pipelines, Git, Jenkins, and automated testing (experience)
Strong problem-solving skills and the ability to work independently or as part of a team (experience)

Preferred Qualifications

Exposure to machine learning pipelines using Spark MLlib or integration with ML frameworks (experience)
Experience with data governance tools (e.g., Apache Atlas, Collibra) (experience)
Contributions to open-source big data projects (experience)

Responsibilities

Design, develop, and optimize big data applications using Apache Spark and Scala
Architect and implement scalable data pipelines for both batch and real-time processing
Collaborate with data engineers, analysts, and architects to define data strategies
Optimize Spark jobs for performance and cost-effectiveness on distributed clusters
Build and maintain reusable code and libraries for future use
Work with various data storage systems like HDFS, Hive, HBase, Cassandra, Kafka, and Parquet
Implement data quality checks, logging, monitoring, and alerting for ETL jobs
Mentor junior developers and lead code reviews to ensure best practices
Ensure security, governance, and compliance standards are adhered to in all data processes
Troubleshoot and resolve performance issues and bugs in big data solutions

Benefits

general: Healthcare benefits including medical & prescription drug coverage, dental, vision, and mental health & well being
general: Financial programs such as 401(k), cash balance pension plan, the IBM Employee Stock Purchase Plan, financial counseling, life insurance, short & long-term disability coverage, and opportunities for performance based salary incentive programs
general: Generous paid time off including 12 holidays, minimum 56 hours sick time, 120 hours vacation, 12 weeks parental bonding leave in accordance with IBM Policy, and other Paid Care Leave programs
general: Training and educational resources on our personalized, AI-driven learning platform where IBMers can grow skills and obtain industry-recognized certifications to achieve their career goals
general: Diverse and inclusive employee resource groups, giving & volunteer opportunities, and discounts on retail products, services & experiences

Target Your Resume for "Spark developer (San Jose/CA)" , IBM

Get personalized recommendations to optimize your resume specifically for Spark developer (San Jose/CA). Takes only 15 seconds!

AI-powered keyword optimization

Skills matching & gap analysis

Experience alignment suggestions

Check Your ATS Score for "Spark developer (San Jose/CA)" , IBM

Find out how well your resume matches this job's requirements. Get comprehensive analysis including ATS compatibility, keyword matching, skill gaps, and personalized recommendations.

ATS compatibility check

Keyword optimization analysis

Skill matching & gap identification

Format & readability score

Tags & Categories

Software EngineeringSoftware Engineering

Answer 10 quick questions to check your fit for Spark developer (San Jose/CA) @ IBM.

10 Questions

~2 Minutes

Instant Score

Related Books and Jobs

No related jobs found at the moment.

Privacy Terms & Conditions About Us Refund Policy Recruiter Login Sitemap