Staff Data Engineer
at Demandbase
Remote
JOB TITLE: Staff Data Engineer (Full time)
JOB DUTIES: The Staff Data Engineer will define and drive software architecture development across different engineering teams, re-designing data-pipeline software for over 200 million records daily to increase efficiency and responsiveness to user needs, ensuring scalable, high-performance, and maintainable software products. The Staff Data Engineer will be primarily focused on the following duties:
- Drive technical direction for data engineering on product teams, ensuring alignment with business goals, and fostering best practices in software development. This includes:
- Develop and maintain the data engineering technical strategy and roadmap for key Demandbase software products, aligning technical strategy with the goal of improving data quality by 20% and reducing process latency by 20%.
- Lead the integration of data pipelines and workflows, delivering business outcomes autonomously.
- Work with engineering managers, peer engineers, and product managers to ensure seamless execution of technical initiatives.
- Introduce and advocate for best software engineering practices, including software design principles, code quality, security, and cloud scalability.
- Act as a mentor and role model, helping to grow and develop engineering talent within the organization.
- Work closely with product managers to break down product initiatives into deliverable iterations while balancing technical and business needs.
- Contribute to code reviews, proof of concepts, and complex system designs when needed.
- Lead design and implementation of robust data solutions and microservices to meet real-time and batch requirements. This includes:
- Lead the development and optimization of large-scale, distributed microservices and APIs for real-time and batch system needs using Scala/Java/Python.
- Lead the development and scaling of data pipelines using Spark, incorporating NoSQL and relational databases.
- Lead data aggregation, warehousing, and processing of at least 200 million records daily.
- Consume and produce data using event-driven systems like Pulsar and Kafka.
- Lead automation and streamlining of deployments, ensuring efficient and secure cloud-based workflows. This includes:
- Lead the maintenance and creation of GitLab pipelines to automate build, test and deployments on AWS Cloud using GitLab CI/CD.
- Lead orchestration of data pipelines, including scheduling, monitoring, and managing high volume data workflows using Astronomer deployed via CI/CD.
- Use Docker for containerization.
- Lead development and refinement of data models to maximize performance and meet business objectives. This includes:
- Create, maintain, and review data models to suit business requirements while ensuring efficient solutions.
100% remote. May be located anywhere in continental United States. Reports to HQ at 222 2nd St, Fl 24, San Francisco, CA 94105.
No travel required.
JOB REQUIREMENTS: Bachelor’s degree (or foreign equivalent) in Computer Science or Computer Engineering and 60 months (5 years) of progressive and post bachelor’s experience as a software engineer or in any occupations in which required experience was gained. Requires:
- 5 years of experience developing and optimizing large-scale, distributed microservices and APIs for real-time and batch requirements using Scala, Java, and/or Python.
- 5 years of experience using Spark (or similar tools) to develop and scale data pipelines which incorporate noSQL and relational databases.
- 5 years of experience setting up automated build, test and deployment pipelines with CI/CD using Github, gitlab, and/or Jenkins.
- 5 years of experience working with Big Data/ Cloud technologies.
- 5 years of experience performing data modeling.
- 5 years of experience working on the orchestration of data pipelines, including scheduling, monitoring, and managing high volume data workflows using tools like Astronomer, Airflow or
- similar tools.
- Experience using Docker for containerization.
- Experience with data aggregation, warehousing, and processing of at least 100 million records daily.
SALARY OFFERED: From $258,000.00 per year
JOB LOCATION: 100% remote. May be located anywhere in continental United States. Reports to HQ at 222 2nd St, Fl 24, San Francisco, CA 94105.
Personal information that you submit will be used by Demandbase for recruiting and other business purposes. Our Privacy Policy explains how we collect and use personal information.