Data Engineer
Summary
As a Data Engineer on the Credit Platform Data team, you will play a crucial role in building and enhancing tools for data processing, enabling internal business units to leverage data efficiently. This position is vital for ensuring that our data systems are robust, scalable, and reliable, allowing the organization to make data-driven decisions. You will be responsible for designing, building, and maintaining data pipelines and ETL processes, collaborating with various stakeholders to understand their data requirements, and implementing solutions that meet business needs.
Responsibilities
- Designing, Building, and Maintaining Data Pipelines: Create robust ETL processes to ingest, transform, and load data from various sources into the data warehouse, ensuring data is readily available for analysis.
- Collaboration: Work closely with product managers, analysts, and other stakeholders to gather data requirements and develop solutions that accommodate large volumes of data.
- System Reliability and Scalability: Monitor the performance of data systems, ensuring their reliability, availability, and scalability. Optimize systems as needed to handle increasing data loads.
- Data Quality Assurance: Implement automated data quality checks and validation processes to ensure the integrity and accuracy of data at scale.
- Troubleshooting: Quickly identify and resolve data-related issues by determining root causes and implementing effective solutions.
- Documentation: Create design documents and maintain documentation for data pipelines, systems, and processes to ensure clarity and continuity.
- Code and Design Reviews: Actively participate in design and code reviews to uphold high-quality standards and best practices.
- Continuous Learning: Stay current with emerging technologies in the field of data engineering to recommend improvements that support scalability and growth.
Requirements
Must-Have Skills
- SQL: Proficiency in SQL is essential for querying and managing relational databases, enabling effective data manipulation and retrieval.
- ETL Processes: Strong experience in designing and implementing ETL processes to ensure efficient data flow from source systems to data warehouses.
- Scripting Languages (Python): Expertise in Python for data manipulation, automation, and integration tasks, allowing for efficient data processing and pipeline management.
- PySpark: Proficiency in PySpark for distributed data processing, enabling the handling of large datasets effectively.
- Data Processing Libraries: Familiarity with libraries such as Pandas for data manipulation and analysis, enhancing data processing capabilities.
- Data Warehousing Tools: Experience with tools like BigQuery or Snowflake, including performance optimization for large-scale data processing.
- Data Modeling and Schema Design: Strong understanding of data modeling, schema design, and optimization techniques to ensure scalability and performance.
- Automation Testing: Experience in creating automation test cases to validate data processes and ensure data quality.
- Unix/Linux Operating Systems: Proficiency in Unix/Linux environments and shell scripting for managing data workflows and system operations.
- Analytical and Problem-Solving Skills: Strong analytical skills to troubleshoot complex data issues and optimize data processing pipelines for scale.
- Communication and Collaboration: Excellent communication skills to work effectively in a team environment and collaborate with various stakeholders.
- Self-Motivated and Proactive: A passion for continuous learning and professional development, demonstrating initiative in improving data processes.
Nice-to-Have Skills
- Data Warehousing: Familiarity with data warehousing concepts and best practices, enhancing the ability to design effective data storage solutions.
- Automation Testing Tools: Experience with automation testing tools to streamline testing processes and improve data quality assurance.
- AWS Redshift or Google Cloud: Knowledge of cloud data warehousing solutions such as AWS Redshift or Google Cloud, contributing to cloud-based data architecture.
- Data Modeling: Additional experience in advanced data modeling techniques to support complex data structures and relationships.
This role is an exciting opportunity for a senior Data Engineer to contribute to the growth and efficiency of our data systems, ensuring that our organization can leverage data effectively for strategic decision-making. If you are passionate about data engineering and eager to work in a collaborative environment, we encourage you to apply.
Target Start Date: ASAP
Engagement Length: Long term
Time Zone: Flexible between CST / PST.