SP – Infrastructure engineer | Support Level 3

Job summary

We are seeking an Infrastructure Engineer with near-architect-level expertise to design and implement infrastructure solutions while effectively communicating with clients and making informed decisions. The ideal candidate will have a strong understanding of networking, including secure service communication (e.g., HTTPS, certificates), and experience with on-premise infrastructure, OpenShift, and the three major cloud providers: AWS, Azure, and GCP. Proficiency in application deployment, database integration, and assessing deployment strategies is essential. Additionally, the candidate should be skilled in Docker and Kubernetes, and able to explain their functions and applications. This role emphasizes infrastructure and deployment over automation, requiring a versatile professional capable of handling diverse components and environments.

Core Skills and Requirements

1. Cloud Platforms

  • AWS, GCP, Azure: Experience provisioning and configuring services like compute, storage, networking, and managed databases.
  • On-Prem Deployments: Proficiency in deploying applications on-prem, including managing custom configurations and integrations for on-prem solutions.
  • Container Management: Knowledge of Kubernetes (especially OpenShift’s specific implementations) for container orchestration and scaling.

2. Containerization & Orchestration

  • Kubernetes: Deep understanding of managing multi-node clusters, setting up namespaces, services, and networking (Load Balancers, Ingress).
  • Docker: Familiarity with building, managing, and troubleshooting containerized applications.
  • Ingress Controllers: Skills in configuring Ingress for secure and efficient traffic management, especially in cloud and OpenShift environments.

3. Data Storage and Messaging Systems

  • Kafka: Knowledge of Kafka clusters, topic management, consumer groups, and replication for high availability.
  • MongoDB: Experience in managing MongoDB databases, especially in clustered setups with replica sets for resilience.
  • Zookeeper: Understanding of Zookeeper’s role in managing distributed systems, particularly for coordinating Kafka and Flink jobs.
  • Persistent Volumes: Proficiency in configuring Persistent Volumes, including limitations and access modes like ReadWriteMany and ReadWriteOnce (especially on cloud-specific storage like Azure StorageClasses).

4. Security and Access Control

  • ABAC and RBAC: Hands-on experience in implementing Attribute-Based and Role-Based Access Controls to manage complex permissions across resources (can be taught).
  • Data Encryption: Knowledge of encryption mechanisms for data at rest and in transit, particularly in compliance with standards.
  • Secrets Management: Familiarity with tools like HashiCorp Vault or Kubernetes Secrets for secure storage and access to sensitive information.

5. DevOps Toolchain and Infrastructure as Code (If needed)

  • Infrastructure as Code (IaC): Proficiency with Terraform, Helm, or Ansible for consistent provisioning and configuration across multiple environments.
  • CI/CD Pipelines: Familiarity with Jenkins, GitLab CI, or similar tools for automating deployment and testing processes, especially with complex multi-cloud setups.
  • Monitoring and Logging: Experience with monitoring (Prometheus, Grafana) and logging (ELK/EFK stacks) for proactive issue resolution and system observability.

6. Big Data and Stream Processing

  • Flink: Skills in deploying, managing, and monitoring Flink jobs, especially within a Kubernetes environment, to ensure efficient data processing.

7. Troubleshooting and Maintenance

  • Issue Diagnosis: Strong troubleshooting skills, especially with Kubernetes resources, storage issues (e.g., PVC access limitations), and certificate configurations.
  • System Health and Performance Monitoring: Capability to identify bottlenecks and optimize performance for distributed components like Kafka, Flink, and MongoDB.
  • Networking: Knowledge of networking concepts, including load balancing, DNS, and network security, is critical for handling traffic routing and managing ingress/egress securely.

8. Soft Skills

  • Collaboration and Communication: Ability to work effectively in cross-functional teams, manage stakeholder expectations, and document configurations and deployment processes.
  • Adaptability: Willingness to learn new tools and adapt to the latest technologies in the fast-evolving cloud and DevOps landscape.

Should be well-equipped to handle the deployment, scaling, and security requirements of a complex, multi-cloud ABAC/RBAC-enabled platform.

Tipo de empleo: Remote
Ubicación del empleo: LATAM

Solicitar este puesto

Tipo(s) permitido(s): .pdf, .doc, .docx
×