SAS Principal Engineer
- Work with the Architecture Team and the Development Teams to translate company needs into infrastructure solutions that will suit those needs and requirements in terms of performance, resource usage, scalability, resilience and observability. The proposed solutions may include on premises virtualised/bare-metal, cloud or hybrid architectures and must ensure the use of Continuous Integration and Continuous Delivery, Infrastructure as Code and GitOps approaches.
- Work with the operations team to design solutions based on pipelines, scripts and playbooks to continuously reduce the human tasks required to operate the production services (toil).
- Based on the high level design (HLD) provided by the Architecture Team, work on the low level design (LLD) for the required solutions/projects. The LLD should describe how the system will be implemented, the detailed specification for each component and how they will be connected to form the system in order to meet the system’s functional and non-functional requirements. Additionally, needs to take into account the requirements of the others operations teams.
- Understand the functional and non-functional requirements of the systems, including performance, scalability, availability and security requirements.
- Identify the hardware, software and networking components that will be required to support the systems. This may include servers, storage devices, networking devices, operating systems and middleware components.
- Design the details of the overall architecture of the system, including the placement of components and the interface between them. This should take into account the HLD, the requirements of the system, as well as any constraint or limitations of the environment.
- Collaborate in the definition of the configuration of the hardware, software and networking components to meet the requirements of the system.
- Define the tests the systems to ensure that it meets the functional and non-functional requirements. In some cases may involve high availability testing, performance testing, scalability testing, security testing and other types of validation.
- Document the LLD, including the architecture, configuration, and testing results. This documentation will be used by operators to manage and maintain the system over time.
- Keep abreast of the latest advancements in technologies and broader industry trend. Stay updated on the latest developments in data center infrastructure, cloud computing, virtualization environments, hardware and storage devices while also maintaining awareness of evolving technologies in related areas such as networking, protocols, operating systems, data base, software-defined architectures and security measures. Incorporate this knowledge into the LLD process to drive innovative and comprehensive solutions that align with industry standards and leverage cross-functional synergies.
- Provisioning, operational tasks (performance, scaling, organization, routine patching, security…) and decommissioning of infrastructure services.
- Support on-call members when necessary.
- Wide experience with Unix/Linux systems (Canonical Ubuntu and Redhat/CentOS Linux) in a large-scale operations, distributed Linux production set-up. Same for Windows based set-ups.
- Experience working with OpenStack platform (COA certification is a plus).
- Experience in Kubernetes deployment and administration. CKA certification is a plus.
- Experience in centralized management systems (Puppet,Canonical Landscape).
- Experience with centralized logging management tools (Splunk, ELK, Fluentd).
- Experience in using Terraform to apply Infrastructure as Code.
- Experience in automating configuration management using Ansible.
- Experience in writing scripts for automating infrastructure tasks (Python, shell script…).
- Experience designing, operating, maintaining and troubleshooting Ceph clusters is a plus.
- Experience in writing automation pipelines (Argo Workflow GitHub Actions…) is a plus.
- Understanding of Continuous Integration and Continuous Deployment tools (Jenkins, Bamboo, ArgoCD, …) and practices (deployment strategies, micro-service pattern, …)
- Understanding of enterprise level virtualisation (VMware, KVM)
- Advanced knowledge of internet services and networking (DNS, email – postfix, HAProxy, …)
- Ability to think critically and systematically about the system’s requirements and how they can be met.
- Ability to create detailed and well-structured LLD documentation.
- Experienced using different design methodologies (i.e. SD, OOD, SOA, DDD, CBD, MDD, etc.).
- Knowledge of Design Frameworks (i.e. SDN, Microservices, service-oriented, containerization or event-driven architectures) and able to apply best practices to create efficient and scalable designs.
- Strong documentation skills, including the ability to use diagrams and technical writing.
- Strong Analytical and problem-solving skills. Ability to assess trade-offs, anticipate potential issues and recommend mitigation strategies.
- Strong attention to detail ensuring all essential components, configurations and specifications are accurately covered. Able to identify potential risks or oversights and ensure that the design is complete and comprehensive.
- Strong collaboration, communication, interpersonal skills and ability to work with cross-functional teams, stakeholders and subject matter experts. Ability to actively participate in discussions, understand requirements and communicate design decisions effectively.
- Strong time management skills in handling multiple design project simultaneously. Able to prioritize tasks effectively, meet deadlines and manage the workload efficiently.
- Experience guiding and mentoring junior engineers, conduction knowledge-sharing sessions or providing technical leadership in design-related initiatives.
- Experience working in an ITIL based organization.
Job Category: IT Opeations
Job Type: Full Time
Job Location: Malaga, Madrid