
Oversee the execution of disaster recovery drills and exercises and simulate incidents to diagnose and resolve escalated data centre-related incidents Develop disaster recovery plans for data centre operations Oversee resolution of data centre-related incidents involving vendors and analyse incidents to determine patterns and propose recommendations to prevent future occurrences Optimise the interfaces between the IT equipment and data centre Identify best practices in data centre operations and management for adoption and recommend enhancements to improve availability and performance Conduct short- and long-term planning to meet organisation's requirements and business needs Conduct technical feasibility studies to determine viability, cost, time required and compatibility with organisational needs and requirements Manage the development of service-level objectives and targets Monitor service level objectives to ensure that requirements are met or exceeded
Monitor and manage the health and performance of core telecom infrastructure (servers, switches, routers, firewalls, load balancers, etc.). Ensure 24/7 uptime of infrastructure components through proactive monitoring and incident management. Perform regular maintenance, upgrades, and patching of hardware and software systems. Collaborate with network engineers, system admins, and vendors to troubleshoot and resolve issues. Maintain detailed documentation of infrastructure configurations, standard operating procedures, and incident reports. Implement automation and scripting to improve operational efficiency and reduce manual tasks. Conduct capacity planning, system audits, and root cause analysis for recurring incidents. Manage physical and virtual infrastructure in on-premises data centers and/or cloud environments.
Conduct scheduled tests on systems and monitor performance to ensure stable operations Oversee monitoring activities to maintain system stability and resolve downtime or malfunctions Develop and monitor service-level objectives to ensure they meet or exceed requirements Create client satisfaction metrics and service procedures, proposing recommendations for improvement Install software and hardware equipment for users, conducting user acceptance tests on new setups Carry out feasibility studies for implementing new solutions and ensuring integration compatibility Evaluate past incidents, prepare reports and document findings for senior stakeholders Classify incidents for escalation, provide support and recommendations to affected teams Analyse technical incidents and provide third-line support to resolve issues effectively Ensure seamless integration and continued operation of systems to minimise disruptions
Design, develop, and implement IT and network systems, including servers, networks, cloud-based platforms, and communication tool Manage the installation and configuration of network/IT systems, ensuring proper integration with existing infrastructure and alignment with organizational requirements. Deploy software, hardware, and networking components, coordinating with cross-functional teams to ensure successful implementation Continuously monitor network/IT systems for performance, reliability, and capacity, identifying potential issues and proactively resolving them. Implement and configure network monitoring tools to track system health, bandwidth utilization, response times, and error rates Provide technical support for network/IT systems, responding to and resolving issues related to system performance, availability, and functionality
Undertake complex projects related to system provisioning, installations, configurations, as well as monitoring and maintenance of systems Apply highly developed specialist knowledge and skills in systems administration and work toward continuous optimisation of system performance Implement system improvements and instruct other IT staff in the resolution of the most complex issues Maintain and troubleshoot communication infrastructure, servers, and application environments. Document support activities, system changes, and troubleshooting procedures.
