Senior DevOps Engineer
Anthology offers the largest EdTech ecosystem on a global scale, supporting over 150 million users in 80 countries. Our mission is to provide dynamic, data-informed experiences to the global education community so that learners and educators can achieve their goals.
We believe in the power of a truly diverse and inclusive workforce. As we expand globally, we are committed to making diversity, inclusion, and belonging a foundational part of not only our hiring practices but who we are as a company.
For more information about Anthology and our career opportunities, please visit www.anthology.com.
As a member of the Site Reliability Engineering team, you will combine software and systems engineering to help build and run large-scale, distributed, and fault-tolerant systems. This is a driven, creative, and energetic team that works in a flexible and agile fashion to deliver world-class products to the education market. You will become a core contributing member to the Site Reliability Engineering team delivering eLearning services to over a thousand clients, comprising almost four million users worldwide.
Specific responsibilities will include:
- Researching, designing, developing, and documenting solutions; implementing solutions for fault tolerance, performance, capacity, and configuration management for various data center operations
- Overseeing the implementation of these solutions under the direction of a SRE Manager and providing technical input as required
- Advancing enterprise security through customization of systems, automation of processes, and collaboration with product teams
- Providing research, evaluation, and expert judgement for product security planning
- Being accountable for the stability and robustness of the platform and all production deployments
- Identifying production stability concerns via break point, vulnerability scanning, and impact analysis, and designing and developing implement remediation plans to address these concerns
- Designing, developing, and implementing documentation or tools to facilitate Technical Support team responsibilities
- Mentoring Site Reliability Engineering team on technical aspects of supported applications
- Designing, developing, and implementing documentation or tools appropriate to the maintenance of application platforms
- Engaging with development teams on the design, deployment, capacity needs and operations of microservices, and supporting them as they transition to production
- Monitoring the availability, performance, and health of production systems in support of meeting service level objectives
- Using automation and tooling to continuously improve the reliability, scalability, and velocity of services deployed on AWS
- Providing support to issues escalated from the Client Engagement Support team and interfacing with development teams to hand-off application issues
- Participating in emergency incident response on-call rosters; practicing blameless postmortems that lead to improvements in resiliency and reductions in pager fatigue
- Identifying opportunities for further automation of activities; looking for synergies within the production environment from the perspective of complexity, cost, and maintenance activity reduction
- Executing and maintaining internal SLAs developed with business stakeholders
- Experience in the fields of Computer Science, Software Engineering, or related
- Experience researching, deploying, and supporting information security systems
- Experience consulting with internal teams on the technical implementation and support of information security controls
- Experience engineering cloud-based information security systems
- Expertise with analyzing and troubleshooting large-scale, multi-region deployments in a public cloud (e.g. AWS)
- Experience with cloud deployment and management tools (e.g. Terraform, Chef)
- Ability to solve complex problems, optimize code, and automate routine tasks
- BS in Computer Science or related field, or equivalent work experience
- Experience with Kubernetes
- Demonstrable scripting experience, preferably in PHP or Ruby
- Experience with network and/or application security
- Prior experience within the education industry and/or with e-learning technologies
We have an office in one of the biggest cultural, economic, and educational centers in South India: Chennai.
- Located on OMR, the IT corridor of South Chennai
- Easy access to Velachery, Thiruvanmiyur Railway station and bus stop
- Very close to Tidel Park, Ascendas, and SRP Tools – Holiday Inn
- Office provides lunch and snacks on all working days
- Office is situated behind Hotel Turyaa on the 5th floor of Rayala Techno Park
- Fun Committee, Happy Fete Team, Food Committee, and Sports Committee ensures fun at work
- ISR Team actively engages employees in contributing to various local charities
This job description is not designed to contain a comprehensive listing of activities, duties, or responsibilities that are required. Nothing in this job description restricts management's right to assign or reassign duties and responsibilities at any time.
Anthology is an equal employment opportunity/affirmative action employer and considers qualified applicants for employment without regard to race, gender, age, color, religion, national origin, marital status, disability, sexual orientation, gender identity/expression, protected military/veteran status, or any other legally protected factor.
Req ID: 497