Home Careers Discover openings Site Reliability Engineer

Site Reliability Engineer

Guadalajara, Mexico

We are seeking an experienced and highly motivated Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a strong background in cloud infrastructure management, automation, and monitoring, with a specific focus on tools like Grafana and Terraform. In this role, you will collaborate with software engineering teams to build and maintain reliable, scalable, and secure infrastructure that supports our growing user base. This is an excellent opportunity to work on cutting-edge technologies while ensuring optimal system performance and reliability.

Essential functions

  • Design, implement, and maintain automated infrastructure using Terraform to manage cloud resources, ensuring scalability and reliability across multiple environments.

  • Build and maintain real-time monitoring dashboards and alerting systems using Grafana, ensuring that key metrics (e.g., uptime, latency, error rates) are tracked and reported across services.

  • Lead incident response efforts, troubleshoot issues, and perform root cause analysis (RCA) to minimize service downtime and improve system reliability.

  • Leverage automation tools to optimize infrastructure provisioning and management, reducing manual interventions and operational overhead.

  • Work closely with engineering teams to ensure that infrastructure supports application needs, such as performance, scalability, and high availability.

  • Continuously monitor, analyze, and improve system performance across production environments, ensuring minimal latency and optimal resource utilization.

  • Ensure infrastructure meets security best practices and compliance requirements.

  • Maintain clear and comprehensive documentation related to infrastructure, automation scripts, and system configurations.

Qualifications

  • Strong experience using Terraform for infrastructure provisioning and automation in cloud environments (AWS, Azure, GCP).

  • Proven experience using Grafana for creating monitoring dashboards, setting up alerts, and providing actionable insights into system health.

  • Strong background in automating operational workflows, especially using tools like Terraform and scripting languages (e.g., Python or Bash).

  • Familiarity with continuous integration and continuous delivery (CI/CD) pipelines and tools.

  • Proficiency in monitoring tools, especially Grafana, for creating and maintaining system observability solutions.

Would be a plus

  • Experience with Java or Python for scripting, automation, or software integration tasks.

  • Solid understanding of SQL and experience with performance tuning and managing databases (MySQL, PostgreSQL, or similar).

  • Familiarity with Quantum Metrics for user session analytics and performance insights.

  • Experience with Splunk for log management and analysis.

  • Familiarity with Jira for issue tracking and project management.

  • Previous experience in an SRE or DevOps role within a fast-paced production environment.

  • Familiarity with New Relic or similar APM tools for monitoring application

We offer

  • 100% payroll scheme, benefits by law (IMSS, INFONAVIT, 12+ vacation days)
  • Benefits above the law: Vacation premium 50%, 5 PTOs, 3 sick days, 10 guaranteed public holidays per year
  • Major medical insurance, Dental and Vision plan for an employee and direct family members
  • Minor Medical Insurance (Multiservicios Médicos Santander) for an employee and direct family members
  • Life Insurance and funeral expenses
  • 5% savings fund, uncapped (matched by the company in the end of the year)
  • Grocery cards/vouchers (Vales de Despensa)
  • 30 days End of the Year Bonus (Aguinaldo)
  • Opportunity to work on bleeding-edge projects with a highly motivated and dedicated team all over the world
  • Individual career development plan and support from the best experts
  • Professional development opportunities (Linkedin Learning, Cloud certification programs, access to corporate LMS integrated with other learning platforms)
  • Well-equipped office in a business area of Guadalajara (quiet room, games room, air hockey, PS5, Nintendo Switch and Xbox Series X, pool table, ping pong, snacks, smoothies, and much more)
  • Corporate social events (yoga, massages, sport tournaments, discussion panels, technical talks, lunch & learns)
  • Flexible working hours
  • Opportunity to relocate to another country where the company's offices are present.

About us

Grid Dynamics (NASDAQ: GDYN) is a leading provider of technology consulting, platform and product engineering, AI, and advanced analytics services. Fusing technical vision with business acumen, we solve the most pressing technical challenges and enable positive business outcomes for enterprise companies undergoing business transformation. A key differentiator for Grid Dynamics is our 8 years of experience and leadership in enterprise AI, supported by profound expertise and ongoing investment in data, analytics, cloud & DevOps, application modernization and customer experience. Founded in 2006, Grid Dynamics is headquartered in Silicon Valley with offices across the Americas, Europe, and India.

Apply to the position

apply status Information on personal data processing
decline status You cannot apply for a position without accepting “INFORMATION ON PERSONAL DATA PROCESSING”

    decline-status file-icon
    Invalid file size or format. DOC, DOCX, TXT, PDF (2 MB)
    Invalid phone format

    Consent to the processing of personal data in future recruitment processes*

    decline-status file-icon
    Invalid file size or format. DOC, DOCX, TXT, PDF (2 MB)
    Submitting
    decline status

    Applications for this job are no longer accepted. Please explore other open opportunities on our platform.

    Vacancy

    Thank you!

    You applied for the position Site Reliability Engineer successfully. We will get back to you soon. Have a great day!

    check

    Something went wrong...

    There are possible difficulties with connection or other issues. Please try to use another browser (it's recommended to use the latest version of Google Chrome browser). If the problem still persists, please send your application to

    Retry

    Something went wrong...

    Please double-check the information filled in the form, and make sure to provide valid data.

    Retry

    Don’t see the right opportunity?

    Grid Dynamics is an equal opportunity employer. We are committed to creating an inclusive environment for all employees during their employment and for all candidates during the application process.

    All qualified applicants will receive consideration for employment without regard to, and will not be discriminated against based on, age, race, gender, color, religion, national origin, sexual orientation, gender identity, veteran status, disability or any other protected category. All employment is decided on the basis of qualifications, merit, and business need.

    Get in touch

    Let's connect! How can we reach you?

      Invalid phone format
      Submitting
      Vacancy

      Thank you!

      It is very important to be in touch with you.
      We will get back to you soon. Have a great day!

      check

      Something went wrong...

      There are possible difficulties with connection or other issues.
      Please try again after some time.

      Retry