Jose Roche
Proactive and goal-oriented Global Operations and Cloud Technology leader with extensive experience running end-to-end Operations, Technology departments and production systems in global, high demanding markets. Manage diverse customers and provide support to cross-functional teams in charge of company strategies, daily operations, support, troubleshooting and maintenance of various areas such as Cloud Operations, Security, Technology and Service Delivery. Plan and lead the operational and technical strategies, shaping and sharing best-practices across all teams promoting a continuous improvement mindset. Experienced managing business compliance with internal/external regulations and commitments in customer-focused organizations, delivering high quality services to external/internal customers. Establish control processes to ensure all SLA’s and KPI’s are consistently met.
Experience
Technical Account Manager
Customer advocate in charge of providing technical and operational guidance to help customers develop best-in-class cloud strategies to plan, migrate, operate and reinvent in AWS. I serve as a subject matter expert driving discussions across the organization’s teams - Developers, Operations, Security and Senior Leadership executives, among others, to guide and help them plan, deploy and build cloud native solutions that are highly available, scalable and reliable using best practices, while proactively helping them keep their environments operating secure and in good health.
- I am an experienced migration expert, having successfully guided multiple customers through the process of developing strategies and migrating to AWS. Utilizing a variety of techniques, including:
- Rehosting with tools like AWS Migration Services and AWS DMS
- Replatforming with services such as RDS, EKS, and EFS
- Re-architecting with tools and services like Kubernetes, API GW, Lambda, EventBridge, ECS, and Fargate.
- My efforts have resulted in the migration of:
- Over 7,000 servers
- 500 applications
- Sunsetting of 8 data centers.
- Led my customers reduce their AWS Monthly Recurring spend by over $100,000.
- I have extensive experience designing, deploying and operating various services, including Kubernetes-based container applications, Serverless Event-driven systems, and AWS cloud infrastructure using the AWS SDK, CDK and CI/CD pipelines following Infrastructure as Code (IaC) principles to expertly demonstrate building and maintaining well-architected and highly resilient workloads on AWS to customers.
- I’m a subject matter expert in the resilience area of depth whereI hold multiple workshops and other engagements to help AWS customers design, test and ensure that their applications running on AWS are reliable, resilient and highly available; and the teams operating them have developed the right processes, playbooks and experience to avoid and/or operate in disaster scenarios.
- I’m currently leading the expansion of this offering to the LATAM region by creating the processes, training and onboarding practitioners and customers from the region. This is highly visible work recently highlighted in a company-wide all-hands.
- As a Certified Kubernetes Administrator (CKA) and experienced infrastructure operations engineer I support internal teams, as well as customers’ teams follow best practices by holding workshops, architecture and operational reviews, participate in correction of errors discussions and other engagements on deploying Kubernetes/EKS and managing infrastructure as code using the AWS Cloud Development Kit (CDK) and AWS Python SDK (boto3).
- As an experienced operations engineer I constantly support customers troubleshoot operational issues impacting their cloud operations to identify, resolve and learn from these issues to avoid future impacts. I help them create post-mortem, run-books, play books, designing tests and identify code changes to enhance their architecture and operations to enhance their operational excellence, encouraging a continuous learning mindset.
- Lead internal discussion within AWS in the adoption of Infrastructure as Code using the AWS CDK and actively participate helping others troubleshoot issue writing and deploying infrastructure with the CDK.
- I’m experienced in cost optimization strategies in AWS and have guided my customers to save over $100,000 in Monthly Recurring Costs by implementing best practices in cloud financial management as well as following cloud design best practices.
Service Delivery Leader
I led various global operations with over 7 MM dollars in annual budgets. This included leading over 10 teams with over 50 people comprised of technology and non-technology members across multiple countries, responsible for the day-to-day execution, production, operations, deployment, service delivery and performance of various areas within Operations responsible for clients’ deliverables.
- I created and owned one of the fastest growing internal applications deployed in over 15 countries to help our global operations team manage our reference data systems observability and management. The application manages thousands of critical reference systems across countries in LATAM, Europe and APAC. This application was considered a gold standard and advocated by executives to be part of the on-boarding process of every new operation.
- Led the redesign of the monolithic application following a service-oriented architecture (SOA) and migrated to containers to be deployed to Kubernetes clusters to create a scalable architecture capable of being expanded to additional countries.
- I mentored and coached direct reports, and other members of the international operations teams, working with them to identify, engage and develop their careers through check-ins, talent/performance reviews, action planning and annual summaries. I was directly involved in the approval, promotion and relocation of dozens of members of our international operations teams.
- Implement and manage quality metrics and quality controls, taking remedial action to correct any problem that could impact quality and/or Service Level Agreements (SLAs), leading and encouraging teams to engage in continuous process of automation, testing and optimization of all our production workloads and processes.
- I led the creation of regional change control boards (COBs) manned by a rotating governing body of operation team members that reviewed each other's plans for changes and new deployments. This would be similar to PR reviews.
- This helped us lower the amount of P0 incidents due to errors introduced in changes by 80%.
- I led the recovery of the Puerto Rico (PR) operation after one of the biggest natural disasters (Hurricane Maria in 2017) that had impacted one of our global operations.
- The PR operations received high praises by our customers and leadership by the design I created and implemented. Using techniques such as fault isolation, cell-based architecture, shuffle sharding and multi-sites deployment, our data centers and operations never went offline even when the island was without power for a prolonged period of time, and our data center for over two (2) months.
- I created and implemented the strategy to recover our sample panel of homes and provide our customers and leadership with continuous updates that help understand the state of the sample and recovery of the island. Earning praise and trust with our customers and the market.
- Continuously meet with clients at all levels, including executive level (VPs, C-level), across the industry to provide updates on our execution and operational health as well as future plans and projects.
- Continuously support commercial teams across different countries to grow our businesses by providing operational support during client negotiations.
- Present and talk during industry events, such as conferences, product launches, seminars, etc., as needed, to support the business from a subject matter expert in Nielsen Operations.
Skills
- Amazon Web Services (AWS) Cloud
- Infrastructure as Code: AWS CDK, Boto3 and CloudFormation
- Kubernetes
- Linux
- Networking
- Cloud Migration, Design, Automation and Operations
- Cloud Cost Optimization
- Cloud Automation with Python (Boto3) and Node.js
- CI/CD: GitHub, GitHub Actions, Jenkins, AWS CodePipeline
- Serverless Technologies: AWS Lambda, DynamoDB, EventBridge, SNS, SQS, AWS Step Functions
- Automating the creation of S3 Incomplete Multipart Upload Lifecycle Rules to Optimize Cost in AWS
- Automating the creation of S3 Incomplete Multipart Upload Lifecycle Rules to Optimize Cost Using an Event-driven Architecture
- Saving Multiple Values within a single Parameter String in AWS Systems Manager Parameter Store