[SVLUG-Jobs] linux sys admin at well funded start-up in redwood city

Marshall Choi marshall.choi at zuora.com
Mon Nov 8 12:49:50 PST 2010


*Position: Site Reliability Engineer*

*Location*: *Redwood City, CA*

Site Reliability Engineering team is responsible for ongoing management of
Zuora’s elastic compute infrastructure in a highly secure environment. The
team is responsible for maintaining service levels for uptime and response
time as well as production escalation procedures. SRE team maintains
capacity plans and triggers actions to meet anticipated and spontaneous
compute demands. Successful candidate will have a strong background in at
least 3 of the following: UNIX systems administration, network
administration, storage systems and database administration.

*Responsibilities*

   - Ultimately responsible for maintaining service level agreements on
   uptime and response time and mobilizing resources as needed.
   - Proficient at understanding how each software component, system design,
   and configuration is linked together to form an end-to-end solution
   - Serve as technical escalation point for all production issues,
   communicate in a timely manner to keep all stakeholders informed during and
   after escalation.
   - Maintain capacity plans and trigger procedures to meet any anticipated
   or spontaneous processing demands to maintain service level agreements.
   - Implementation of monitoring, alerts and escalation management.
   - Ability to communicate directly with customer counterpart to resolve
   technical issues.
   - Serve as liaison to other Operations teams and Engineering to resolve
   outstanding issues.
   - Drive or participate in technical design reviews and operational
   acceptance exercises for new and existing services.
   - Ensure operational readiness of all data centers including disaster
   recovery site.

*Requirements*

   - BS or MS degree in Computer Science, Engineering, or related technical
   discipline
   - Minimum of 6 years experience administering Linux systems, storage
   systems, networks and databases in a large-scale production environment,
   especially in Software-as-a-Service environment.
   - Advanced knowledge of Linux, TCP/IP, web services, high-availability
   configurations, load balancing.
   - Must have a deep understanding of building and managing large-scale
   systems and application architectures
   - Experience with Virtualization and Storage Systems such as SAN, NAS,
   distributed storage using commodity hardware.
   - Experience coding in one of the following languages: Shell, Ruby,
   Python or Perl
   - Proficient in one or more of the following network management systems
   and monitoring tools: Nagios, Ganglia, Cacti, Splunk, Zabbix
   - Experience with high-availability and large-scale MySQL Administration
   - Experience with Application stack based on Java, Tomcat or other app
   servers, MySQL.
   - Strong organization and multi-tasking abilities
   - Solid verbal and written communication skills
   - Proven ability to quickly learn and implement unfamiliar technologies



Local applicants preferred.



To apply, email your Resume to marshall.choi at zuora.com and include the job
title in the subject field.



Zuora is an Equal Opportunity Employer.



No third party applications accepted.



More information about the Jobs mailing list