Location: Palo Alto, CA USA
Facebook is seeking talented operations engineers to join the Site Reliability Engineering team. The ideal candidate will have
strong communication skills, a passion for tinkering with Linux, and an almost insane fondness for fast-paced, seat-of-your-pants
troubleshooting and crisis management. The position is full-time and is based in our main office in downtown Palo Alto.
Responsibilities
• Monitor the stability and performance of the website
• Remotely troubleshoot and diagnose hardware problems
• Debug issues with Linux software, applications and network
• Resolve technical challenges encountered in LAMP technologies
• Develop and maintain monitoring tools and automation systems
• Predict and respond to utilization variances across multiple datacenters
• Identify and triage all outage related events
• Facilitate communication, coordinate escalation, and work with subject matter experts to implement critical fixes
• Automate and streamline processes
Requirements
• 2-3 years+ Linux support/sys admin experience in an Internet operations environment
• BA/BS in Computer Science or a related field, or equivalent experience
• Working knowledge of Linux, TCP/IP, Apache and mySQL
• Experience working with network management systems and monitoring tools, such as Nagios, Ganglia and Cacti
• Competency in Shell, PHP, Perl or Python. C is a plus
• Solid understanding of web services architecture and commonly employed technologies
• A sense of urgency in responding to and resolving critical issues that relate to the performance of the site and/or core
infrastructure
• Excellent verbal and written communication skills
• Participation in a shifted coverage schedule, including occasional evenings shifts
Relocation assistance is available.
To apply, submit your Resume/CV to marshallchoi@facebook.com
Facebook is an Equal Opportunity Employer.
No third party applications accepted.
Set as favorite
Bookmark
Email this
Hits: 1073
Trackback(0)
Comments (0)

Write comment


