Sr. Site Reliability Engineer
Company: NBCUniversal
Location: Beverly
Posted on: May 29, 2023
|
|
Job Description:
NBCUniversal owns and operates over 20 different businesses
across 30 countries including a valuable portfolio of news and
entertainment television networks, a premier motion picture
company, significant television production operations, a leading
television stations group, world-renowned theme parks and a premium
ad-supported streaming service.
Here you can be your authentic self. As a company uniquely
positioned to educate, entertain and empower through our platforms,
Comcast NBCUniversal stands for including everyone. We strive to
foster a diverse and inclusive culture where our employees feel
supported, embraced and heard. We believe that our workforce should
represent the communities we live in, so that together, we can
continue to create and deliver content that reflects the current
and ever-changing face of the world. Click here
(https://corporate.comcast.com/values/diversity-equity-inclusion/our-impact)
to learn more about Comcast NBCUniversal's commitment and how we
are making an impact.
Create solutions to improve performance, scalability, and
reliability (15%).
Develop scripts for monitoring tools, maintenance, and for
deploying code on servers.
Make design recommendations for internal software and hardware
projects.
Attend technical meetings and execute strategic and tactical
plans.
Responsible for fine tuning applications and databases to make sure
they run efficiently and optimize their performance.
Contribute to developing deployment tools which are smart enough to
understand difference between old and new code; it should be able
stop the services on servers gracefully before we make changes on
the servers. This is achieved with combination of Python and
Ansible working together.
Work closely with development teams to support continuous
integration and deployment (15%).
Integrate automation tools such as foreman with home grown tools to
automate server provisioning process.
Make sure correct Software version is deployed on all servers. We
rely on the Software version report received from monitoring tools
and act on the discrepancies found in the version report.
Work closely with developers on resolution if any new changes cause
performance degradation. Solve problems quickly and automate
processes (15%).
Develop and maintain containers with the help of Mesos, develop
Docker images and maintain private Docker registry to store all the
Docker images.
Monitor site stability and performance (15%). Monitoring tools play
a significant role in the job as we rely on them to detect service
issues and alert us in a timely manner.
Work on production impacting alerts received from various
monitoring tools.
Test and integrate new tools such as Nagios, PagerDuty, Grafana,
ELK, Splunk etc. into environment to monitor and detect issues
quickly.
Perform patching on production servers and containers to keep the
Operating System and Applications safe and up to date.
Work on external monitoring services such as Dynatrace and Website
Pulse for detecting costumer impacting issues and submitting a fix
swiftly.
Providing analytical ad hoc support (15%); and Work with various
third parties in integrating their services with Vudu, which
requires understanding how their service work and make appropriate
changes on our end.
Deploy latest software on production servers to push out new
features and promotions.
Work on securing services by deploying/renewing SSL certs on
servers/services.
Manage users/group activity and providing limited/appropriate
access to users/group with help of Active Directory.
Work on identifying root cause of the issues observed in monitoring
tools and submitting a fix for them. 24x7 weekly on call rotation
(25%).
Work closely with other teams such as development and QA in
resolving the issues noticed in production and making sure we have
proper solution and monitors are in place to avoid the reoccurrence
of the issue.
Create new monitors plugins in Nagios, adjust thresholds of the
monitors to avoid false positive alerts, create new alerts,
reports, and dashboards in Splunk as we bring up new
services/features in Vudu.
Catch Database issues, setup replications among DB clusters, and
fine tune them for better performance. As a part of disaster
recovery (DR) plan, responsible for supporting DR site by creating
replication between active and standby data centre.
Distribute the traffic on web servers and database servers by
configuring Load balancers, creating virtual IPs, and configuring
them to bind with respective servers.
Perform various activity on webservers to ensure our main service
is secure and available to our customers, this is achieved by
administrating and maintaining Apache and Nginx.
Responsible for Installing/updating SSL certs, enabling certain TLS
(Transport Layer Security) versions, loading modules, limiting
request size.
This position is eligible for company sponsored benefits, including
medical, dental and vision insurance, 401(k), paid leave, tuition
reimbursement, and a variety of other discounts and perks. Learn
more about the benefits offered by NBCUniversal by visiting the
Benefits page of the Careers website. Salary Range:
$155,000-$165,000
Bachelor's degree in Computer Science, Information Systems/IT,
Software Engineering or Closely related technical field (or foreign
equivalent),
5 (five) years of experience in the job offered or closely related
occupation OR a Master's degree in Computer Science, Information
Systems/IT, Software Engineering or Closely related technical field
(or foreign equivalent),
2 (two) years of experience in the job offered or closely related
occupation.
Special Requirements:
Must possess expertise/knowledge sufficient to adequately perform
the duties of the job being offered. Expertise/knowledge may be
gained through employment experience or education. Such
expertise/knowledge cannot be "quantified" by "time." Required
expertise/knowledge includes: Experience with Continuous
Integration, Continuous Delivery practices.
Understanding of DevOps principles, experience with operational
tools (Ansible, Puppet, Terraform) and best practices for
infrastructure and software deployment.
Experience with Docker Containerization.
Operational experience at large scale web sites. Scripting skills
in at least two languages (bash, python, ruby, perl, etc.)
Knowledge of monitoring and metrics.
Background with relational databases and database administration,
with MySQL and Postgres.
Working knowledge of NoSQL data stores (MongoDB, Cassandra,
DynamoDB, Couchbase, Postgres etc.)
Systems engineering experience on Linux and/or Windows platforms,
Experience using AWS in a production environment.
JOB LOCATION: -
Fandango Media, LLC
407 N Maple Dr., Beverly Hills, CA 90210
40 hours per week / If offered employment, must have legal right to
work in U.S.
*Telecommuting/Remote work permitted 100% of the time from anywhere
in the United States
CONTACT: - - - - - - - - - - - - - - - - - - - - - - - - - -
Qualified applicants please send resume to Elsbeth Velasco at
elsbeth.velasco@nbcuni.com
**Must reference JOB CODE# AV23PN: when applying.
This notice is being provided as a result of the filing of a
permanent alien labor certification application for this job
opportunity. Any person may provide documentary evidence bearing on
the application to the Certifying Officer of the Department of
Labor at:
U.S. Department of Labor
Employment and Training Administration
Office of Foreign Labor Certification
200 Constitution Avenue, NW
Room N-5311
Washington, DC 20210
Phone: - 202-693-8200
NBCUniversal's policy is to provide equal employment opportunities
to all applicants and employees without regard to race, color,
religion, creed, gender, gender identity or expression, age,
national origin or ancestry, citizenship, disability, sexual
orientation, marital status, pregnancy, veteran status, membership
in the uniformed services, genetic information, or any other basis
protected by applicable law. NBCUniversal will consider for
employment qualified applicants with criminal histories in a manner
consistent with relevant legal requirements, including the City of
Los Angeles Fair Chance Initiative For Hiring Ordinance, where
applicable.
If you are a qualified individual with a disability or a disabled
veteran, you have the right to request a reasonable accommodation
if you are unable or limited in your ability to use or access
nbcunicareers.com as a result of your disability. You can request
reasonable accommodations in the US by calling 1-818-777-4107 and
in the UK by calling +44 2036185726.
Keywords: NBCUniversal, Boston , Sr. Site Reliability Engineer, Engineering , Beverly, Massachusetts
Click
here to apply!
|