Statistical Genetics Platform Engineer
Company: Eli Lilly and Company
Location: Boston
Posted on: February 27, 2026
|
|
|
Job Description:
At Lilly, we unite caring with discovery to make life better for
people around the world. We are a global healthcare leader
headquartered in Indianapolis, Indiana. Our employees around the
world work to discover and bring life-changing medicines to those
who need them, improve the understanding and management of disease,
and give back to our communities through philanthropy and
volunteerism. We give our best effort to our work, and we put
people first. We’re looking for people who are determined to make
life better for people around the world. The Lilly research
environment is evolving to centralize the access and analysis of
human genetic data. This new initiative will work to define data,
tools and process to provide the therapy area teams key evidence
for target evaluation and target discovery. Many different therapy
areas across Eli Lilly focus on new therapeutic approaches for the
treatment of many different diseases. Starting from an idea, we
work with partners across Lilly to discover and develop novel
biologic, small molecule and nucleic acid-based therapeutics. Our
focus is the patient: by understanding the biology and
pathophysiology underlying disease states, we aim to address the
root cause of disease and develop breakthrough therapies. We have
one of the strongest pipelines in the industry and a track record
of delivering impactful medicines that improve people’s lives. In
this hands-on role, the Statistical Genetics Platform Engineer will
join a team that enables statistical geneticists to derive
scientific insights from internal and external human genetic data.
The ultimate purpose being to drive data-driven decision-making
within the organization. The successful candidate will collaborate
with team members and also with data engineers and platform
architects across the Lilly research environment. The goals of the
collaboration will include identifying genetically-based disease
targets, finding potential expanded clinical indications for
existing assets, classifying and validating patient subpopulations,
and understanding disease mechanisms. The role will support these
goals by developing robust computational pipelines that leverage
harmonized clinical datasets. This role is a great opportunity to
be at the forefront of scientific exploration in a dynamic research
field. Interested in working on an innovative team focusing on
providing clear evidence for therapeutic targets? Apply today! Key
Responsibilities: Design and implement robust, scalable
computational pipelines for statistical genetics analyses,
including workflows for GWAS, polygenic risk scores, fine-mapping,
colocalization and variant annotation Develop and maintain platform
tools and APIs that enable researchers to efficiently process
genomic data at scale (biobanks, population cohorts, multi-omics
datasets) Build infrastructure for reproducible research, including
containerization, workflow orchestration, and version control for
analytical pipelines Optimize computational performance of
statistical genetics algorithms and implement distributed computing
solutions for large-scale analyses Collaborate with statistical
geneticists and computational biologists to translate
methodological innovations into production-ready software Establish
best practices for data access, quality control, validation, and
documentation across genomic analysis pipelines Maintain and
improve existing codebases, ensuring code quality, testing
coverage, and comprehensive documentation Monitor platform
performance, solve issues, and implement improvements based on user
feedback and evolving research needs Support the integration of
AI-based tools and required MLOps infrastructure Basic
Requirements: Master’s in Computer Science, Statistical Genetics,
Bioinformatics or related field and 6 years post-Master’s
experience (in industry or large-scale non-academic institutions,
e.g. Broad, NIH), OR PhD in Computer Science, Statistical Genetics,
Bioinformatics or related field and 3 years post-PhD experience (in
industry or large-scale non-academic institutions, e.g. Broad, NIH)
Key Requirements: Strong programming skills in languages commonly
used in genomics research (Python, R) Demonstrable understanding of
statistical genetics concepts including GWAS, heritability
estimation, genetic correlation, rare variant analysis, and
population structure Experience using standard tools and formats
for genetic data (VCF, BGEN, PLINK, BAM/CRAM) and genomic databases
Proficiency with workflow management systems (Nextflow,
Cromwell/WDL) and containerization technologies Experience with
high-performance computing environments, cloud platforms (AWS, GCP,
Azure), or distributed computing frameworks Strong problem-solving
abilities and attention to detail in handling complex biological
datasets Ability to prioritize and manage multiple competing
priorities within a fast-paced environment Additional
Skills/Preferences: Demonstrated track record performing end-to-end
analysis of human genetic data Familiarity with operationalizing
statistical genetics tools like plink, ADMIXTURE, regenie, VEP,
LDSC, FINEMAP, SuSiE, coloc, METAL, LDpred2, MAGMA, rvtest (rare
variants), SNP-int GPU (epistasis) Experience performing complex
analyses in cloud-based environments required; prior experience
with DNANexus and/or DataBricks is preferred Experience with
large-scale biobanks and their trusted research environments is
preferred. Experience of querying data for analysis through SQL
(e.g. PostgreSQL), noSQL (e.g. Elasticsearch), data stores (e.g.
hail), graph databases (e.g. neo4j) and file storage (e.g. S3)
Experience working with additional data formats, including RNA-seq,
metabolomic, and proteomic data Experience with protein language
models and/or sequence language models Experience working with
clinical data Experience working with electronic health record data
Knowledge of AdAM and OMOP formats Strongly team-oriented with a
customer focused design thinking approach Knowledge of drug
development process and how genomics data is used to impact these
areas Lilly is dedicated to helping individuals with disabilities
to actively engage in the workforce, ensuring equal opportunities
when vying for positions. If you require accommodation to submit a
resume for a position at Lilly, please complete the accommodation
request form (
https://careers.lilly.com/us/en/workplace-accommodation ) for
further assistance. Please note this is for individuals to request
an accommodation as part of the application process and any other
correspondence will not receive a response. Lilly is proud to be an
EEO Employer and does not discriminate on the basis of age, race,
color, religion, gender identity, sex, gender expression, sexual
orientation, genetic information, ancestry, national origin,
protected veteran status, disability, or any other legally
protected status. Our employee resource groups (ERGs) offer strong
support networks for their members and are open to all employees.
Our current groups include: Africa, Middle East, Central Asia
Network, Black Employees at Lilly, Chinese Culture Network,
Japanese International Leadership Network (JILN), Lilly India
Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ
Allies), Veterans Leadership Network (VLN), Women’s Initiative for
Leading at Lilly (WILL), enAble (for people with disabilities).
Learn more about all of our groups. Actual compensation will depend
on a candidate’s education, experience, skills, and geographic
location. The anticipated wage for this position is $166,500 -
$266,200 Full-time equivalent employees also will be eligible for a
company bonus (depending, in part, on company and individual
performance). In addition, Lilly offers a comprehensive benefit
program to eligible employees, including eligibility to participate
in a company-sponsored 401(k); pension; vacation benefits;
eligibility for medical, dental, vision and prescription drug
benefits; flexible benefits (e.g., healthcare and/or dependent day
care flexible spending accounts); life insurance and death
benefits; certain time off and leave of absence benefits; and
well-being benefits (e.g., employee assistance program, fitness
benefits, and employee clubs and activities).Lilly reserves the
right to amend, modify, or terminate its compensation and benefit
programs in its sole discretion and Lilly’s compensation practices
and guidelines will apply regarding the details of any promotion or
transfer of Lilly employees. WeAreLilly
Keywords: Eli Lilly and Company, Boston , Statistical Genetics Platform Engineer, Science, Research & Development , Boston, Massachusetts