Principal Software Engineer, Data Infrastructure
Company: The New York Times
Location: New York City
Posted on: April 2, 2026
|
|
|
Job Description:
The mission of The New York Times is to seek the truth and help
people understand the world. That means independent journalism is
at the heart of all we do as a company. It’s why we have a
world-renowned newsroom that sends journalists to report on the
ground from nearly 160 countries. It’s why we focus deeply on how
our readers will experience our journalism, from print to audio to
a world-class digital and app destination. And it’s why our
business strategy centers on making journalism so good that it’s
worth paying for. About the Role We are seeking a Principal
Software Engineer to lead the architecture and evolution of our
data and machine learning infrastructure. This role will shape the
foundation on which data-driven products, analytics, and AI
applications are built. You will design systems that enable
large-scale data processing, reliable pipelines, and efficient
machine learning development—from feature engineering to real-time
model serving. As a principal engineer, you will partner with
product, data science, and platform teams to set technical
direction, drive adoption of reusable frameworks, and mentor
engineers across the organization. You will ensure that both data
and ML platforms are scalable, reliable, cost-efficient, and
compliant with privacy and governance standards. The core of the
Data Platform is a data lake on AWS S3 with Apache Iceberg as the
table format to ensure reliability. Data ingestion is standardized
through Confluent Kafka for real-time streaming and Fivetran for
ingestion of files and change-data. The transformation layer is
decoupled from storage, using Apache Flink for stream processing,
AWS Glue (Spark) for core ETL , and dbt/Athena for building
analytical data models. The platform serves data through
fit-for-purpose data stores, including Amazon DynamoDB for
low-latency applications and Google BigQuery as the primary engine
for analytics and BI. This is a hybrid role based in our New York
City headquarters, reporting to the Sr. Director of Engineering.
You can typically expect to come into the office 2 days per week.
Responsibilities: Architect & Build Platform: Design and evolve
infrastructure for data ingestion, storage, batch and streaming
pipelines, and machine learning workflows Enable ML at Scale: Build
systems for training, deploying, monitoring, and governing models,
including feature stores, registries, and inference platforms
Reliability & Observability: Ensure end-to-end system reliability,
monitoring, and cost transparency across data and ML workloads
Self-Service Platforms: Deliver frameworks and APIs that enable
engineers, analysts, and ML scientists to build and operate
solutions independently Innovation & Standards: Evaluate and
introduce emerging technologies (vector databases, distributed
training, orchestration frameworks, LLM stacks) and establish
adoption guidelines Cross-Functional Leadership: Partner with
platform, product, and engineering and ML science leaders to align
on strategy and accelerate delivery Mentorship & Influence: Guide
senior and staff engineers, lead architecture reviews, and raise
the technical bar across data and ML domains Demonstrate support
and understanding of our value of journalistic independence and a
strong commitment to our mission to seek the truth and help people
understand the world Basic Qualifications: 10 years of software
engineering experience with a focus on distributed systems, data
platforms, and ML infrastructure or equivalent Proven ability to
influence technical direction across multiple teams and mentor
senior/staff engineers Proven expertise in data processing
frameworks and table formats (e.g. Spark, Flink, Iceberg) and
orchestration tools (e.g. Airflow, Kubeflow) Deep knowledge of ML
infrastructure: model training pipelines, feature stores,
registries, serving, and monitoring Strong programming skills in
Python and at least one compiled language like Java or Go
Experience designing systems with scalability, reliability, and
cost-efficiency as first-class concerns Cloud platform experience
(AWS, GCP), familiarity with Kubernetes and modern data platform
architectures Preferred Qualifications: Familiarity with compliance
and governance in data/ML systems (auditability, privacy,
explainability) Familiarity with the data lakehouse paradigm and
medallion architecture This role requires limited on-call hours. An
on-call schedule will be determined when you join, taking into
account team size and other variables. LI-Hybrid REQ-018968 The
annual base pay range for this role is between: $198,000 - $220,000
USD For roles in the U.S., dependent on your role, you may be
eligible for variable pay, such as an annual bonus and restricted
stock. Benefits may include medical, dental and vision benefits,
Flexible Spending Accounts (F.S.A.s), a company-matching 401(k)
plan, paid vacation, paid sick days, paid parental leave, tuition
reimbursement and professional development programs. For roles
outside of the U.S., information on benefits will be provided
during the interview process. The New York Times Company is
committed to being the world’s best source of independent, reliable
and quality journalism. To do so, we embrace a diverse workforce
that has a broad range of backgrounds and experiences across our
ranks, at all levels of the organization. We encourage people from
all backgrounds to apply. We are an Equal Opportunity Employer and
do not discriminate on the basis of an individual's sex, age, race,
color, creed, national origin, alienage, religion, marital status,
pregnancy, sexual orientation or affectional preference, gender
identity and expression, disability, genetic trait or
predisposition, carrier status, citizenship, veteran or military
status and other personal characteristics protected by law. All
applications will receive consideration for employment without
regard to legally protected characteristics. The U.S. Equal
Employment Opportunity Commission (EEOC)’s Know Your Rights Poster
is available here . The New York Times Company will provide
reasonable accommodations as required by applicable federal, state,
and/or local laws. Individuals seeking an accommodation for the
application or interview process should email
reasonable.accommodations@nytimes.com. Emails sent for unrelated
issues, such as following up on an application, will not receive a
response. The Company encourages those with criminal histories to
apply, and will consider their applications in a manner consistent
with applicable "Fair Chance" laws, including but not limited to
the NYC Fair Chance Act, the Los Angeles Fair Chance Initiative for
Hiring Ordinance, the San Francisco Fair Chance Ordinance, the Los
Angeles County Fair Chance Ordinance for Employers, and the
California Fair Chance Act. For information about The New York
Times' privacy practices for job applicants click here . Please
beware of fraudulent job postings. Scammers may post fraudulent job
opportunities, and they may even make fraudulent employment offers.
This is done by bad actors to collect personal information and
money from victims. All legitimate job opportunities from The New
York Times will be accessible through The New York Times careers
site . The New York Times will not ask job applicants for financial
information or for payment, and will not refer you to a third party
to do so. You should never send money to anyone who suggests they
can provide employment with The New York Times. If you see a fake
or fraudulent job posting, or if you suspect you have received a
fraudulent offer, you can report it to The New York Times at
NYTapplicants@nytimes.com. You can also file a report with the
Federal Trade Commission or your state attorney general .
Keywords: The New York Times, Levittown , Principal Software Engineer, Data Infrastructure, IT / Software / Systems , New York City, New York