Work & Research — Linda Petrini

Publications

Academic publications

Peer-reviewed papers and contributing authorships. Full list on Google Scholar.

Machine Learning · Conference

Locality and Compositionality in Zero-Shot Learning

First author. Published at ICLR 2020. Studied how locality and compositionality relate to generalisation in zero-shot learning.

ICLR 2020

Simulation · Conference

gradsim: Differentiable simulation for system identification and visuomotor control

Contributing author. Differentiable physics simulation framework. Published at ICLR 2021.

ICLR 2021

Ethics · Computer Vision

Ethics and Creativity in Computer Vision

Contributing author. Retrospective on ethics in creative applications of computer vision, drawing on a series of CVPR/ECCV/ICCV workshops I co-organised. Presented at the NeurIPS 2021 workshop on Machine Learning for Creativity and Design.

arXiv (2021)

Contributing author

Research communication contributions

Publications where I provided research support — editing, feedback, writing, and figures — rather than leading the underlying research.

AI Safety · Alignment

Alignment faking in large language models

Demonstrated that frontier models strategically fake alignment during training. 264+ citations.

arXiv (2024)

AI Safety · Evaluations

SHADE-Arena: Evaluating sabotage and monitoring in LLM agents

Framework for evaluating sabotage behaviors and monitoring effectiveness in agentic LLM settings.

arXiv (2025)

AI Safety · Scaling

Inverse scaling in test-time compute

Analysis of cases where more test-time compute leads to worse model performance.

arXiv (2025)

AI Safety · Training

Enhancing model safety through pretraining data filtering

Methods for improving model safety by filtering pretraining data.

Anthropic (2025)

AI Safety · Classifiers

Cost-effective constitutional classifiers via representation re-use

Efficient safety classifiers for content filtering.

Anthropic (2025)

AI Safety · Elicitation

Unsupervised elicitation of language models

Methods for eliciting model capabilities without supervised examples.

arXiv (2025)

Reports & Analysis

Technical reports

Independent research and co-authored technical reports on AI safety, policy, and applications.

AI Safety · Policy

AI Pathways Report

An analysis of potential development trajectories for advanced AI systems and their implications for safety and governance.

Read the report

AI Safety · Foresight

Hyper Entities

A report exploring hyper-entities — novel organisational structures and agents emerging from advanced AI — produced for Foresight Institute's Existential Hope programme.

Read the report

AI Safety · Technology Mapping

Secure AI Tech Tree

A comprehensive mapping of the technical landscape for secure AI development — produced for Foresight Institute, covering alignment, interpretability, and robustness.

Read the report

AI & Climate

AI & Climate Report (Bezos Earth Fund)

An extensive technical report on the intersection of AI and climate science, examining how machine learning can accelerate environmental research and action.

Read the report

Research support

Organisations I've worked with

Embedded research support, technical writing, and analysis for teams working on some of the hardest problems in AI.

Palisade Research Research support

Foresight Institute Technical writing

Research support

Bezos Earth Fund Research & writing

In the 10 years I have been involved in hiring contractors for various technical writing at Foresight, Linda has been the best writer I've worked with, both in terms of the quality of result she delivers and in terms of working style. She brings structure to projects whose scope is rather unclear, before launching a diligent research process that often uncovers new information that shapes the trajectory of the project. She hits deadlines, is kind, patient, reliable and a great communicator. Feel free to contact me for more info.

Allison Duettmann

President, Foresight Institute

New projects

Open to new research collaborations

I take on a limited number of research and writing projects each year. If you're working on something at the intersection of AI safety, policy, or governance and need research support or a skilled technical writer, get in touch.

Get in touch

Research at the frontier of AI

Academic publications

Locality and Compositionality in Zero-Shot Learning

gradsim: Differentiable simulation for system identification and visuomotor control

Ethics and Creativity in Computer Vision

Research communication contributions

Alignment faking in large language models

SHADE-Arena: Evaluating sabotage and monitoring in LLM agents

Inverse scaling in test-time compute

Enhancing model safety through pretraining data filtering

Cost-effective constitutional classifiers via representation re-use

Unsupervised elicitation of language models

Technical reports

AI Pathways Report

Hyper Entities

Secure AI Tech Tree

AI & Climate Report (Bezos Earth Fund)

Organisations I've worked with

Open to new research collaborations