Schneier on Security's Journal
 
[Most Recent Entries] [Calendar View]

Friday, July 25th, 2025

    Time Event
    1:15p
    Subliminal Learning in AIs

    Today’s freaky LLM behavior:

    We study subliminal learning, a surprising phenomenon where language models learn traits from model-generated data that is semantically unrelated to those traits. For example, a “student” model learns to prefer owls when trained on sequences of numbers generated by a “teacher” model that prefers owls. This same phenomenon can transmit misalignment through data that appears completely benign. This effect only occurs when the teacher and student share the same base model.

    Interesting security implications.

    I am more convinced than ever that we need serious research into AI integrity if we are ever going to have trustworthy AI.

    11:33p
    Friday Squid Blogging: Stable Quasi-Isodynamic Designs

    Yet another SQUID acronym: “Stable Quasi-Isodynamic Design.” It’s a stellarator for a fusion nuclear power plant.

    << Previous Day 2025/07/25
    [Calendar]
    Next Day >>

Schneier on Security   About LJ.Rossia.org