|
Justin's Linklog
The following are the titles of recent articles syndicated from Justin's Linklog
Add this feed to your friends list for news aggregation, or view this feed's syndication information.
LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose.
[ << Previous 20 ]
| Thursday, January 15th, 2026 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 1:41 pm |
Why people believe misinformation even when they’re told the facts
Why people believe misinformation even when they’re told the facts
"Factchecking is seen as a go-to method for tackling the spread of false information. But it is notoriously difficult to correct misinformation. Evidence shows readers trust journalists less when they debunk, rather than confirm, claims.
The work of media scholar Alice Marwick can help explain why factchecking often fails when used in isolation. Her research suggests that misinformation is not just a content problem, but an emotional and structural one:
[Marwick] argues that it thrives through three mutually reinforcing pillars: the content of the message, the personal context of those sharing it, and the technological infrastructure that amplifies it:
People find it cognitively easier to accept information than to reject it, which helps explain why misleading content spreads so readily;
When fabricated claims align with a person’s existing values, beliefs and ideologies, they can quickly harden into a kind of “knowledge”. This makes them difficult to debunk;
[When social media platforms] prioritise content likely to be shared, making sharing effortless, every like, comment or forward feeds the [misinformation] system. The platforms themselves act as a multiplier.
Tags: misinformation disinformation alice-marwick research psychology social-media fake-news information debunking facts factchecking | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 9:56 am |
A better way to limit Claude Code (and other coding agents!) access to Secrets
A better way to limit Claude Code (and other coding agents!) access to Secrets
Bubblewrap, a Linux CLI tool which uses namespaces to sandbox a specific command (and its subprocesses):
Bubblewrap lets you run untrusted or semi-trusted code without risking your host system. We’re not trying to build a reproducible deployment artifact. We’re creating a jail where coding agents can work on your project while being unable to touch ~/.aws, your browser profiles, your ~/Photos library or anything else sensitive.
Very nice, I hadn't heard of this tool before. The rest of the blog post details how to use it to isolate Claude Code specifically.
Tags: claude llms sandboxing linux cli namespaces security infosec trust unix | | Wednesday, January 14th, 2026 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 10:49 am |
Russian Propaganda Infects AI Chatbots
Russian Propaganda Infects AI Chatbots
CEPA: "A Moscow-based global “news” network is leveraging Western artificial intelligence tools to devastating effect":
This form of data poisoning is deliberately designed to corrupt the information environments on which AI systems depend. Large language models do not possess an internal understanding of truth. They operate by assessing credibility based on statistical signals, including repetition, apparent consensus, and cross-referencing posts from across the web. Unfortunately, this approach to truth-seeking means an unexpected but structural vulnerability that hostile states have learned to exploit. [...]
The West has failed to recognize that it is under sustained information warfare. The United States dismantled the US Information Agency years ago, has steadily weakened Voice of America and Radio Free Europe, and recently scaled back the Foreign Malign Influence Center, even as Russia, China, and Iran made information warfare a core instrument of state power.
As AI systems increasingly function as arbiters of fact, this vulnerability becomes a national security danger. It is no longer sufficient for technology companies to disclaim responsibility by reminding users that models can make mistakes. Information security needs to be treated as a core requirement.
Tags: propaganda russia misinformation disinformation ai llms web truth | | Thursday, January 8th, 2026 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 11:59 am |
Today in “Google broke email” [ Error: Irreparable invalid markup ('<a [...] email"">') in entry. Owner must fix manually. Raw contents below.] <ul><li><p>
<a class="deliciouslink" href="https://www.jwz.org/blog/2025/12/today-in-google-broke-email-2/#comment-265285" title="Today in "Google broke email"">Today in "Google broke email"</a></p>
<p>update on the POP3pocalypse -- it appears that the most likely thing to work in the future will be to use SMTP forwarding to gmail, with ARC headers added. This is a comment thread detailing the rather complex Postfix/OpenARC setup that may do the job. It looks frankly unpleasant</p>
<p class="taglist">Tags: <a class="delicioustag" href="https://bookmarks.taint.org//t:email">email</a> <a class="delicioustag" href="https://bookmarks.taint.org//t:smtp">smtp</a> <a class="delicioustag" href="https://bookmarks.taint.org//t:pop3">pop3</a> <a class="delicioustag" href="https://bookmarks.taint.org//t:gmail">gmail</a> <a class="delicioustag" href="https://bookmarks.taint.org//t:arc">arc</a> <a class="delicioustag" href="https://bookmarks.taint.org//t:forwarding">forwarding</a> <a class="delicioustag" href="https://bookmarks.taint.org//t:postfix">postfix</a> <a class="delicioustag" href="https://bookmarks.taint.org//t:openarc">openarc</a></p></li></ul> | | Tuesday, January 6th, 2026 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 12:13 pm |
This system can sort real pictures from AI fakes — why aren’t platforms using it? | | Monday, January 5th, 2026 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 11:05 am |
Pi Reliability: Reduce writes to your SD card | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 11:05 am |
Solid state drive – ArchWiki | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 11:05 am |
Understanding EV Battery Life
Understanding EV Battery Life
Ireland's SEAI have published a decent blog post with some real world facts about EV battery lifespans:
In 2020 GeoTab, a telematics solution provider, published real world battery data of 6,000 EVs (BEV & PHEV) over millions of days to produce 2 free to use tools that provide invaluable insight into the impact of temperature and SoH of EV batteries in the long term.
This real-world data showed the average EV battery lost around 2.3% capacity per year. In other words, a 300km range EV today will have lost 34km in 5yrs. Data also showed that heat & fast-charging (DC charging) is responsible for more battery degradation than age or mileage, so high levels of use i.e. driving or mileage does not appear to be a concern.
GeoTab's real world data along with other reports of EVs far surpassing their warranty by multiples of distance, cases of high level of use are plentiful. For example a 2017 Renault Zoe 52kWh, that's in use as a taxi in (hot) Turkey with 345,000Kms on the clock and a near perfect 96% SoH after driving further than an average Irish car's life expectancy.
Tags: seai ev batteries cars driving bev | | Thursday, December 18th, 2025 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 10:48 am |
_Cheap science, real harm: the cost of replacing human participation with synthetic data_ [pdf]
_Cheap science, real harm: the cost of replacing human participation with synthetic data_ [pdf]
A new paper from the inimitable Abeba Birhane, on the increasingly common practice of generating synthetic data using LLMs:
Driven by the goals of augmenting diversity, increasing speed, reducing cost, the use of synthetic data as a replacement for human participants is gaining traction in AI research and product development. This talk critically examines the claim that synthetic data can “augment diversity,” arguing that this notion is empirically unsubstantiated, conceptually flawed, and epistemically harmful. While speed and cost-efficiency may be achievable, they often come at the expense of rigour, insight, and robust science. Drawing on research from dataset audits, model evaluations, Black feminist scholarship, and complexity science, I argue that replacing human participants with synthetic data risks producing both real-world and epistemic harms at worst and superficial knowledge and cheap science at best.
"Synthetic data: stereotypes compressed" is absolutely spot on. This doesn't give insights into human behaviour and beliefs, just into stereotypes. It is increasingly common in social science fields, under the names of "digital twins" and "silicon samples".
Tags: data surveys abeba-birhane papers ai synthetic-data digital-twins simulation testing social-science silicon-samples | | Tuesday, December 16th, 2025 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 10:52 am |
Boost for artists in AI copyright battle as only 3% back UK active opt-out plan
Boost for artists in AI copyright battle as only 3% back UK active opt-out plan
Wow, this is an absolute bollocking for the Labour plan:
95% of the more than 10,000 people who had their say over how music, novels, films and other works should be protected [in the UK] from copyright infringements by tech companies called for copyright to be strengthened and a requirement for licensing in all cases or no change to copyright law.
By contrast, only 3% of people backed the UK government’s initial preferred tech company-friendly option, which was to require artists and copyright holders to actively opt out of having their material fed into data-hungry AI systems.
Tags: ai training data copyright law uk uk-politics llms | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 11:06 am |
Chafa: Terminal Graphics for the 21st Century | | Monday, December 15th, 2025 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 12:00 pm |
Avoid UUID Version 4 Primary Keys | Software Engineer, Author, High Performance PostgreSQL for Rails
Avoid UUID Version 4 Primary Keys | Software Engineer, Author, High Performance PostgreSQL for Rails
A well-researched article suggesting that random UUIDs do not make a good primary key for database tables; I would tend to agree (for cases where performance is important).
- UUID v4s increase latency for lookups, as they can’t take advantage of fast ordered lookups in B-Tree indexes
- For new databases, don’t use gen_random_uuid() for primary key types, which generates random UUID v4 values
- UUIDs consume twice the space of bigint
- UUID v4 values are not meant to be secure per the UUID RFC
- UUID v4s are random. For good performance, the whole index must be in buffer cache for index scans, which is increasingly unlikely for bigger data.
- UUID v4s cause more page splits, which increase IO for writes with increased fragmentation, and increased size of WAL logs
- For non-guessable, obfuscated pseudo-random codes, we can generate those from integers, which could be an alternative to using UUIDs
- If you must use UUIDs, use time-orderable UUIDs like UUID v7
Tags: postgres rails databases sql mysql uuids indexing primary-keys keys lookup storage random | | Tuesday, December 9th, 2025 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 10:04 am |
‘Pig Butchering’ Scams May Have Spurred Thailand-Cambodia War
'Pig Butchering' Scams May Have Spurred Thailand-Cambodia War
Via TJ McIntyre -- indications that the Thailand-Cambodia war is being driven by the "pig butchering" scammer compounds operating in the border area:
Cambodia’s 2019 census put O’Smach’s population just over 9,850, but that doesn’t include the prison-like, office-dormitory compounds that have appeared here over the past five years, with the capacity to house 10,000 more.
Around 50 sites like these now line the Cambodia-Thailand border, designed to house a slice of the trillion-dollar cybercrime industry—primarily teams running investment scams, dubbed “pig butchering” for the way they fatten their targets up; sextortion scams that blackmail victims, including children, by threatening to make sexual images public; scams that impersonate police to gain account access; and fraudulent online gambling sites. Once aimed largely at the Chinese public, these now target victims worldwide and rake in tens of billions of dollars a year in Cambodia alone.
The compounds evolved from a casino industry that caters mostly to Chinese tourists and Thai day-trippers and has been linked to human trafficking, drug smuggling, and the endangered wildlife trade. From 2016, physical casinos were dwarfed by the online gambling industry (outlawed by Cambodia in 2019), which progressed to illegal sites and outright scams. Operators rent space in casinos and purpose-built compounds controlled by Chinese criminals, Myanmar warlords, and the Cambodian political elite.
Scam companies rely heavily on forced and trafficked labor from Asia, Africa, and Latin America to chat with targets, pose as romantic interests and employees at fake investment platforms, and persuade them to make deposits. Survivors tell us that torture, rape, and beatings are common. As the fighting raged in July, some trafficking victims reached out for help, saying they were locked in their dorms by their bosses. Videos shot from inside these sites show missiles flying overhead, explosions thundering outside, some workers appearing to break out and run, and damage from shelling in the grounds.
Tags: scams phishing pig-butchering war grim-meathook-future thailand cambodia scammers | | Monday, December 8th, 2025 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 11:37 am |
Year in Review 2025: Hari Kunzru on AI slop and censorship | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 11:00 am |
Multiplying our way out of division — Matt Godbolt’s blog
Multiplying our way out of division — Matt Godbolt’s blog
A very silly optimisation for the “binary to decimal” conversion problem:
The compiler has turned division by a constant ten into a multiply and a shift. There’s a magic constant 0xcccccccd and a shift right of 35! Shifting right by 35 is the same as dividing by 235 - what’s going on? [..]
What’s happening is that 0xcccccccd / 2**35 is very close to ? (around 0.10000000000582077). By multiplying our input value by this constant first, then shifting right, we’re doing fixed-point multiplication by ? - which is division by ten. The compiler knows that for all possible unsigned integer values, this trick will always give the right answer.
Tags: hacks optimization bit-hacking binary decimal fixed-point arithmetric tricks | | Thursday, December 4th, 2025 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 6:26 pm |
Large Language Models As The Tales That Are Sung
Large Language Models As The Tales That Are Sung
A thought-provoking read on LLMs, poetry, the oral tradition, and Gene Wolfe:
"Even if LLMs are made out of poetry, they are incapable of producing poems. Or in Wolfe’s language, both the epic form and LLMs are story, but are incapable of telling stories. That requires the marriage of structure and intention that human mediation provides. LLMs are a kind of composite of the singing of tales, but are not singers, even if we sometimes misconstrue them as such."
Tags: llms text poetry words language gene-wolfe ascians storytelling structure culture | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 1:08 pm |
‘Unauthorized’ Edit to Ukraine’s Frontline Maps Point to Polymarket’s War Betting | | Wednesday, December 3rd, 2025 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 3:19 pm |
| | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 12:56 pm |
Building a Medallion architecture with ClickHouse
Building a Medallion architecture with ClickHouse
Walkthrough of the "Medallion" architecture concept, which comprises three layers (or stages), each serving distinct purposes in the data pipeline:
-
Bronze layer - This layer acts as the landing area for raw, unprocessed data directly from the source system: simply put a "staging area". This data is stored in its original structure with minimal transformations and additional metadata. This layer is optimized for fast ingestion, and can provide an historical archive of source data that is always available for reprocessing or debugging. Whether the bronze layer should store all data is a point of contention, with some users preferring to filter the data and apply transformations, e.g., flattening JSON, renaming fields, or filtering out poorly formed data. We're not overly opinionated here but recommend optimizing the storage for consumption by the silver layer only - not other consumers.
-
Silver layer - Here, data is cleansed, deduplicated, and conformed to a unified schema, with raw data from the previous Bronze layer being enriched and transformed to provide a more accurate and consistent view. This data can be consistent and usable for enterprise-wide use cases such as machine learning and analytics. The data model should emerge at this layer with a focus placed on ensuring primary and foreign keys are consistent to simplify future joins. While not common, applications and downstream consumers can read from this layer. These are typically business-wide applications that need the entire cleansed dataset, e.g., ML workflows. Importantly, data quality will not improve after this stage only the ease at which it can be queried efficiently.
-
Gold layer - This later aims to have fully curated, business-ready, and project-specific datasets that make the data more accessible (and performant) to consumers. These datasets are often denormalized, or pre-aggregated, for optimal read performance and may have been composed of multiple tables from the previous silver stage. The focus here is on applying final transformations and ensuring the highest data quality for consumption by end-users or applications, such as reporting and user-facing dashboards.
This layered approach to data pipelines aims to efficiently address challenges like data quality, duplication and schema inconsistencies. By transforming raw data incrementally, the Medallion architecture aims to ensure a clear lineage and progressively refined datasets that are ready for analysis or operational use.
Tags: medallion-architecture data architecture pipelines clickhouse | | Friday, November 28th, 2025 | | LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose. |
| 1:04 pm |
|
[ << Previous 20 ]
LJ.Rossia.org makes no claim to the content supplied through this journal account. Articles are retrieved via a public feed supplied by the site for this purpose.
|