Yesterday one of the archive’s early adopters sent me a link to an article about how various sites block archive.org and asked how things are on our end. I wrote back something like: “Honestly, the bigger trend we’re dealing with lately is front-enders shipping a hundred JavaScript files per page, so if even one of them fails to load the whole page collapses like a house of cards. Against that background, even if something like what the article describes did happen, it probably passed unnoticed.” … An hour later an email arrives from a sysadmin at Condé Nast: “Are you blocking our office IP?” “Oh. Right. Yes, we are. You’re reprinting the seed-crystal of a finne troll’s black-tar propaganda about us, laundering it with your brand’s legitimacy, and you still expect to keep using our free service? Have you people completely lost your damn minds over there?” … Back in the dawn-of-the-Internet era, when hosting providers billed by the gigabyte even for dedicated servers and the Great Firewall of China was still a glimmer in some bureaucrat’s eye, lots of sites just blocked visitors from China. They weren’t buying anything anyway. Now everyone blocks everyone they don’t like or don’t profit from. Walmart (or Target?) blocks everyone outside the U.S. Ukraine’s been blocking VK for a decade. Things that feel almost like core infrastructure - (((ifconfig.me))), (((ipinfo.io))), … - block Iran. We block Cyprus because it has a suspiciously high density of people with a past best left undisclosed starting shiny new “European” lives from scratch. … To deal with that reality, a multi-exit VPN, one that chooses a exit node depending on the target IP, has been a necessity for a long time now, for bots and humans, long before “VPN” became a lifestyle accessory. But it comes with problems: First, privacy. Tracking scripts don’t see one IP, they see several. And even that pattern by itself is a de-anonymizing signal, because there aren’t that many surfers who look like that. Second, Cloudflare. The exit gets chosen for the IP, not the domain, and multiple sites are mixed together on the same IP. Some only let you in from the U.S., others only from Europe, etc. There’s no good solution. So you pick some compromise region X based on which of your favorite sites you’re least willing to have broken. All your Cloudflare traffic now goes through region X. And if you yourself aren’t actually in X (because you chose it not by proximity but by least-badness for your personal web ecosystem) then your packets start doing laps around the planet. For a multi-exit VPN user, a site behind such a “mixer” CDN ends up slower than a site with no CDN at all. And this is yet another reason — after EDNS, captchas (that can pop up instead of any one of a hundred included JavaScript files), random de-platformings, did I miss anything? — that makes Cloudflare a kind of natural antagonist. Not exactly an enemy. More like a sparring partner you keep finding yourself matched against, again and again, in different disciplines, in different rings, each time convinced this bout will finally settle something, and each time walking away a little more bruised and a little more aware of how strange the whole fight has become.
When the lease on your domain expires, it often gets snapped up by what are called “parking” outfits (which is like calling a toll booth a “roadside hospitality concept”). A parking domain is basically a dead address turned into a little money farm: no real content, just ads, redirects, tracking pixels, and a vague pretense of being a website, all optimized to squeeze value out of whatever stray visitors still wander in. But what if the domain that falls into a parking company’s hands was not serving articles or blog posts or cat photos, but scripts. Say, for example, a CDN endpoint. Or a banner network. Or some forgotten third-party JavaScript that thousands of living, breathing sites still quietly load in the background. Well then the fun starts. Because now the parking company is sitting in the middle of someone else’s supply chain. They can redirect visitors from perfectly legitimate, still-active sites that happen to reference that old domain. And they do it in a way designed to stay invisible. No big splash. No obvious breakage. Just a slow siphoning of traffic that can go unnoticed for years. For example, here is a case where traffic was stolen from EJ.ru for four years. Four. Nobody noticed until someone sent a bug report that basically said: “Why can’t I archive pages from EJ?” And the answer turned out to be: because somewhere in the stack, a script was loading from a dead domain that had been picked up by a parking company and turned into a redirect machine. Here is the archive: https://archive.today/ww82.echobanners.n And another similar story: https://archive.today/www3.widgetserver.c So when people start talking about hacker ethics, about bug bounties, about responsible disclosure, you start to wonder how that whole moral economy is supposed to function when the so called respectable domain investors are behaving a little worse than the hackers. Not breaking in, not exploiting zero days, just quietly sitting on expired infrastructure and milking the pipes that nobody remembered to shut off.
| ← Previous day | (Calendar) | Next day → |