19 May 2026

Mini Shai-Hulud, blocked: a live capture against the real payload

On 19 May 2026, between 01:39 and 02:06 UTC, the atool npm publisher account pushed 637 malicious versions across 317 packages, including some with eight-figure monthly download counts: size-sensor, echarts-for-react, @antv/scale, dozens more in the @antv scope. The payload is Mini Shai-Hulud, the same toolkit family behind the SAP compromise three weeks earlier. SafeDep published the discovery; StepSecurity reverse-engineered the payload. Both writeups are good; we are not going to repeat that work.

What we are going to do is run the real payload through a Leitwacht-enforced container and paste the agent’s raw output. The capture is the post.

We also picked up three other findings on the way: npm’s yank cadence, npmmirror’s auto-redirect, and an optionalDependencies GitHub-commit vector worth a SAST rule. All three go below.

Layered defences, in the order Mini Shai-Hulud encounters them

Trying to obtain the real size-sensor@1.2.4 was the first interesting finding, because we had to defeat three other defences before we got to ours. In order:

1. npm primary registry: versions yanked within hours. Of the twelve @antv-family packages we surveyed, every wave-2 version (approximately 02:06 UTC, per SafeDep’s timeline) returned 404 when we tried npm pack. The _npmUser field on the still-pullable wave-1 publishes is atool <i@hust.cc>, confirming the compromised maintainer. Some packument responses still listed the yanked versions by name for a while after the version-specific endpoints went dead, suggesting registry-cache propagation lag rather than incomplete removal. By ~16 hours post-wave, every malicious sample we wanted from npm was gone.

2. Alibaba’s npmmirror: “antv-block” auto-redirect. When we pointed npm install size-sensor@1.2.4 at registry.npmmirror.com, the install completed in 7 seconds and gave us a clean install. The mirror response carried a deprecation header we didn’t expect:

npm warn deprecated size-sensor@1.2.4: [WARNING] Use 1.0.3 instead of 1.2.4,
  reason: antv-block list, redirect to last clean version before 2026-05-19

npmmirror has an active block list for the AntV-family Mini Shai-Hulud versions and silently substitutes the pre-attack release via cnpm/bug-versions, a config-driven block list consumed by cnpmcore and npminstall. The antv-block addition was merged at 05:57 UTC on 2026-05-19. At the time of our capture (~24h post-wave, when the manifest redirect was already live) the malicious tarball was still cached at npmmirror’s CDN, that’s how we obtained it. Both registries now return 404 for the direct tarball URL; only npmmirror’s manifest-level redirect remains as a visible artifact.

3. The package’s optionalDependencies vector: worth a SAST flag. Inspecting the size-sensor@1.2.4 manifest revealed an unusual entry beyond the documented payload:

"optionalDependencies": {
  "@antv/setup": "github:antvis/G2#1916faa365f2788b6e193514872d51a242876569"
}

A pin to a specific commit on antvis/G2, not a tagged release. GitHub had already deleted the referenced commit when we tried; codeload.github.com/antvis/G2/tar.gz/<sha> returned 404. The optional flag means npm tolerates the failure and proceeds, but the shape is interesting: an optionalDependencies entry pointing at a specific malicious GitHub commit is a clean way to ship a payload through a path that doesn’t show up in the regular dependencies block of npm view. Pinning to a git SHA isn’t always malicious (vendored deps and pre-release testing do it legitimately), but a SAST rule that flags github:org/repo#<sha> in an optionalDependencies slot would have surfaced this one.

4. CI tooling pinning via lockfile. A project with a package-lock.json pinning size-sensor to an integrity hash from before 19 May would refuse to install 1.2.4 because the integrity hash wouldn’t match. Lockfiles do their job here.

5. Leitwacht’s default-deny egress. The thing we’re here to test. Even if all four defences above missed, the malicious code still needs to phone home, and a kernel-level allowlist on per-cgroup egress doesn’t care whether the package was tag-published, range-resolved, or downloaded out of band. We caught it.

The capture

The setup: docker container, hardened (read-only rootfs, all caps dropped, non-root, no host volumes), labelled leitwacht.scope=canary, running bun run index.js against the real size-sensor@1.2.4 payload. The Leitwacht agent attaches a cgroup_skb/egress BPF program to this container’s cgroup with a tiny allowlist (registry.npmmirror.com, registry.npmjs.org, and a couple of CDN domains, no exfil destinations). Everything else is dropped at the kernel.

A note on the hardening. The read-only rootfs, capability drops, non-root user, and host-volume removal are defense-in-depth, not the load-bearing layer of this test. None of them constrain outbound TCP: the container’s network stack is fully functional, bun run opens sockets freely, and every exfil attempt the payload makes reaches the kernel. The only thing dropping the 21 SYNs is cgroup_skb/egress, which runs at the kernel egress hook ahead of any container-level isolation. An unhardened container with the same agent attached produces the same trace; an unhardened container without the agent lets every connection complete. We hardened the lab container to remove adjacent kill-chain branches (host-filesystem write, nsenter, docker.sock abuse) so what the capture measures is the network primitive in isolation, not an incidental interaction with another layer.

When the payload runs, the agent emits this on enforcement detached:

"msg":"enforcement detached"
"violations":21
"dns_entries":3
"bytes_sent":0

Zero bytes left the container during the 2+ minute run. Three DNS queries the agent’s proxy saw and sinkholed; 21 TCP SYNs the BPF program dropped.

DNS layer: what the payload tried to resolve

Three queries from the bun runtime, all blocked by the agent’s DNS proxy before any IP was returned to the application:

{
  "domain": "t.m-kosche.com",
  "query_type": "AAAA",
  "action": "dns_blocked",
  "comm": "bun",
  "binary_path": "/usr/local/bin/bun"
},
{
  "domain": "t.m-kosche.com",
  "query_type": "A",
  "query_count": 2,
  "resolved_ips": ["192.0.2.1"],
  "action": "dns_blocked",
  "comm": "bun",
  "binary_path": "/usr/local/bin/bun"
},
{
  "domain": "api.github.com",
  "query_type": "A",
  "query_count": 2,
  "resolved_ips": ["192.0.2.1"],
  "action": "dns_blocked",
  "comm": "Bun Pool 0",
  "binary_path": "/usr/local/bin/bun"
}

Both C2 domains caught at the DNS layer. t[.]m-kosche[.]com is the primary OpenTelemetry-disguised exfil endpoint documented in SafeDep’s writeup (the path /api/public/otel/v1/traces makes outbound POSTs look like ordinary observability traffic). The payload tried IPv6 first, then IPv4. Both dns_blocked. api.github.com is the payload’s fallback C2: when the primary OTel POST to t[.]m-kosche[.]com is unreachable, it pivots to pushing encrypted data to a dead-drop branch on a public GitHub repository via the Git Data API, authenticated with any GitHub PAT it has scavenged from the runner. The connect attempt fires regardless of whether a usable PAT was found; only the request body would have been weaponised. The agent’s DNS proxy returns the RFC 5737 TEST-NET-1 address 192.0.2.1 instead of the real IP, so any subsequent connect attempt lands on the sinkhole instead of the real destination.

Connection layer: what the kernel BPF program dropped

The payload doesn’t give up just because DNS failed. It retried the connect to the sinkhole IP repeatedly for over two minutes. The agent dropped each one (the ts values below are wall-clock UTC):

{ "ts": "21:35:29.715", "ip": "192.0.2.1", "port": 443, "comm": "HTTP Client",
  "binary_path": "/usr/local/bin/bun", "action": "sinkhole_hit",
  "domain": "api.github.com", "domain_source": "agent_dns" },
{ "ts": "21:35:30.747", "ip": "192.0.2.1", "port": 443, "comm": "HTTP Client", "action": "sinkhole_hit" },
{ "ts": "21:35:31.770", "ip": "192.0.2.1", "port": 443, "comm": "HTTP Client", "action": "sinkhole_hit" },
{ "ts": "21:35:32.794", "ip": "192.0.2.1", "port": 443, "comm": "HTTP Client", "action": "sinkhole_hit" },
... 17 more identical entries spanning 21:35:33 to 21:37:55 ...

Total: 21 connect attempts, all to the sinkhole IP, all dropped at the kernel by cgroup_skb/egress. The action: "sinkhole_hit" reason is the agent’s tag for “the application ignored DNS policy and is trying to connect to the address the DNS proxy returned for a blocked query”, a distinct signal from a normal kernel drop used to flag aggressive malware.

comm: "HTTP Client" is bun’s HTTP-pool worker; binary_path: /usr/local/bin/bun confirms the executing binary. The retry interval starts at ~1 s and backs off. The payload’s exfil loop is patient; it would have kept going indefinitely if we hadn’t bounded the test.

What didn’t get attempted in this run

The original Mini Shai-Hulud capability list per SafeDep includes EC2 IMDS (169.254.169.254) and ECS metadata (169.254.170.2). We didn’t see those in this run. Two reasons:

The container had no ~/.aws or ~/.config/gcloud-shaped credentials, so the payload’s filesystem scanner found nothing valuable to exfiltrate. The metadata-IP probes (which use stolen tokens to enumerate further cloud surface) likely got short-circuited. We confirmed this in a later run: when fake credential files were placed on the container’s tmpfs, the payload opened every one of them immediately (see part two below).
The container started with no CI runner secrets in scope; from the payload’s perspective our environment looked empty.

In a real victim CI box with cloud credentials present in env or ~/.aws, you’d see those metadata hits too. The agent has the AWS IMDS IP (169.254.169.254) in the BPF hardcoded-deny path either way: it is checked before the allow map so that even a misconfigured policy cannot permit it. ECS metadata (169.254.170.2) can be blocked via the operator-configurable deny_ips policy.

What this means for you, if you run npm in CI

In order of effort:

Default-deny egress on every job that touches npm install. The free agent is MPL-2.0; the deployment is a Helm chart. Whatever else you do, that single layer would have stopped Mini Shai-Hulud from exfiltrating from your runners today.
Pin every dependency by integrity hash via package-lock.json / npm shrinkwrap. Lockfiles refuse to install different content for the same name@version. Doesn’t help once you bump a range and resolve a poisoned version, but it freezes the past.
Separate the npm install step from the credential-bearing step. Run the install in a job with no AWS / GitHub / Vault credentials in scope, then the actual release step in a job that has them. Anything the install does runs against an empty credential set. In GitLab CI, the mechanism is the environment: scope: scope your secrets to a named environment (e.g. npm-registry), and only the deploy job declares that environment. The install job has no environment: so it never receives the scoped variable. variables: {} in the install job clears YAML-defined pipeline variables, but environment-scoped variables are the primary defense: they are simply not injected into jobs that don’t match the scope.

# Install job: zero secrets
install_and_test:
  stage: build
  variables: {}
  # no "environment:" → no environment-scoped secrets injected
  script:
    - npm ci
    - npm test
  artifacts:
    paths:
      - dist/

# Publish job: carries NPM_TOKEN
publish_npm:
  stage: deploy
  environment:
    name: npm-registry
    action: prepare
  rules:
    - if: $CI_COMMIT_TAG
  script:
    - echo "//registry.npmjs.org/:_authToken=${NPM_TOKEN}" >> .npmrc
    - npm publish dist/

Set NPM_TOKEN in Settings > CI/CD > Variables with environment scope npm-registry (not *), protect + mask. The install job never sees it.

Two gotchas with the above pattern: stage: deploy alone does not gate variable access. The environment: keyword is what matters. And “protected” is not the same as “scoped”: a protected variable without an environment scope still leaks into every job on a protected branch, including test and lint. You need both.

Watch for optionalDependencies that pin a specific GitHub commit. That’s the third vector this campaign used and the one not currently flagged by most SAST scanners. Legitimate uses exist (vendored or pre-release deps), but github:org/repo#<sha> in an optional slot is worth a flag because it ships code through a path that doesn’t show in npm view’s dependencies block.

Three of those are general advice; the first is what the agent does for you, kernel-level, on Linux 5.8+.

What default-deny alone does not solve

The capture above caught the GitHub fallback C2 only because the canary’s allowlist was deliberately narrow: npm registries and a couple of CDNs. In a typical allowlist that permits api.github.com (clone, releases, status API, Actions analogues), the fallback C2 would have succeeded. That is exactly why the attacker chose GitHub as the fallback: it hides inside traffic most CI environments can’t switch off.

Leitwacht targets GitLab Runner shops. On a pure GitLab pipeline, api.github.com is not ambient-trusted the way it is on GitHub Actions; most GitLab jobs never touch the GitHub API, so default-deny catches the fallback honestly. The attacker chose GitHub as the C2 fallback because on GitHub Actions runners it’s invisible; on a GitLab Runner with kernel-enforced egress, it’s just another exotic destination.

Even in GitLab shops, GitHub egress isn’t zero: optionalDependencies pointing at GitHub commits (vector #3 above), tools that fetch their own releases from GitHub, vendored deps cloned at install time. So api.github.com ends up on a non-trivial fraction of real GitLab allowlists. If yours is one of them, the GitHub fallback C2 needs the credential-isolation layers because default-deny alone won’t catch it. This is exactly what recommendation #3 (separate the install step from the credential-bearing step) solves, and why it ranks above a simple allowlist.

Layered defences, part two: what happened when we let the C2 through

After the initial capture, we ran a second test: allow api.github.com on 443, place fake credential files on the container’s tmpfs, and run the same payload. The question was whether the payload would reach the credential-harvesting phase when its fallback C2 was reachable, and whether kernel-level LSM hooks would stop it there. (The primary C2, t[.]m-kosche[.]com, remained blocked by default-deny.)

The answer came in under 3 seconds:

{
  "kind": "cred_file_open",
  "message": "credential file access detected: pid=1417662 comm="Bun Pool 0" path="/home/runner/.aws/credentials"",
  "action_taken": "killed"
}

The payload’s Bun thread pool attempted multiple credential files; the credwatch BPF LSM (Linux Security Module hook accessible via eBPF, requires CONFIG_BPF_LSM=y) returned -EPERM on the first access shown above (/home/runner/.aws/credentials) and killed the container before any file content was read. bytes_sent: 0.

This is defense-in-depth at the kernel level: even when the network allowlist permits a C2 channel, the LSM blocks credential theft at the open() syscall. The credential reads and C2 connections fire in parallel, but the LSM always wins this race because open() is synchronous at the VFS layer while DNS resolution is asynchronous. In repeated runs, the LSM killed the container in ~130ms from the first credential file access, faster than any network-based kill could operate.

The proc_mem LSM (not triggered in this variant)

The third LSM hook, proc_mem, blocks open() on /proc/<pid>/mem. It was not triggered by the size-sensor variant. The Runner.Worker memory-scrape path documented in StepSecurity’s writeup uses python3 to read a sibling process’s memory, which does not exist in our container. This attack path is specific to GitHub Actions runners where Runner.Worker holds secrets in memory. The proc_mem LSM is architecturally proven (unit-tested with live BPF LSM attach, real /proc/self/mem EPERM, ring-buffer event delivery) and would fire if a python3 or ptrace process attempted the same read against a GitLab shell-executor runner where the gitlab-runner process is a PID-namespace sibling.

What we shipped along with this post

The site you’re reading this on launched today. The recent-attacks panel on the home page now includes Mini Shai-Hulud beside Trivy, axios, and Bitwarden CLI. The data-plane writeup that explains how this enforcement works is at Why CI runners are the soft target; the free CE agent is at leitwacht/leitwacht-agent.

Sources we built on: