When the warehouse has no lock
Crypto exchanges spend fortunes on the front door. Hardware security modules for the hot wallets. Bug bounties on the matching engine. WAFs, rate limits, mandatory 2FA, a KYC vendor on a six-figure contract. Then they copy the entire business into an analytics database so the dashboards load fast, and nobody sets a password on it.
That's what we found on a recent assessment of a centralized exchange. The analytics layer was a column-oriented OLAP warehouse, the kind teams stand up to feed BI dashboards and AML monitoring. It answered on a public port. It had four built-in accounts, and every one had a blank password, including the one with full administrative rights.
That single gap was worse than almost any flaw the trading engine could have shipped. A bug in the matching engine gets you an edge on one market. A blank password on the warehouse gets you every customer.
Severity: CVSS 9.8 (Critical) · CWE-1188, CWE-306
Exposed: the full user table (~14M rows), KYC document images, TOTP seeds and recovery codes, withdrawal address books, backoffice admin hashes and OAuth client secrets, and 1M+ AML monitoring rows — all unauthenticated.
What was behind the door
The warehouse is where a company denormalizes everything for fast reads. So "everything" is exactly what was sitting there, already joined and indexed for convenience:
- The full user table: emails, password hashes, registration IP, country, and the TOTP seed used to generate 2FA codes.
- Thousands of 2FA recovery codes and stored WebAuthn passkey records.
- KYC review data, down to scanned passport images by URL.
- Withdrawal address books and deposit addresses for the customer base.
- Spot and derivatives balances per account.
- Backoffice admin accounts with their own password hashes, and OAuth client secrets for internal services.
- Over a million AML transaction monitoring rows.
Look hard at that 2FA line. A TOTP seed is the secret that derives every future code, so whoever holds it can mint the second factor forever. An attacker holding the password hash and the TOTP seed doesn't need to phish anyone. They reconstruct the second factor themselves.
Why this keeps happening
Nobody decides to expose the warehouse. It happens in three quiet steps.
First, the analytics database gets provisioned during a sprint, often by a data team rather than the platform team, with whatever defaults the installer ships. Many of these systems install with a root account and no password and trust the operator to fix it later.
Second, "later" collides with a deadline. The dashboard works, the ticket closes, the empty password stays.
Third, a security group rule that was meant to be internal gets widened so a contractor or a cloud function can reach it, and the host ends up answering the public internet. No single person did anything obviously reckless. The exposure is the sum of three reasonable looking decisions.
The security program never looked here because the threat model was built around the trading product. The warehouse counted as plumbing, and plumbing rarely gets audited. This plumbing happened to hold every customer's identity and balance.
How an attacker finds it in minutes
There is nothing clever about this exploit, and that's what should worry you. The steps are short:
- Enumerate the organization's address space and look for the management ports of common data infrastructure, not just web ports. Analytics engines, message queues, search clusters, and cache servers all have their own listeners.
- Try the documented default credentials for whatever responds. For a surprising share of data tooling that means a known username and an empty password.
- Once connected, run the inventory query the database hands you for free: list the schemas, list the tables, read the row counts. The schema names tell you where the money and the identities live.
# the warehouse answered on a public port with a blank admin password
$ clickhouse-client --host <redacted> --user default --password ''
:) SELECT database, name, total_rows FROM system.tables
ORDER BY total_rows DESC LIMIT 5
┌─database─┬─name──────────┬─total_rows─┐
│ core │ users │ 14,200,331 │
│ core │ kyc_documents │ 9,512,084 │
│ core │ aml_events │ 1,002,455 │
└──────────┴───────────────┴────────────┘
:) SELECT email, totp_seed, withdrawal_address FROM core.users LIMIT 1
-- login, the 2FA seed that mints every future code, and the payout address
The whole chain is unauthenticated and uses the database's own intended features. There's no payload, no memory corruption, nothing a signature-based tool flags. It's just a door that was supposed to be locked.
The blast radius
Frame it the way a defender has to. With this one foothold an attacker can:
- Pull the entire identity graph for credential stuffing and for SIM swap targeting, since the phone numbers and KYC are right there.
- Defeat 2FA at scale using the stored TOTP seeds and recovery codes, turning "we require 2FA" into theater.
- Build a target list ranked by balance and by withdrawal address, which turns generic phishing into precision fraud and, in the worst case, physical coercion of the wealthiest customers.
- Move laterally using the leaked OAuth client secrets and admin hashes.
No wallet was drained during the test, and that distinction matters for an honest report. But the data alone is a regulatory event and a complete prelude to account takeover. The exchange's own controls became the attacker's toolkit.
The fix, in order of urgency
- Set passwords on every data service account today, then rotate them. Treat a blank or default credential as an active incident, not a finding to schedule.
- Put the warehouse behind the network boundary it was always supposed to be behind. It should never answer an unauthenticated request from outside the VPC.
- Stop replicating raw secrets into analytics. TOTP seeds, recovery codes, passkey material, and full KYC images do not belong in a reporting database. Tokenize or hash what the dashboards genuinely need, and leave the rest in the systems that guard it.
- Add credential and exposure checks for data infrastructure to your monitoring, with the same seriousness you give the public web tier.
Prevent it in CI/CD: secret- and exposure-scanning that fails the pipeline on any datastore with a default or blank credential, or a security-group rule open to the internet; an IaC policy that data services are private-subnet-only; and a scheduled external port-and-credential sweep of the whole data estate, owned as seriously as the web tier.
What to check on your own estate
Three things to run down:
- List every datastore your org runs, not just the production database. Include the analytics warehouse, the queues, the search and cache layers, and anything a data team stood up on its own.
- For each one, answer two questions. Can it be reached from outside the trusted network, and does it still carry any default or empty credential?
- Then the harder one. What sensitive fields have you copied into it for convenience, and which of those are secrets that should never have left their home system?
This one cost the attacker nothing: no exploit, no payload, just a connection string and a blank password field. The analytics warehouse is part of your attack surface whether or not anyone put it on the list, and it's holding the same data your front door spends a fortune protecting.