1M+ Data Points Survey Architecture — Lucas De Alencastro Moreira

Most large-scale surveys fail in a way that looks like success. They hit their response targets, they generate a tidy report, and the findings sit somewhere between obvious and unfalsifiable. The failure isn’t in execution — it’s upstream. Surveys at scale tend to collect what’s easy to collect rather than what answers the question that motivated the work in the first place. The instruments aren’t validated. The aggregated results hide more signal than they reveal. And by the time anyone notices, the data has already shaped a decision.

The work — embedded inside ADEK across 150+ schools and multiple stakeholder groups — was to refuse that pattern from instrument design forward.

Psychometricians as a design constraint, not a quality check

The first commitment was bringing psychometricians in at the start, not the end. Every item drafted for every instrument had to clear three questions before it could ship. What construct is this item measuring? Is it actually measuring that construct, or is it measuring something adjacent like social desirability or response style? And does the response scale produce usable variance across the population we’re sampling, or does everyone cluster on the same option?

That changes the cost curve of the project in a way that’s worth being honest about — it adds time at the front. It also removes the much larger cost that hits at the analysis stage when half your items turn out to be unusable. The trade is correct, and it’s almost never the one large surveys make.

The scale challenge

Once the instruments held up, the harder problem was deploying them across 150+ schools and four stakeholder groups — students, teachers, parents, administrators — without the integrity of the data drifting between contexts. A teacher answering on a school computer, a parent answering on a phone in a second language, a student answering during a homeroom block: each of those contexts introduces its own noise. The infrastructure had to enforce instrument consistency, response integrity, and accessibility simultaneously. None of those are individually hard. Holding them together at this scale, in this many languages, across this many concurrent waves, is the actual engineering.

Pipelines that surface meaning, not just numbers

Analysis on a dataset this size is not “run the numbers.” Run the wrong numbers on a million data points and you can confidently mislead an entire agency. The pipeline was built around three principles. Anomaly detection at the school and cohort level — so that data quality issues surfaced before they were averaged away. Cross-cohort comparison built in by default — because a finding without a reference class is rarely actionable. And explicit weighting for sample bias — because non-response is never random and pretending otherwise corrupts every downstream conclusion.

The output is not a deck of charts. It’s a system that lets the people who own the policy questions interrogate the data themselves, with the construct definitions and the methodological assumptions visible in line.

What changed

The visible outcome is 1M+ data points that are actually defensible — that hold up to a psychometrician reviewing the instrument, a methodologist reviewing the pipeline, and a policy lead reviewing the conclusion. The less visible, more durable outcome is that the infrastructure is reusable. ADEK no longer runs surveys as one-off projects with a fresh vendor each cycle. The capability lives inside the agency: instruments, pipelines, analysis patterns, and the institutional memory of why each was designed the way it was. That is the difference between a survey and a research capability, and it’s the part that compounds across cycles.