Employer interview questions to ask candidates: a structured playbook
Behavioural prompts, anchored scorecards, and a panel that scores the same way - the structure that makes employer interview questions predict.
Why most employer interview questions miss
The trouble with most employer interview questions is not the questions. It is the lack of structure around them. A clever question asked three different ways by three interviewers, scored on three private hunches, predicts performance about as well as a coin flip. Most hiring panels notice this only after the fifth disappointing first quarter, by which point the candidate is now a colleague and the panel is now a problem.
The numbers are unkind. Wiesner and Cronshaw (1988), pulling together 150 validity coefficients across more than 50,000 interviews, found structured interviews reached a corrected mean validity of r = 0.62 against r = 0.31 for unstructured ones. Structure roughly doubles predictive validity, and the residual variance for unstructured interviews is almost entirely explained by statistical artefacts - which leaves very little for "interviewer skill" to rescue.
McDaniel and colleagues (1994) reached the same conclusion at much greater scale, working from 245 coefficients and 86,311 individuals. Structured beats unstructured across content types, and the gap holds whether the questions are situational, job-related, or behavioural. The familiar reassurance - "we just know a good one when we see them" - is doing a lot of work in those numbers, mostly because the unstructured interview's habit of conflating two very different questions ("would I work with them?" and "would they do the job?") goes unchecked.
The rest of this article is the structure that fixes it: what the format actually requires, which question types earn their keep, how to build a scorecard, and how to install the lot inside a small business without a consultancy invoice.
What "structured interview" actually means
A structured interview is not a script. It is the same script, scored the same way, by interviewers trained the same way, in the same order, every time. Standardisation runs across four things at once: the questions you ask, the order you ask them, the scale you score them on, and the way the panel is briefed and trained. Drop one of those and the structure leaks.
Campion, Palmer and Campion (1997) catalogue 15 distinct components of a structured interview, split between content and evaluation. Content covers the question set itself: questions drawn from a job analysis, identical wording across candidates, no improvised follow-ups that send one candidate down a corridor the next candidate never sees. Evaluation covers what happens once an answer arrives: anchored rating scales, written notes, multiple interviewers, rater training. Most teams who say "we use a structured interview" mean the first half. The second half is where validity actually comes from.

Levashina et al. (2013) pulled the receipts on this. In their review of field studies, the average "structured" interview used only about six of the 15 components - anchored rating scales, identical questioning, and rater training were the three most often retained. Useful, then, as a minimum. The reader recipe falls out of this neatly enough: pick five to seven competencies, write two main questions per requirement, score each answer on an anchored 1-5 scale, and put every interviewer through the same short training before they ever meet a candidate. Five moving parts, all of which a small team can install without a consultant.
The question types worth asking
Once the structure is in place, the question types do the heavy lifting. Two earn their seat at the table: behavioural and situational. A behavioural interview asks the candidate to describe a real moment from their past that resembles the demands of the role - the working assumption being that what someone has actually done is the cleanest available signal for what they will do next. Situational questions invert the angle and put a plausible hypothetical on the table instead. Both are structured, both are job-related, and both belong in the rotation. There is a sensible case for combining behavioural and situational prompts rather than choosing between them.
The numbers back the choice. McDaniel et al. (1994), pooling 245 coefficients across 86,311 individuals, put situational interviews at r = 0.50 and job-related interviews (which include behavioural questions) at r = 0.39. Both clear unstructured formats by a comfortable margin, and both belong in any sensible bank of recruiting questions for interview panels who want their decisions to mean something a fortnight later.

Worked examples sharpen the difference. A behavioural prompt sounds like: "Tell me about a time you missed a deadline you had committed to. What did you do next?" A situational prompt for the same competency sounds like: "A direct report misses a deadline the day before launch. What is your first move?" The first probes ownership through evidence; the second probes judgement under fresh pressure. For experienced candidates, including new hire interview questions where the role is a clear step up, behavioural prompts tend to produce the more decision-useful answers, because there is more past to interrogate.
This is not just laboratory work. McClelland (1998) reported that a food and beverage company nicknamed "Tastyfood" replaced its traditional hiring with a Behavioural Event Interview-driven competency algorithm and watched executive turnover fall from 49 % to 6.3 %, an estimated 3.5 million dollars saved. One case is one case, but it points the same way as the meta-analyses: ask candidates to describe what they have actually done, score it against anchored criteria, and the decisions start holding up.
Building the scorecard
A scorecard is the bridge from questions to decisions. Without it, two interviewers can hear the same answer and walk into the debrief with completely different impressions, both of them sincere. The scorecard is what stops that. One row per competency, one column per interviewer, an anchored 1-5 scale, and a small box of written evidence beside each rating. Nothing fancy; the discipline is the whole point.
Anchors do most of the work. Each level - what a 1, a 3, and a 5 actually look like - is described in plain English so two reasonably observant adults grade the same answer the same way. Levashina et al. (2013) classified anchored rating scales firmly inside the evaluation step of a structured interview rather than question design, and the distinction matters: questions test the candidate, scoring tests the panel. Keep the two stages separate or you end up rewriting the rubric to fit whoever you liked best.

Anchored scoring also dampens the demographic noise that unstructured panels quietly generate. Bragger et al. (2002) showed that bias against pregnant candidates shrank under structured conditions compared with unstructured ones. The mechanism is the same one Campion, Palmer and Campion (1997) describe in their fifteen-component framework: when the rating scale is fixed and the evidence is written down, the room has less space to drift toward whoever felt most familiar.
None of this needs a software purchase. Greenhouse's interview-questions piece treats templates as a baseline that small teams can run in shared docs, and that is honest advice. A shared spreadsheet with anchored descriptors and an evidence column is enough to start. Better tooling can come later; the scorecard is the practice.
Installing structured hiring without consultants
By this point the picture is reasonably settled. Structured interviews predict performance, behavioural and situational questions earn their keep, and an anchored scorecard keeps a panel honest. The remaining question is who builds it. For most small businesses and scale-ups, the honest answer is: you do, with the right kit. Hiring transformation has historically been sold as a consultancy engagement, partly because it is genuinely involved and partly because the people selling it preferred it that way. It does not have to be.
HireSchool is a self-guided digital programme called the Structured Hiring Method. Customers buy access and install it themselves, supported by video content and a learning management system that lets them onboard the team and track everyone's progress through the curriculum. The underlying standard is First Past the Post: the bar is set in advance, the decision follows, and strong candidates do not drift while the panel keeps options open in case someone better turns up.
The components that map most cleanly onto this article are the ones a reader will actually use the morning after they finish reading. Behavioural interviewing training covers the technique itself, built on the principle that past behaviour in similar situations is the best available predictor of future behaviour - the same idea that has held up across the meta-analyses cited earlier. Codified Performance Assessment is the anchored scorecard: consistent criteria, written evidence, and the same yardstick across interviewers, so two reasonable people in the same panel produce ratings that are at least talking about the same thing. Role Hiring Process Flow standardises the steps, so every candidate and every hiring manager knows the shape of the process before it starts, and Hiring Manager Blueprint gives the lead interviewer a structure for planning each loop and coordinating the panel. Decision Management is the codified protocol for reaching a defensible decision the panel can still explain the next morning. For businesses that can support it, an optional Quality Assurance module sets up the kind of independent interviewer function Google and Amazon run - useful, but not the right starting point for everyone.
HireSchool is not a recruiting agency, not an applicant tracking system, and not a one-off course. There are no HireSchool consultants sitting inside your hiring process. The method is the product; your team applies it.
If the case made above is one you would rather act on than admire, the next step is to look at the curriculum directly. You can explore the Structured Hiring Method programme and see how the modules, the LMS, and the templates fit together - the same kit that turns the principles in this article into your team's standard.
Handling the "our roles are too unique" objection
"We can't structure interviews - our roles are too unique" is the most common reason small teams give for keeping the process loose. It is also, on the evidence, the wrong conclusion drawn from a fair observation. Unusual roles are real. The leap from "this role is unusual" to "therefore standardised questions and anchored scoring won't work" is the bit that doesn't survive contact with the data.
Wiesner and Cronshaw (1988) ran the relevant comparison directly. Structured interviews built on a formal job analysis reached a corrected validity of r = 0.87, against r = 0.59 for "armchair" structured interviews where someone had simply guessed the questions worth asking. The uniqueness of a role is an argument for more job analysis, not less structure. The harder the role is to define, the more the interview needs to be pinned to whatever has been defined.
Concede the contrarian point, though. Structure raises the floor; work samples and ability tests still tend to predict harder, and there is a fair case that unstructured interviews are barely better than chance. Most small businesses cannot run hour-long work samples for every candidate, which makes structured interviews the highest-leverage upgrade most can actually install.
The practical answer for unique roles is to keep five to seven competencies fixed across the company, and swap the role-specific behavioural prompts beneath them. Ownership, judgement, and communication look much the same whether the hire is a logistics lead or a research engineer. The structure stays; the surface adapts.
Red flags as signals to test, not vibes to react to
The phrase "red flag" does a lot of quiet damage in interviews. It is often a feeling that travels around the panel by the end of the day, picks up agreement on the way, and ends with a candidate quietly cut for reasons no one can describe the next morning. The more useful move is treating interview red flags as signals worth testing rather than vibes worth acting on. A flag is a hypothesis. Hypotheses get one more well-designed question, not a quiet veto.
Take the candidate who keeps shifting blame onto former managers. That is a hypothesis about ownership, and ownership is testable. Ask for a time the candidate held the bag for a decision that did not work. Listen for what they did next, who they told, and what they changed. The follow-up is the work; the impression on its own is not.

Tie each flag back to the scorecard. Every concern becomes a row to score with anchored evidence, not a feeling shared in the corridor afterwards. That is the point of the whole exercise. Structured employer interview questions, behavioural prompts, and an anchored scorecard turn the interview from theatre into evidence, and that is something a small team can install themselves.