Skip to main content
QATraining
Back to curriculum
Chapter 8 of 10

Robustness, Security, Adversarial Testing, and AI-Specific Threats

Teach practical testing for AI-specific robustness and security threats, including poisoning, evasion, extraction, prompt injection, and confidentiality attacks.

45 min guide5 reference questions folded into the guide material
Guided briefing

Robustness, Security, Adversarial Testing, and AI-Specific Threats video briefing

A focused explanation of chapter 8, turning the AI testing theory into concrete validation checks.

Briefing focus

Module opening

This is a structured lesson briefing. Real video/audio can be added later as a media source.

Estimated time

9 min

  1. 1Module opening
  2. 2Learning objectives
  3. 3Mind map
  4. 4Scenario evidence breakdown

Transcript brief

Teach practical testing for AI-specific robustness and security threats, including poisoning, evasion, extraction, prompt injection, and confidentiality attacks. The briefing explains why the topic matters, walks through a failure scenario, and identifies the artefacts a tester should produce for evidence and auditability.

Key takeaways

  • Connect the AI risk to a measurable test or monitor.
  • Document the evidence needed for reproducibility and audit.
  • Use the lab or scenario to practise the validation workflow.

Module opening

Teach practical testing for AI-specific robustness and security threats, including poisoning, evasion, extraction, prompt injection, and confidentiality attacks.

Audience. QA, security-minded testers, and test leads responsible for resilient AI systems.

Why this matters. AI systems introduce new attack surfaces and fragility. Testers need to think about accidental variation and deliberate abuse.

ISTQB CT-AI mapping. CT-AI 7.6, 9.1, 10.1

Trainer note

Start with the scenario before the theory. Ask learners what evidence would make them confident, then use the module to build that evidence step by step.

Learning objectives

  • Explain the core quality risk in robustness, security, adversarial testing, and ai-specific threats.
  • Select practical test evidence that supports an AI release decision.
  • Apply the module concepts to a realistic QA scenario.
  • Produce a portfolio artifact that can be reused in a professional AI testing context.

Mind map

Robustness, Security, Adversarial Testing, and AI-Specific Threats mind map

Real-life scenario · Logistics automation

The image classifier fooled by a small sticker

Situation. A vision model classifies parcel labels and handling requirements. A small visual perturbation caused fragile predictions for hazardous handling labels.

Lesson. AI testing is strongest when risks, examples, evidence, and release decisions are connected.

Scenario evidence breakdown

Scenario elementDetail
Product/SystemParcel sorting system
AI featureA vision model classifies parcel labels and handling requirements.
Failure or riskA small visual perturbation caused fragile predictions for hazardous handling labels.
Testing challengeStandard clean-image accuracy did not reveal robustness under realistic noise, damage, camera angle, or adversarial input.
Tester responseThe tester built a threat model, perturbation suite, attack success metric, fallback workflow, and monitoring for abnormal confidence patterns.
Evidence requiredThreat model, robustness test report, adversarial examples, mitigation backlog, and incident playbook.
Business decisionApprove only for lanes where fallback scanning and manual review controls reduce harm.

Visual flow

Robustness, Security, Adversarial Testing, and AI-Specific Threats scenario flow

Learning path

  1. Start Here

    5 min

    Outcome, CT-AI exam relevance, and the parcel classifier scenario.

  2. Learn

    24 min

    Robustness, threat modelling, evasion, poisoning, prompt injection, and resilience controls.

  3. See It

    10 min

    Attack success and fallback evidence for hazardous labels.

  4. Try It

    18 min

    Build a threat model and robustness report.

  5. Recall and Apply

    10 min

    Exam traps, active recall, and the portfolio artifact.

Robustness is release evidence

Robustness testing checks whether AI behaviour remains acceptable under realistic variation and plausible abuse, not just clean lab inputs.

Example

A small sticker or damaged label caused fragile predictions for hazardous parcel handling.

Mistake

Reporting clean-image accuracy without perturbation, abuse-case, fallback, or residual-risk evidence.

Evidence

Threat model, perturbation suite, attack success rate, fallback workflow, control matrix, monitoring alerts, and incident playbook.

Worked example: Limiting release after robustness failure

Scenario. A parcel classifier works on clean images but misclassifies hazardous labels after realistic smudges, stickers, and camera-angle changes.

Reasoning. The risk is high-impact and operational. Release can only be considered where fallback scanning, confidence thresholds, and manual review reduce harm.

Model answer. Approve only for constrained lanes with tested fallback controls; block broader release until robustness thresholds and mitigation evidence pass.

Try it: Build the threat model and robustness report

Prompt. Use the parcel classifier scenario to define threats, perturbations, controls, and release conditions.

Learner action. Name attacker or variation source, target asset, access path, test cases, success metric, mitigation, monitoring, owner, and residual risk.

Expected output. `ai-threat-model-and-robustness-report.md` with abuse cases, robustness results, controls, incident playbook, and release recommendation.

Exam trap

Objective

CT-AI 7.6, 9.1, 10.1

Common trap

Treating robustness and security as separate from model quality or data lifecycle.

Wording clue

Look for answers that link attack path, test evidence, mitigation, residual risk, and release action.

Portfolio checkpoint

Create the module portfolio deliverable and use it to support your release decision.

Artifact structure

ai-threat-model-and-robustness-report.md

ContextThreatsPerturbationsAttack resultsControlsMonitoringResidual riskRecommendation

Recall check

What is an evasion attack?
Manipulating runtime inputs to cause an incorrect model output.
Why test realistic perturbations?
Clean lab inputs can hide failures under noise, damage, spelling errors, lighting, or angle changes.
What makes a robustness finding release-relevant?
It has severity, attack success or degradation evidence, mitigation, owner, and release action.
What portfolio artifact does this module produce?
ai-threat-model-and-robustness-report.md, a threat and robustness evidence report.

Topic-by-topic teaching guide

1. Robustness

Robustness is stable behaviour under expected variation such as noise, missing fields, spelling errors, or lighting changes.

Teaching lensPractical detail
Real QA exampleA support classifier should still understand common typos and formatting differences.
What can go wrongTesting only perfect lab inputs.
How a tester should thinkPerturb realistic inputs and measure stability.
Evidence to collectRobustness suite and degradation thresholds.

2. Threat Modelling

AI threat modelling names the attacker, goal, access, target, and control points.

Teaching lensPractical detail
Real QA exampleA competitor may query an API to infer model behaviour, while a user may attempt prompt injection.
What can go wrongListing threats without capability or testable scenario.
How a tester should thinkTurn threats into executable tests and controls.
Evidence to collectThreat model and abuse case catalogue.

3. Evasion and Poisoning

Evasion manipulates inputs at inference time; poisoning corrupts training data or feedback loops.

Teaching lensPractical detail
Real QA exampleA malicious review campaign can shift a recommendation system if feedback is trusted blindly.
What can go wrongTreating security as unrelated to data and model lifecycle.
How a tester should thinkTest both runtime and training-time attack paths.
Evidence to collectAttack simulation report and data control checks.

4. Prompt Injection and LLM Threats

LLM applications can be attacked through user text, retrieved documents, tool outputs, or hidden instructions.

Teaching lensPractical detail
Real QA exampleA retrieved page tells the assistant to ignore policy and reveal private data.
What can go wrongOnly testing friendly prompts.
How a tester should thinkRed-team instructions, retrieval content, and tool boundaries.
Evidence to collectPrompt injection tests and tool permission evidence.

5. Resilience Controls

Controls include validation, rate limits, human fallback, monitoring, isolation, and rollback.

Teaching lensPractical detail
Real QA exampleLow-confidence hazardous parcel predictions go to manual review.
What can go wrongFinding vulnerabilities without defining release action.
How a tester should thinkLink each risk to a control and residual risk decision.
Evidence to collectControl matrix, monitoring alerts, and incident playbook.

Practical QA workflow

  • Start from the user or business decision affected by the AI system.
  • Name the AI asset under test: data, feature pipeline, model, prompt, retrieval index, tool, or full workflow.
  • Convert the main risk into observable quality signals and release gates.
  • Choose the right oracle: deterministic assertion, metric threshold, metamorphic relation, reviewer rubric, comparison, or production monitor.
  • Test important slices, edge cases, misuse cases, and change scenarios.
  • Record versions, data sources, thresholds, reviewer notes, and decision rationale.

Test design checklist

  • What harm could happen if this AI behaviour is wrong?
  • Which users, groups, products, regions, or workflows need separate evidence?
  • Which metric or observation would reveal the failure early?
  • What is the minimum evidence needed for release, shadow mode, rollback, or rejection?
  • Who owns the evidence after the model, prompt, or data changes?

Worked QA example

A tester receives a release request for the module scenario. Instead of asking only whether tests pass, the tester writes three release questions: what changed, who could be harmed, and what evidence proves the change is controlled. The answer becomes a small evidence pack: one risk table, one set of representative examples, one automated or reviewable check, and one release recommendation.

Common mistakes

  • Treating AI output as a normal deterministic response when the real risk is behavioural.
  • Reporting one impressive metric without slices, uncertainty, or business context.
  • Forgetting that data, prompts, model versions, and monitoring are part of the test surface.
  • Writing governance language that cannot be checked by a tester.

Guided exercise

Use the scenario above and create a one-page evidence plan. Include the decision being influenced, the main risk, the test oracle, the data or examples required, the release gate, and the owner.

Discussion prompt

Which is more likely in your domain: accidental messy input, malicious input, poisoned feedback, or prompt injection?

Hands-on lab mapping

  • Lab: CourseMaterials/AI-Testing/labs/05_adversarial_attacks_art.ipynb
  • Task: Run a simple adversarial robustness experiment and document attack success and mitigation options.
  • Why this lab matters: it turns the module theory into visible evidence that a release approver can inspect.

Decision simulation

Robustness drops sharply under realistic noisy inputs. Decide whether to release with fallback controls or block for model improvement.

Key terms

  • Evasion attack: Manipulating inputs against a deployed model.
  • Data poisoning: Corrupting training data or feedback to influence future behaviour.
  • Prompt injection: Text designed to override or bypass intended LLM instructions.
  • Attack success rate: Proportion of attack attempts that achieve the attacker goal.

Revision prompts

  • Explain the module scenario in two minutes to a product owner.
  • Name three pieces of evidence you would require before release.
  • Identify one automated check and one human-review check.
  • Describe how this topic changes after deployment.