← Back to Test Cases
According to EU GMP Annex 1 (2022 revision), when performing Aseptic Process Simulation (media fills) for conventional aseptic filling, what is the maximum acceptable number of contaminated units when filling fewer than 5,000 units?
Knowledge QAEu Annex1hard
Cross-Model Comparison
| Model | Score | Latency | Tokens In | Tokens Out |
|---|---|---|---|---|
| Gemini 3.1 Flash-Lite | 100.0% | 1.8s | 105 | 163 |
| GPT-5.4 mini | 100.0% | 959ms | 106 | 86 |
| DeepSeek-V3.2 | 100.0% | 27.7s | 106 | 280 |
| DeepSeek-R1 | 100.0% | 11.4s | 112 | 368 |
| Mistral Large 3 675B | 100.0% | 5.1s | 110 | 298 |
| Gemini 3 Flash | 100.0% | 2.8s | 104 | 252 |
| GPT-5.4 | 100.0% | 1.7s | 106 | 75 |
| Claude Haiku 4.5 | 100.0% | 3.7s | 121 | 323 |
| Claude Sonnet 4.6 | 100.0% | 8.9s | 121 | 356 |
| Claude Opus 4.6 | 100.0% | 12.1s | 121 | 387 |
| Llama 3.3 70B Instruct | 100.0% | 2.1s | 113 | 154 |
| Llama 4 Maverick | 100.0% | 4.4s | 112 | 473 |
| DeepSeek-R1-Distill-Qwen-32B | 100.0% | 30.1s | 109 | 636 |
| Mistral Small 2603 | 100.0% | 1.6s | 122 | 232 |
| GPT-5.4 nano | 0.0% | 1.0s | 106 | 94 |
| Qwen3.5-35B-A3B | 0.0% | 74.3s | 116 | 10,326 |
| Qwen3.5-397B-A17B | 0.0% | 145.1s | 116 | 8,754 |
| Gemini 3.1 Pro | 0.0% | 23.1s | 104 | 1,361 |
| Llama 4 Scout | 0.0% | 2.8s | 111 | 223 |
Tags
media_fillsaseptic_process_simulationcontamination