LLMShield Text Inference

Test the DistilBERT intent classifier — type a prompt or select a sample, then classify as safe or harmful.

LLMShield: Layered Adversarial Defense for LLMs

Enoch Kwateh Dongbo · Student ID: 202324100003

Model: DistilBERT (66M) · ONNX

Benign

Harmful (direct)

Adversarial bypass

Medical adversarial

Enter a prompt and click Classify to see results

The DistilBERT intent classifier runs server-side via ONNX Runtime