Unclassified

Instructions

  • This app evaluates an LLM's ability to produce correct pilot jargon
  • The brevity codes are from SD Brevity 2025 (approved for public release)
  • The model is Claude Haiku 4.5 with brevity codes in its context window
  • Use this app in one of two ways:
    • Either provide a plain text example scenario to the model (without brevity codes), then rate the models abilty to produce a correct brevity code response
    • Or provide an example scenario exclusively in brevity codes, then rate the models abilty to produce a correct text description
  • Rate the models response from 1 to 5 stars and in the feedback box, explain what the correct response should be
  • Do not enter CUI data
  • Click Record Data to save your evaluation to the SQL database for analysis
Rate the LLM's output:
Unclassified