Prompt Completeness Score Calculator

Prompt text

Tip: Include role, goal, constraints, data, format, and success criteria.

Use auto-detect boosts based on prompt signals

Objective clarity

How clearly the task and deliverable are stated.

Weight 14

0 5 3

Context

Background facts, constraints, and relevant assumptions.

Weight 12

0 5 3

Constraints

Boundaries, forbidden items, limits, and rules.

Weight 12

0 5 3

Examples

Input/output examples or few-shot demonstrations.

Weight 10

0 5 2

Output format

Structure: JSON, table, bullets, schema, etc.

Weight 12

0 5 3

Tone & style

Voice, register, and writing style expectations.

Weight 8

0 5 2

Audience fit

Reader level, persona, or stakeholder context.

Weight 6

0 5 1

Inputs & data

Provided data, references, or required sources.

Weight 10

0 5 2

Evaluation criteria

What “good” means; checks and acceptance rules.

Weight 8

0 5 1

Edge cases

Exceptions, ambiguity handling, failure modes.

Weight 8

0 5 1

Auto-detected signals (from current prompt)

Goal keywords: NoRole/persona: NoContext terms: NoConstraint terms: NoFormat terms: NoExamples: NoTone/style: NoAudience: NoEvaluation criteria: NoData/tools: No

Formula used

Each criterion is rated from 0 to 5 and scaled by its weight.

Completeness Score (0–100)

Score = Σ [(Rating_i ÷ 5) × Weight_i]

Ratings: your slider values per criterion.
Weights: importance multipliers that sum to 100.
Tier: qualitative label based on the final score.

How to use this calculator

Paste your prompt into the text box.
Optionally click Auto-fill from prompt to set sliders.
Review sliders and adjust any criterion you care about.
Press Calculate Score to see results above the form.
Download CSV for tracking, or PDF for sharing.

Example data table

Prompt snippet	What is specified	Expected score range
“Summarize this article.”	Goal only; missing format, constraints, and audience.	15–35
“Summarize for executives in five bullets, max 90 words.”	Goal + audience + format + constraint; still lacks evaluation and examples.	55–75
“Act as an analyst. Use the table below. Output JSON with fields… Include edge cases…”	Role, inputs, structured output, constraints, and exceptions are defined.	80–95

Professional notes

Why prompt completeness matters

Incomplete prompts increase rework because models must guess missing goals, context, or output structure. A completeness score provides a repeatable signal of specification quality, helping teams compare prompts across use cases. Higher completeness typically reduces hallucination risk, improves consistency, and shortens iteration cycles by making expectations explicit.

Core dimensions behind the score

This calculator rates ten criteria from 0–5, then applies weights that sum to 100. Objective clarity and output format carry higher influence because they anchor what to do and how to return it. Context, constraints, and inputs/data ensure the model has the necessary facts and boundaries. Examples, tone, audience, evaluation criteria, and edge cases refine behavior in realistic situations.

Interpreting tiers and tradeoffs

Scores near 85–100 indicate strong specification with minimal ambiguity, while 70–84 suggests a usable prompt missing a few tightening details. Mid‑range scores often show gaps in constraints or evaluation criteria, leading to variable outputs. Very low scores usually lack a clear deliverable or format. Remember that “complete” is not “long”; concise prompts can score well when structured.

Improvement tactics that raise scores

Start by stating the task, the success definition, and the exact output schema. Add constraints such as length limits, prohibited content, and required citations or sources. Provide the essential inputs, including tables or assumptions, and name any tools the model may use. Include a small example pair when the format is complex. Finally, call out edge cases like missing values or conflicting requirements.

Operationalizing scoring in teams

Use the score as a pre‑review gate before prompts enter production. Track scores over time with the CSV export and attach PDF reports in reviews. Set minimum targets by prompt class, for example 70 for internal drafts and 80 for customer‑facing automation. When scores drop, inspect the “weak areas” list and update the prompt template to prevent regressions. In workshops, score several prompts against the same rubric to calibrate ratings. If auto-detect is enabled, keyword signals can gently boost ratings, but human review should confirm intent. Pair scoring with A/B evaluation metrics such as task success rate, defect counts, and latency in practice.

FAQs

1) What does a high completeness score indicate?

It indicates your prompt clearly defines the objective, provides sufficient context and constraints, and specifies the output format. Higher scores usually correlate with more consistent responses and fewer clarification questions from the model.

2) Should every prompt target a perfect score?

No. Some tasks benefit from exploration. Aim for the minimum completeness that produces stable results: clear goal and output, plus constraints that matter. Over-specifying can reduce creativity or add unnecessary maintenance.

3) How should I rate the Examples criterion?

Rate higher when you include at least one representative input and the exact expected output shape. Give extra credit when the example covers formatting details, edge conditions, or common mistakes the model should avoid.

4) My prompt is short. Can it still score well?

Yes. Brevity is fine if you include structure: a clear task statement, required output format, and key constraints. Short prompts often score lower only when they omit context or success criteria entirely.

5) Does auto-detect replace manual review?

No. Auto-detect only looks for keyword signals and provides gentle boosts. You should still verify intent, data requirements, and edge cases, especially for production workflows where small ambiguities can create large failures.

6) How can I use the CSV and PDF exports effectively?

Use CSV to track scores across versions, owners, and use cases, then spot trends. Use PDF for reviews and approvals, because it packages the score, tier, ratings, and weak-area list into one shareable snapshot.

Formula used

How to use this calculator

Example data table

Professional notes

Why prompt completeness matters

Core dimensions behind the score

Interpreting tiers and tradeoffs

Improvement tactics that raise scores

Operationalizing scoring in teams

FAQs

1) What does a high completeness score indicate?

2) Should every prompt target a perfect score?

3) How should I rate the Examples criterion?

4) My prompt is short. Can it still score well?

5) Does auto-detect replace manual review?

6) How can I use the CSV and PDF exports effectively?

Related Calculators