How RoboQA Scores Your Website

Six metrics, one honest score

🔄

Flow Reliability

25% of overall score

Measures whether each test step actually did something. A step that leaves the page unchanged is not a test — it's observation.

Starts at 100

Condition	Deduction
Before/after screenshots identical on an interactive step	−20 per step
Step marked PASS but only passive inspection occurred	−15 per step
Navigation action failed or timed out	−25 per step
Step description contains vague language	−10 per step

Good — 95

Clicked "Sign Up", page navigated to /dashboard, screenshots differ, URL changed.

Bad — 40

2 of 4 steps had identical screenshots. Nothing visibly changed after interaction.

🐛

Bug Density

20% of overall score

Counts real bugs found — broken forms, missing elements, crashes, 404s, validation failures. Zero bugs only earns 100 if interactions were genuinely real.

Starts at 100

Condition	Deduction
Critical bug (broken form, 404, crash, hard fail)	−30 per bug
High bug (broken feature, missing required element)	−20 per bug
Medium bug (validation not working, missing feedback)	−15 per bug
Low/minor bug (UI glitch, slow response)	−10 per bug
Zero bugs but no real interactions occurred	Capped at 70

Good — 100

Zero bugs found. All interactions were real with confirmed assertions.

Bad — 10

3 critical bugs found: form crash, 404 on submit, missing required field.

🎨

UI Issues

15% of overall score

Detects browser-level UI problems: console errors, elements that could not be found, and elements that were blocked or non-interactable.

Starts at 100

Condition	Deduction
Console error detected during step execution	−10 per error
Expected element not found on the page	−15 per element
Element found but not interactable (disabled, overlay)	−10 per element

Good — 100

Zero console errors. All elements found and clickable on first attempt.

Bad — 55

3 console errors (−30), 1 missing element (−15). Score: 55.

✓

Success Rate

25% of overall score

The percentage of steps that genuinely passed — not just steps that did not error out. Three conditions must all be true for a step to count as a real pass.

Formula: (genuine passes ÷ total steps) × 100

A step is a genuine PASS only if ALL are true
Real interaction occurred (click, fill, submit)	Required
Before/after screenshots are visually different	Required
Step status is PASS (no error returned)	Required

Good — 100

All 5 steps: clicked element, page changed, status PASS. 5/5 genuine passes.

Bad — 0

4 steps all PASS but screenshots identical every time. Zero genuine passes.

📝

Form Reliability

10% of overall score

Tracks whether forms were fully exercised end-to-end: filled with real data, submitted, and the response confirmed. Skipped if the page has no forms.

Starts at 100 · N/A if no form detected

Condition	Score
Form filled + submitted + success/error confirmed	100
Submitted but response not checked	70
Filled but never submitted	20
Form page detected but never interacted with	0
No form on this page	N/A

When N/A, its 10% weight is redistributed equally to Flow Reliability and Success Rate.

⚙️

Automation Success

5% of overall score

Measures how smoothly the automation engine ran: timeouts, missing elements, fallback recoveries, and retries all indicate a harder-to-test page.

Starts at 100

Condition	Deduction
Step timed out waiting for element or page load	−20 per step
Target element could not be found on the page	−20 per step
Fallback recovery strategy was used instead of primary action	−15 per step
Retry was required to complete a step	−10 per retry

Good — 100

Every element found on first try, no timeouts, no retries needed.

Bad — 40

2 timeouts (−40), 1 fallback used (−15), 1 retry needed (−10). Score: 35.

Overall score formula

Weighted Average

Each metric contributes a different percentage. The weights reflect how much each metric matters for real-world website quality.

Flow Reliability

25%

Success Rate

25%

Bug Density

20%

UI Issues

15%

Form Reliability

10%

Automation Success

If Form Reliability is N/A (no form on the tested page), its 10% is redistributed equally — 5% to Flow Reliability and 5% to Success Rate.

Score tiers

95–100

Excellent

Every step had a real interaction, a confirmed assertion, and visually different screenshots. Zero deductions across all metrics.

85–94

Good

Minor issues only. Most steps interacted genuinely. A few retries or low-severity bugs were found.

70–84

Needs Work

Several steps produced no visible change, or medium-severity bugs were found. The site works but has meaningful quality gaps.

0–69

Fail

Critical bugs, repeated timeouts, broken navigation, or the test never produced genuine interactions. Immediate attention needed.