fix: catch pydantic ValidationError in VectorStoreQueryOutputParser #20450

majiayu000 · 2026-01-05T13:51:35Z

Description

Wrap pydantic ValidationError as OutputParserException in VectorStoreQueryOutputParser.parse() to ensure consistent exception handling in VectorIndexAutoRetriever._parse_generated_spec.

Previously, when the LLM returned malformed JSON that passed JSON parsing but failed Pydantic validation, an uncaught ValidationError would be raised instead of the expected OutputParserException.

Fixes #19410

New Package?

No

Version Bump?

No

Type of Change

Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

I added new unit tests to cover this change

Added two test cases:

test_output_parser_invalid_schema_raises_output_parser_exception: Tests that missing required fields raise OutputParserException
test_output_parser_invalid_type_raises_output_parser_exception: Tests that invalid types raise OutputParserException

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran uv run make format; uv run make lint to appease the lint gods

Wrap pydantic ValidationError as OutputParserException to ensure consistent exception handling in VectorIndexAutoRetriever. Fixes run-llama#19410 Signed-off-by: majiayu000 <[email protected]>

AstraBert · 2026-01-05T15:23:42Z

...-index-core/llama_index/core/indices/vector_store/retrievers/auto_retriever/output_parser.py

+        except ValidationError as e:
+            raise OutputParserException(
+                f"Failed to validate query spec. Error: {e}. Got JSON dict: {json_dict}"
+            ) from e


Did you notice a better UX here by throwing the OutputParserException instead of the ValidationError that would be normally thrown without this extra step?

Signed-off-by: majiayu000 <[email protected]>

logan-markewich · 2026-01-05T16:24:45Z

llama-index-core/llama_index/core/schema.py

Looks like these changes are unrelated

Signed-off-by: majiayu000 <[email protected]>

majiayu000 · 2026-01-08T14:52:01Z

@AstraBert Thanks for the suggestion. The parser now wraps the ValidationError and raises OutputParserException with context (see llama-index-core/llama_index/core/indices/vector_store/retrievers/auto_retriever/output_parser.py).

@logan-markewich The schema changes are part of the MediaResource hash fix; happy to split if you prefer.

AstraBert · 2026-01-13T11:36:08Z

@majiayu000 I see you have another PR open for the MediaResource hashing changes, we can follow up there with those and I suggest you remove them from here

majiayu000 · 2026-01-13T14:45:09Z

Thanks @AstraBert! You are right.

I have updated this PR and removed the MediaResource hashing changes. Those changes are indeed handled in PR #20451.

This PR is now focused solely on catching the ValidationError in VectorStoreQueryOutputParser.

fix: catch pydantic ValidationError in VectorStoreQueryOutputParser

900d799

Wrap pydantic ValidationError as OutputParserException to ensure consistent exception handling in VectorIndexAutoRetriever. Fixes run-llama#19410 Signed-off-by: majiayu000 <[email protected]>

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jan 5, 2026

AstraBert reviewed Jan 5, 2026

View reviewed changes

Fix MediaResource hash surrogate encoding

fd2dae1

Signed-off-by: majiayu000 <[email protected]>

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jan 5, 2026

Fix hash tests and formatting

1d81f3c

Signed-off-by: majiayu000 <[email protected]>

logan-markewich reviewed Jan 5, 2026

View reviewed changes

llama-index-core/llama_index/core/schema.py

Copy link

Collaborator

logan-markewich Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these changes are unrelated

majiayu000 added 2 commits January 8, 2026 21:03

Update docling node parser hash expectations

f8e13e1

Signed-off-by: majiayu000 <[email protected]>

Merge branch 'main' into fix/vector-auto-retriever-validation-error

e7d7412

fix: revert MediaResource hashing changes (moved to designated PR)

9f0049a

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jan 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: catch pydantic ValidationError in VectorStoreQueryOutputParser #20450

fix: catch pydantic ValidationError in VectorStoreQueryOutputParser #20450

Uh oh!

majiayu000 commented Jan 5, 2026

Uh oh!

AstraBert Jan 5, 2026

Uh oh!

logan-markewich Jan 5, 2026

Uh oh!

majiayu000 commented Jan 8, 2026

Uh oh!

AstraBert commented Jan 13, 2026

Uh oh!

majiayu000 commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: catch pydantic ValidationError in VectorStoreQueryOutputParser #20450

Are you sure you want to change the base?

fix: catch pydantic ValidationError in VectorStoreQueryOutputParser #20450

Uh oh!

Conversation

majiayu000 commented Jan 5, 2026

Description

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Uh oh!

AstraBert Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

logan-markewich Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

majiayu000 commented Jan 8, 2026

Uh oh!

AstraBert commented Jan 13, 2026

Uh oh!

majiayu000 commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants