Skip to content

Conversation

@majiayu000
Copy link
Contributor

Description

Wrap pydantic ValidationError as OutputParserException in VectorStoreQueryOutputParser.parse() to ensure consistent exception handling in VectorIndexAutoRetriever._parse_generated_spec.

Previously, when the LLM returned malformed JSON that passed JSON parsing but failed Pydantic validation, an uncaught ValidationError would be raised instead of the expected OutputParserException.

Fixes #19410

New Package?

  • No

Version Bump?

  • No

Type of Change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

  • I added new unit tests to cover this change

Added two test cases:

  • test_output_parser_invalid_schema_raises_output_parser_exception: Tests that missing required fields raise OutputParserException
  • test_output_parser_invalid_type_raises_output_parser_exception: Tests that invalid types raise OutputParserException

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

Wrap pydantic ValidationError as OutputParserException to ensure
consistent exception handling in VectorIndexAutoRetriever.

Fixes run-llama#19410

Signed-off-by: majiayu000 <[email protected]>
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jan 5, 2026
Comment on lines +16 to +19
except ValidationError as e:
raise OutputParserException(
f"Failed to validate query spec. Error: {e}. Got JSON dict: {json_dict}"
) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you notice a better UX here by throwing the OutputParserException instead of the ValidationError that would be normally thrown without this extra step?

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jan 5, 2026
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these changes are unrelated

@majiayu000
Copy link
Contributor Author

@AstraBert Thanks for the suggestion. The parser now wraps the ValidationError and raises OutputParserException with context (see llama-index-core/llama_index/core/indices/vector_store/retrievers/auto_retriever/output_parser.py).

@logan-markewich The schema changes are part of the MediaResource hash fix; happy to split if you prefer.

@AstraBert
Copy link
Member

@majiayu000 I see you have another PR open for the MediaResource hashing changes, we can follow up there with those and I suggest you remove them from here

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jan 13, 2026
@majiayu000
Copy link
Contributor Author

Thanks @AstraBert! You are right.

I have updated this PR and removed the MediaResource hashing changes. Those changes are indeed handled in PR #20451.

This PR is now focused solely on catching the ValidationError in VectorStoreQueryOutputParser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: VectorIndexAutoRetriever._parse_generated_spec may raise uncaught pydantic.ValidationError

3 participants