You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sdk/python/tests/integration/online_store/test_universal_online.py::test_retrieve_online_documents_v2 is the canonical integration test for retrieve_online_documents_v2. Today it's gated to:
Five online stores in sdk/python/feast/infra/online_stores/ implement retrieve_online_documents_v2: pgvector, elasticsearch, sqlite (with sqlite-vec + fts5/BM25), milvus, and remote (passthrough). The universal test covers two of the four eligible stores.
This gap surfaced while working on #6275 (bug fixes in MilvusOnlineStore) — one of the bugs (TypeError: fromhex argument must be str, not None on an empty composite-key path) would have been caught by the universal v2 test if Milvus were part of its parametrization. #6299 adds targeted integration coverage inside the existing Milvus-only test, but the structural question remains.
Why "just add Milvus and SQLite" doesn't work
test_retrieve_online_documents_v2 asserts four capability classes in one test body:
Dense vector search (L2 and cosine)
Full-text search with a text_rank contract — float in [0, 1], sorted descending
Hybrid (vector + text) with combined scoring
Empty-match behavior for full-text
Capability
pgvector
elasticsearch
sqlite
milvus
Dense vector search
✓
✓
✓
✓
L2 and cosine
✓
✓
✓
✓
BM25 full-text w/ text_rank
✓
✓
✓ (fts5)
✗ (LIKE filter only)
Hybrid dense + sparse
✓
✓
varies
✓ (native)
Milvus's full-text path is a LIKE '%query_string%' filter (milvus.py:610, 664), not BM25, and cannot satisfy the text_rank contract the universal test asserts. SQLite with fts5 plausibly can, but has never been added to the parametrization.
Proposals
A. Per-capability test functions (minimal change)
Split test_retrieve_online_documents_v2 into smaller functions, each with its own only=[...]:
Pros: no new abstractions; additive; easy to review. Cons: store lists drift as new stores land; each new vector store means N only-list updates.
B. Store-declared capabilities on the test creator (richer)
Extend OnlineStoreCreator with a capabilities() -> set[Capability] classmethod. The universal test harness filters stores per test based on the capability each test needs:
Pros: each store self-describes; adding a new store = declaring its capabilities once. Cons: introduces a new abstraction; needs agreement on the capability enum; may uncover existing mismatches during adoption.
C. Store-declared capabilities on OnlineStore itself
Move capabilities() onto the base class, usable at runtime for fallback logic (not just tests).
Pros: most principled. Cons: cross-cutting change to every store; probably scope creep relative to the testing pain we're trying to solve.
Questions for maintainers
Should all stores with v2 retrieval be covered by a universal test matrix, or is the current split (universal for pg+elasticsearch, per-store for milvus/sqlite) intentional?
Preference between A (minimal) and B (capability-declared)? C likely deferrable.
If B: where should the Capability enum live — new feast.infra.online_stores.capabilities module, or alongside vector_store.py?
Should query_string-based search with non-BM25 semantics (Milvus's LIKE filter) be kept, dropped, or formalized as a distinct capability (KEYWORD_FILTER_SEARCH vs BM25_TEXT_SEARCH)?
Next steps if we agree on direction
Cheap independent win: add SQLite to only=[...] in the existing v2 test. Likely works today.
Implement A or B for the remaining stores.
Once in place, adding Milvus to relevant subsets becomes one-line.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Context
sdk/python/tests/integration/online_store/test_universal_online.py::test_retrieve_online_documents_v2is the canonical integration test forretrieve_online_documents_v2. Today it's gated to:@pytest.mark.universal_online_stores(only=["pgvector", "elasticsearch"])Five online stores in
sdk/python/feast/infra/online_stores/implementretrieve_online_documents_v2: pgvector, elasticsearch, sqlite (with sqlite-vec + fts5/BM25), milvus, and remote (passthrough). The universal test covers two of the four eligible stores.This gap surfaced while working on #6275 (bug fixes in MilvusOnlineStore) — one of the bugs (
TypeError: fromhex argument must be str, not Noneon an empty composite-key path) would have been caught by the universal v2 test if Milvus were part of its parametrization. #6299 adds targeted integration coverage inside the existing Milvus-only test, but the structural question remains.Why "just add Milvus and SQLite" doesn't work
test_retrieve_online_documents_v2asserts four capability classes in one test body:text_rankcontract — float in[0, 1], sorted descendingtext_rankMilvus's full-text path is a
LIKE '%query_string%'filter (milvus.py:610, 664), not BM25, and cannot satisfy thetext_rankcontract the universal test asserts. SQLite with fts5 plausibly can, but has never been added to the parametrization.Proposals
A. Per-capability test functions (minimal change)
Split
test_retrieve_online_documents_v2into smaller functions, each with its ownonly=[...]:test_v2_vector_search→only=["pgvector", "elasticsearch", "sqlite", "milvus"]test_v2_text_search_bm25→only=["pgvector", "elasticsearch", "sqlite"]test_v2_hybrid→ per-store supporttest_v2_empty_result→ allPros: no new abstractions; additive; easy to review.
Cons: store lists drift as new stores land; each new vector store means N
only-list updates.B. Store-declared capabilities on the test creator (richer)
Extend
OnlineStoreCreatorwith acapabilities() -> set[Capability]classmethod. The universal test harness filters stores per test based on the capability each test needs:Pros: each store self-describes; adding a new store = declaring its capabilities once.
Cons: introduces a new abstraction; needs agreement on the capability enum; may uncover existing mismatches during adoption.
C. Store-declared capabilities on
OnlineStoreitselfMove
capabilities()onto the base class, usable at runtime for fallback logic (not just tests).Pros: most principled.
Cons: cross-cutting change to every store; probably scope creep relative to the testing pain we're trying to solve.
Questions for maintainers
Capabilityenum live — newfeast.infra.online_stores.capabilitiesmodule, or alongsidevector_store.py?query_string-based search with non-BM25 semantics (Milvus's LIKE filter) be kept, dropped, or formalized as a distinct capability (KEYWORD_FILTER_SEARCHvsBM25_TEXT_SEARCH)?Next steps if we agree on direction
only=[...]in the existing v2 test. Likely works today.Beta Was this translation helpful? Give feedback.
All reactions