fix: Check duplicate names for feature view across types#5999
fix: Check duplicate names for feature view across types#5999Prathap-P wants to merge 5 commits intofeast-dev:masterfrom
Conversation
Signed-off-by: Prathap P <436prathap@gmail.com>
f23ff7b to
77cc24c
Compare
Signed-off-by: Prathap P <436prathap@gmail.com>
77cc24c to
c299a71
Compare
Signed-off-by: Prathap P <436prathap@gmail.com>
|
|
||
| # Check StreamFeatureView before FeatureView since StreamFeatureView is a subclass | ||
| # Note: All getters raise FeatureViewNotFoundException (not type-specific exceptions) | ||
| if isinstance(feature_view, StreamFeatureView): |
There was a problem hiding this comment.
Does this mean it will call to retrieve feature view many times? Can we optimize it?
There was a problem hiding this comment.
If I understand correctly, you're concerned that when creating 100 feature views, the apply_feature_view function calls the getters for the other two types each time. That results in 2 database calls per feature view — 200 total — and you’re wondering if we should optimize this part.
There was a problem hiding this comment.
@HaoXuAI
I have a plan. We need to validate in two places:
- In the incoming request — ensure the new feature view name doesn’t exist in the other two types within the request (and vice versa).
- In the database — check against existing records (for the SQL registry case).
Currently, the cache util doesn’t have an exists function to check if a name is already in the registry. Even if it did, calling it 100 times wouldn’t be efficient.
So I’m planning to create a util function per feature view type that takes a list of names and a project, and checks that none of the names exist in the DB.
For example, if 50 feature views come in:
- I’ll call the new util once with those 50 names against the stream and on-demand tables.
- Do the same vice versa for the other types.
This way, instead of calling per feature view, we make only two DB calls per type.
That’s my current idea. Do you see any better optimization?
What this PR does / why we need it:
get_online_featuresresolved feature view names viaget_any_feature_view, which checks registry tables in a fixed order: FeatureView → StreamFeatureView → OnDemandFeatureView.Which issue(s) this PR fixes:
Fixes #5995
Misc