Feast (Feature Store) is an open source feature store for machine learning that manages and serves features for production AI/ML systems. This page introduces Feast's purpose, architecture, and key capabilities. For detailed architectural patterns, see System Architecture. For CLI usage, see Getting Started and CLI. For subsystem details, see Core Concepts through Advanced Topics.
Feast provides a unified interface for managing features across model training (batch) and inference (online) workloads. It abstracts feature storage from feature retrieval, enabling ML platform teams to productionize analytic data using existing infrastructure.
Primary Interface: The FeatureStore class sdk/python/feast/feature_store.py105-220 orchestrates all operations, coordinating between:
BaseRegistry) - metadata catalog for feature definitionsThe system is highly modular: components can be used independently, mixed with managed cloud services, or extended with custom implementations.
Sources: README.md26-38 sdk/python/feast/feature_store.py105-220 docs/README.md1-33
Feast addresses three fundamental challenges in ML feature management:
| Challenge | Solution | Implementation |
|---|---|---|
| Training-Serving Skew | Point-in-time correct feature joins prevent future data leakage during training | get_historical_features() method in offline stores sdk/python/feast/feature_store.py1682-1845 |
| Feature Availability | Pre-computed features served at low latency (<10ms) from online stores | get_online_features() method sdk/python/feast/feature_store.py1847-2061 |
| Infrastructure Coupling | Pluggable offline/online stores enable portability across data systems | Provider abstraction sdk/python/feast/infra/provider.py49-531 |
The system ensures consistency through:
Sources: README.md32-36 docs/getting-started/quickstart.md43-59 sdk/python/feast/feature_store.py105-115
Component Architecture with Key Classes:
Architecture Patterns:
| Pattern | Implementation | Location |
|---|---|---|
| Facade | FeatureStore coordinates Registry, Provider, storage | sdk/python/feast/feature_store.py105-220 |
| Strategy | Provider delegates to pluggable OfflineStore/OnlineStore | sdk/python/feast/infra/provider.py49-67 |
| Registry | BaseRegistry manages feature metadata with multiple backends | sdk/python/feast/infra/registry/base_registry.py |
| Template Method | OfflineStore.get_historical_features() defined per backend | sdk/python/feast/infra/offline_stores/offline_store.py235-281 |
Key Design Characteristics:
FeatureStore properties (registry, provider) load on first access sdk/python/feast/feature_store.py205-254PassthroughProvider delegates all operations to stores sdk/python/feast/infra/passthrough_provider.py58-90Sources: sdk/python/feast/feature_store.py105-254 sdk/python/feast/infra/provider.py49-67 sdk/python/feast/repo_config.py68-107 sdk/python/feast/infra/passthrough_provider.py58-90 sdk/python/feast/infra/registry/base_registry.py sdk/python/feast/infra/offline_stores/offline_store.py73-281 sdk/python/feast/infra/online_stores/online_store.py35-252
The FeatureStore class is the primary interface for all Feast operations. It lazily initializes its dependencies:
Location: sdk/python/feast/feature_store.py105-220
Key Methods:
apply() - Registers feature definitions sdk/python/feast/feature_store.py944-1123materialize() - Loads features into online store sdk/python/feast/feature_store.py1124-1252get_historical_features() - Retrieves training data sdk/python/feast/feature_store.py1682-1845get_online_features() - Retrieves inference features sdk/python/feast/feature_store.py1847-2061The registry stores metadata about feature definitions. Multiple implementations exist:
| Type | Class | Storage Backend | Use Case |
|---|---|---|---|
| File | Registry | Local/S3/GCS | Development |
| SQL | SqlRegistry | PostgreSQL/MySQL | Production |
| Snowflake | SnowflakeRegistry | Snowflake | Snowflake-native |
| Remote | RemoteRegistry | gRPC server | Distributed teams |
Configuration: sdk/python/feast/repo_config.py136-184
Sources: sdk/python/feast/infra/registry/registry.py sdk/python/feast/feature_store.py206-243
Offline stores provide historical feature values for training and batch scoring. The base interface:
Interface: sdk/python/feast/infra/offline_stores/offline_store.py
Implementations: 9+ stores including BigQuery, Snowflake, Redshift, Spark, DuckDB sdk/python/feast/repo_config.py91-107
Online stores provide low-latency feature retrieval for inference:
Interface: sdk/python/feast/infra/online_stores/online_store.py35-252
Implementations: 9+ stores including Redis, DynamoDB, SQLite, Cassandra, Milvus (vector DB) sdk/python/feast/repo_config.py68-89
Example - SQLite: sdk/python/feast/infra/online_stores/sqlite.py117-431
Sources: sdk/python/feast/infra/online_stores/online_store.py35-71 sdk/python/feast/infra/online_stores/sqlite.py103-116
The Provider interface orchestrates offline stores, online stores, and materialization:
The PassthroughProvider is the default implementation that delegates to configured stores sdk/python/feast/infra/passthrough_provider.py58-128
Sources: sdk/python/feast/infra/provider.py49-105 sdk/python/feast/infra/passthrough_provider.py58-90
The following diagram traces the complete lifecycle from Python feature definitions to runtime serving, showing actual method names and code paths:
Critical Code Paths:
| Step | Function | File Location | Purpose |
|---|---|---|---|
| Parse | parse_repo() | sdk/python/feast/repo_operations.py114-221 | Scans Python files for Feast objects |
| Infer | update_feature_views_with_inferred_features_and_entities() | sdk/python/feast/inference.py | Infers schema from data sources |
| Validate | _validate_all_feature_views() | sdk/python/feast/feature_store.py644-663 | Validates feature view definitions |
| Apply | FeatureStore.apply() | sdk/python/feast/feature_store.py944-1123 | Registers objects to registry |
| Plan | FeatureStore.plan() | sdk/python/feast/feature_store.py795-880 | Dry-run diff preview |
Registry Implementations: The registry can be backed by file storage (development), SQL databases (production), or Snowflake tables (Snowflake-native). All implementations serialize to RegistryProto protos/feast/core/Registry.proto24-34 for wire format consistency.
Sources: sdk/python/feast/repo_operations.py114-221 sdk/python/feast/feature_store.py644-880 sdk/python/feast/feature_store.py944-1123 sdk/python/feast/inference.py protos/feast/core/Registry.proto24-34
Materialization transfers feature values from offline (historical) to online (serving) stores, showing the complete data flow with actual method calls:
Materialization Entry Points:
| Method | Location | Behavior |
|---|---|---|
materialize(start, end, fvs) | sdk/python/feast/feature_store.py1124-1252 | Time-range materialization with explicit start/end |
materialize_incremental(end, fvs) | sdk/python/feast/feature_store.py1253-1320 | Uses last_materialized_time from registry as start |
materialize_single_feature_view() | sdk/python/feast/infra/passthrough_provider.py222-246 | Provider implementation, one feature view at a time |
Implementation Details:
tqdm progress bars for user feedback sdk/python/feast/feature_store.py1232-1252_convert_arrow_to_proto() serializes feature values sdk/python/feast/utils.pyCompute Engine Configuration:
Supported Compute Engines: sdk/python/feast/repo_config.py46-53
| Engine | Use Case | Implementation |
|---|---|---|
local | Default in-process | feast/infra/compute_engines/local/compute.py |
snowflake.engine | Snowflake stored procedures | feast/infra/compute_engines/snowflake/snowflake_engine.py |
lambda | AWS Lambda serverless | feast/infra/compute_engines/aws_lambda/lambda_engine.py |
k8s | Kubernetes jobs | feast/infra/compute_engines/kubernetes/k8s_engine.py |
spark.engine | Distributed Spark | feast/infra/compute_engines/spark/compute.py |
ray.engine | Ray distributed Python | feast/infra/compute_engines/ray/compute.py |
Sources: sdk/python/feast/feature_store.py1124-1320 sdk/python/feast/infra/passthrough_provider.py222-246 sdk/python/feast/repo_config.py46-53 sdk/python/feast/infra/key_encoding_utils.py sdk/python/feast/utils.py
Feast is configured via feature_store.yaml:
The RepoConfig class sdk/python/feast/repo_config.py253-318 parses this configuration and provides properties for:
registry - Registry configuration sdk/python/feast/repo_config.py365-383offline_store - Offline store configuration sdk/python/feast/repo_config.py385-398online_store - Online store configuration sdk/python/feast/repo_config.py431-443batch_engine - Materialization engine configuration sdk/python/feast/repo_config.py445-459Sources: sdk/python/feast/repo_config.py253-460 docs/getting-started/quickstart.md106-117
Feast supports multiple serving deployment patterns with different performance and operational characteristics:
Use Case: Python applications, Jupyter notebooks, model training scripts
Implementation: sdk/python/feast/feature_store.py1847-2061
Endpoints:
POST /get-online-features - Feature retrieval sdk/python/feast/feature_server.py323-351POST /push - Stream ingestion sdk/python/feast/feature_server.py388-480POST /materialize - Trigger materialization sdk/python/feast/feature_server.py530-568POST /retrieve-online-documents - Vector search (alpha) sdk/python/feast/feature_server.py353-386Key Features:
Use Case: Non-Python clients, microservices, high-throughput serving
Implementation: sdk/python/feast/feature_server.py211-699
Features:
FeatureStore CRD for GitOps workflowsUse Case: Enterprise JVM environments, Kubernetes-native deployments
Location: infra/charts/feast/
Use Case: High-performance requirements, edge computing, cost optimization
Documentation: docs/reference/feature-servers/go-feature-server.md
Features:
Use Case: Feature discovery, data governance, team collaboration
Implementation: sdk/python/feast/ui_server.py ui/
Sources: README.md136-164 docs/getting-started/quickstart.md136-157 sdk/python/feast/feature_server.py211-699 infra/feast-operator/README.md1-13 sdk/python/feast/ui_server.py14-110 docs/reference/feature-servers/python-feature-server.md1-38
Feast's architecture is designed for extensibility through well-defined interfaces:
| Component | Interface | Custom Implementation Guide |
|---|---|---|
| Offline Store | OfflineStore | See Adding a new offline store |
| Online Store | OnlineStore | See Adding a new online store |
| Registry | BaseRegistry | See Registry documentation |
| Provider | Provider | See Creating a custom provider |
| Batch Engine | ComputeEngine | See Custom materialization engine |
Class Loading: Custom implementations are loaded via the importer module sdk/python/feast/importer.py using fully qualified class names in configuration.
Sources: sdk/python/feast/repo_config.py39-121 sdk/python/feast/infra/provider.py533-542 docs/getting-started/third-party-integrations.md1-30
Feast supports multiple deployment patterns:
The architecture diagram at README.md38-43 illustrates the minimal deployment with SQLite and local files.
Sources: README.md38-68 docs/how-to-guides/running-feast-in-production.md1-20 docs/how-to-guides/feast-on-kubernetes.md1-8
Feast uses a custom type system that maps between Python types, Pandas types, database types, and Protocol Buffer types:
PrimitiveFeastType, Array, and complex typesThis type system ensures consistency across offline stores (databases), online stores (key-value stores), and the feature server API (protobuf).
For details, see Type System.
Sources: sdk/python/feast/types.py sdk/python/feast/type_map.py protos/feast/types/Value.proto
Refresh this wiki
This wiki was recently refreshed. Please wait 5 days to refresh again.