feat: add map function#1187
Conversation
0c3bdfd to
d72cbf5
Compare
93be562 to
d71de73
Compare
|
Is |
docarray/utils/apply.py
Outdated
| func: Callable[[BaseDocument], BaseDocument], | ||
| num_worker: Optional[int] = None, | ||
| pool: Optional['Pool'] = None, | ||
| show_progress: bool = False, |
There was a problem hiding this comment.
Should we have a backend options to switch between multi-processing and multi-threading?
There was a problem hiding this comment.
As far as I understood we only wanted to keep multiprocessing, not multithreading @samsja
There was a problem hiding this comment.
@JoanFM maybe know more. But to me only multi processing makes sense. Can multi threading really improve performance here ?
There was a problem hiding this comment.
Just the keep the PR discussion up to date with what we discussed on Discord: there will be multi threading since it makes sense for IO bound ops and tf/np/torch stuff
Map is already included, but private right now. Not sure if we want to expose it again, or only one of the two. |
3a943a6 to
6c8176a
Compare
92d7a46 to
4c27409
Compare
9263bdb to
14e3cc1
Compare
c7bef8f to
a84b56d
Compare
Why should we keep only one @samsja? I think both make sense, the user might already have an in-place or pure function that they want to use without rewriting |
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
6a8b753 to
4a3a290
Compare
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
|
📝 Docs are deployed on https://ft-feat-map-apply--jina-docs.netlify.app 🎉 |
Goals:
Add
map_docsfunction andmap_batch:This will be different from doing
leverage multiprocessing by benchmarking in test, check that using 2 CPUS is faster then using 1
map_docs()
map_docs_batch()
benchmarking tests
check and update documentation, if required. See guide