feat(v2): add audio url and predefined document by anna-charlotte · Pull Request #940 · docarray/docarray

anna-charlotte · 2022-12-14T10:03:58Z

Add support for audio files to docarray v2

Goals:

AudioUrl
AudioTensor with AudioNdarray and AudioTorchTensor
Audio predefined doc (with optional attrs Audio.tensor, Audio.url and Audio.embedding)
tests
check and update documentation, if required. See guide

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

…udio-v2 Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

anna-charlotte · 2022-12-29T15:41:55Z

@JoanFM @alaeddine-13 I checked all the comments and made corresponding changes. It's ready for re-review now.

docarray/predefined_document/audio.py

tests/integrations/predefined_document/test_audio.py

docarray/typing/tensor/abstract_tensor.py

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

anna-charlotte · 2023-01-02T09:19:17Z

docarray/typing/url/audio_url.py

+            )
+        return cls(str(url), scheme=None)
+
+    def load(self: T) -> np.ndarray:


@samsja Should I return a np.ndarray here, or rather AudioNdArray or AudioTorchTensor?

I would say AudioNdArray.
Normally we just load the np.ndarray, bc the framework specific conversion can happen when putting it back into the document, and having it as np.ndarray makes the least assumptions about the sorrounding code.
In this case, since there are actual audio features that come with the array, I would go with AudioNdArray. It can still be treated like a normal np.ndarray, but bring the aforementioned features.

But just verify in a test that setting a AudioNdArray to a field with type AudioTorchArray actually works without issue?

But just verify in a test that setting a AudioNdArray to a field with type AudioTorchArray actually works without issue

Do you mean like this?

def test_load_audio_url_to_audio_torch_tensor(file_url): class MyAudioDoc(Document): audio_url: AudioUrl tensor: Optional[AudioTorchTensor] doc = MyAudioDoc(audio_url=file_url) doc.tensor = doc.audio_url.load() assert isinstance(doc.tensor, np.ndarray) assert isinstance(doc.tensor, AudioNdArray)

Ok I moved it to the computational backends

great I like it better like this

samsja

This pr looks really nice. I like how the idea of having different tensor types for the different modality work :) I added some comments

docarray/typing/tensor/audio/audio_tensor.py

docarray/typing/tensor/abstract_tensor.py

tests/units/typing/tensor/test_audio_tensor.py

JohannesMessner

Great PR, love the audio tensor stuff, this is a good pattern to use moving forward. just some things to consider

docarray/typing/tensor/abstract_tensor.py

docarray/typing/tensor/audio/audio_torch_tensor.py

docarray/typing/tensor/abstract_tensor.py

docarray/typing/tensor/audio/audio_ndarray.py

JohannesMessner · 2023-01-02T10:24:32Z

docarray/typing/url/audio_url.py

+            )
+        return cls(str(url), scheme=None)
+
+    def load(self: T) -> np.ndarray:


I would say AudioNdArray.
Normally we just load the np.ndarray, bc the framework specific conversion can happen when putting it back into the document, and having it as np.ndarray makes the least assumptions about the sorrounding code.
In this case, since there are actual audio features that come with the array, I would go with AudioNdArray. It can still be treated like a normal np.ndarray, but bring the aforementioned features.

But just verify in a test that setting a AudioNdArray to a field with type AudioTorchArray actually works without issue?

docarray/typing/url/audio_url.py

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

JohannesMessner

Just resolve the conflicts and we're good to go, great work!

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

github-actions · 2023-01-03T09:41:08Z

📝 Docs are deployed on https://ft-feat-add-audio-v2--jina-docs.netlify.app 🎉

anna-charlotte linked an issue Dec 14, 2022 that may be closed by this pull request

Add audio predefined document to v2 #914

Closed

github-actions bot added size/s area/core area/typing labels Dec 14, 2022

anna-charlotte mentioned this pull request Dec 14, 2022

Meta: DocArray v2 Roadmap #780

Closed

47 tasks

anna-charlotte changed the title ~~feat: add audio url and predefined document~~ feat(v2): add audio url and predefined document Dec 14, 2022

anna-charlotte added 2 commits December 14, 2022 15:25

feat: add audio url class

bebc9d4

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

fix: typos

6025c2f

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

anna-charlotte force-pushed the feat-add-audio-v2 branch from 1285a1c to 6025c2f Compare December 14, 2022 14:26

anna-charlotte added 2 commits December 15, 2022 11:34

test: add tests for audio and audio url

9a599e5

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

feat: add audio url and audio predefined class

04abdae

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

github-actions bot added size/m area/entrypoint area/testing component/proto and removed size/s labels Dec 15, 2022

anna-charlotte added 3 commits December 21, 2022 21:58

Merge remote-tracking branch 'origin/feat-rewrite-v2' into feat-add-a…

f8d700d

…udio-v2 Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

chore: add types-request

d58f804

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

feat: add audio tensors torch and ndarray

bdf8e88

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

github-actions bot added the area/setup label Dec 22, 2022

anna-charlotte added 2 commits December 22, 2022 10:07

fix: mypy type hints

6572df8

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

test: empty test file

9cd4baa

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

github-actions bot added size/l and removed size/m labels Dec 22, 2022

anna-charlotte added 6 commits December 28, 2022 09:09

test: add more unit and integration tests

b3c1948

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

fix: update audio tensors and audio url

7774181

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

fix: remove print statements

af840d4

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

docs: add documentation

797f488

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

refactor: rename test audio py to test audio tensor py

8b48a77

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

fix: typo in torch tensor py

e135438

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

anna-charlotte added 6 commits December 29, 2022 15:03

fix: revert ndim in abstract tensor and torch tensor and ndarray

83ef649

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

fix: mypy checks

eecca41

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

docs: add docstring to n dim

4762c3c

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

refactor: move n dim to abstract tensor and subclasses

6948122

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

refactor: make to protobuf abstract, change node to protobuf signature

d174087

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

fix: remove not needed methods

3a52303

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

alaeddine-13 reviewed Dec 30, 2022

View reviewed changes

docarray/predefined_document/audio.py Outdated Show resolved Hide resolved

tests/integrations/predefined_document/test_audio.py Outdated Show resolved Hide resolved

docarray/typing/tensor/abstract_tensor.py Outdated Show resolved Hide resolved

anna-charlotte added 3 commits December 30, 2022 10:12

fix: change remote audio file to file from github

a0be12e

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

fix: raw content from remote file

9623d29

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

fix: path to github remote file

6efdcf2

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

anna-charlotte commented Jan 2, 2023

View reviewed changes

samsja reviewed Jan 2, 2023

View reviewed changes

JohannesMessner requested changes Jan 2, 2023

View reviewed changes

anna-charlotte added 6 commits January 2, 2023 14:44

refactor: tensor field name to proto field name

5026543

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

test: remove redundant test in test audio tensor

703de43

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

fix: load audio url to audio ndarray instead of np ndarray

83ece31

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

refactor: move n dim to computational backend

de079e2

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

docs: update docstrings for audio tensors

2ef1350

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

feat: make dtype in audiourl load optional

d51d38e

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

JohannesMessner approved these changes Jan 3, 2023

View reviewed changes

samsja approved these changes Jan 3, 2023

View reviewed changes

anna-charlotte added 3 commits January 3, 2023 10:10

Merge branch 'feat-rewrite-v2' into feat-add-audio-v2

3901cfa

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

test: fix document refactor and ndarray import

a571898

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

fix: fix mypy check

71af630

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>

anna-charlotte requested a review from JoanFM January 3, 2023 09:43

JoanFM approved these changes Jan 3, 2023

View reviewed changes

JoanFM merged commit da3b7f0 into feat-rewrite-v2 Jan 3, 2023

JoanFM deleted the feat-add-audio-v2 branch January 3, 2023 09:45

Conversation

anna-charlotte commented Dec 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anna-charlotte commented Dec 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anna-charlotte Jan 2, 2023

Choose a reason for hiding this comment

Uh oh!

JohannesMessner Jan 2, 2023

Choose a reason for hiding this comment

Uh oh!

anna-charlotte Jan 2, 2023

Choose a reason for hiding this comment

Uh oh!

anna-charlotte Jan 3, 2023

Choose a reason for hiding this comment

Uh oh!

samsja Jan 3, 2023

Choose a reason for hiding this comment

Uh oh!

samsja left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JohannesMessner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JohannesMessner Jan 2, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JohannesMessner left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

anna-charlotte commented Dec 14, 2022 •

edited

Loading

anna-charlotte commented Dec 29, 2022 •

edited

Loading