Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
📝 WalkthroughWalkthroughAdds COCO annotation import and export: importer accepts Changes
Sequence DiagramsequenceDiagram
participant User
participant QueryResult
participant FieldResolver as _resolve_annotation_field
participant AnnotationStore as AnnotationsCollector
participant RepoAPI
participant ImageDownloader
participant COCOWriter as export_to_coco_file
participant FS as FileSystem
User->>QueryResult: export_as_coco(download_dir, annotation_field, ...)
QueryResult->>FieldResolver: resolve annotation_field
FieldResolver-->>QueryResult: resolved_field
QueryResult->>AnnotationStore: collect annotations from resolved_field
AnnotationStore-->>QueryResult: annotations list
alt annotations exist
QueryResult->>RepoAPI: request image files (prepend source_prefix)
RepoAPI-->>ImageDownloader: stream/download images
ImageDownloader-->>QueryResult: local image paths
QueryResult->>COCOWriter: build COCO JSON (images, annotations, categories)
COCOWriter->>FS: write output JSON (output_filename)
FS-->>COCOWriter: write confirmation
COCOWriter-->>QueryResult: output Path
QueryResult-->>User: return Path
else no annotations
QueryResult-->>User: raise RuntimeError
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
dagshub/data_engine/annotation/importer.py (1)
83-92:⚠️ Potential issue | 🟡 MinorAdd else clause to handle unsupported annotation types.
If
annotations_typeis not one of the supported values,annotation_dictremains unbound, causing anUnboundLocalErroron line 92.Proposed fix
elif self.annotations_type == "coco": annotation_dict, _ = load_coco_from_file(annotations_file) + else: + raise ValueError(f"Unsupported annotation type: {self.annotations_type}") return annotation_dict
♻️ Duplicate comments (1)
dagshub/__init__.py (1)
1-1:⚠️ Potential issue | 🔴 CriticalVersion changes should not be included in PRs.
The version bump has been flagged previously as a blocker issue. Version changes should be handled separately from feature PRs, typically during the release process.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 13ca3252-8076-4a28-a78a-8200f0235dc6
📒 Files selected for processing (9)
dagshub/__init__.pydagshub/auth/token_auth.pydagshub/data_engine/annotation/importer.pydagshub/data_engine/annotation/metadata.pydagshub/data_engine/model/query_result.pysetup.pytests/data_engine/annotation_import/test_coco.pytests/data_engine/conftest.pytests/mocks/repo_api.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: build (3.11)
- GitHub Check: build (3.10)
- GitHub Check: build (3.9)
- GitHub Check: build (3.13)
- GitHub Check: build (3.12)
🔇 Additional comments (11)
dagshub/auth/token_auth.py (1)
40-40: Readability improvement with no behavior change.Line 40 is an equivalent condition and reads more idiomatically in Python. No concerns here.
dagshub/data_engine/annotation/metadata.py (2)
284-284: Move import to module level.The import should be at the top of the file with other imports, not inside the function.
Proposed fix
Add to imports at top of file:
from dagshub_annotation_converter.converters.coco import load_coco_from_json_stringThen remove line 284.
274-294: LGTM on method implementation.The method correctly:
- Parses COCO JSON via the converter
- Rewrites all annotation filenames to the datapoint's path (expected behavior when adding annotations to a specific datapoint)
- Extends the existing annotations list
- Persists via
_update_datapoint()The filename rewriting is intentional here since you're associating external annotations with a specific datapoint.
tests/data_engine/annotation_import/test_coco.py (2)
20-23: Consider moving mock to shared fixture.This
mock_source_prefixfixture should be integrated into_create_mock_datasource()inconftest.pyso it works for all datasource fixture users, not just this test module.
1-218: Good test coverage for COCO functionality.The tests comprehensively cover:
- Import from disk and error handling for missing files
- Label Studio task conversion
add_coco_annotationfilename rewriting_resolve_annotation_fieldedge cases (explicit, auto, no fields, alphabetical ordering)- Export with various configurations (bbox coordinates, empty annotations error, custom classes, custom filename, multiple datapoints)
The helper functions (
_make_coco_json,_write_coco,_make_image_bbox,_make_qr) are well-structured and reusable.tests/mocks/repo_api.py (1)
116-118: LGTM!The mock
idproperty correctly mirrors the realRepoAPI.idsignature with a constant return value appropriate for test scenarios.tests/data_engine/conftest.py (1)
8-8: LGTM!Explicitly setting
source_type = DatasourceType.REPOSITORYensures the mocked datasource state correctly exercises REPOSITORY-specific code paths (e.g.,path_parts()revision extraction), which is necessary for the new COCO annotation tests.Also applies to: 29-29
dagshub/data_engine/model/query_result.py (3)
783-791: LGTM!Good refactoring to centralize annotation field resolution. The alphabetical sorting provides deterministic selection when multiple annotation fields exist, and the error message is clear when no fields are found.
871-917: LGTM on overall export_as_coco structure.The method follows the established pattern from
export_as_yolo: resolves annotation field, collects annotations, prefixes filenames with source prefix, downloads images, and exports to the target format. Error handling for empty annotations is appropriate.
901-903: No changes required. The code correctly assignsdict(classes)tocontext.categories, which matches the documented COCO format specification of{id: name}. The importedCategoriesclass is from the YOLO module and should not be used withCocoContext, as it is format-specific to YOLO, not COCO.> Likely an incorrect or invalid review comment.dagshub/data_engine/annotation/importer.py (1)
160-163: Defensive handling for None filenames.This change prevents
TypeErrorwhenremap_funcis called withNone. The fallback tonew_filenameis reasonable since the annotation belongs to the image keyed by that filename in the dictionary.However, a
Nonefilename from the converter could indicate a bug upstream. Consider logging when this fallback is triggered to aid debugging.
Based on changes to dagshub-annotation-converter to the same effect: DagsHub/dagshub-annotation-converter#26
Implementation Details
The PR extends annotation support through a multi-layered approach:
Importer Module Enhancements (
importer.py):AnnotationTypeliteral to include "coco",Metadata Annotations Support (
metadata.py):add_coco_annotationmethod to load COCO-format JSON and integrate annotations into metadataQuery Result Export API (
query_result.py):export_as_coco()- generates COCO JSON with optional class mapping and custom output filename_resolve_annotation_fieldto auto-select available annotation field with alphabetical fallbackTest Coverage:
test_coco.py(218 lines): Import/export flows, filename rewriting, field resolution, class mapping, output customizationCode Quality and Testing Infrastructure:
ruff.tomllinting configuration with comprehensive rule set including:__init__.pyandconftest.pyDatasourceType.REPOSITORYassignment in test fixtures (tests/data_engine/conftest.py)idproperty toMockRepoAPItest utilitytoken_auth.py(no behavioral change - logical equivalence in boolean expression)Documentation Note: Per PR comments, the supported formats list at dagshub.com/docs/use_cases/annotation/#supported-formats-for-annotation_imports requires manual update to reflect COCO BBOX, and COCO Segmentation support.