Skip to content

#330 - Enable deep copy#331

Open
dmnc-grdnr wants to merge 19 commits intodkpro:mainfrom
dmnc-grdnr:feature/330-enable-deep-copy
Open

#330 - Enable deep copy#331
dmnc-grdnr wants to merge 19 commits intodkpro:mainfrom
dmnc-grdnr:feature/330-enable-deep-copy

Conversation

@dmnc-grdnr
Copy link
Contributor

This PR adds the deep copy functionality to the CAS object.

Tests for the function was added to test_cas.py

I have verified the functionality on different CAS files, but due to the complexity of the data structure, there might be something I have overlooked.

Therefore feedback is highly appreciated!

@reckart reckart changed the title feature/330-enable-deep-copy #330 - Enable deep copy Nov 10, 2025
@reckart reckart requested a review from Copilot November 10, 2025 19:36
@reckart reckart added this to the 0.11.0 milestone Nov 10, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds deep copy functionality to the CAS (Common Analysis Structure) object, allowing users to create independent copies of CAS instances with or without duplicating the associated typesystem.

Key changes:

  • Added a deep_copy() method to the Cas class with optional typesystem copying
  • Implemented comprehensive test coverage for both copying modes (with and without typesystem)

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
cassis/cas.py Implements the deep_copy() method to create deep copies of CAS objects, handling sofas, views, feature structures, and their references
tests/test_cas.py Adds two test cases verifying deep copy functionality with and without typesystem copying

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@reckart
Copy link
Member

reckart commented Nov 10, 2025

@dmnc-grdnr Could you please consider the Copilot review. If you believe and of its review remarks are wrong, please comment them.

@reckart
Copy link
Member

reckart commented Nov 11, 2025

Oh, and thank you again for another PR :)

@dmnc-grdnr
Copy link
Contributor Author

I reviewed the Copilot suggestions agreed with most, but found that a few lines are actually obsolete and should be removed instead of modified. This is my first pull request and I am unsure how to make the changes without breaking things. Could you please point me to what to do next?

Copy link
Member

@reckart reckart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much. To continue, it would be great if you could simply address the comments by updating the code and then committing the changes into this branch. Once the changes are in, the review will be repeated. Please do not use the "amend" function of git, just add another commit. Avoid force pushes, then you can't really break anything. Thanks!

@dmnc-grdnr
Copy link
Contributor Author

I was finally able to iron out all of the issues that appeared after including the random CAS tests. They were extremely helpful! I think, I made all the changes that you and Copilot required. Could you please have another look? Thanks!

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 12 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@reckart
Copy link
Member

reckart commented Jan 27, 2026

@dmnc-grdnr coming back to this, I re-ran copilot just for good measure before I would review the code again and possible merge it. Unfortunately, I have to say it found a few points that appear to be quite critical and which I may have quite easily missed in a manual review. Do you still have the capacity and interest to address those issues?

@dmnc-grdnr
Copy link
Contributor Author

Hello @reckart,

it took me some time, but I have not lost interest :)
I have read through all of the suggestions and commented wherever I was sure about the answer. The suggestions left uncommented (in my opinion) boil down to questions of defensive programming vs plausibility. What I mean is, should the code be able to copy a structurally damaged CAS or should it fail instead:
Should it copy a feature structure that does not have a name attribute?
Should it copy an FSArray, that contains None-elements or an FSArray which has an elements-attribute set to the value None?
Should the code expect there to be feature structures that are not returned by cas._find_all_fs but somehow their xmIDs are referenced in other feature structures?

Don't get me wrong, I don't mind applying these safety measuers at all. It just seems that they might be superflous. Then again my analysis might superficial instead ;)

(I'd be happy to have call about it too, if you like)

@reckart
Copy link
Member

reckart commented Feb 15, 2026

@dmnc-grdnr Best write me a mail.

@dmnc-grdnr
Copy link
Contributor Author

@reckart That's fine. But I have already posted my questions in my previous comment.

reckart added 4 commits March 17, 2026 15:21
* main:
  dkpro#328 - Remove annotations in a specified range
  dkpro#328 - Remove annotations in a specified range
  dkpro#328 - Remove annotations in a specified range
  dkpro#328 - Remove annotations in a specified range
  dkpro#328 - Remove annotations in a specified range
  dkpro#328 - Remove annotations in a specified range
  dkpro#328 - Remove annotations in a specified range
  dkpro#328 - Remove annotations in a specified range
  Issue dkpro#342: Improve cas_to_comparable_text
  dkpro#328 - Remove annotations in a specified range
  dkpro#328 - Remove annotations in a specified range

% Conflicts:
%	tests/test_cas.py
- ensure original and copy are fully decoupled
- Preserve None entries in FSArray copies and avoid warnings for None non-primitive features.
- Avoid creating a document annotation when checking for the language of the source CAS
- Support FSList and FSArray in cloning
- properly handle collection features that have None value
@reckart reckart force-pushed the feature/330-enable-deep-copy branch from 21989c9 to 3dca51e Compare March 17, 2026 20:44
- Avoid shared collection objects between original and clone
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

reckart added 4 commits March 17, 2026 22:23
- Fix handling of primitive lists
- Fix handling of inlined vs. reusable arrays
- Fix handling of inlined vs. reusable primitive arrays
- Added more testing
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

reckart added 2 commits March 18, 2026 18:57
- Fix handling of inlined vs. reusable primitive arrays
- Fix remembering which structures should be in the index and in which index and which should not be indexed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

reckart added 3 commits March 18, 2026 19:26
- Remove print calls from tests
- Better tracking of which FS belongs to which view(s)
- No need to pass language and mime-type at CAS construction
@reckart reckart requested a review from Copilot March 18, 2026 19:20
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

- Also properly duplicate sofaArray if set
@reckart reckart requested a review from Copilot March 18, 2026 19:48
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

- Preserve active view
@reckart reckart requested a review from Copilot March 18, 2026 20:05
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +1044 to +1053
for item in fs.elements:
if item is None:
referenced_list.append(None)
elif hasattr(item, "xmiID") and item.xmiID is not None:
referenced_list.append(item.xmiID)
else:
warnings.warn(
f"Standalone FSArray {fs.xmiID} contains an unidentifiable item; preserving as None in copy."
)
referenced_list.append(None)
Comment on lines +1113 to +1120
referenced_list.append(None)
elif hasattr(item, "xmiID") and item.xmiID is not None:
referenced_list.append(item.xmiID)
else:
warnings.warn(
f"Array feature '{feature.name}' of FS {fs.xmiID} contains an unidentifiable item; preserving as None in copy."
)
referenced_list.append(None)
Comment on lines +1044 to +1053
for item in fs.elements:
if item is None:
referenced_list.append(None)
elif hasattr(item, "xmiID") and item.xmiID is not None:
referenced_list.append(item.xmiID)
else:
warnings.warn(
f"Standalone FSArray {fs.xmiID} contains an unidentifiable item; preserving as None in copy."
)
referenced_list.append(None)
Comment on lines +966 to +973
head = current.head
if head is None:
referenced_list.append(None)
elif hasattr(head, "xmiID") and head.xmiID is not None:
referenced_list.append(head.xmiID)
else:
warnings.warn("FSList item without xmiID encountered during deep copy; preserving as None in copy.")
referenced_list.append(None)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants