Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AUTO: update migrate.py schema_version 5.2.3->5.3.0 #1292

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Mar 11, 2025

This is an automated PR to update migrate.py from schema_version 5.2.3->5.3.0

@ejmolinelli ejmolinelli requested a review from joyceyan March 11, 2025 14:24
@ejmolinelli ejmolinelli changed the title AUTO: update migrate.py schema_version 5.2.3->6.0.0 AUTO: update migrate.py schema_version 5.2.3->5.3.0 Mar 11, 2025
Adding Public and Private specific dataset updates, as well as CSR matrix checking according to single-cell-curation issue 1023.
Non_csr_list contains dataset_ids of datasets that have at least one non-csr matrix.
Adding Non_CSR_matrix check for checking sparsity of non csr matrices in migrate.py

# fmt: off
# ONTOLOGY TERMS TO UPDATE ACROSS ALL DATASETS IN CORPUS
# Initialization is AUTOMATED for newly deprecated terms that have 'Replaced By' terms in their ontology files
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep these types of comments so this file can remain as a template for future migrations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jahilton I'll see if I can update the generator to keep the comments.

import anndata as ad

import pandas as pd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can avoid pandas (below)

},
"development_stage": DEV_STAGE_AUTO_MIGRATE_MAP,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep as empty dict like the other fields for future use

]

# Dictionary for CURATOR-DEFINED remapping of deprecated feature IDs, if any, to new feature IDs.
GENCODE_MAPPER = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep


# Dictionary for CURATOR-DEFINED remapping of deprecated feature IDs, if any, to new feature IDs.
GENCODE_MAPPER = {}
df = pd.read_csv('migrate_files/non_csr_list.csv')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for a csv as it's a single columns. So remove the column header, and you can just read into a list without going through pandas

# utils.replace_ontology_term(df, <ontology_name>, {"term_to_replace": "replacement_term", ...})
# elif collection_id == "<collection_2_id>":
# <custom transformation logic beyond scope of replace_ontology_term>
# ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep for future


dataset.var.drop(columns="feature_type", inplace=True)

if GENCODE_MAPPER:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep for future

dataset = utils.remap_deprecated_features(adata=dataset, remapped_features=GENCODE_MAPPER)

# AUTOMATED, DO NOT CHANGE -- IF GENCODE UPDATED, DEPRECATED FEATURE FILTERING ALGORITHM WILL GO HERE.
if DEPRECATED_FEATURE_IDS:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep for future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants