deid.dicom package¶
Subpackages¶
Submodules¶
deid.dicom.fields module¶
-
class
deid.dicom.fields.
DicomField
(element, name, uid, is_filemeta=False)[source]¶ Bases:
object
A dicom field.
A dicom field holds the element, and a string that represents the entire nested structure (e.g., SequenceName__CodeValue).
-
name_contains
(expression)[source]¶ Determine if a name contains a pattern or expression.
Use re to search a field for a regular expression, meaning the name, the keyword (nested) or the string tag.
name.lower: includes nested keywords (e.g., Sequence_Child) self.tag: is the string version of the tag self.element.name: is the human friendly name “Sequence Child” self.element.keyword: is the name without nesting “Child”
-
property
stripped_tag
¶ Return the stripped element tag
-
property
tag
¶ Return a string of the element tag.
-
-
deid.dicom.fields.
expand_field_expression
(field, dicom, contenders=None)[source]¶ Get a list of fields based on an expression.
If no expression found, return single field. Options for fields include:
endswith: filter to fields that end with the expression startswith: filter to fields that start with the expression contains: filter to fields that contain the expression select: filter based on DICOM element properties allfields: include all fields exceptfields: filter to all fields except those listed ( | separated)
Returns: a list of DicomField objects
-
deid.dicom.fields.
extract_item
(item, prefix=None, entry=None)[source]¶ Extract values from a dicom sequence depending on the type.
A helper function to extract sequence, will extract values from a dicom sequence depending on the type.
Parameters: item (an item from a sequence.) –
-
deid.dicom.fields.
extract_sequence
(sequence, prefix=None)[source]¶ Extract a sequence recursively.
return a pydicom.sequence.Sequence recursively as a flattened list of items. For example, a nested FieldA and FieldB would return as:
{‘FieldA__FieldB’: ‘111111’}
Parameters: - sequence (the sequence to extract, should be pydicom.sequence.Sequence) –
- prefix (the parent name) –
deid.dicom.filter module¶
-
deid.dicom.filter.
apply_filter
(dicom, field, filter_name, value)[source]¶ essentially a switch statement to apply a filter to a dicom file.
Parameters: - dicom (the pydicom.dataset Dataset (pydicom.read_file)) –
- field (the name of the field to apply the filter to) –
- filer_name (the name of the filter to apply (e.g., contains)) –
- value (the value to set, if filter_name is valid) –
-
deid.dicom.filter.
compareBase
(self, field, expression, func, ignore_case=True)[source]¶ Search a field for an expression.
compareBase takes either re.search (for contains) or re.match (for matches) and returns True if the given regular expression is contained or matched
-
deid.dicom.filter.
contains
(self, field, expression)[source]¶ Determine if a field value contains an expression.
contains returns true if the value of the identifier contains the the string argument anywhere within it; otherwise, it returns false.
-
deid.dicom.filter.
empty
(self, field)[source]¶ Determine if the value is empty.
Empty returns True if the value is found to be “”. If the field is not present for the dicom, then we return False (missing != empty)
-
deid.dicom.filter.
endsWith
(self, field, term)[source]¶ Determine if a field value ends with an expression.
endsWith returns true if the value of the identifier ends with the string argument; otherwise, it returns false.
-
deid.dicom.filter.
equals
(self, field, term)[source]¶ returns true if the value of the identifier exactly equals the string argument; otherwise, it returns false.
-
deid.dicom.filter.
equalsBase
(self, field, term, ignore_case=True, not_equals=False)[source]¶ base of equals, with variable for ignore case (default True)
-
deid.dicom.filter.
matches
(self, field, expression)[source]¶ Determine if a field value matches an expression.
matches returns true if the value of the identifier matches the regular expression specified in the string argument; otherwise, it returns false.
-
deid.dicom.filter.
missing
(self, field)[source]¶ Determine if the dicom is missing a field.
Missing returns True if the dicom is missing the field entirely This means that the entire field is None
deid.dicom.groups module¶
-
deid.dicom.groups.
extract_fields_list
(dicom, actions, fields=None)[source]¶ Given a list of actions for a named group (a list) extract values from the dicom based on the list of actions provided. This function always returns a list intended to update some lookup to be used to further process the dicom.
-
deid.dicom.groups.
extract_values_list
(dicom, actions, fields=None)[source]¶ Given a list of actions for a named group (a list) extract values from the dicom based on the list of actions provided. This function always returns a list intended to update some lookup to be used to further process the dicom.
deid.dicom.header module¶
-
deid.dicom.header.
get_identifiers
(dicom_files, force=True, config=None, strip_sequences=False, remove_private=False, disable_skip=False, expand_sequences=True)[source]¶ Extract all identifiers from a dicom image.
This function returns a lookup by file name, where each value indexed includes a dictionary of nested fields (indexed by nested tag).
Parameters: - dicom_files (the dicom file(s) to extract from) –
- force (force reading the file (default True)) –
- config (if None, uses default in provided module folder) –
- strip_sequences (if True, remove all sequences) –
- remove_private (remove private tags) –
- disable_skip (do not skip over protected fields) –
- expand_sequences (if True, expand sequences. otherwise, skips) –
-
deid.dicom.header.
remove_private_identifiers
(dicom_files, save=True, overwrite=False, output_folder=None, force=True)[source]¶ Remove private identifiers.
remove_private_identifiers is a wrapper for the simple call to dicom.remove_private_tags, it simply reads in the files for the user and saves accordingly
-
deid.dicom.header.
replace_identifiers
(dicom_files, ids=None, deid=None, save=False, overwrite=False, output_folder=None, force=True, config=None, strip_sequences=False, remove_private=False, disable_skip=False)[source]¶ Replace identifiers.
replace identifiers using pydicom, can be slow when writing and saving new files. If you want to replace sequences, they need to be extracted with get_identifiers and expand_sequences to True.
deid.dicom.parser module¶
-
class
deid.dicom.parser.
DicomParser
(dicom_file, recipe=None, config=None, force=True, disable_skip=False)[source]¶ Bases:
object
Parse a dicom, performing one or more actions on fields.
A dicom parser serves as a cache to read in all fields from a dicom file. For each, we store the element and child elements
-
add_field
(field, value)[source]¶ Add a field to the dicom.
If it’s already present, update the value.
-
define
(name, value)[source]¶ Add a function or variable to the lookup for later usage.
This can be used for functions, lists, or variables.
-
delete_field
(field)[source]¶ Delete a field from the dicom.
We do this by way of parsing all nested levels of a tag into actual tags, and deleting the child node.
-
property
excluded_from_deletion
¶ Return once-evaluated list of fields that are not removed by REMOVE ALL or REMOVE SomeField, as they later have to be changed by REPLACE / JITTER That allows whitelisting fields from REMOVE ALL/SomeField to change them if needed (i.e. obfuscation)
-
find_by_name
(name)[source]¶ Find fields by name.
Given a string, find all field objects that contain the name. Name can correspond to:
- a string of the tag, with or without the parens and comma/space
- a keyword
- a field name
-
find_by_values
(values)[source]¶ Find fields by values.
Given a list of values, find fields in the dicom that contain any of those values, as determined by a regular expression search.
-
get_fields
(expand_sequences=True)[source]¶ expand all dicom fields into a list, where each entry is a DicomField. If we find a sequence, we unwrap it and represent the location with the name (e.g., Sequence__Child)
-
get_nested_field
(field, return_parent=False)[source]¶ Retrieve a nested field.
Based on a DicomField, return the one referenced in self.dicom. If a delete is needed, then the parent should be returned as well.
-
property
keep
¶ Return a list of fields to keep original, as defined by all KEEP actions in recipe Those fields are not impacted by REPLACE/JITTER actions
-
load
(dicom_file, force=True)[source]¶ Load the dicom file.
Ensure that the dicom file exists, and use full path. Here we load the file, and save the dicom, dicom_file, and dicom_name.
-
parse
(strip_sequences=False, remove_private=False)[source]¶ Parse the dicom.
The parse action corresponds to iterating through fields, and for each one, saving a data structure with the full element, the string (with nested representation of the keywords) and the tag. We want to save all three in a flat list that is easy to search over, and also build up actions for the lookup on the first parsing.
-
perform_action
(field, value, action, filemeta=False)[source]¶ Perform an action on a field.
perform action takes an action (dictionary with field, action, value) and performs the action on the loaded dicom.
Parameters: - field (a field for expand) –
- value (field value) –
- action (the action from the parsed deid to take) – “field” (eg, PatientID) the header field to process “action” (eg, REPLACE) what to do with the field “value”: if needed, the field from the response to replace with
-
replace_field
(field, value)[source]¶ Replace a value in a field.
This uses the same function as ADD, but likely the dicom has the value.
-
property
skip
¶ Return a list of fields to skip, as defined in the self.config
-
deid.dicom.tags module¶
Add tag will take a string for a tag (e.g., ) and define a new tag for it. By default, we give the type “Short Text.”
find_tag will search over tags in the DicomDictionary and return the tags found to match some term.
get private tags
Parameters: dicom (the pydicom.dataset Dataset (pydicom.read_file)) –
get_tag will return a dictionary with tag indexed by field. For each entry, a dictionary lookup is included with VR.
Parameters: field (the keyword to get tag for, eg "PatientIdentityRemoved") –
has_private will return True if the header has private tags
Parameters: dicom (the pydicom.dataset Dataset (pydicom.read_file)) –
remove sequences from a dicom by removing the associated tag. We use dicom.iterall() to get all nested sequences.
Parameters: dicom (the loaded dicom to remove sequences) –
update tag will update a value in the header, if it exists if not, nothing is added. This check is the only difference between this function and change_tag. If the user wants to add a value (that might not exist) the function add_tag should be used with a private identifier as a string.
Parameters: - dicom (the pydicom.dataset Dataset (pydicom.read_file)) –
- field (the name of the field to update) –
- value (the value to set, if name is a valid tag) –
deid.dicom.utils module¶
-
deid.dicom.utils.
get_files
(contenders, check=True, pattern=None, force=False, tempdir=None)[source]¶ Get a generator for files.
get_files will take a list of single dicom files or directories, and return a generator that yields complete paths to all files
Parameters: - contenders (a list of files or directories (contenders!)) –
- check (boolean to indicate if we should validate dicoms (default True)) –
- pattern (A pattern to use with fnmatch. If None, * is used) –
- force (force reading of the files, if some headers invalid.) – Not recommended, as many non-dicom will come through
-
deid.dicom.utils.
save_dicom
(dicom, dicom_file, output_folder=None, overwrite=False)[source]¶ Save a dicom file to an output folder.
We make sure to not overwrite unless the user has enforced it
Parameters: - dicom (the pydicon Dataset to save) –
- dicom_file (the path to the dicom file to save (we only use basename)) –
- output_folder (the folder to save the file to) –
- overwrite (overwrite any existing file? (default is False)) –