Configuration#

A number of different thresholds and paths can be defined in the configuration file. An example configuration config.yaml is provided in the repository. Additionally, configuration files for the gene specific recommendations for the hereditary breast and ovarian cancer risk genes ATM (v.1.3.0), BRCA1 (v.1.1.0), BRCA2 (v.1.1.0), CDH1 (v.3.1.0), PALB2 (v.1.1.0), PTEN (v.3.1.0), and TP53 (v.1.4.0) are provided (see gene_specific folder).

For further information on the gene specific guidelines please see the documentation on ClinGen.

Configuration name#

At the top at each configuration file a name and a version can be defined. Version and name of the configuration together with the final classification result.

Rules#

In the rules section of the configuration file, a list of rules from the implemented rules can be defined in a list. The processing of the rules is not case sensitive.

Disease relevant transcript#

In this section of the configuration file a disease relevant transcript or list of disease relevant transcripts can be defined. If this is not the case all transcripts will be analysed and returned, with results for the MANE transcript being returned as the main result.

The name needs to be the Ensembl transcript ID without version extension. Under nmd_threshold the last nucleotide position that may cause nonsense mediated decay can be documented. pos_last_known_patho_ptc allows for the definition of the last known pathogenic premature termination codon. These transcript specific information are needed for the application of PVS1, especially in the gene-specific recommendations.

Thresholds prediction tools#

In this section thresholds for pathogenicitiy prediction tools can be defined. HerediClassify is only designed to use on prediction tool for pathogenicity prediction and one prediction tool for splicing prediction. For pathogenicity prediction the thresholds can be defined under pathogenicity_prediction and thresholds for splicing prediction tools can be defined under splicing_prediction. Both thresholds have the same structure. Under name the name of the prediction tool can be set. The prediction tool should be the same as the one specific in the variant import json. Under benign and pathogenic the thresholds for pathogenic and benign evidence can be set. Using direction the direction of the comparison between the threshold and the prediction value is defined. The following directions are possible:

  • less

  • less_than_or_equal

  • greater

  • greater_than_or_equal

To define the thresholds use evidence strength:threshold e.g. very_strong:0.003.

The thresholds for splicing and pathogenicity predictions at least at supporting level are required for all implementations of BP4 and PP3. Furthermore, the threshold for splicing prediction is required for the application of PVS1 and BP7. Additional, the threshold for splicing prediction is required for PM5, PM5_PTEN, PM5_TP53, BP1_annotation_cold_spot_strong, PM1_TP53, PS1_protein_spliceai, PS1_protein_enigma, PS1_splicing, PS1_splicing_clingen, PS1_protein_TP53, PS1_splicing_TP53, and PS1_splicing_pten.

When using the rules PP3_protein_mult_strength or BP4_protein_mult_strength thresholds for higher evidence strengths can be defined using moderate , strong, and very_strong. For example see example configuration files under gene_sepcific/pejaver_mult_strength. When another implementation of PP3 or BP4 is used that does not apply an evidence strength higher than supporting, thresholds for higher evidence strengths will be ignored during the assessment of the criterion.

Likelihood thresholds#

Under likelihood thresholds the threshold for all evidence strengths can be set. These thresholds are accessed by BS4, PP1, PP4_engima, and BP5_engima.

Allele frequency thresholds#

Here allele frequency thresholds can be defined as decimal figures. A threshold of 10% would therefore be set as 0.1.

  • threshold_ba1

    Threshold for BA1, required for all BA1 implementations.

  • threshold_ba1_absolute

    Threshold for minimum absolute allele count in subpopulation for BA1 to apply, required for BA1_with_absolute.

  • threshold_bs1

    Threshold for BS1, required for all BS1 implementations.

  • threshold_bs1_supporting

    Threshold for BS1 with supporting evidence, required for rule BS1_supporting.

  • threshold_bs1_absolute

    Threshold for minimum absolute allele count in subpopulation for BS1 to apply, required for BS1_with_absolute.

  • threshold_bs2

    Threshold for BS2, required for all BS2 implementations.

  • threshold_bs2_supporting

    Threshold for BS2 with supporting evidence, required for rule BS2_supporting.

  • threshold_pm2

    Threshold for PM2, required for all PM2 implementations.

  • threshold_cancerhotspots_ac

    Threshold for the application of mutational hotspot criterion, required for PM1_TP53.

Functional thresholds#

Under functional thresholds the threshold for difference in length of protein in percent can be defined. This us ed for the implementation of PVS and PM4. Please use a decimal point. So for a threshold of 10% threshold_diff_len_prot_percent should be set to 0.1.

Annotation files#

All the annotation files needed can be downloaded using the Download scripts provided and explained under installation instructions.

Gene-specific configurations#

Under this header the location of the gene-specific configuration is defined. Under root the folder where the gene-specific configuration are located is defined. Underneath, every gene name can be mapped to a specific file in which the gene-specific configuration is housed. After loading the variant and the configuration, HerediClassify checks if there is a gene-specific configuration available and uses that. When no gene-specific configuration are defined, HerediClassify will use the parameters of the general configuration file for all variants. The definition of the gene-specific configuration is only needed in the general configuration file.