In Python 3.8, I'm trying to mock up a validation JSON schema for the structure below:
{
    # some other key/value pairs
    "data_checks": {
        "check_name": {
            "sql": "SELECT col FROM blah",
            "expectations": {
                "expect_column_values_to_be_unique": {
                    "column": "col",
                },
                # additional items as required
            }
        },
        # additional items as required
    }
}
The requirements I'm trying to enforce include:
- At least one item in data_checksthat can have a dynamic name. Item keys should be unique.
- sqland- expectationskeys must be present
- sqlshould be a text string
- At least one item in expectations. Item keys should be unique.
- Within expectations, item keys must be equal to available methods provided bydir(class_name)
More advanced capability would include:
- Enforcing expectationsmethod items to only includekwargsfor that method
I currently have the following JSON schema for the data_checks portion:
"data_checks": {
    "description": "Data quality checks against provided sources.",
    "minProperties": 1,
    "type": "object",
    "patternProperties": {
        ".+": {
            "required": ["expectations", "sql"],
            "sql": {
                "description": "SQL for data quality check.",
                "minLength": 1,
                "type": "string",
            },
            "expectations": {
                "description": "Great Expectations function name.",
                "minProperties": 1,
                "type": "object",
                "anyOf": [
                    {
                        "type": "string",
                        "minLength": 1,
                        "pattern": [e for e in dir(SqlAlchemyDataset) if e.startswith("expect_")],
                    }
                ],
            },
        },
    },
},
This JSON schema does not enforce expectations to have at least one item nor does it enforce valid method names for the nested keys as expected from [e for e in dir(SqlAlchemyDataset) if e.startswith("expect_")]. I haven't really looked into enforcing kwargs for the selected method (is that even possible?).
I don't know if this is related to things being nested, but how would I enforce the proper validation requirements?
Thanks!
