Schema for Statistics (statistics.json file)
Note
After careful consideration, we have made the decision to close new customer access to Amazon Sagemaker Model Monitor, effective 7/30/26. Existing customers can continue to use the service as normal. AWS continues to invest in security and availability improvements for Model Monitor, but we do not plan to introduce new features. For more information, see Amazon SageMaker Model Monitor availability change.
The schema defined in the statistics.json file
specifies the statistical parameters to be calculated for the baseline
and data that is captured. It also configures the bucket to be used by
KLL
{ "version": 0, # dataset level stats "dataset": { "item_count": number }, # feature level stats "features": [ { "name": "feature-name", "inferred_type": "Fractional" | "Integral", "numerical_statistics": { "common": { "num_present": number, "num_missing": number }, "mean": number, "sum": number, "std_dev": number, "min": number, "max": number, "distribution": { "kll": { "buckets": [ { "lower_bound": number, "upper_bound": number, "count": number } ], "sketch": { "parameters": { "c": number, "k": number }, "data": [ [ num, num, num, num ], [ num, num ][ num, num ] ] }#sketch }#KLL }#distribution }#num_stats }, { "name": "feature-name", "inferred_type": "String", "string_statistics": { "common": { "num_present": number, "num_missing": number }, "distinct_count": number, "distribution": { "categorical": { "buckets": [ { "value": "string", "count": number } ] } } }, #provision for custom stats } ] }
Notes
-
The specified metrics are recognized by SageMaker AI in later visualization changes. The container can emit more metrics if required.
-
KLL sketch
is the recognized sketch. Custom containers can write their own representation, but it won’t be recognized by SageMaker AI in visualizations. -
By default, the distribution is materialized in 10 buckets. You can't change this.