Skip to contents

Check model output data tbl samples contain the appropriate number of samples for a given compound idx.

Usage

check_tbl_spl_n(tbl, round_id, file_path, hub_path)

Arguments

tbl

a tibble/data.frame of the contents of the file being validated. Column types must all be character.

round_id

character string. The round identifier.

file_path

character string. Path to the file being validated relative to the hub's model-output directory.

hub_path

Either a character string path to a local Modeling Hub directory or an object of class <SubTreeFileSystem> created using functions s3_bucket() or gs_bucket() by providing a string S3 or GCS bucket name or path to a Modeling Hub directory stored in the cloud. For more details consult the Using cloud storage (S3, GCS) in the arrow package. The hub must be fully configured with valid admin.json and tasks.json files within the hub-config directory.

Value

Depending on whether validation has succeeded, one of:

  • <message/check_success> condition class object.

  • <warning/check_failure> condition class object.

Returned object also inherits from subclass <hub_check>.

Details

Output of the check includes an errors element, a list of items, one for each compound_idx failing validation, with the following structure:

  • compound_idx: the compound idx that failed validation of number of samples.

  • n: the number of samples counted for the compound idx.

  • min_samples_per_task: the minimum number of samples required for the compound idx.

  • max_samples_per_task: the maximum number of samples required for the compound idx.

  • compound_idx_tbl: a tibble of the expected structure for samples belonging to the compound idx. See hubverse documentation on samples for more details.