[フレーム]
You are viewing this page in an unauthorized frame window.

This is a potential security issue, you are being redirected to https://csrc.nist.gov.

You have JavaScript disabled. This site requires JavaScript to be enabled for complete site functionality.

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

    Publications

Conference Paper

Data Frequency Coverage Impact on AI Performance

Documentation Topics

Published: April 15, 2025

Author(s)

Erin Lanus (Virginia Tech), Brian Lee (Virginia Tech), Jaganmohan Chandrasekaran (Virginia Tech), Laura Freeman (Virginia Tech), M S Raunak (NIST), Raghu Kacker (NIST), Richard Kuhn (NIST)

Conference

Name: 2025 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)
Dates: 03/31/2025 - 04/04/2025
Location: Naples, Italy
Citation: 2025 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW), pp. 258-267

Abstract

Artificial Intelligence (AI) models use statistical learning over data to solve complex problems for which straightforward rules or algorithms may be difficult or impossible to design; however, a side effect is that models that are complex enough to sufficiently represent the function may be uninterpretable. Combinatorial testing, a black-box approach arising from software testing, has been applied to test AI models. A key differentiator between traditional software and AI is that many traditional software faults are deterministic, requiring a failure-inducing combination of inputs to appear only once in the test set for it to be discovered. On the other hand, AI models learn statistically by reinforcing weights through repeated appearances in the training dataset, and the frequency of input combinations plays a significant role in influencing the model’s behavior. Thus, a single occurrence of a combination of feature values may not be sufficient to influence the model’s behavior. Consequently, measures like simple combinatorial coverage that are applicable to software testing do not capture the frequency with which interactions are covered in the AI model’s input space. This work develops methods to characterize the data frequency coverage of feature interactions in training datasets and analyze the impact of imbalance, or skew, in the combinatorial frequency coverage of the training data on model performance. We demonstrate our methods with experiments on an open-source dataset using several classical machine learning algorithms. This pilot study makes three observations: performance may increase or decrease with data skew, feature importance methods do not predict skew impact, and adding more data may not mitigate skew effects.

Artificial Intelligence (AI) models use statistical learning over data to solve complex problems for which straightforward rules or algorithms may be difficult or impossible to design; however, a side effect is that models that are complex enough to sufficiently represent the function may be... See full abstract

Artificial Intelligence (AI) models use statistical learning over data to solve complex problems for which straightforward rules or algorithms may be difficult or impossible to design; however, a side effect is that models that are complex enough to sufficiently represent the function may be uninterpretable. Combinatorial testing, a black-box approach arising from software testing, has been applied to test AI models. A key differentiator between traditional software and AI is that many traditional software faults are deterministic, requiring a failure-inducing combination of inputs to appear only once in the test set for it to be discovered. On the other hand, AI models learn statistically by reinforcing weights through repeated appearances in the training dataset, and the frequency of input combinations plays a significant role in influencing the model’s behavior. Thus, a single occurrence of a combination of feature values may not be sufficient to influence the model’s behavior. Consequently, measures like simple combinatorial coverage that are applicable to software testing do not capture the frequency with which interactions are covered in the AI model’s input space. This work develops methods to characterize the data frequency coverage of feature interactions in training datasets and analyze the impact of imbalance, or skew, in the combinatorial frequency coverage of the training data on model performance. We demonstrate our methods with experiments on an open-source dataset using several classical machine learning algorithms. This pilot study makes three observations: performance may increase or decrease with data skew, feature importance methods do not predict skew impact, and adding more data may not mitigate skew effects.


Hide full abstract

Keywords

testing AI; combinatorial coverage; combinatorial frequency
Control Families

None selected

Documentation

Publication:
https://doi.org/10.1109/ICSTW64639.2025.10962464
Preprint (pdf)

Supplemental Material:
None available

Document History:
04/15/25: Conference Paper (Final)

Topics

Security and Privacy

modeling, testing & validation

Technologies

artificial intelligence, combinatorial testing

AltStyle によって変換されたページ (->オリジナル) /