Concept class
In computational learning theory in mathematics, a concept over a domain X is a total Boolean function over X. A concept class is a class of concepts. Concept classes are a subject of computational learning theory.
Concept class terminology frequently appears in model theory associated with probably approximately correct (PAC) learning.[1] In this setting, if one takes a set Y as a set of (classifier output) labels, and X is a set of examples, the map {\displaystyle c:X\to Y}, i.e. from examples to classifier labels (where {\displaystyle Y=\{0,1\}} and where c is a subset of X), c is then said to be a concept. A concept class {\displaystyle C} is then a collection of such concepts.
Given a class of concepts C, a subclass D is reachable if there exists a sample s such that D contains exactly those concepts in C that are extensions to s.[2] Not every subclass is reachable.[2] [why? ]
Background
[edit ]A sample {\displaystyle s} is a partial function from {\displaystyle X}[clarification needed ] to {\displaystyle \{0,1\}}.[2] Identifying a concept with its characteristic function mapping {\displaystyle X} to {\displaystyle \{0,1\}}, it is a special case of a sample.[2]
Two samples are consistent if they agree on the intersection of their domains.[2] A sample {\displaystyle s'} extends another sample {\displaystyle s} if the two are consistent and the domain of {\displaystyle s} is contained in the domain of {\displaystyle s'}.[2]
Examples
[edit ]Suppose that {\displaystyle C=S^{+}(X)}. Then:
- the subclass {\displaystyle \{\{x\}\}} is reachable with the sample {\displaystyle s=\{(x,1)\}};[2] [why? ]
- the subclass {\displaystyle S^{+}(Y)} for {\displaystyle Y\subseteq X} are reachable with a sample that maps the elements of {\displaystyle X-Y} to zero;[2] [why? ]
- the subclass {\displaystyle S(X)}, which consists of the singleton sets, is not reachable.[2] [why? ]
Applications
[edit ]Let {\displaystyle C} be some concept class. For any concept {\displaystyle c\in C}, we call this concept {\displaystyle 1/d}-good for a positive integer {\displaystyle d} if, for all {\displaystyle x\in X}, at least {\displaystyle 1/d} of the concepts in {\displaystyle C} agree with {\displaystyle c} on the classification of {\displaystyle x}.[2] The fingerprint dimension {\displaystyle FD(C)} of the entire concept class {\displaystyle C} is the least positive integer {\displaystyle d} such that every reachable subclass {\displaystyle C'\subseteq C} contains a concept that is {\displaystyle 1/d}-good for it.[2] This quantity can be used to bound the minimum number of equivalence queries[clarification needed ] needed to learn a class of concepts according to the following inequality:{\textstyle FD(C)-1\leq \#EQ(C)\leq \lceil FD(C)\ln(|C|)\rceil }.[2]