Matthieu Vergne's Homepage

Last update: 19/11/2017 11:16:50

How to formalise generic/specific?

Context

In our hyperconnected world, there is a tremendous tendency towards sharing and putting in common resources, whether it is for optimisation purpose or simply to build communities around shared interests. Sometimes not enough, with the obvious redundancy leading to think that we can further optimise, and sometimes too much, with people struggling when using or adapting the shared elements for their own context. The proper balance is achieved when what is shared among everyone is actually put in common, while subgroups keep control over what is particular to them. Said another way, there are generic things which can be centralised for optimisation prupose, and specific things which should remain decentralised to avoid overnormalisation, which can lead to conflicts and rejection. Of course, we don't speak here about policies which enforce (de)centralisation, which can be motivated for various reasons, we speak about the objective observation that, in a given context, some things are shared and others are not.

This is particularly true in programming, where reinventing the wheel is usually considered as a loss of time (if not for training purpose), leading to a natural incentive towards centralising generic parts into libraries and reusing them in more specific projects. This incentive towards building generic pieces for reuse comes from a more global need to produce independent, simple modules which can be combined to do more complex tasks, a modular aspect supported by many programming languages today. Building generic components for reuse in specific contexts is one aspect of this modularity, where generic components are useless if taken alone, but gain in interest with the many specific contexts in which we can use them. This interest in building generic components for reuse is so strong that some people have worked hard to identify how to systematically build generic components, like Goguen or Musser and Stepanov, a practice that is usually called Generic Programming (GP), although this term is linked to different interpretations. As a Java programmer, I am particularly seduced by the abstraction of Java, which hides the specific aspects of the machine through its JVM, and abstracting further to design generic programs is a personnal tendency since a while. A series of posts is planned to be published on that matter, in order to investigate GP in Java and describe some techniques to help making generic programs. The current post should be of particular help for this series.

This separation between generic and specific is also a perspective that we can find in Artificial Intelligence (AI), where many subfields deal with specific goals, like Computer Vision, Expert Systems, Machine Learning, or Machine Translation, while the subfield of Artificial General Intelligence (AGI) focuses on the generic aspects of intelligence that should be common to all the others. From our perspective, the complementarity observed in AI has led us to think deeper about how AGI focuses on obtaining high performance from machines in a domain-generic manner, while other subfields deal with domain-relevant tasks, which we relate respectively to intelligence and expertise, two complementary ways to achieve high performance. In order to better grasp this notion and go towards proper implementations, such informal definitions are however not enough, and proper formalisation should be provided. This is the main incentive of this post: to formalise what we mean by generic and specific, two terms that we consider complementary, a step which should be completed with the formalisation of other concepts, like relevance and domain.

Consequently, our aim is to clarify the meaning that we assign to the notions of genericness and specificness, but at the same time we want this meaning to be the most natural, not to be a mere personal definition for the sake of our isolated work. Formalising natural language depends primarily on how we understand it, which is why we need to focus on a specific interpretation. We choose here to rely on dictionaries because we assume that they are the most natural to expect, first because they build on previous and actual usage, and second because our experience shows that they are often accepted as relevant references in discussions where people have different understandings of a term. Then, from the definitions which fit at best the notions we want to cover, we want to design a formal version that can be reused for computation purpose, in particular in AI and GP. Going further, in addition to consider both genericness and specificness, we also need to consider two variants of these notions: absolute and relative. If we take the example of genericness, by absolute we mean that something is generic, like a piece of code which is shared by a group of projects, while by relative we mean that something is more generic than something else, like a piece of code reused more often than another. In the case of specificness, we would say that something is specific with the absolute stance, and that something is more specific than something else with the relative stance.

Question

Our main question is, consequently, about the formalisation of these two notions of genericness and specificness from two perspectives, which leads us to consider actually 4 questions:

Although we speak here about properties, a piece of code X shared by several projects can be described as the property using the piece of code X shared by all these projects. Similarly, a task X can be described as the property executing the task X, or a skill X as the property having the skill X, so although we speak about properties, we just remain at a high level of abstraction to cover as much cases as possible.

Method

To answer these questions, we first need to get a better grasp of the usual meanings behind these terms, which is why we first look at official English dictionaries. Although dictionary definitions are important, they are still natural language definitions, and we will see below that they can fall short in providing a precise definition. So we also checkout the etymology of the words to better grasp their differences, and exploit these aspects to design additional concepts that helps us to better represent the meanings we want to cover. Finally, once the informal definitions are precise enough, we go through the formalisation process.

Dictionary Definitions

If we ignore definitions about brands and biology, Merriam-Webster defines something generic as relating to or characteristic of a whole group or class or having no particularly distinctive quality or application, in other words something which appears to be always present within a group of elements. We can highlight that it is not about mere redundancy, or even being common in the group, but something which is so frequent that every members of the group are concerned. The Cambridge dictionary remain on similar definitions, while the Oxford dictionary also defines it as not specific, which gives us a proper support for opposing the two terms.

This opposition gives us the opportunity to look also for the definition of specific. Starting from the Cambridge dictionary, which is the simplest, we say that something is specific when relating to one thing and not others. We ignore the alternative definition which makes it a synonymous of exact, which opposes more the notion of imprecision than genericness. Although the first definition is rather absolute by speaking about one thing only, the given examples illustrate mainly specificness applied to a group of things, like specific cells or specific purposes (in plural forms). Moreover, it highlights the presence of two kinds of elements: the ones which have the property and the one which do not, a perspective which differs from the previous notion of genericness which is expressed only based on the elements having the property. In other words, something specific appears to be something which pertains to a group of elements like generic, but at the opposite of generic which is shared by all the elements we may speak about, specific clearly entails the idea that there is other elements which do not pertain to this group. Looking at the Oxford dictionary, we retrieve the absolute definition of belonging or relating uniquely to a particular subject as well as the synonymous of exact. We also have a definition closer to the meaning we identified in the Cambridge dictionary, with relating to species or a species, but as a biological interpretation only. Other and more complex definitions are also given, but again without a clear idea of how it would oppose to the notion of genericness defined previously, which is surprising if we consider that this dictionary is the one defining generic as not specific. It is finally Merriam-Webster which provides the best description of the notion of specificness we want to represent. Indeed, although we retrieve the idea of specific as a synonym of exact, we also find several definitions of interest, like constituting or falling into a specifiable category, which we understand as something which can be expressed as a shared property for the elements of the category, a meaning better expressed by sharing or being those properties of something that allow it to be referred to a particular category. We also retrieve the idea of a unique individual with restricted to a particular individual, situation, relation, or effect, but again illustrated with a group (i.e. a disease specific to horses). We retrieve other definitions provided by the Oxford dictionary, but again they do not show a clear opposition to generic, thus we ignore them.

All these definitions show however a paradox: although we may interpret generic and specific as opposed notions, all these definitions highlight mainly their similarity, which is that both generic and specific can be used to speak about properties shared by a group of elements. However, there is a subtle detail which makes the difference: generic implies that a property is shared through all the elements of the group, while specificness implies also that at least some elements (out of the group) do not share this property. Indeed, if we say that formalising genericness is specific to this post, we mainly mean that the other posts on this blog speak about different topics (we will avoid to make any claim regarding the rest of the Web). At the opposite, being written in English is a generic property of this post, because everything here is written in English, but it does not say anything regarding the other posts. Once we have understood this difference, can we really consider that generic and specific oppose each other? If not, what is the opposite of a property being generic? And the opposite of a property being specific? To better grasp the different characteristics and properly choose our wording, let's go back to the roots of these words.

A Bit of Etymology for Additional Concepts

Generic comes from the Latin word genus, which means kind, type, species, or group. Specific comes from the Latin word specificus, which is the composition of speciēs (kind) and faciō (make). We see that both builds on the notion of kind, but with different terms: genus on one side and speciēs on the other. Speciēs is itself a derivation from speciō, which means observing or watching. As speciēs, we then speak about a view or appearance, thus a kind in the sense of an observable property, which leads us to interpret specificus as the ability to identify this property. To identify a property, one should at least be able to discern when it applies and when it does not (in a world where everything is red, what is the point of saying that something in paticular is red?), so it makes sense to speak about something specific when we are able to identify both elements having the property and elements not having it. On the other side, once we have established what is part of the group and what is not, then we can describe this part they all share, their shared gene, as a generic property of the group, independently of other elements out of it.

Once the etymology of the words is clarified, we can then think about words which could express the opposite meanings that we are looking for. In particular, if a generic property is a genus shared in the whole group, how to express that such a genus is partially or totally absent from the group? We may speak about common and uncommon properties, but these notions translate more the idea of majority: if a property is possessed by most of the elements of the group, it is common, while uncommon means that most of them do not have this property. Some may think about using relationships used in meronymy, such that a property is a member of a group, but it would be a mistake: the members of the group are elements having properties, not properties directly. We can say that a property is in (resp. out of) the group to avoid heavy sentences, but it should be interpreted as a property possessed by at least one element in (resp. out of) the group, not a direct membership relation between the property and the group. Consequently, a property can be possessed by elements in the group only, elements out of the group only, or shared by elements in the group and out of it, a case that meronymy does not cover (one cannot be member and not member of a group). Due to our inability to find existing words to properly convey these opposite meanings, we look at different prefixes and apply them to the notion of genericness:

a–
without, not
anti–
against
dis–
not, apart

Building on the prefixes above, we can interpret ageneric as a property without genericness in the group, such that it covers also properties which can be found in the group but for which we know that there is at least one element of the group not having it. We can interpret antigeneric as a property against genericness in the group, or more formally which negates a generic property of the group, such that it can be possessed only by elements out of the group. We can interpret disgeneric as a property apart from genericness in the group, which seems more ambiguous to us and could be interpreted in both the previous ways (i.e. apart from the ability of being generic or apart from elements having generic properties), so we restrict to the two first prefixes in the following. With these words, we do some neologism, hopefully not producing a mere idiolect, but we are not linguists, so we are open to suggestions for improvements. Of course we know that natural languages are subject to individual interpretations, and it is also true for prefixes like Lester M. Prindle remind us efficiently, but we want to use words which can support at best the meaning we intend them to represent, and the most objective way to achieve this goal seems to engineer terms precisely for this purpose with the most suited pieces we can find. In summary, as shown in the figure below, we start from all the possible properties (biggest circle), remove the generic properties to obtain the ageneric ones, and remove the properties still possessed by elements in the group to obtain the antigeneric ones.

Types of properties.
All Ageneric Antigeneric

Informal Summary Before Proper Formalisation

Now, several points can be clarified to better grasp how these concepts relate to elements in and out of the group, because they do not split the properties between properties only found in the group and properties only found out of it, as illustrated in the next figure. Indeed, if we consider all properties, there is properties which can be possessed by elements in the group and elements out of it, like the property being an element which is shared by all the elements. An ageneric property adds the constraint of having at least one element of the group not possessing it, but without constraining elements out of the group, thus it can again be possessed by both types of elements. An antigeneric property, at the opposite, negates a generic property of the group, which means that elements in the group necessarily do not possess this property (because they possess the generic one), thus restricting antigeneric properties to elements out of the group only. Several questions then may arise regarding how antigeneric properties and elements out of the group relate to each other. A first question is whether or not any element possessing an antigeneric property is out of the group, which by definition is true because an antigeneric property can only be possessed by elements out of the group, but what makes it interesting is to answer the next question. Indeed, we can also wonder whether or not any element out of the group possesses an antigeneric property, which can be answered positively because any group has a generic property being in the group negated by the antigeneric property being out of the group, property which is common to all the elements out of the group. From these two answers, we can already say that being out of the group is equivalent to having an antigeneric property, which already draws a strong tie between antigeneric properties and elements out of the group. However, despite this equivalence, it is not because an element is out of the group that all its properties are antigeneric: any element possesses the property being an element, which is thus a generic property for any non-empty group, so any element has at least a generic property. Not only this example shows that being out of the group does not mean having only antigeneric properties but, given a group, no element have only antigeneric properties. These are important aspects that should not be contradicted when formalising our concepts, including the case of the empty group described next.

Properties in/out of the group
All Ageneric In or out of the group Antigeneric Out of the group

The empty group is a particular case and should be carefully dealt with. First, we consider whether or not any property possessed only by elements out of the group is antigeneric, which can be answered positively if we assume the law of excluded middle, which states that, for any element, a property is either possessed or not possessed by it. With this assumption, a property possessed only by elements out of the group is negated by a property which is necessarily possessed by all the other elements, including all the elements in the group. This negation is thus a generic property for the group, leading the negated property to be, by definition, an antigeneric property. The empty group, however, brakes the logic: if the group is empty, then the negation is a generic property for the (empty) group, but any element having this generic property is itself out of the group, which makes this generic property also antigeneric, which is absurd (an element cannot be in and out of the group). The case of the empty group thus lead to an undefined genericness, such that one cannot say whether a property is generic, ageneric, antigeneric, or, as we see next, specific.

As we can see so far, all these terms apply depending on the group considered, so a property can be generic for a given group but ageneric or antigeneric for another one. Before to say whether a property is of a given kind, one should first specify the group of elements considered, which means that our formalisation cannot define them without including a dependence to this group either. This is another requirement for us to comply with in our formalisation.

We can now close our analysis by summarising our interpretation of genericness and specificness based on the defined concepts, as illustrated in the figure below. One may speak about a generic property for a group of elements, which is a property that every elements of the group has, while all the other properties are ageneric, which means that at least one element of the group does not have it. Antigeneric properties are properties negating these generic properties, so only elements out of the group can have them. Specific properties are the remaining: properties which are neither generic nor antigeneric, or ageneric properties which are not antigeneric, so not only they are not shared by all the elements of the group, but they are not totally absent either.

Generic and specific properties
Generic Specific Antigeneric In the group Out of the group Generic Ageneric

Formal Concepts to Build on

For our formalisation, we build on the mathematical notion of sets, such that a set \(S\) is a group of elements \(\{s_0, s_1, s_2, ...\}\). We can express the cardinality of this set through \(|S| \in \mathbb{R}^+\) in order to know how many elements it contains. We also need to speak about the properties possessed by these elements, which we model through predicates \(p: S \rightarrow \{\top, \bot\}\), so if \(p(s) = \top\) then \(s\) has the property \(p\), otherwise it does not have it or it has the opposite property \(\neg p\). Beside the choice of using sets and predicates, other basic characteristics can be reminded, especially because they provide some support for the proofs of the next subsections. We describe them below, show what they mean in our context from a formal perspective, and derive what we need for later reuse.

One of them is the law of noncontradiction, which means that en element cannot have a property and not have it, and from which we can infer that a set built on that assumption is necessarily empty:

\[\begin{align} \nonumber & \forall s \in S, p(s) \wedge \neg p(s) = \bot\\ \label{eq:noncontradiction} \Rightarrow& \{s \in S | p(s) \wedge \neg p(s) \} = \emptyset\\ \end{align}\]

The second one is the law of excluded middle, which assumes that an element either has a property or does not have it, nothing else (e.g. no uncertainty, no unknown state). Such an assumption implies that a set which combines elements having a property \(p\) and elements not having it simply contains every possible elements:

\[\begin{align} \nonumber & \forall s \in S, p(s) \vee \neg p(s) = \top\\ \label{eq:excludedMiddle} \Rightarrow& \{s \in S | p(s) \vee \neg p(s) \} = S\\ \end{align}\]

The third property we use relates to the cardinality of the union of sets, which should not count twice the elements that we find in both sets:

\[\begin{align} \label{eq:unionCardinality} |A \cup B| = |A| + |B| - |A \cap B|\\ \end{align}\]

By union of sets, we mean that a set \(S_1 = \{s \in S | p_1(s)\}\) and a set \(S_2 = \{s \in S | p_2(s)\}\) form a union \(S_1 \cup S_2 = \{s \in S | p_1(s) \vee p_2(s)\}\), while their intersection corresponds to \(S_1 \cap S_2 = \{s \in S | p_1(s) \wedge p_2(s)\}\). From all these characteristics, we can infer that the cardinality of a set \(S\) can be expressed as the sum of the cardinalities of two sets: the set of the elements having a property \(p\) and the set of elements not having it, which we formalise below:

\[\begin{align} \nonumber |S| &= |\{ s \in S | p(s) \vee \neg p(s)\}| &(\text{Eq \eqref{eq:excludedMiddle}})\\ \nonumber &= |\{ s \in S | p(s) \}| + |\{ s \in S | \neg p(s)\}| - |\{ s \in S | p(s) \wedge \neg p(s)\}| &(\text{Eq \eqref{eq:unionCardinality}})\\ \nonumber &= |\{ s \in S | p(s) \}| + |\{ s \in S | \neg p(s)\}|&(\text{Eq \eqref{eq:noncontradiction}})\\ \end{align}\]

The last line can also be rewritten as:

\[\begin{align} \label{eq:complementCardinal} |S| - |\{ s \in S | p(s) \}| &= |\{ s \in S | \neg p(s)\}|\\ \end{align}\]

Formalisation of Absolute Genericness and Specificness

Once the preliminary characteristics are identified, we can properly formalise our concepts of genericness and specificness based on the previous analysis of the terms. First, we define genericness as a property \(p\) shared by all the elements of a set \(S\):

\[\begin{align} \label{eq:formalGeneric} generic(p, S) &= \forall s \in S, p(s) \\ \end{align}\]

A specific attention should be paid to this definition, because it is subject to vacuous truth in the case of the empty set. Indeed, if the set is empty, we cannot find an element in it for which the property holds, but we cannot either find an element for which it does not hold, so it is equivalently true to say one or the other. As we discussed in the analysis above, the empty set breaks the logic we want to formalise, so we interpret vacuous truth as an undefined state. Once we have formalised a generic property, we can define the ageneric property which opposes it:

\[\begin{align} \label{eq:formalAgeneric} ageneric(p, S) &= \neg generic(p, S) \\ \nonumber &= \neg (\forall s \in S, p(s)) \\ \nonumber &= \exists s \in S, \neg p(s) &(S \neq \emptyset)\\ \end{align}\]

Notice the condition of the last line, which is due to the empty set leading to a vacuous truth: because both a property and its negation can be considered as true, there is no reason to infer the last line, which would be false in the case of an empty set rather than undefined. Going further, if a property is totally absent from the set (a definition also subject to vacuous truth), it is antigeneric:

\[\begin{align} \label{eq:formalAntigeneric} antigeneric(p, S) &= \forall s \in S, \neg p(s) \\ \end{align}\]

Thus, we see our two ways of negating genericness: ageneric is about negating the whole relation (\(generic\) vs. \(\neg generic\)) while antigeneric is about negating the generic property (\(p\) vs. \(\neg p\)). From these definitions, we can then formally define specificness, stating like Fig. 3 that specific properties are neither generic nor antigeneric:

\[\begin{align} \label{eq:formalSpecific} specific(p, S) &= \neg generic(p,S) \wedge \neg antigeneric(p, S) \\ \nonumber &= \neg (\forall s \in S, p(s)) \wedge \neg (\forall s \in S, \neg p(s)) \\ \nonumber &= (\exists s \in S, \neg p(s)) \wedge (\exists s \in S, p(s)) &(S \neq \emptyset)\\ \label{eq:formalSpecificDetailed} &= \exists s_1, s_2 \in S, \neg p(s_1) \wedge p(s_2) \\ \end{align}\]

This formalisation properly describes that a specific property is a property which applies at least to some elements in the set but not all of them, making the property specific to some elements in the set. It is important to not read \(specific(p, S)\) as \(p\) being specific to the set \(S\), but being specific in the set, such that we can find a subset to which it is specific. In other words, there is a set \(S_p \subset S\), which can be called the reduction of \(S\) through \(p\), such that \(0 < |S_p| < |S|\) (it is neither the empty set nor \(S\) itself) and \(generic(p, S_p)\). At the opposite, the reduction of the set through a generic property leads to the set itself, while a reduction through an antigeneric property leads to the empty set. This notion of reduction is formally described in the next section, where we use it extensively to establish an order between properties, but before to speak about orders, we should close this section, which still did not consider the extreme cases and how they impact our definitions.

Although reductions can lead to having an empty set, we can also wonder what happens when \(S\) itself is empty: is there generic or specific properties with an empty set? We already saw that our generic and antigeneric formalisation are subjects to the vacuous truth in the case of the empty set, and we can see that it is also the case for specific which requires, for the same reasons, the set to be non empty to derive Equation \eqref{eq:formalSpecificDetailed} from Equation \eqref{eq:formalSpecific}. But specific is more fundamentally impacted by vacuous truth, which leads to consider that a property both holds and does not hold, so we can say that this property is generic and antigeneric, inferring from Equation \eqref{eq:formalSpecific} that it is not specific. But similarly, because it does not hold and holds, we can also say that it is neither generic nor antigeneric, such that we can infer from the same equation that it is, indeed, specific. Thus, like we interpret this kind of paradox as an undefined state for generic and antigeneric, we also consider it as an undefined state for specific (and ageneric which directly depends on generic), which allows us to remain consistent with the requirements established in section 3.3.

Another extreme case is when the set has a single element. Indeed, although one can say whether a property is generic or antigeneric depending on whether the single element has the property or not, specific requires 2 elements to be established. More precisely, Equation \eqref{eq:formalSpecificDetailed} defines a specific property by having at least 2 elements because one element cannot have and nothave the property, which means that if one of them is missing, Equation \eqref{eq:formalSpecificDetailed} returns false. This is confirmed from the initial definition, provided by Equation \eqref{eq:formalSpecific}, which clearly states that a property is specific when it is neither generic nor antigeneric. Indeed, a set of one element can have only two cases: whether the property is possessed by this element and it is a generic property, or it does not and it is antigeneric, which means that in any case the property cannot be specific.

Formalisation of Relative Genericness and Specificness

So far, we focused on the absolute evaluation of genericness and specificness, which allows to say whether or not a property is generic/specific, and we need now to formalise the relative perspective, where a property is more generic/specific than another. This leads to the need of describing degrees or levels of genericness and specificness, which we will call genericity and specificity to differentiate them from their absolute counterparts generic and specific. To express degrees, we need to establish an order between properties based on some measures but, although a generic property can be considered as having more genericity than a specific one, we cannot order two specific properties based solely on the previous definitions. To fix that, we will build on the previous observation we made, such that a property \(p\) specific in a set \(S\) can reduce the set to a subset \(S_p\) where this property is generic. Thus, we need to formally define this reduction process, and we do so as follows:

\[\begin{align} \label{eq:formalReduction} reduction(p, S) &= \{s \in S | p(s)\} \\ \end{align}\]

This reduction is nothing more than a filter function which uses the predicate \(p\) to filter out irrelevant elements from \(S\), which allows us to infer several things. First, if a property is generic to a set, then all the elements of the set has the property, such that no element is irrelevant, or more formally \(generic(p_1, S) \Rightarrow reduction(p_1, S) = S\). At the opposite, if a property is antigeneric to a set, then no element of the set has the property, such that all elements are irrelevant, or more formally \(antigeneric(p_2, S) \Rightarrow reduction(p_2, S) = \emptyset\). In between, we can speak about specific properties, which are neither generic nor antigeneric, which means that their reductions will be at the same time smaller than the set but also non empty, or more formally \(specific(p_3, S) \Rightarrow reduction(p_3, S) \in 2^S \setminus \{S, \emptyset\}\). From this, we can establish an order between the reductions of the properties, such that \(reduction(p_2, S) \subset reduction(p_3, S) \subset reduction(p_1, S)\). We can also see that this kind of order applies also between specific properties if the reduction of one specific property is included in the reduction of another specific property.

Although this subset ordering provides a nice support to express degrees of genericness and specificness, it is still incomplete, because we cannot order two specific properties if they are disjoint or if they only partially overlap. We may consider them as equal, but if we take a set of 10 elements and split it into two disjoint subsets of 9 elements and 1 element, it seems hard to motivate that the subset of 9 elements has as much specificity than the subset of 1 element. Consequently, instead of considering an order based on the reductions themselves, we consider an order based on their cardinality \(|reduction(p, S)|\), which allows to properly compare any reductions. In the following, when we speak about the cardinality of a property, we speak about the cardinality of the reduction of the set based on this property.

If we consider the three properties we used before, the generic property \(p_1\) has a cardinality of \(|S|\), the antigeneric \(p_2\) has a cardinality of 0, and the cardinality of a specific property \(p_3\) is in ]0;1[. Thus, we obtain the same order than before but based on which value is inferior to the other rather than which reduction is included in the other. If the reduction of a specific property is included in the reduction of another specific property, the cardinality of the former is also lower than the cardinality of the latter, giving again the same order than before. But additionally, if two reductions are disjoint or overlapping, their cardinality is equal only if they have the same number of elements, while a bigger (resp. smaller) reduction gives a higer (resp. lower) cardinality to its property.

With this cardinality computation, we obtain the order that we want, but we can go a step further to properly relate the formalisation of the relative perspective with the one of the absolute perspective. Indeed, if we consider the values we compute so far, it is hard to tell which kind of property we have, excepted for antigeneric properties which are the only ones having a cardinality of 0. If the cardinality of a property is 3, we need to know how many elements are in the set to know whether the property is generic (i.e. set of 3 elements) or not (i.e. more than 3 elements), and in order to understand to which extent it is generic (e.g. 3 in a set of 4 is close to generic, 3 in 300 is not). To simplify the interpretation we can normalise the value based on the cardinality of the set, such that we obtain a proportion (a value in [0;1]), with 0 for antigeneric properties, 1 for generic ones, and values in between for specific properties, depending on how close they are to be generic. This normalisation allows to use especially the convenient representation of percentages by multiplying by 100: generic properties have a genericity of 100%, antigeneric properties have a genericity of 0%, and specific properties are in between. In summary, we define the genericity of a property in a given set as:

\[\begin{align} \label{eq:formalGenericity} genericity(p, S) &= \frac{|reduction(p, S)|}{|S|} \\ \end{align}\]

The next figure shows how the genericity evolves based on the size of the reduction, showing especially the linear evolution from antigeneric properties at 0 to generic properties at 1. Obviously, the line is not continuous because we deal with sets of discrete elements, but it is an important aspect that we will exploit later, so it is a good thing to highlight it now. It is even more important to notice that a property is more generic than another given a set \(S\), while comparing \(genericity(p_1, S_1)\) and \(genericity(p_2, S_2)\) does not allow to tell which one is more generic than the other because they are computed on different contexts. In particular, one could choose the right sets to have one property more generic than another, and another pair of sets to reverse their order.

Genericity function.
0
\(|S|\)
\(|reduction(p, S)|\)
0
1
\(genericity(p, S)\)

Now that we know how to compute a proportion of genericness through genericity, we may design a proportion of specificness through specificity. However, we saw in our preliminary analysis that generic and specific are not formally negating each other, in the sense that a generic property is negated by an ageneric property, while a specific property is not only ageneric but also not antigeneric. Similarly, it seems unreliable to define specificity simply as opposed to genericity because a specificity of 1 would mean that the property is antigeneric, which conflicts with our definition of specific given in Equation \eqref{eq:formalSpecific}. Moreover, generic properties correspond to a genericity of 100%, while ageneric properties correspond to any other value, thus it seems hard to motivate the usefulness of an agenericity measure, unless we count the elements out of the group, which is not our intent. Rather, we choose to define the antigenericity to measure this negation:

\[\begin{align} \label{eq:formalAntigenericity} antigenericity(p, S) &= 1 - genericity(p, S) \\ \nonumber &= 1 - \frac{|reduction(p, S)|}{|S|} &(\text{Eq. \eqref{eq:formalGenericity}})\\ \nonumber &= \frac{|S|}{|S|} - \frac{|reduction(p, S)|}{|S|} \\ \nonumber &= \frac{|S| - |reduction(p, S)|}{|S|} \\ \nonumber &= \frac{|S| - |\{s \in S | p(s)\}|}{|S|} &(\text{Eq. \eqref{eq:formalReduction}})\\ \nonumber &= \frac{|\{s \in S | \neg p(s)\}|}{|S|} &(\text{Eq. \eqref{eq:complementCardinal}})\\ \label{eq:formalAntigenericityDetailed} &= \frac{|reduction(\neg p, S)|}{|S|} &(\text{Eq. \eqref{eq:formalReduction}})\\ \end{align}\]

Then, how illustrated in the next figure, an antigenericity of 0 means that the property is generic, and antigenericity of 1 (or 100%) means that the property is antigeneric, and a value in between that the property is specific. We obtain again an order, like with genericity, but reversed such that \(antigenericity(p_2, S) > antigenericity(p_3, S) > antigenericity(p_1, S)\). Of course, like for genericity, this order also applies to specific properties depending on the size of their reductions, thus being defined whether or not they overlap.

Antigenericity function.
\(antigenericity(p, S)\)

Finally, what about specificity? Should we consider that antigenericity is the formal definition we build on when saying that a property is more specific than another? If we focus on properties which are not antigeneric, the order is indeed consistent, because properties which apply to a lower subset of elements are considered as more specific. However, this order fails by providing to an antigeneric property a higher value than any specific property, while we defined a specific property as not antigeneric. To remain consistent, antigeneric properties should have a lower value than any specific property, in other words at most a value of 0 like generic properties, while only specific properties should have a higher value. Moreover, the highest value should be reached by the most specific properties, which we assume to be the properties possessed by a single element in the set. Summarised in the next figure, we need in fact a function which is similar to the antigenericity excepted that it reaches 1 just before the reduction looses all its elements, and step back to 0 when reaching the empty reduction. Such a function, however, clearly takes a different semantic than the previous ones: while (anti)genericity builds on reaching 1 only for properties which are (anti)generic, specificity builds on reaching 0 only for properties which are not specific.

Specificity function.
1
\(specificity(p, S)\)

To design such a function, we first focus on the non empty reductions, such that we obtain the proper line, which means that we want to build a linear function of the shape \(y = a.x + b\), with \(y\) our specificity and \(x\) the size of the reduction. To do so, we consider 2 points: the case of a generic property, called \(p \rightarrow S\), which should return a specificity of 0, and a property reducing to 1 element, called \(p \rightarrow 1\), which should return a specificity of 1. Then, we need to identify the slope and intercept of this function based on these points, such that:

\[\begin{align} \nonumber specificity(p, S) &= slope(p, S) |reduction(p, S)| + intercept(p, S) \\ \end{align}\]

The slope is the ratio between the difference of specificities (\(\Delta_y\)) and the difference of reductions (\(\Delta_x\)):

\[\begin{align} \nonumber slope(p, S) &= \frac{specificity(p \rightarrow S, S) - specificity(p \rightarrow 1, S)}{|reduction(p \rightarrow S, S)| - |reduction(p \rightarrow 1, S)|} \\ \nonumber &= \frac{0 - 1}{|S| - 1} \\ \nonumber &= \frac{-1}{|S|-1} \\ \end{align}\]

The intercept is the specificity value (\(y\)) reached when the reduction is empty (\(x = 0\)), which can be computed based on one of our points and the slope we just calculated:

\[\begin{align} \nonumber intercept(p, S) &= specificity(p \rightarrow S, S) - slope(p, S) |reduction(p \rightarrow S, S)| \\ \nonumber &= 0 - \frac{-1}{|S|-1}|S| \\ \nonumber &= \frac{|S|}{|S|-1} \\ \end{align}\]

By putting everything together, we obtain the expression of the specificity for the properties which are not antigeneric:

\[\begin{align} \nonumber specificity(p, S) &= \frac{-1}{|S|-1} |reduction(p, S)| + \frac{|S|}{|S|-1} \\ \nonumber &= \frac{|S|-|reduction(p, S)|}{|S|-1} \\ \nonumber &= \frac{|S|-|\{s \in S | p(s)\}|}{|S|-1} &(\text{Eq. \eqref{eq:formalReduction}})\\ \nonumber &= \frac{|\{s \in S | \neg p(s)\}|}{|S|-1} &(\text{Eq. \eqref{eq:complementCardinal}})\\ \label{eq:formalSpecificityPartial} &= \frac{|reduction(\neg p, S)|}{|S|-1} &(\text{Eq. \eqref{eq:formalReduction}})\\ \end{align}\]

This expression is almost equal to the antigenericity expression because it builds on the same shape, excepted that the normalisation factor is based on \(|S|-1\) elements instead of \(|S|\). Now, this expression does not fit for antigeneric properties, because the value reached when the reduction is empty is \(\frac{|S|}{|S|-1}\) (the value of the intercept) instead of 0. To replace this value, we may split the definition of the function, such that different ranges have different definitions, but because we deal with discrete values, we can actually keep everything together while applying an operation which changes the value only when the reduction is empty. This operation is the modulo, which is usually defined as an application of a divisor (\(d\)) to a dividend (\(x\)) to obtain a remainder (\(x ~ mod ~ d\)), and which can be used as a periodic reset function. Indeed, instead of using directly the cardinality of the reduction, we can use its modulo such that, when the reduction reaches a size of \(|S|\), which is reach only when the property is antigeneric, it resets to 0. In other words, we slightly refine the previous specificity function in the following way:

\[\begin{align} \label{eq:formalSpecificity} specificity(p, S) &= \frac{|reduction(\neg p, S)| ~ mod ~ |S|}{|S|-1} \\ \end{align}\]

In the case where the line would have been continuous, the modulo would have moved only the very first point of the curve, which means that we would have values with a specificity higher than 1 for reductions having a size between 0 and 1. But because we work with discrete values, such that no size exists between 0 and 1, using the modulo is enough to obtain the function shown in Fig. 6.

Once again, we can analyse how our expressions are impacted by the set \(S\) in extreme cases. When the set is empty, the genericity computed with Equation \eqref{eq:formalGenericity} is undefined because of the denominator, while antigenericity is undefined by definition with Equation \eqref{eq:formalAntigenericity} and because of the denominator in Equation \eqref{eq:formalAntigenericityDetailed}. Specificity is also undefined with Equation \eqref{eq:formalSpecificity}, because the modulo is undefined for a null divisor: building a modulo implies to divide by the divisor, thus it is again undefined because of a null denominator. All these undefined values complies with the requirements established in section section 3.3. The other extreme case is when the set has only one element, such that the property either applies, leading to a genericity of 1 and an antigenericity of 0, or not, leading to a genericity of 0 and an antigenericity of 1. For the specificity computation, Equation \eqref{eq:formalSpecificity} gives us an undefined value because of the denominator, which is in disagreement with the absolute case, which states that a property is necessarily not specific if the set has only one element. This is a limitation which may lead to a revision of our equation of specificity, but so far this is the best expression we could establish (beside splitting the definition in two).

Answer

We asked several questions, thus we provide an answer for each of them:

Bibliography