Abstract. This paper discusses differences between two major schools in philosophy of criminal law, retributivism and consequentialism, with regard to the risk of (unintentionally) punishing the innocent. As it is argued, the main point of departure between these two camps in this respect lies in their attitude towards the high evidentiary threshold in a criminal trial: while retributivism seems to strongly support setting this standard high, consequentialists may find it desirable to relax it in some cases. This discussion is set in the context of proxy criminalization, i.e. a situation, in which some suspicious behaviour (i.e. behaviour that is only in some correlation with wrongful conduct, while not being substantially wrongful in itself) is criminalized. Since proxy criminalization may be understood as an effective lowering of the evidentiary threshold, its employment is justifiable from the consequentialist perspective, while being highly problematic for the retributivists.

1. Introduction ^{^[1]}

According to the standard retributivist approach to criminal law, punishment should be imposed on the guilty, and only on the guilty, according to what they deserve. From that, it follows that no innocent individual should be punished, nor should any guilty individual be punished more than she deserves. On the other hand, consequentialism traditionally argues that criminal law should distribute punishment in a way that maximizes some desirable consequences, such as social welfare.

As it has been argued many times by the proponents of retributivism, a major point of departure between these two approaches which dominate the philosophy of criminal law lies in their attitude towards the possibility of knowingly punishing the innocent. While retributivism unconditionally prohibits knowingly imposing punishment on an innocent individual, consequentialism is supposed to lack such a safeguard, which puts it at odds with common moral intuitions. A typical retributivist argument of this kind involves an invented case in which a law enforcer faces a choice between framing an innocent individual or to risk triggering a lynching in which many people are likely to be killed. ^[2] A utilitarian law enforcer, retributivists claim, would choose to frame the innocent individual whenever the harm caused by the wrongful conviction is smaller than the expected detrimental consequences of the riot. If true, this argument would be most damaging to the moral legitimacy of consequentialism in criminal law: a movement that was historically born as a protest against the abuse of discretion in the infliction of punishment would turn out to allow one to intentionally punish the innocent, an outcome morally unacceptable for most consequentialists themselves. ^[3]

In reply, consequentialists use one of two standard counterarguments. First, a law enforcer that would choose to frame the innocent individual in such a case would be, at best, a short-sighted consequentialist. A more thorough consequentialist analysis would easily show that, in the long run, the bad consequences of allowing the punishment of innocents (such as the possibility of abuse of power, or a decrease in the legitimacy of the legal system, resulting in the reduced willingness to comply with the law and cooperate with law enforcers and so on) would be larger than any possible short-term benefits. ^[4] Secondly, and more fundamentally, consequentialists argue that cases of this kind are excessively stylised, fanciful, and are not likely to ever happen in the actual world. ^[5] Thus, even if the theoretical possibility of consequentialism recommending the conviction of the innocent is of some interest for academic moral philosophy, it lacks any relevance in discussions on real-word criminal law policy.

However, if we agree that consequentialism is highly unlikely to recommend that one knowingly convict the innocent, this does not end the discussion. This is because it may be argued that retributivism and consequentialism give different recommendations when it comes to the issue of managing the risk of unintentional wrongful convictions. Due to the inevitable errors in the operations of the police and the judicial system, this risk can hardly be eliminated. ^[6] In modern legal systems, this risk is mitigated by the introduction of high evidentiary threshold in criminal trial. Typically, the criminal law requires that, in order to convict the defendant, the court has to be almost certain with regard to his guilt (or, as it is phrased in Anglo-American law, the guilt has to be proved ‘beyond a reasonable doubt’). Thus, modern law usually strikes the trade-off between wrongful convictions and wrongful acquittals by putting much more weight on minimizing the former, ^[7] i.e. setting a high evidentiary threshold for conviction. ^[8]

As it has been recently argued in the literature (and as it will be presented in this paper), retributivism, because of its strong aversion towards punishing the innocent, may provide unequivocal support for the uniformly high evidentiary threshold in criminal law. In contrast, consequentialism can support some relaxation of this high standard of proof in cases where it may be expected to generate overall better consequences. ^[9]

This difference between retributivism and consequentialism with regard to the desirable evidentiary threshold has some practically important consequences. The one to be examined in detail in this paper is the assessment of proxy criminalization, i.e. the criminalization of conduct that is assumed to be suspicious (that is, to be in a significant correlation with some other wrongful behaviour) even if it is not particularly wrongful in itself. As it will be pointed out in this paper, proxy criminalization is effectively a tool for lowering the evidentiary threshold and thus it can be, in principle, justified within the consequentialist framework while remaining possibly inconsistent with retributivist views. Since proxy criminalization is quite persistent in contemporary legal systems, such a conclusion would imply that consequentialism may be more descriptively accurate with regard to the actual operation of modern systems of criminal justice.

Proxy criminalization, despite its seemingly tremendous practical importance, has not yet received much attention in philosophical literature. This paper is the first, to my knowledge, to aim at assessing the moral status of proxy crimes from both retributivist and consequentialist perspectives. As it will be argued, proxy crimes are inconsistent with what appears to be the most plausible interpretation of retributivism, which may explain some controversies (including those in case law) surrounding actual instances of proxy crimes. I will suggest that this outcome may result from the fact that many retributivists focus on minimizing the ‘particular’ risk of punishing the innocent while ignoring the ‘global’ risk. Even though in this paper I do not side with any of the two analysed schools of thought, I would find such an implication of retributivism to be a bit cumbersome.

2. Retributivism, consequentialism, and the evidentiary threshold in a criminal trial

Whether retributivism provides unequivocal support for the high evidentiary threshold is, however, a somewhat more difficult question than suggested in the previous section. Let us once again note that the retributivist philosophy of punishment is generally based on the statement that the punishment should be imposed on the guilty (and only on the guilty) according to what they deserve. ^[10] The notion of desert (‘what they deserve’) is notoriously hard to define and its interpretation varies largely across retributivist theories. However, many retributivists would agree that the punishment should be proportional to the offender’s ‘blameworthiness’, which in turn is determined by the ‘wrongfulness’ of the act and the degree of ‘blame’ attributable to the offender.

Basic retributivism as presented here gives two recommendations: the state should (1) refrain from punishing innocents and (2) punish the guilty as they deserve it. ^[11] Setting a high evidentiary threshold facilitates the achievement goal (1) at the cost, however, of goal (2). Therefore, basic retributivism is not able to provide us with the desirable ratio of false convictions to false acquittals, at least as long as we have not specified a desirable trade-off between goals (1) and (2).

Since basic retributivism so-understood seems to be unable to give any justification for the high evidentiary threshold, we need to consider more nuanced retributivist theories. The most promising option is negative retributivism, which claims only that the state must neither punish innocents nor punish the guilty more than they deserve; but it does not have any moral obligation to punish all offenders. ^[12] Negative retributivism is shared mostly by philosophers who believe that, in principle, the punishment should be distributed in a way that facilitates the achievement of some consequentialist aims (first of all, limiting the level of crime) but who, at the same time, are afraid that pure consequentialism may lead to morally unacceptable results. ^[13] Thus, negative retributivism is supposed to work as a side-constraint: the punishment is to be applied in accordance with some consequentialist criteria as long as these criteria do not lead us to punishing innocents or punishing the guilty too harshly.

As long as it seems to just put more weight on avoiding wrongful convictions than wrongful acquittals, negative retributivism appears to provide a better argument in favour of high evidentiary threshold. However, before we agree with that, we have to deal with yet another problem: is negative retributivism, understood as prohibiting the application of punishment to innocents under any circumstances, tenable? Ultimately, any criminal law system (bar the hardly conceivable one in which the standard of proof requires absolute certainty) unavoidably leads to some wrongful convictions, as long as it is not possible to totally get rid of factual mistakes. ^[14] Thus, in this respect there is nothing qualitatively specific about a legal system with a relaxed evidentiary threshold below the beyond a reasonable doubt (BARD) standard: it would fail to satisfy the criterion of negative retributivism exactly as any feasible criminal law system does.

However, this is not that much of a problem if we notice that negative retributivists seem to understand retributivist side-constraints to be of agent-relative nature. ^[15] Thus, what constitutes a fundamental moral wrong is not the fact that the criminal justice system in general produces some wrongful convictions but rather a particular setting in which the court convicts the defendant despite some non-negligible uncertainty with regard to her guilt. The negative retributivist should not worry much that the operation of criminal justice in general may cause some harm to innocents, just as he does not worry much that virtually any human activity or social institution may cause some harm to third parties. But, on the other hand, being punished means being condemned for committing a wrong, negative retributivists say, thus there is an agent-relative norm prohibiting imposing punishment without having knowledge that the defendant is guilty. ^[16]

It is important not to misread retributivist accounts as simply putting more weight on avoiding false convictions. In contrast to the consequentialists, who basically see the evidentiary threshold as a tool for striking a socially-desirable trade-off between Type 1/Type 2 errors, ^[17] retributivists tend to treat the BARD standard as a ‘directly morally-grounded principle’, ^[18] having an intrinsic moral value independent of any instrumental considerations. According to Patrick Tomlin, the high evidentiary threshold may be directly grounded by affirming two moral principles: the objective one (“punishment is only appropriately directed toward those who have performed punishment-worthy wrongs and have yet to receive the appropriate punishment”) and the subjective one (“punishment should only be directed toward those who we are sure beyond any reasonable doubt fall into the category described in [the objective principle]”). While the first principle seems to be acceptable for anybody who is not a thorough consequentialist, acknowledging the second, subjective, principle (and, subsequently, treating the BARD standard as a directly morally grounded principle) depends on the way we decide to deal with our epistemic limitations. ^[19] As should be visible from what we have discussed above, negative retributivists have no problems with acknowledging the subjective principle, because of their strong aversion to striking any explicit trade-off between the minimization of the risk of punishing the particular innocent person and other values (which Tomlin calls the ‘overriding approach’).

Tomlin claims that there is another path to acknowledging the subjective principle: any person that believes that the avoidance of wrongful convictions vastly outweighs the avoidance of wrongful acquittals should embrace this principle (the ‘outweighing approach’). This means that at least some consequentialists could embrace the subjective principle as well. This does not seem correct to me. As we will see shortly, there exists a possibility, at least theoretically, that relaxing the evidentiary standard may lead to a decrease in the global number of wrongful convictions. This possibility has been neglected in the literature on the philosophy of criminal law. However, as long as it is not dealt with, it remains questionable whether the ‘outweighing approach’ may lead to the direct moral grounding of the BARD standard. Thus, it appears to me that only the retributivists can treat the BARD standard as directly morally grounded without any qualifications.

When it comes to proponents of consequentialism, ^[20] we can see that until recently many of them would find some arguments in favour of the high evidentiary threshold in a criminal trial. Such consequentialists argued that the ratio of wrongful convictions to wrongful acquittals should be set in a way that minimizes the total social costs of legal errors. ^[21] Since, as it has been argued for many centuries, the social cost of wrongful conviction is, on average, significantly larger than the cost of wrongful acquittal (because a false conviction generates a huge deadweight loss due to loss of freedom and social stigma), society should be more interested in avoiding wrongful convictions than in preventing actual criminals from getting off the hook. Thus, the high evidentiary threshold in criminal law was supposed to be consistent with this ‘error-cost minimization’ framework.

The error-cost minimization framework has recently come under fire. The main objection to it is the fact that, focusing on the ex post costs of adjudication, it totally neglects the effects that the evidentiary threshold has ex ante on the behaviour of both potential criminals and innocents. The higher the standard of proof is set, the lower, ceteris paribus, is the likelihood of conviction (and, subsequently, the expected punishment), so the lower is the deterrence effect, and vice versa. However, the deterrence effect means here both deterring socially wasteful activities and chilling benign behaviour. Therefore, the social-welfare maximizing evidentiary threshold is supposed to maximize the difference between the benefit gained due to deterring wrongful activity and the cost of chilling benign activity.

Louis Kaplow, who developed the most elaborate welfarist theory of burden of proof to date, ^[22] uses exactly this finding, claiming that the optimal threshold occurs when the marginal benefit of deterring wrongful behaviour equals the marginal cost of chilling benign activity (i.e., respectively, both sides of the following equation):

The first variable on both sides of the equation, i.e. the expected sanction, is determined by three factors: the sanction, the probability of being charged, and the probability of being convicted conditional on being charged. ^[23] The last of these factors is determined by the evidentiary threshold, so we can see that marginal changes in the evidentiary threshold enter the equation via the first variable on both sides. ^[24] The second variable on both sides depicts how many acts are concentrated at the margin, so that multiplying first two variables gives us the number of acts deterred by a marginal change in the evidentiary threshold. Finally, the last variables, expressing the gain per deterred act, differ somewhat. For wrongful acts, it is calculated as the difference between the social cost of an act and whatever private benefit a criminal enjoyed because of it. For benign activity, we assume that they do not generate any external cost (nor, for simplicity, any external benefit), so we are interested only in the private gain of an innocent.

Kaplow’s analysis, as presented so far, is applicable to any model of adjudication aimed at deterring wrongful behaviour. However, as we remember from the earlier discussion, criminal law is somewhat unique because criminal sanctions are socially costly, so the standard of proof in criminal trial is supposed to be higher than in other areas of law (because it is socially preferable to put a greater weight on preventing wrongful convictions). So, would the inclusion of social costs of wrongful convictions substantially change Kaplow’s model? Perhaps surprisingly, the answer is not much. A quite shocking consequence of the model is the observation that increasing the standard of proof does not necessarily lead to a reduction in the number of wrongful convictions (actually, it may increase this number). To see it, let us notice that, according to the model, increasing the standard of proof leads to an increase in the number of benign acts. But that means that more innocents will be brought before the court. Increasing the standard implies that a smaller proportion of them will be convicted but, if their number increased enough, it is possible that, in absolute numbers, more of them will be punished. Therefore, without knowing parameters like the concentration of marginal benign acts it is impossible to predict whether a high evidentiary threshold will succeed in reducing the number of wrongful convictions.

The basic lesson from Kaplow’s model is that the socially optimal evidentiary threshold depends on some empirical parameters which are likely to be significantly different for different categories of crimes. Thus, it is unreasonable to believe that the uniformly high evidentiary threshold in criminal law maximizes social welfare. In the welfarist framework, the threshold should be more diversified and, at least for some categories of crimes, it may be desirable to lower it below the current BARD standard. Another lesson is that, at least in theory, some relaxation of the evidentiary standard may decrease the number of wrongful convictions.

3. Proxy crimes

The analysis presented in this paper thus far assumes that the high evidentiary threshold (the BARD standard) is actually in place in the majority of modern criminal law systems. This assumption seems to be prima facie true, since in virtually all modern jurisdictions the high evidentiary threshold is explicitly stated in procedural criminal law. However, the real picture may be somewhat more complicated. As it has been recently suggested, ^[25] the actual evidential threshold may be effectively lowered (in comparison with the one specified in procedural law) if some institutions are introduced into substantive criminal law. An example of such an institution to be analysed in detail in this paper is a proxy crime.

Although the notion of a proxy crime is quite recent in criminal law literature, the basic idea behind it can be traced as far back as the legal writings of Jeremy Bentham. ^[26]

The fourth class [of accessory offences] is composed of presumed offences [i.e. proxy crimes in our terminology], that is, of acts which are considered as proofs of an offence. They may be called evidentiary offences; acts injurious or otherwise in themselves, but furnishing a presumption of an offence committed. ^[27]

This definition is almost identical to the one provided almost 200 years later by Richard H. McAdams: proxy crimes prohibit behaviour that, “while not inherently risking harm, stands in for behaviour that does risk harm.” ^[28]

McAdams’ definition highlights two main characteristics of proxy crimes as they are understood in this paper. First, while criminalization is usually justified by the fact that criminalized behaviour causes harm or generates unacceptable level of risk of harm, ^[29] this is not the case with proxy crimes, simply because they do not generate any substantial risk of harm. ^[30] Second, proxy crimes are supposed to be in a significant correlation with some (supposedly prohibited) harmful behaviour. ^[31] Therefore, from the fact that an individual exhibits the suspicious behaviour described by a proxy crime, we can infer, with some substantial likelihood, that she also commits the underlying crime.

The third characteristic of proxy crimes, not explicitly stated in the definitions provided above but commonly assumed in the literature, is the justification for their introduction: evidentiary problems. We encounter proxy crimes in situations in which the statute prohibiting an underlying crime contains elements that may be hard to prove in some circumstances. A corresponding proxy crime lacks these ‘hard-to-prove’ elements, which often makes it possible to convict the defendant of a proxy crime in situations in which committing the underlying crime is hard to prove beyond a reasonable doubt. ^[32]

Table 1. Proxy crimes of type AB
Proxy crime	Underlying crime	Omitted element
Illegal gratuities	Bribery	Quid pro quo
Statutory rape	Rape	Lack of consent
Carrying weapon inside an airport	Attempt of terrorist attack	Intent to carry out a terrorist attack

Following William J. Stuntz, ^[33] we can think about two general ways in which a proxy crime may be designed. Let us start with an underlying crime whose description contains elements ABC, out of which at least one (C) may be hard to prove. Proxy crimes of the first type (AB) are created just by removing the troubling element from the description of the underlying crime. The criminalization of accepting gifts by public officials is a good example of this type. The underlying crime (bribery) usually requires two elements to be proven: (A) a public official accepts some items of value, (B) this act is intended to influence the actions of the official. However, even in situations in which element A is easily observable, it may be prohibitively hard to prove element B (quid pro quo). Therefore, in many jurisdictions the standard prohibition of bribery is supplemented with a proxy crime prohibiting public officials from accepting any gifts, irrespective of the presence or lack of the quid pro quo element. In other words, it is assumed that the mere acceptance of a gift is already suspicious enough (correlating with bribery to a level which is high enough) that it should be punishable even when it is impossible to prove the quid pro quo element (however, as with other proxy crimes, acceptance of gifts tends to be punished more leniently than proper bribery).

Table 2. Proxy crimes of type DEF
Proxy crime	Underlying crime
Possession of drugs over the specified quantity	Drug trafficking
Driving with an open container of alcohol inside the car	Driving under influence
Bulk cash smuggling	Money laundering

The other type of proxy crime results when the legislature creates a completely new crime DEF, which denotes behaviour that is assumed to correlate with ABC, while being typically easier to prove. Harsh penalties for the mere possession of illegal drugs in quantities exceeding the statutory threshold can serve as an example of this type. Possession of drugs is not in any necessary way an indicator of participation in drug dealing schemes, nevertheless in many jurisdictions it is assumed that the possession of drugs over some threshold does correlate with drug dealing to a significant level. Thus, an individual caught with over 28 grams of cocaine should be punished harshly, even if there is nothing else to indicate that she is a drug dealer rather than just a person possessing drugs only for personal use. ^[34]

As we saw, Jeremy Bentham in his legal writings used the label ‘presumed’ or ‘evidentiary offence’ to denote what we call here a proxy crime. The wording chosen by the English jurist is quite fortunate because it points out the role the proxy crimes played in his theory: they were nothing more than statutory presumptions, aiming at correcting possible shortcomings of the evidentiary inferences taking place at a criminal trial. ^[35]

The English legislature fearing that juries, too prone to lenity, would not see in these presumptions a certain proof of guilt, has thought fit to erect the Act which furnishes the presumption into a second offence, an offence distinct from every other. In those countries in which a perfect confidence is placed in the tribunals, these Acts may be arranged under their proper head, and be considered merely as presumptions, from which the court is to draw such inferences as the circumstances warrant. ^[36]

So, according to Bentham, proxy crimes tend to be introduced when (1) there exists some pattern of suspicious behaviour that indicates with a very large degree of likelihood that a given individual has committed a given underlying crime but (2) there is uncertainty whether the court would see this pattern of behaviour as sufficient for meeting the BARD standard of proof, or, in other words, as evidence sufficient for conviction. Therefore, to avoid problems with meeting the standard, (3) the lawmaker decides to criminalize the suspicious pattern of behaviour itself, so that an individual exhibiting the suspicious behaviour may be convicted of the proxy crime even if otherwise she would not be convicted of the underlying crime because of insufficient evidence.

This is enough to see that the main objective and actual result of proxy criminalization is to circumvent the high evidentiary threshold necessary for convicting the defendant of the underlying crime: ^[37] if the defendant cannot be convicted of the underlying crime because of insufficient evidence but has exhibited the requisite suspicious behaviour, he can still be ‘proxy-convicted’.

Therefore, from the perspective of a lawmaker willing to lower the standard of proof for a given crime, the introduction of a proxy crime is a substitute for an explicit or implicit change of procedural rules of evidence. To use an example: let us assume that the legislator wants to introduce a presumption that anybody possessing more than 28 grams of cocaine is involved in drug trafficking (even if there is no other evidence indicating that the person in question is involved in selling, distribution, or generally trading in cocaine). In this case there are two general ways to achieve this goal. First, it is possible to alter the rules of evidence so that the mere possession of large quantities of narcotics is sufficient for convicting a person of drug trafficking. Alternatively, the legislator may introduce a new proxy crime, so that possessing more than 28 grams of cocaine is a crime on its own.

4. Desirability of proxy criminalization

As mentioned before, consequentialists may find a relaxation of the evidentiary threshold desirable in some situations, thus proxy criminalization (as a way of lowering the evidentiary threshold) may be in principle justified from the consequentialist perspective. ^[38] However, one may still wonder why the evidentiary threshold should be lowered by crafting new proxy crimes instead of explicitly lowering the threshold as stated in the procedural law, which seems to be a more natural way to do so. Some arguments showing why proxy criminalization may be a preferable solution have been proposed in the literature.

The first argument stems from an observation that the correlation between suspicious behaviour and the respective underlying crime is hardly ever perfect, so that there are usually some individuals that exhibit the suspicious behaviour for a legitimate reason, without any intent to commit the underlying crime. The society seems to have an interest in incentivizing such ‘innocents’ to refrain from the suspicious behaviour (first, because of the potential deadweight loss resulting from punishing them as if they had committed the underlying crime and secondly because by exhibiting the suspicious behaviour innocents may trigger unnecessary, socially wasteful actions on the part of law enforcement). However, because innocents are obviously more likely to refrain from the suspicious behaviour if they are well informed about the possibility of being convicted, it seems likely that proxy criminalization tends to be superior precisely because it facilitates the acquisition of information by innocents.

To show this, let us once again use the cocaine example, in which the lawmaker wants to effectively prosecute drug trafficking. Imagine two extreme legal regulations: in the first one the mere possession of more than 28 grams of cocaine is proxy-criminalized; in the other one case law allows the court to infer drug trafficking from the possession of a ‘large quantity of illegal drugs’. Further, assume that there are some ‘innocent’ (not involved in drug trafficking) consumers of drugs who tend to store large quantities of cocaine for personal use. These innocents seem to be more likely to signal their innocence (by refraining from the suspicious behaviour, i.e. by limiting the quantity of cocaine in their possession) in the first situation. It seems to be the case because the statutory substantive law is more easily available than the case law and it tends to use less ambiguous distinctions than the standard-like judge-made law, so it can be more easily absorbed by laypeople. ^[39] It is an overlooked advantage of proxy crimes: when the underlying crime’s definition is vague and standard-like, supplementing it with a rule-like proxy crime may actually increase the legal certainty and provide innocents with better information on how they can avoid the risk of getting involved in a criminalized conduct.

The second argument refers to the need of calibrating sanctions in accordance with the strength of the evidence. ^[40] Since the correlation between suspicious behaviour and the underlying crime is usually not perfect, lowering the standard of proof may lead to chilling the benign behaviour of innocents. To mitigate the chilling risk, punishment for suspicious behaviour should be lower than the one applied to individuals whose guilt for committing the crime has been proven beyond a reasonable doubt. Assuming the legislator is able to assess the strength of the correlation and the value lost because of the chilling effect, he can set the punishment for committing a proxy crime as adequately lower than the sanction for the underlying crime. Since sanctions for proxy crimes actually tend to be lower than sanctions for the underlying crime, we can assume that this goal is to some extent achieved in reality.

Yet another argument points at the fact that proxy criminalization is socially preferable to the explicit relaxation of evidentiary rules because it fits common moral beliefs and the expressive function of criminal law better than an explicit relaxation of procedural rules. ^[41] The explicit incorporation of evidential uncertainty (like in a hypothetical ruling saying that “the defendant was found 70%-guilty”) would undermine the belief that criminal law is an ultimate tool to be used only when there is knowledge of the guilt of a defendant. Luckily, under proxy criminalization we are able to prove the guilt of an individual exhibiting a suspicious behaviour beyond a reasonable doubt (though it is the guilt of committing the proxy, not the underlying, crime), so that the useful fiction of criminal law as a realm of certainty can be retained.

Finally, it may be argued that proxy criminalization is preferable in cases in which proving the underlying crime would require making use of circumstantial, as opposed to direct, evidence. ^[42] There is literature documenting the propensity of legal decision-makers (both jurors and professional judges) to act under the influence of unreliable heuristics and cognitive biases when dealing with statistical evidence. ^[43] However, recent experimental studies have discovered an even more fundamental and troublesome phenomenon: anti-inference bias, i.e. aversion towards basing liability on inferences from any circumstantial evidence. ^[44] Thus, in cases where a proxy crime may be described in a clear-cut way, its introduction may allow legal decision-makers to determine their verdicts based on direct evidence and to avoid possible mistakes stemming from suboptimal assessment of circumstantial, especially statistical, evidence.

The attitude of retributivists towards proxy criminalization turns out to be somewhat more complex. Retributivism, as understood in this paper, would definitely object proxy criminalization if it resulted in punishing the innocent. However, it is far from uncontroversial whom we may consider innocent.

On the one hand, it might seem that even a retributivist who sees the BARD standard as a directly morally grounded principle do not need to have any specific objections against proxy crimes. As long as the defendant is only convicted of a proxy crime if it has been proved beyond a reasonable doubt that he has committed an act fitting the statutory definition of a given proxy crime, there seems to be no interference with the BARD standard. Indeed, we would expect such a reaction from criminal law scholars advocating the procedural reading of the presumption of innocence principle and the BARD standard. ^[45] In their opinion, the BARD standard belongs only to the realm of rules governing the criminal trial and it cannot be used to assess any actions taking place outside the courtroom. In particular, it does not set any limit on the scope of legitimate criminalization. Even if the state effectively makes it easier to convict individuals by relaxing the statutory definitions of offences, it does not, by itself, interfere with the BARD standard. The defendant whose guilt of committing a proxy crime has been proven beyond a reasonable doubt cannot be considered ‘innocent’ in any relevant meaning of this word, thus proxy criminalization does not lead to the unacceptable punishment of innocents.

However, on the other hand, the situation appears to be radically different if we adopt a more substantive reading of the presumption of innocence (and the BARD standard), ^[46] claiming that the state is free to criminalize conduct only if there is a high degree of certainty that it will not lead to the punishment of innocents (here understood as individuals, whose acts, even if criminalized, were not punishment-worthy ^[47]). Of course, proxy crimes become potentially illegitimate under this reading. As long as the correlation between the underlying crime and the suspicious behaviour is not perfect, proxy criminalization would lead to the punishment of so-understood innocents.

Whether the substantive reading of the presumption of innocence is justified remains a hotly debated issue in contemporary philosophy of criminal law. Nevertheless, it appears to me that this interpretation is more consistent with negative retributivism. As it was argued before, ^[48] the importance of the presumption of innocence (and the BARD standard) within the retributivist framework results from treating it as a principle that is directly morally grounded. The presumption of innocence is directly justified by the evil that would result from punishing the innocent. But then, if we agree that this kind of moral considerations justifies a strict decision criterion during the criminal trial, it would be misguided not to apply a similar criterion while deciding on the scope of criminalization. ^[49] It is so because a decision with regard to the scope of criminalization is no less likely to result in the punishment of individuals who do not deserve to be punished than decisions made during the trial. Just as the court cannot impose the punishment unless it has been proven beyond a reasonable doubt that the defendant is indeed guilty, so the legislator is not allowed to criminalize a given conduct unless it is certain beyond a reasonable doubt that engaging in this conduct is punishment-worthy. ^[50]

There is still one important issue missing in our discussion on retributivism and proxy criminalization: proxy crimes seem to fit well the traditional definition of mala prohibita (i.e. offenses that are “wrong only because prohibited by legislation” ^[51]) and there exists a substantial (even if inconclusive) literature aiming at reconciling the existence of mala prohibita with retributivism. Perhaps, then, proxy crimes can be easily justified by some of the proposed retributivist justifications for mala prohibita (or at least it can be concluded proxy crimes are not more inconsistent with retributivism than any other type of mala prohibita)? This seems somewhat dubious to me since there exists one significant difference between proxy crimes and typical examples of mala prohibita: the motivation behind their introduction. Let us recall that the basic reason behind any proxy crime is the fact that proving the underlying crime beyond a reasonable doubt may be problematic. A typical proxy crime would denote a conduct that is perfectly unwrongful if not for the fact that it suspiciously resembles some other criminalized act. Thus, a typical proxy offence is criminal only because some other act has been criminalized before.

Most retributivist attempts at justifying mala prohibita aim, in my opinion, to show that, even if a given act would not be morally wrongful absent any legal regulation, after the legal regulation is in place individuals have independent (other than avoiding criminal punishment) moral reasons not to engage in it. ^[52] That is not the case with proxy crimes, since basically the only reason individuals have not to engage in a proxy-criminalized conduct is not to raise suspicions of the criminal law enforcers, which is the same as saying that they do not have any moral reasons independent from the criminal law not to engage in this behaviour. ^[53]

Negative retributivism accompanied by the substantive reading of the presumption of innocence is inconsistent with proxy criminalization as long as the latter is overinclusive ^[54] (i.e. as long as it leads to the punishment of individuals who are not punishment-worthy). As I have tried to show in the preceding paragraphs, such an interpretation of negative retributivism appears to be most plausible and consistent. However, what is equally important for this paper, is that such an interpretation of negative retributivism may serve as an explanation for existing court judgments that put limits on the legitimate scope of proxy criminalization. Indeed, court rulings in which proxy crimes are treated with utmost suspicion are numerous and may be found in many jurisdictions. ^[55] In this paper, I will discuss in more detail just one of them, namely the U.S. Supreme Court case of Bajakaijan.

In this case, ^[56] the Supreme Court’s majority found the punishment of full forfeiture for the offence of bulk cash smuggling (understood here as a proxy crime supplementing the underlying crime of money laundering) is “grossly disproportional” to the gravity of the offence.

In Bajakaijan, the defendant was apprehended while trying to board a trans-Atlantic flight with over 350,000 US dollars in cash hidden in a false-bottomed suitcase, which he had not disclosed in an appropriate declaration. It had not been proved that the funds were of any illegal origin (Bajakaijan claimed he had carried the money to repay a legitimate loan from his relatives in Syria), even if the behaviour of the defendant was generally suspicious and inconsistent. Still, in accordance with the criminal statue, the custom authorities declared the whole amount subject to full forfeiture.

The Supreme Court held that the harsh punishment of full forfeiture for the offence of bulk cash smuggling was addressed against actual money launders. Since it had not been proved that Bajakaijan was a money launderer, his act had been only a minor administrative offence, ^[57] for which the harsh punishment would be grossly disproportional. ^[58]

This ruling is based on an obvious observation that proxy criminalization is overinclusive: a proxy crime is intended to be addressed against individuals that have committed the underlying crime but in fact it includes all individuals that have engaged in the suspicious behaviour, including those who have not committed the underlying crime (‘innocents,’ as I call them here). So, the criminalization of bulk cash smuggling is addressed against money launderers (because it seems reasonable to assume that almost everybody who smuggles cash is a money launderer) but it includes also innocents, i.e. people who move cash for legitimate reasons and just forgot to fill out the required declaration.

If taken literally, the majority opinion in Bajakaijan seems to say: the harsh punishment for the crime that Bajakaijan has committed (bulk cash smuggling) could be imposed only if it had been proven beyond a reasonable doubt that he was a money launderer. But notice that had it been proved that he was a money launder, the conviction of bulk cash smuggling would have been superfluous. Bulk cash smuggling, like any proxy crime, is employed exactly when it cannot be proved beyond a reasonable doubt that the defendant has committed the underlying crime. And in the case of money laundering, a crime that is both lucrative and hard to detect, the introduction of proxy criminalization seems, at least intuitively, justifiable. However, a consequentialist analysis of this kind is missing from the majority option. ^[59] It looks like the majority’s willingness to avoid any risk of punishing the innocent, no matter how small, clearly overrode any possible benefits associated with proxy-criminalizing money launderers.

Yet another feature of this ruling points at its retributivist basis. The majority did not attempt in any way to assess how likely it was for an ‘innocent’ (i.e. a person not involved in money laundering) to commit the offence in question, or, in other words, how likely it was that somebody would fail to declare a movement of cash simply because she did not know she was supposed to or thought of it as a bureaucratic requirement of secondary importance. Thus, the majority did not seem to be concerned with the question of whether proxy criminalization was empirically overinclusive in this case. For the majority, it was enough that the scope of criminalization was conceptually overinclusive, i.e. that there was not analytically necessary link between the definition of the proxy crime (bulk-cash smuggling) and the wrong that would justify the harsh punishment (i.e. money laundering). This requirement of an analytical correspondence between the description of the criminalized conduct and the wrong that justifies the criminalization was very much stressed by the retributivist proponents of the substantive reading of the presumption of innocence, whose views we discussed earlier. ^[60]

5. Conclusions

As it has been argued in this paper, proxy crimes, while in principle justifiable from the consequentialist perspective, are very much more troublesome for retributivism (or at least for what I find to be the most plausible interpretation of retributivism). As I attempted to demonstrate in the preceding section using the example of Bajakaijan, judicial rulings that set limits on the scope of proxy criminalization are best understood as an expression of these retributivist views.

However, such an outcome may be troubling for a number of reasons. First, while proxy criminalization remains controversial (often rightly so) in newly developed areas of criminal law (such as anti-money laundering and anti-terrorist regulations), remember that many examples of proxy crimes, including some referred to in this paper, have been present in criminal codes for many years and do not seem to be particularly frowned upon by retributivists. The question why these more traditional proxy crimes do not trigger that much controversy is yet to be answered satisfactorily.

Another issue is whether retributivist arguments in favour of the uniformly high evidentiary threshold should bother us at all. As we established above, if retributivism generates an argument in favour of the high threshold, it is because it endorses an agent-relative norm, requiring knowledge of guilt in order to impose the punishment in particular cases, while not caring that much about the overall number of wrongful convictions the criminal law system generates. Of course, for consequentialists such an attitude is absurd: if there is something morally relevant, it is exactly the overall number of wrongful convictions. As we saw while analysing Kaplow’s model, there is at least a theoretical possibility that relaxing the evidentiary threshold may decrease the number of wrongful convictions ^[61] but for negative retributivists it would not suffice as an argument in favour of lowering the threshold. To refer to an intuitively compelling class of cases: when the underlying crime is vague and standard-like (thus possibly causing confusion among individuals) supplementing it with a more rule-like proxy crime may increase the certainty of law and decrease the number of wrongful convictions. However, even if that would be the case, many retributivists would object to such proxy crimes precisely on the grounds of protecting the innocent.

In any case, the deontological asymmetry between extreme prudence in dealing with the ‘particular’ risk of punishing the innocent and disregarding the task of mitigating the ‘global’ risk seems somewhat troubling while being an important part of negative retributivist theories of criminal law. It is the most likely reason for the general support that the high evidentiary standard in criminal law enjoys. However, as it was argued in this paper, the existence of proxy crimes is one example which shows that the effective evidentiary threshold in contemporary legal systems is oftentimes lower than the one officially specified in the procedural law. Thus, the dispute between retributivism and consequentialism in the context of dealing with the risk of punishing the innocent is of more practical relevance than it is usually acknowledged.

Footnotes

The research was supported by a grant no. 2015/17/B/HS1/02279 funded by the National Science Centre, Poland. ↑
See Kaplow, Shavell (2002) for a review and critical discussion. See also Carritt (1947) and Smilansky (1990). ↑
Rawls (1955). ↑
Kaplow, Shavell (2002). ↑
Smilansky (1990). ↑
Alexander (1983). ↑
It is often expressed by the maxim saying that it is better that n guilty persons escape than that one innocent suffer, where n ≥ 1. However, there has been no consensus on what the right value of n is, with proposed values historically ranging from 1 (Voltaire) to 10 (Blackstone) to 1000 (Moses Maimonides). See Epps (2015) for an overview and discussion. ↑
From a more retributivist perspective, one may say that by setting the evidentiary threshold the society decides what ratio of false convictions is the maximal price we are willing to pay for the operation of the criminal justice system. ↑
Smilansky (1990). ↑
Robinson (2008); Duff, Hoskins (2017). ↑
Duff, Hoskins (2017). ↑
Ibidem. Notice that negative retributivism, so defined, is a position with regard to the distribution of punishment, not with regard to the justification of punishment, which may remain purely retributivist. ↑
See e.g. Hart (1968). ↑
Alexander (1983). ↑
Moore (1997): 155: “[t]he ‘deontological’ or ‘agent-relative’ retributivist regards the act of punishing the guilty as categorically demanded on each occasion, considered separately.” ↑
This is to say that, according to many retributivists, ‘knowledge’ (understood as ‘justified or well-founded true belief’) of the defendant’s guilt is a necessary condition for imposing condemnation associated with criminal punishment, see Duff et al. (2007): 89–91 (the issue of whether the epistemological definition of knowledge as ‘justified true belief’ can be applied in this context without any qualifications has not been analysed in the literature. I find it somewhat problematic but I will not elaborate on it here). In the notable words of Lawrence Tribe (1970), “guilt beyond reasonable doubt represents not a lawyer’s fumbling substitute for a specific percentage, but a standard that seeks to come as close to certainty as human knowledge allows – one that refuses to take a deliberate risk of punishing any innocent man.” A somewhat different account is given by Tadros (2006): “conviction is warranted only if knowledge that the defendant perpetrated the offence is demonstrated, and, we might add, demonstrated publicly. One reason why this might be so is that a criminal conviction will only achieve the kind of closure that the criminal trial aims at if no reasonable doubt remains about the guilt of the defendant.” ↑
See below in this section. ↑
Tomlin (2013). Technically, Tomlin refers at this point to the presumption of innocence, treating the high evidentiary threshold as a part of this principle. Since in this paper I focus mostly on the evidentiary standard, I will speak of the BARD standard instead. ↑
Ibidem. ↑
Due to space constraints, in the remaining part of this paper, under the head of ‘consequentialism’ I understand mostly normative economic (or welfarist) analysis of law. I believe this approach predominating in contemporary legal academia is representative for other branches of consequentialism in most respects relevant for this paper. That means that important ideas about the evidentiary threshold expressed by consequentialists who do not belong to the economic school, in particular Larry Laudan (2012), unfortunately will not be dealt with in this paper. ↑
Posner (2011). ↑
Kaplow (2012). ↑
Therefore raising the sanction or the probability of apprehension (being charged) are substitutes for lowering the evidentiary threshold. Kaplow’s model describes the interaction between these three parameters and their optimal levels. Due to space constraints, I will not deal with this issue here and will take the two former parameters as fixed. ↑
This is equivalent to what we have already said about the reverse relation between changes in the evidentiary threshold and the deterrence effect. ↑
See e.g. Tadros, Tierney (2004); Teichman (forthcoming). ↑
Schauer (2003). ↑
Bentham (1864): 425. ↑
McAdams (2006). ↑
Becker (1968); Polinsky, Shavell (2000); Duff et al. (2007). ↑
Some authors use the term ‘proxy crime’ to denote basically any crime crafted on the basis of a statistical generalization (see e.g. Alexander, Kessler Ferzan (2013); so understood, the term ‘proxy crime’ seems closely related to the notion of a ‘hybrid offence’ discussed by Duff (2007)). In this meaning, the prohibition of driving over 90 km per hour is a proxy offence because it denotes conduct which usually is unacceptably risky but is sometimes not. The meaning used here is much narrower and it only contains offences prohibiting behaviour that is always substantially harmless unless committed with the intent to commit also the underlying crime. ↑
Thereafter I will denote this harmful behaviour as ‘underlying crime’. ↑
A commonly known example (and frequently exploited in popular culture) of such a situation may be found in the history of the famous gangster Al Capone and his conviction for tax evasion. Of course, this example is not perfect (because tax evasion is generally harmful and deserves prosecution on its own) but in this particular case tax evasion obviously served only as a proxy for the core of Capone’s criminal activity. ↑
Stuntz (2001). ↑
For the sake of argument, I am assuming here that personal use of drugs is not punishment-worthy. ↑
For a good analysis of Bentham’s ideas in this regard, see Schauer (2003). ↑
Bentham (1864): 426. ↑
However, what is important from the viewpoint of traditional legal theory is the fact that the lowering of the standard of proof is achieved through manipulating substantive, not procedural, rules. ↑
Of course, the question whether actual instances of proxy crimes are optimal from the perspective of, for example, Kaplow’s welfarist model, remains to be answered empirically. ↑
Ehrlich, Posner (1974); Kaplow (1992, 1999). ↑
Teichman (forthcoming). ↑
Ibidem. ↑
Zamir, Harlev, Ritov (2016/2017). ↑
See e.g. Teichman, Zamir (2015): sections 7 and 8 for a review of literature. ↑
Zamir, Ritov, Teichman (2014). ↑
Roberts (2005); Lippke (2016). ↑
See e.g. Tadros, Tierney (2004); Tomlin (2013). ↑
The term punishment-worthiness, as used by Tomlin (2013), is open to at least two interpretations. The moralistic interpretation [e.g. Duff (2005)] would imply that the legislator may criminalize only a conduct that is actually immoral. The other interpretation, advocated by Tadros (2006), implies that the legislator is free to criminalize a conduct that they consider a public wrong (even if it is not actually immoral); what the legislator is not allowed to do, however, is to draft an offence in such a way that would lead to the punishment of individuals who have not committed what the legislator considers the public wrong in question (in other words, the legislator is not allowed to draft offences whose definitions are overinclusive). The differences between these two interpretations are by no means trivial; however, most proxy crimes (or at least most of the examples of proxy crimes presented in this paper) would be problematic under both interpretations, so I will not elaborate on this issue any further. ↑
See Section 2 of this paper. ↑
Tomlin (2013). ↑
Such a conclusion is a result of what Tomlin (ibidem) calls Equivalence Thesis 1: “it can be as bad or worse to punish someone for something that they should not, in fact, be punished for (and did do), as it is to punish someone for something that they did not, in fact, do (but that is, in principle, punishment-worthy.” Another, somewhat more practically-oriented argument in favour of the substantive reading is offered by Tadros and Tierney (2004): if the presumption of innocence is supposed to serve as an individual’s shield against the state, then its protection would become totally illusory if the state were free to relax the evidentiary standard by enacting overinclusive offences. ↑
See Duff (2005): chapter 4.4, and literature discussed there. ↑
Be it: keeping a promise, solving coordination problems, or respecting rules of fair play; see Husak (2005) for a critical overview. ↑
For completeness, we should mention legalistic (as opposed to moralistic) retributivism: a claim that an individual engaging in a conduct prohibited by a duly enacted criminal statute deserves to be punished, simply because he has been given notice, irrespectively of the content of this statue. Following Husak (2005), I would argue that unrestricted legalistic retributivism cannot constitute a tenable moral criterion, simply because it is impossible to imagine a duly enacted criminal statute that would not be justifiable under unrestricted legalistic retributivism. ↑
Schauer (1991); Alexander, Kessler Felzan (2013). ↑
E.g. the New York Court of Appeals case of Bunis (9 N.Y.2d 1, 210 N.Y.S.2d 505, 172 N.E.2d 273 1961) in which the criminalization of selling magazines or books without a front cover has been found unconstitutional, or a House of Lords ruling in Sheldrake v DPP ([2004] UKHL 43), in which two proxy crimes (being drunk in charge a car and being a member of a proscribed terrorist organization) were interpreted as imposing only an evidentiary burden on the defendant, or a European Court of Human Rights ruling in Salabiaku v. France (1988 13 EHRR 379), in which the Court set the limit on the possibility to convict the defendant of “possession of prohibited goods when passing through customs” in a situation in which the prosecution failed to prove that the defendant was guilty of drug trafficking. ↑
United States v. Bajakajian, 524 U.S. 321 (1998). ↑
It may be argued that Bajakaijan is not a good example to be used in an analysis of proxy crimes, since the failure to report would be an offence anyway and the question to be decided by the Court was only whether the punishment was excessively harsh or not. But let us recall that in this paper I understand proxy criminalization as referring to two somewhat different situations: 1) a totally harmless conduct is criminalized; 2) a conduct that is harmful (or punishment-worthy) only to a small extent is sanctioned with a disproportional penalty. In the latter case, the harsh punishment is imposed not because of the intrinsic wrongfulness of the conduct but only because this conduct is suspicious from the viewpoint of some serious underlying crime. I agree that the second setting is a much less obvious example of proxy criminalization and may seem to be more readily justifiable form the retributivist perspective. However, let us remember that the main thesis of negative retributivism states both that the innocent should not be punished and that the guilty should not be punished more than they deserve. There is no reason to treat the second constraint as less important, quite the opposite. In the words of Tomlin (2013), “[o]verpunishment mistakes can be just as serious, if not more so, than wrongful conviction mistakes”. This is clearly visible in these cases of proxy criminalization in which minor regulatory offences, which we would expect to be punished leniently, are punished as the serious crimes with which they are assumed to correlate. To refer to the current example: undeclared movement of cash looks like a minor administrative offence (so a person violating this law would expect a moderate penalty, comparable to penalties for similar offences consisting in failure to report some information; this is exactly the case in most contemporary jurisdiction that criminalize bulk-cash smuggling) but in the US the punishment turns out to be unexpectedly harsh. ↑
Notice that the reasoning here is based on a (reasonable) assumption that bulk cash smuggling in itself is a rather harmless act. It is harmful only if it is a part of a money laundering (or tax evasion) scheme. ↑
However, consequentialist arguments appear in a dissent by Justice Kennedy. ↑
Tomlin (2013); Tadros, Tierney (2004). ↑
Of course, consequentialists have to take other (beneficial or detrimental) effects of lowering the threshold into account before making any recommendation. ↑

Diametros 53 (2017): 26–49
doi: 10.13153/diam.53.0.1099

Retributivism, Consequentialism, and the Risk of Punishing the Innocent: The Troublesome Case of Proxy Crimes

Piotr Bystranowski

1. Introduction ^{^[1]}

2. Retributivism, consequentialism, and the evidentiary threshold in a criminal trial

3. Proxy crimes

4. Desirability of proxy criminalization

5. Conclusions

Footnotes

References

Diametros 53 (2017): 26–49 doi: 10.13153/diam.53.0.1099

Retributivism, Consequentialism, and the Risk of Punishing the Innocent: The Troublesome Case of Proxy Crimes

Piotr Bystranowski

1. Introduction [1]

2. Retributivism, consequentialism, and the evidentiary threshold in a criminal trial

3. Proxy crimes

4. Desirability of proxy criminalization

5. Conclusions

Footnotes

References

Diametros 53 (2017): 26–49
doi: 10.13153/diam.53.0.1099

1. Introduction ^{^[1]}