Comments

Friday, March 24, 2017

Lexical items and "mere" morphology-2

This continues the saga from the previous post.

The LK theory had a good run. However, two particularly thorny problems strongly urged revision.  Here we briefly outline the main problems with LK approaches. We then discuss the properties of the GB Binding Theory (BT) that came to replace it with particular emphasis on how it addressed these problems.

There are two problems with LK approaches. First, how to analyze sentences like (5a). On analogy with examples like John kissed himself (derived form underlying John kissed John), they should have underlying structures like (5b). However, (5a) does not mean what (5b) does, and this serves to gut the basic LK approach to reflexivization.[1] Analogous examples in (6) argue against the LK analysis of pronominalization.

(5)       a. Everyone kissed himself                                                                                                               
            b. Everyone kissed everyone
(6)       a. Everyone thinks that he is tall
            b. Everyone thinks that everyone is tall

It is worth considering how these examples pose a problem for the LK analysis. First, given LK’s background assumptions, it appears that the anaphoric morphemes are semantically significant.  Specifically, it is plausible that (5a)/(6a) differ in meaning from (5b)/(6b) precisely because the semantic contributions of himself and he are different from that of everyone.  Adverting to the semantic contributions of the specific anaphors requires reneging on the assumption that binding dependencies are morpheme blind, and suggests that a morpheme centered analysis of binding is the right way to proceed. In other words, rather than being grammatical afterthoughts, the licensing requirements of anaphoric morphemes drive the grammar. Reflexives and bound pronouns are elements that require grammatical validation and grammatical processes cater to their licensing requirements.

Second, once binding theory becomes morpheme centered, economy is no longer required (or desired). Reflexivization and Pronominalization must be ordered with (1) before (2) because Pronominalization applies to the same inputs (aka Structural Descriptions (SD)) as Reflexivization. Were they unordered, the grammar would generate illicit bound pronoun constructions.  However, ordering (1) before (2) is nugatory if they have different SDs. Requiring reference to specific reflexive and a pronoun morphemes in the statement of the operations (as the data in (5) and (6) suggests is necessary) automatically results in rules with different SDs and this then obviates strictly ordering reflexivization before pronominalization. In other words, once binding theory aims to license reflexive morphemes and bound pronoun morphemes, the principles that do so can apply without reference to the applicability of other rules, i.e. the binding principles can be stated unconditionally rather than conditions whose applicability is relative to that of others.

In addition to the problems posed by (5) and (6), there is a second problem with the LK account, forcefully noted in Lasnik 1976. Say it is correct that rules like (1) and (2) prevent bound pronouns from appearing where reflexives do.  This still leaves open the possibility that referential pronouns might appear with the requisite interpretation. Recall, that LK does not regulate (co-)reference, merely binding. Thus, what is wrong with (7a) where the pronoun is not the product of any grammatical process.  This kind of base generated deictic pronoun is what we find in sentences like (7b). So what prevents generating such a pronoun in (7a) with the same reference as John? As Lasnik 1976 notes, (7a) with the co-referential interpretation of the pronoun is rather unacceptable and this is left unexplained so long as the co-reference is not the result of binding but is “accidental.” We can patch up the LK account by adding another disjointness rule, something to the effect that pronouns cannot have local binders (along the lines of Principle B) but, and this is the critical point, this patch seems to render the Pronominalization rule superfluous as the anti-binding Principle B alone suffices to account for the distribution of both bound and unbound pronouns. In effect, they can appear anywhere they are not prohibited.

(7)       a. John loves him
            b. Mary saw him

Lasnik’s (1976) proposal has an interesting theoretical effect: it reinterprets what the grammar tracks. For LK, the grammar licenses antecedence and establishes antecedent-anaphor dependencies. However, the disjointness rule does not establish antecedence but anti-antecedence. This results in a somewhat schizophrenic grammatical theory of bound anaphora.[2] Reflexive are treated more or less as in LK with the grammar requiring that a reflexive have an appropriate local antecedent. In effect, all reflexives are anaphoric and require antecedents and the grammar functions to ensure this result.  Pronouns, in contrast, do not require antecedents and what the grammar does is ensure that if there is an antecedent it is not too local.  Importantly, this collapses bound and non-bound pronouns into one class. Indeed, it’s this move that allows Lasnik (1976) to address the puzzle he identified. For LK, grammars code for anaphoric dependency. After Lasnik (1976) grammars regulate the distribution of classes of morphemes regardless of interpretation, with some later extra grammatical process (we might now say interface processes) determining which pronouns are understood as bound and which not.
So with this as background, let’s consider the general features of the GB Binding Theory (BT).  It consists of three principles:

(8)       A. An anaphor (e.g. a reflexive) must be bound in its domain.
            B. A Pronoun cannot be bound (must be free) in its domain.
            C. An R-expression cannot be bound.

Domains have been variously characterized, but for the nonce we can assume that it is the minimal (finite) clause containing the anaphor/pronoun/R-expression. There are several things to observe about these principles. [3]

First, they are morpheme centered. The inputs to the rules are expressions that fall into one of three classes, anaphors (+a,-p), pronominals (+p,-a) and R-expressions (-a,-p).[4] Furthermore, the rules state licensing requirements for these expressions. In other words, differing binding requirements restrict the distribution (and interpretation) of lexical items with differing feature constitutions. Note that this effectively rejects the LK distinction between lexical and grammatical formatives.  Reflexives and bound pronouns are just as lexical as cat and run, though their features call forth grammatical licensing in ways that more “ordinary” lexical items do not.

Second, the principles are unconditional in the sense that neither A nor B is ordered or needs to be ordered with respect to the other.  What accounts for the complementarity of reflexives and pronouns is the fact that both elements meet inverse licensing requirements in the same local domain. Where anaphors must be bound, pronouns must not be. There is no sense within the standard GB version of BT that pronouns and anaphors or their licensing requirements are in an economy relation or that one kind of dependency is preferred to the other.  Both apply where they can and the presence or absence of either has no grammatical effect on the other. 

A thought experiment makes the contrast with LK evident. I imagine that an epidemic struck wiping out the rule of reflexivization. In an LK grammar, pronominalization would apply to yield sentences like John likes him with a bound interpretation.  Given the same epidemic, this time wiping out A, such sentences would still be prohibited for pronominals would still be subject to B.

Third, the principles apply regardless of how the morphemes are semantically interpreted. This is clearest in the case of B. The grammar does not single out any particular antecedence relation. B states that pronouns cannot appear in certain configurations, it does not specify or distinguish and DP as antecedent.  In fact, the co-indexing that is part of the system has no univocal semantic interpretation. This is what allows B to accommodate Lasnik’s (1976) worries. The indexing in (8) violates B regardless of whether it is interpreted as binding or co-reference.

(8) *John1 likes him1 

This has the effect of enriching the interpretive systems, as some indexations are interpreted as binding dependencies and some are not. Which are which falls beyond the purview of the grammar despite the fact that they have consequences for acceptability that do not seem particularly “semantic.” Consider one example. Weak Crossover Effects (WCO) are restricted to bound pronouns, e.g. ones where the antecedent is a quantified DP. Thus there is a contrast in (9a,b) where him can be interpreted as John but not as bound by everyone.

(9)       a. his1 photos distressed John1
            b. *his1 photos distressed everyone1

The standard description of this is that WCO constrains binding relations but not (co-) reference. The indexing in (9a) is interpreted as co-reference while the one in (9b) cannot be so interpreted as everyone is not a referring expression.  However, as binding is illicit in WCO configurations, it cannot receive a binding interpretation so the indexing yields no good interpretation and the structure yields unacceptability.[5] If this is correct, then BT requires supplementation to distinguish those indexings that will be interpreted as bindings from those that will not. Why? Because the indexing above is purely syntactic and it tracks a mixed bag of interpretive possibilities.  This is just another way of saying that BT per se accounts for the distribution of certain morphemes, not the dependencies that they enter into.

Another consequence of the GB re-invention of the binding theory is Principle C. As noted above, LK had no corresponding principle. Rather, for LK Principle C effects are by products of how Reflexivization and Pronominalization are stated.  Once the idea that reflexives and pronouns are real inputs to semantic interpretation and not mere morphological dress-up, the LK strategy for dealing with Principle C effects is not longer viable and an explicit statement is required. 

As has long been noted, Principle C is somewhat odd. First, it is not bounded in any way, unlike A and B, which apply within circumscribed domains.  Second, at least initially, it was taken to apply to what on the surface appear to be entirely different kinds of cases. The examples in (10) do not violate Principles A or B.  Consequently, if these are to be excluded an additional binding principle is required. Furthermore, whereas (10c,d) plausibly pertain to the structural conditions imposed on antecedents and their dependent anaphors, (10a,b) do not appear to involve anaphoric elements at all.  To collect all these cases under the same principle it is critical to add another category of expressions, R-expressions, to the inventory of elements regulated by the Binding Theory.[6]  Like Principle B, Principle C does not specify licit anaphoric dependencies but blocks illict ones.  It is negative, rather than a positive rule of grammar.[7] 

(10)     a. *John1 likes John1
            b. *John thinks that John is tall
            c. *John1 expect himself1 to like John1
            d. *He1 thinks that John1 is tall

Ok, let’s take a step back now and consider these two approaches. As should be evident, the theoretical intuitions behind BT are very different than those behind LK.  They differ not only technologically (LK uses transformations, is derivational in spirit and has morpheme rewrite rules while BT has indexing algorithms and is representational in spirit and uses filters stated at various grammatical levels) but in what they take the subject matter of binding to be.  They differ along three critical dimensions:

1.     Do binding principles have a natural semantic interpretation?
2.     Are binding principles economy principles (or absolute)?
3.     Are binding principles morpheme (or dependency) centered?
LK answers yes to (1) and (2) and no to (3). GB answers no to (1) and (2) and yes to (3). What of minimalist approaches? I don’t know, but my hunch is that LK approaches have some features that are worth reconsidering in in MP context. How so?

Well, first, the main problem with LK accounts noted above disappear in the context of theories that distinguish copies due to I-merge and those that result from multiple selections of the same expression from the lexicon. The LK theory did not (and could not) distinguish these two possibilities and so the fact that everyone loves himself does not mean the same as everyone loves everyone, was sufficient to sink the LK approach. However, as you know, this does not hold for MP accounts so long as binding tracks movement. If it does, then we can distinguish the two case above syntactically.

As noted, this requires endorsing a movement theory of binding. The main obstacle to this in earlier GG theories was D(eep) Structure. Given DS, there could be no movement between “theta” positions and so the technology MP provides to distinguish everyone1 loves everyone2 from everyone1 loves everyone1 could not be applied. However, if we eschew DS (as MP stories do) then it is in principle possible to move between “theta” positions and so distinguish these two kinds of chains. We can then restrict LK reasoning to the second kind of chain without empirical hazard. So, the elimination of DS restrictions is a pre-requisite for revivifying the LK approach, and this is precisely what MP theories allow.

Last, we must allow Gs that differentially spell out copies. The LK theory, recall, treats the morphological differences between reflexives and bound pronouns as syntactically very superficial. What counts are the underlying chains/dependencies, not the morphemes that express them. Imo, this is likely a good thing. Let me explain why.

The aim of MP is to explain why we have the UG principles we do. In other words, the aim is to explain the structure of FL/UG. Morpheme centered Gs don’t cannot support this kind of project. Why not? Because they effectively stipulate G requirements by packing them into the idiosyncratic feature make-ups of specific lexical items. Why must anaphors be locally bound? Because they have features that require that they be locally bound. Why do they have these features? Well, because they are reflexives and reflexives inherently have such features by stipulation. This explanation is circular in the worst sense: the circle is very very tight.

Note, that such stipulations are particularly problematic for those with minimalist fish to fry. They are not, for example, worrisome for Plato Problem kinds of issues. So, if A-anaphors are innately part of any lexicon then their binding requirements need not be learned.[8] However, if one wants to go beyond explanatory adequacy, then such stipulations stink. They defeat the MP project from the get-go, as they stipulate what we want to have explained.[9] In other words, from an MP perspective the problem with classical binding theory is that it is based on a series of morphological stipulations concerning specific lexemes, and stipulations like these prevent explanation. So, if your goal is to explain why the binding theory looks the way it does, then you don’t want morpheme centered accounts of binding like the ones we find in GB. Of course, this goal might be unattainable and morpheme based accounts might be the best we can do, but…

Ok, basta! This has gone on far too long. Let me just suggest that earlier GG theories had some properties that are worth re-examining, most particularly the idea that some morphemes are just by-products of grammatical processes rather than being the causal engines behind them. This is not a new idea, but MP has given them, imo, a new lease on life and part of this lease implies rethinking the idea that all formatives are created equal and have an equal purchase in interpretation at the interfaces.



[1] Recall in LK accounts the pre-transformational phrase marker was sole input to semantic interpretation. However, even later approaches which allowed information from several grammatical levels to contribute to semantic interpretation would not have been able to incorporate an LK analyses which sensibly captured the meaning of examples like (5a) and (6a).
[2] This is curiously mirrored in Lasnik’s paper where the appendix deals with bound pronouns and the body of the paper with co-reference.
[3] There is a third reason for moving from an LK approach to a morpheme centered one. In the mid 1970s there was a well motivated theoretical move to dramatically simplify structural descriptions (SDs) and Structural Changes (SCs). This made rules that included the insertion of specific morphemes less natural/desirable. The main problem was that such morpho-phonological intruders complicated the move to simple rules like Move alpha anywhere. Flash forward 40 years and the analogue of LK rules finds a natural home: it results from processes the spell out copies/occurrences. See below.
[4] PRO was taken to be (+a,+p).  We ignore PRO here.
[5] Strong Crossover effects yield a similar problem. Variables are the semantically quintessential anaphors.  They are no referring expressions and require binders.  As such, one might think that they would qualify as anaphoric (+a,-p) elements. In fact the earliest versions of trace theory categorized wh-traces/variables as anaphors. However, so categorizing variables leads to a big empirical problem. Sentences like (i) are wrongly expected to be interpretable as (ii).
(i)            Who1 does he1 think t1 is intelligent
(ii)          Who1 t1 thinks he1 is intelligent
This conclusion can be finessed by cataloging residues of movement, t1, as an R-expression subject to principle C.  This works, but it also highlights the fact that ‘R’ does not mean “referential” in the naïve sense of the term.
[6] One of the authors suspects that the attractiveness of the GB theory of PRO was in part the result of filling an available cell in the required feature matrix. If anaphors are +a,-p and pronouns are –a,+p, and R-expression are –a,-p then there should be something that is +a,+p. PRO was the proposed missing link. 
[7] There are some problems with this way of dealing with the data in (10). First, it is not clear that (10a,b) are really as unacceptable as (10c,d). Indeed there are languages where analogous sentences seem perfectly well formed and express anaphoric dependencies (c.f. Boeckx, Hornstein and Nunes 200x for some discussion). Second, it is not particularly clear what an R-expression is. Among the elements that fell under the category are traces (interpreted as bound variables), names, definite descriptions, demonstratives etc.  What all these expressions have in common besides is unclear. Variables are prototypical anaphors. Names are prototypical non-anaphors.  Definite descriptions can be used anaphorically or not and semantically and grammatically they share many properties with pronouns. Nonetheless, they too are categorized as R-expressions.  The category seems to be a catch-all with entry requirements to the fraternity driven entirely by empirical necessity. This results in negligible explanatory force.
[8] These kinds of theories, however, do require some non-trivial explications of how morpho-phonoligical agreement implicates semantic dependency. Just because two expressions have the same features need imply nothing about whether/how these features have semantic significance.
[9] Incidentally, this is why PRO based accounts of control should also be MP suspect. Assuming PRO with its special licensing requirements allows to track control facts but not explain why control exists.

Monday, March 20, 2017

Lexical items and "mere" morphology-1

This was intended to be short. I failed. It is long and will get longer. Here is part 1. Part 2 sometime later this week or next.

As I mentioned in an earlier post, I am in the process of co-editing a volume of commentary essays on Syntactic Structures (SS). The volume is scheduled to be out just in time for the holidays and will, I am sure, make a great gift.  Nothing like an anniversary copy of SS with a compendium of essays elaborating its nuances to while away the holidays. I mention this because the project has got me thinking about how our theories of grammar have changed over time. And this brings me to the topic of today’s question: are all morphemes created equal?

Interestingly, GG theories answer this question differently. SS and Aspects sharply distinguish, at least theoretically, between two kinds of morphemes: those that enter derivations via lexical insertion, and those that enter transformationally. In this way, these theories make a principled distinction between grammatical vs non-grammatical formatives and track their grammatical differences to different G etiologies.

Later theories (take GB as the poster child) distinguish lexical vs functional morphemes, but, and this is important, there appears to be no principled distinction here. The latter more closely track important G features, but both types of formatives enter derivations in the same way (via lexical insertion or heads of X’ projections) and are manipulated by the same kinds of rules. The main difference (which I return to) is that some lexical items require specific grammatical licensing conditions (e.g. reflexives, pronouns, wh-elements) while others don’t (there is no grammatical licensing condition for ‘cat’ or ‘husband’). Functional elements are also often designated “closed class” items, but this classification carries no obvious theoretical import, at least within the theory of grammar. Rather, the designation is descriptive and adverts to the statistical frequency of these elements. Grammatically speaking, it is unclear what makes an expression “functional” beyond the fact that we designate it as such.

Minimalist accounts fall roughly on the GB side of these issues. This, I believe is unfortunate for the earlier distinction between lexical and grammatical formatives is, IMO, worth a modern investigation. Before saying a few words why I believe this, let me indulge my penchant for Whig History and illustrate the distinction contrasting the older Lees-Klima binding theory with the more modern GB view. Readers be warned, this will not be a short excursus.

Let’s start with the Less-Klima (LK) (1963) account.  The theory invokes the following two rules.  They must apply when they can and they are ordered so that (1) applies before (2).
            (1) Reflexivization:
X-NP1- Y- NP2 - Z --> X- NP1-Y- pronoun+self-Z,                               (Where NP1=NP2, pronoun has the f-features of NP2, and NP1/NP2 are in the same simplex sentence and).
(2) Pronominalization:
X-NP1-Y-NP2-Z --> X-NP1-Y- pronoun-Z                                                             (Where NP1=NP2 and pronoun has the f-features of NP2).

As is evident, the two rules have very similar forms. Both apply to identical NPs and morphologically convert one to a reflexive or pronoun. (1), however, only applies to nominals in the same simplex clause, while (2) is not similarly restricted. As (1) obligatorily applies before (2), reflexivization will bleed the environment for the application of pronominalization by changing NP2 to a reflexive (thereby rendering the two NPs non-identical).  The rule ordering effectively derives the complementary distribution of bound pronouns and reflexives. 

An illustration will help make things clear. Consider the derivations of (3a).  It has the underlying structure in (3b). We can factor this as in (3c) as per the reflexivization rule (1). This results in converting (3c) to (3d) with the surface output (3e) carrying a reflexive interpretation.
(3)       a. John1 washed himself/*him
            b. John washed John
            c. X-John-Y-John-Z
            d. X-John-Y-him+self-Z
            e. John washed himself
What blocks John likes him with a similar reflexive reading? To get this structure requires that Pronominalization apply to (3c).  However, it cannot as (1) is ordered to obligatorily apply before (2).  Once (1) applies we get (3d) and this is no longer has a structural description amenable to (2). Thus, the application of (1) bleeds that of (2) and John likes him with a bound reading cannot be derived.

This changes in (4). Reflexivization cannot apply to (4c) as the two Johns are not in the same clause. As (1) cannot apply, (2) can (indeed, must) as it is not similarly restricted to apply to clausemates. In sum, the inability to apply (1) allows the application of (2). Thus does the LK theory derive the complementary distribution of reflexives and bound pronouns.
(4)       a. John believes that Mary washed *himself/him
            b. John believes that Mary washed John
            c. X-John-Y-John
            d. X-John-Y-him
            e. John believes that Mary washed him
There are other features of note:

·      *LK Grammars code for antecedence: Anaphoric dependency is grammatically specified. In other words, just as the antecedent of a reflexive is determined by (1), the antecedent of an anaphoric pronoun is determined by (2). If one understands “NP1 = NP2” to mean that the two nominals must (at least) have the same semantic value (i.e. that NP1 semantically binds NP2) then what the equality expresses is the idea that the grammar codes semantic binding and semantic antecedence.[1]  This has two consequences. First, that the grammar codes binding dependencies, not (co-)referential dependencies.[2] Second, there is no analogue of GB’s Condition B, which grammatically states an anti-binding restriction. (1) and (2) together determine the class of anaphoric dependencies. There is no specific coding for disjoint reference or anti-anaphora.[3]
·      *Some operations have priority over others. A key feature of the LK approach is that reflexivization obligatorily applies before pronominalization.  Were the operations either freely ordered or not obligatory then John hugged him would support the bound reading of the pronoun.  In effect, the LK account embodies an economy conception wherein reflexivization is preferred to (is obligatorily ordered before) pronominalization. Absent this preference, locally bound pronouns would be grammatically generated.  This point is made evident by considering a slight alternative version of the Pronominalization rule. Assume that we added the following rider to (2): NP1 and NP2 are not contained in the same simplex clause.  This codicil is analogous to the restriction in (1), where Reflexivization is limited to clause-mates.  Interestingly, this amendment allows (1) and (2) to be freely ordered. The clause-mate condition in (1) restricts application to clause-mated nominals and the one in (2) to non-clause-mated NPs. This suffices to prevent the illicit pronouns and reflexives in (3a)/(4a).[4]
·     * The LK approach is dependency centered not morpheme centered. (1) and (2) primarily code antecedence relations not morpheme distributions. A by-product of the dependency (in English) is the insertion of reflexive and pronominal morphemes. These are clearly surface morpho-phonological byproducts of the established dependency and can be expected to differ across languages.[5] Stated more baldly, one can have reflexive and bound pronoun constructions without reflexives or bound pronouns.  This gives the LK theory two distinctive characteristics when viewed with a modern eye. First, it distinguishes between morphemes that enter derivations from the lexicon and those that do not.  Second, it endows this distinction between morpheme types with semantic significance. In the context of the Standard Theory, the LK background theory, bound pronouns and reflexives are semantically inert. Here Deep Structure exclusively determines semantic interpretation. Consequently, as reflexive and bound pronoun morphemes are not in Deep Structure but are introduced in the course of the syntactic derivation they must be interpretively impotent.  There is one more interesting consequence, the LK conception rejects a central feature of later accounts: that morphological identity is a good guide to syntactic or semantic categorization.  In other words, for LK theorists, the mere fact that bound pronouns and deictic pronouns have the same morpho-phonological shape in many languages is at best a weak prima facie reason for treating them as a unified syntactic class.
·      *The binding rules in (1) and (2) also effectively derive a class of principle C effects given the background assumption that reflexives and pronouns morphologically obscure an underlying copy of the antecedent.[6] The derivation, however, is not particularly deep.  By stipulation, the rules retain the higher copies and morphologically reshape the lower ones into pronouns and reflexives. This has the effect of blocking the derivation of sentences like Himself kissed Bill, He thinks that John is tall, and (if the rules are ordered before WH-movement (aka Question formation) Who1 did he say t1 left. There are two noteworthy features of this account of principle C effects. First, as noted it is not deep for there is no reason for why the rules could not have been stated so that the higher copy (optionally) gets morphologically transmogrified. Were this possible all the indicated unacceptable sentences would be fully grammatically generated.  Second, this version of principle C effects only holds for bound anaphors. It does not extend to co-referential dependencies, which fall outside the purview of this version of the binding theory.  This is not, in itself a bad thing. As has been noted, there are well known “exceptions” to principle C where co-reference is tolerated. On the LK account, this is to be expected.[7]

In sum: for LK the syntax outputs antecedent-anaphor dependencies. This is explicitly and directly coded in the relevant binding rules.  The proposal has two central features: an economy condition in the guise of the preference for reflexivization over pronominalization and a distinction between formatives that enter derivations via rules like Lexical Insertion (e.g. words like cat, dog, the, this, deictic pronouns, etc.) and those that are the morphological by-products of rules of grammar (e.g. words like himself and certain bound hims that are morphological residues of established anaphoric dependencies).



[1] It must code more than this however for otherwise (2) could apply to the output of (1). It would suffice to block this to assume that some kind of syntactic identity is also required, e.g. that the two be tokens of the same type. For further discussion c.f. Hornstein 2001 and note 3.
[2] Figuring how to make this clear led to problems with the original LK account. For example, how exactly to code (i)?  It does not semantically express (ii).

(i)            Everyone hugged himself
(ii)          Everyone hugged everyone
Interestingly this problem for the LK theory has an answer in contemporary minimalist approaches if we take binding to be a chain relation.  In effect, the difference between the underlying form of (i) vs (ii) is that the latter has two selections of everyone in the numeration while the former has one. In other words, if we treat (1) and (2) as morphological spell out rules defined over chains, this problem disappears.  C.f. Drummond 2011, Idsardi and Lidz 1997 and Hornstein 2001 for discussion. We return to this point again later on.
[3] Lasnik’s 1976 proposal for an anti-co-reference rule is built around the problems regarding “accidental” co-reference that this fact entails.  Contemporary attempts to return to the LK vision have roughly followed Reinhart  in assuming that the possibility of grammatical binding restricts extra-grammatical co-reference options.
[4] We must still assume that they are obligatory, but this is to block principle C effects (e.g. John saw John, and John said that Mary like John) rather than assure the complementarity of reflexives and bound pronouns. 
[5] This is very much a Distributed Morphology conception, though in earlier theoretical guise.
[6] Recall that this assumption creates problems for quantified NP antecedents as remarked in footnote 3.
[7] C.f. Evans and Reinhart among others.  Note, in addition, that there are virtually no extant cases of inverse binding, i.e. where a pronoun is anaphorically dependent on an antecedent it c-commands.  Furthermore, even WCO configurations would seem to be underivable given the actual rules proposed.  Nice as this is, it is worth recalling that this empirical success arises from codifying the stipulation that it is the higher/leftmost copy that is retained and the lower rightmost copy that gets morphologically altered.