MPL pattern set ‘max’

This is the pattern set ‘max’ described in Chapter 4 of my thesis.

# Match rules.
# These are regular expressions which allow variables to match individual pattern elements (or parts thereof).
# match means must match, !match means must match anything but...





# @ENTITY, @AGENT and @TARGET can be anything containing an entity:

match @ENTITY = Entity[a-z]{1,2}
match @AGENT = Entity[a-z]{1,2}
match @TARGET = Entity[a-z]{1,2}

# 'Which' and 'that' are fairly interchangeable:

match @WHICH = ^(which|that)$

# @NOUN is an unusual one as it matches POS tag nodes rather than words.
# For now, it's set to the same list as the default chunkable tags in NounPhraseChunker:

match @NOUN = ^(NN|NNS|NNP|NNPS|DT|CD|JJ|JJR|JJS|VBG|FW)$

# Same for @VERB which matches any verb POS:

match @VERB = ^VB.?$

# Let's allow these noun placeholders to be anything for now, and we can narrow them down later if we need to:

# !match @ATTRIBUTE = ^Entity[a-z]{1,2}$
# !match @FAMILY = ^Entity[a-z]{1,2}$
# !match @LOCATION = ^Entity[a-z]{1,2}$
# !match @PROCESS = ^Entity[a-z]{1,2}$
# !match @ROLE = ^Entity[a-z]{1,2}$
# !match @SUBSTANCE = ^Entity[a-z]{1,2}$

match @ATTRIBUTE = .*
match @FAMILY = .*
match @LOCATION = .*
match @PROCESS = .*
match @ROLE = .*
match @SUBSTANCE = .*
match @REGULATOR = .*
match @MODIFICATION = .*
match @CHANGE = .*
match @EFFECT = .*

# @REGULON can be singular or plural:

# match @REGULON = regulon




# Here are the interaction keywords that we're considering.
# Add replacement rule later to append co- to these!

# @AFFECT etc. for more-or-less-specific direct-object "A affects B" type relations. Note that some have quite different biological semantics (compare 'determines' and 'downregulates') but they are grouped by similar syntactic requirements:

# VB, VBP
match @AFFECT = ^(activate|affect|antagonise|antagonize|attenuate|bind|block|cause|control|convert|dephosphorylate|determine|direct|down-regulate|downregulate|express|drive|govern|inactivate|increase|induce|inhibit|initiate|limit|mediate|modify|modulate|perturb|phosphorylate|precede|prevent|produce|recognise|recognize|reduce|regulate|repress|stimulate|target|terminate|transcribe|up-regulate|upregulate)$

# VBD, VBN
match @AFFECTED = ^(activated|affected|antagonised|antagonized|attenuated|bound|blocked|caused|controlled|converted|dephosphorylated|determined|directed|down-regulated|downregulated|expressed|driven|governed|inactivated|increased|induced|inhibited|initiated|limited|mediated|modified|modulated|perturbed|phosphorylated|preceded|prevented|produced|recognised|recognized|reduced|regulated|repressed|stimulated|targeted|terminated|transcribed|up-regulated|upregulated)$

# VBZ
match @AFFECTS = ^(activates|affects|antagonises|antagonizes|attenuates|binds|blocks|causes|controls|converts|dephosphorylates|determines|directs|down-regulates|downregulates|expresses|drives|governs|inactivates|increases|induces|inhibits|initiates|limits|mediates|modifies|modulates|perturbs|phosphorylates|precedes|prevents|produces|recognises|recognizes|reduces|regulates|represses|stimulates|targets|terminates|transcribes|up-regulates|upregulates)$

# VBG, @NOUN
match @AFFECTING = ^(activating|affecting|antagonising|antagonizing|attenuating|binding|causing|blocking|controlling|converting|dephosphorylating|determining|directing|down-regulating|downregulating|expressing|driving|governing|inactivating|increasing|inducing|inhibiting|initiating|limiting|mediating|modifying|modulating|perturbing|phosphorylating|preceding|preventing|producing|recognising|recognizing|reducing|regulating|repressing|stimulating|targeting|terminating|transcribing|up-regulating|upregulating)$

# @REQUIRE etc. for 'backward' verbal relationships like "A requires B":

# VB, VBP
match @REQUIRE = ^(require|need)$

# VBD, VBN
match @REQUIRED = ^(required|needed)$

# VBZ
match @REQUIRES = ^(requires|needs)$

# VBG, @NOUN
match @REQUIRING = ^(requiring|needing)$

# @RESPONSIBLE for 'forward' adjectival relationships like "A is responsible for B":

match @RESPONSIBLE = ^(responsible|essential|necessary|sufficient|required|important|vital)$

# @DEPENDENT for 'backward' adjectival relationships like "A is dependent on B":

match @DEPENDENT = ^(dependent|reliant|driven)$

# @SIMILAR for sequence/structure similarity:

match @SIMILAR = ^(similar|homologous|orthologous|paralogous|identical)$

# @SENSITIVE for, well, 'sensitive' -- there must be other adjectives that take a to-complement?

match @SENSITIVE = sensitive

# @DEPEND etc. for verbed versions of the latter like "A depends on B"

# VB, VBP
match @DEPEND = ^(depend|rely)$

# VBD, VBN
match @DEPENDED = ^(depended|relied)$

# VBZ
match @DEPENDS = ^(depends|relies)$

# VBG, @NOUN
match @DEPENDING = ^(depending|relying)$

# @CONTRIBUTE etc. for verbs that take a to-object like "A contributes to B". Note that some have quite different biological semantics (compare 'contribute to' and 'bind to') but they are grouped by similar syntactic requirements:

# VB, VBP
match @CONTRIBUTE = ^(contribute|bind)$

# VBD, VBN
match @CONTRIBUTED = ^(contributed|bound)$

# VBZ
match @CONTRIBUTES = ^(contributes|binds)$

# VBG, @NOUN
match @CONTRIBUTING = ^(contributing|binding)$

# @SHOW etc. for rhetorical verbs that take a "that"-complement. Any others??

# VB, VBP
match @SHOW = ^(show|demonstrate|prove|determine|suggest|find)$

# VBD, VBN
match @SHOWED = ^(showed|shown|demonstrated|proved|determined|suggested|found)$

# VBZ
match @SHOWS = ^(shows|demonstrates|proves|determines|suggests|finds)$

# VBG, @NOUN
match @SHOWING = ^(showing|demonstrating|proving|determining|suggesting|finding)$

# @APPEAR for rhetorical verbs that take an infinitive complement.

# VB, VBP
match @APPEAR = ^(appear|seem|think|believe)$

# VBD, VBN
match @APPEARED = ^(appeared|seemed|thought|believed)$

# VBZ
match @APPEARS = ^(appears|seems|thinks|believes)$

# VBG, @NOUN
match @APPEARING = ^(appearing|seeming|thinking|believing)$

# @PLAY for... "play". Are there any others in this class?

# VB, VBP
match @PLAY = ^(play)$

# VBD, VBN
match @PLAYED = ^(played)$

# VBZ
match @PLAYS = ^(plays)$

# VBG, @NOUN
match @PLAYING = ^(playing)$

# @INCLUDE for... "include". Are there any others in this class?

# VB, VBP
match @INCLUDE = ^(include)$

# VBD, VBN
match @INCLUDED = ^(included)$

# VBZ
match @INCLUDES = ^(includes)$

# VBG, @NOUN
match @INCLUDING = ^(including)$

# @RESULT for... "result" (verb). Are there any others in this class?

# VB, VBP
match @RESULT = ^(result)$

# VBD, VBN
match @RESULTED = ^(resulted)$

# VBZ
match @RESULTS = ^(results)$

# VBG, @NOUN
match @RESULTING = ^(resulting)$

# @ACT for verbs that take an "as" or "like" complement. Any others?

# VB, VBP
match @ACT = ^(act|work|behave)$

# VBD, VBN
match @ACTED = ^(acted|worked|behaved)$

# VBZ
match @ACTS = ^(acts|works|behaves)$

# VBG, @NOUN
match @ACTING = ^(acting|working|behaving)$



# Some dependency type variables

# Allowable equivalents for prep_by:

match @PREP_BY = ^(prep_by|prep_through|prep_via)$

# And for prep_in:

match @PREP_IN = ^(prep_in|prep_within)$

# Match any preposition, just in case:

match @PREP = ^prep_.*$

# And for nsubj. prep_like is a hack to deal with things like "A, like B, affects...", and prep_than covers things like "more sensitive than":

match @NSUBJ = ^(nsubj|xsubj|prep_like|prep_than)$

# Also for nsubjpass (prep_like as above):

match @NSUBJPASS = ^(nsubjpass|prep_like)$






# Special variable to match anything:

match @@ = .*

# Special variable to match any sequence of characters not containing an underscore (useful in composites):

match @CHARS = [^_]+







# Replacement rules.

# These are applied to each pattern to generate variations, once the patterns are all read.

# The string on the left of = is replaced with the string on the right. Any part of a pattern may be edited with a replacement rule; this is performed at the character level on the pattern's string definition, before it is compiled.

# For each pattern, we apply each replacement rule IN THE ORDER PRESENTED HERE. If n new patterns P{1...n} are created by the application of a rule R to an existing pattern O, the rest of the rules after R are applied in order to each new P. The rules before R are not applied. This avoids recursion.

# It is perfectly permissible to repeat a rule so that it is applied (or may potentially be applied) more than once. This may indeed be necessary in order to model certain linguistic phenomena.

# By convention, an @ sign is used to mark variables which may be replaced via one of these rules. Variables marked with a # sign are used wherever replacement is disallowed, e.g. in very specific constructions or inside composite elements.

replace @AGENT = @SUBSTANCE ( dep VBD~~called ( dep @NOUN~~@AGENT ) )

# Keeping this commented for now to save memory
replace @TARGET = @SUBSTANCE ( dep VBD~~called ( dep @NOUN~~@TARGET ) )

replace @TARGET = @SUBSTANCE ( dep @NOUN~~@TARGET )

# Keeping this commented for now to save memory
# replace @AGENT = @SUBSTANCE ( dep @NOUN~~@AGENT ) # 2 FPs

# Added these to deal with cases that should have been chunked

replace @TARGET = @SUBSTANCE ( nn @NOUN~~@TARGET )

# Keeping this commented for now to save memory
replace @AGENT = @SUBSTANCE ( nn @NOUN~~@AGENT )

replace @AGENT = @SUBSTANCE ( prep_with @NOUN~~@AGENT )

replace @TARGET = @REGULATOR ( prep_for @NOUN~~@TARGET )

replace @TARGET = @CHANGE ( @PREP_IN @NOUN~~@TARGET )

replace @AGENT = @PROCESS ( prep_of @NOUN~~@AGENT )
replace @TARGET = @PROCESS ( prep_of @NOUN~~@TARGET )

replace @PROCESS ( prep_of @NOUN~~@TARGET ) = @PROCESS ( @PREP_IN @NOUN~~@PROCESS ( prep_of @NOUN~~@TARGET ) )

replace VBZ~~is = VBP~~are
replace VBZ~~is = VBD~~were
replace VBZ~~is = VBD~~was

replace VB~~have = VBP~~have
replace VB~~have = VBZ~~has
replace VB~~have = VBD~~had
replace VB~~having = VBG~~having

replace VB~~belong = VBP~~belong
replace VB~~belong = VBZ~~belongs
replace VB~~belong = VBD~~belonged
replace VB~~belong = VBG~~belonging

replace VB~~@AFFECT = VBP~~@AFFECT
replace VB~~@AFFECT = VBZ~~@AFFECTS
replace VB~~@AFFECT = VBD~~@AFFECTED
replace VB~~@AFFECT = VBG~~@AFFECTING

replace VB~~@REQUIRE = VBP~~@REQUIRE
replace VB~~@REQUIRE = VBZ~~@REQUIRES
replace VB~~@REQUIRE = VBD~~@REQUIRED
replace VB~~@REQUIRE = VBG~~@REQUIRING

replace VB~~@DEPEND = VBP~~@DEPEND
replace VB~~@DEPEND = VBZ~~@DEPENDS
replace VB~~@DEPEND = VBD~~@DEPENDED
replace VB~~@DEPEND = VBG~~@DEPENDING

replace VB~~@PLAY = VBP~~@PLAY
replace VB~~@PLAY = VBZ~~@PLAYS
replace VB~~@PLAY = VBD~~@PLAYED
replace VB~~@PLAY = VBG~~@PLAYING

replace VB~~@INCLUDE = VBP~~@INCLUDE
replace VB~~@INCLUDE = VBZ~~@INCLUDES
replace VB~~@INCLUDE = VBD~~@INCLUDED
replace VB~~@INCLUDE = VBG~~@INCLUDING

replace VB~~@RESULT = VBP~~@RESULT
replace VB~~@RESULT = VBZ~~@RESULTS
replace VB~~@RESULT = VBD~~@RESULTED
replace VB~~@RESULT = VBG~~@RESULTING

replace VB~~@APPEAR = VBP~~@APPEAR
replace VB~~@APPEAR = VBZ~~@APPEARS
replace VB~~@APPEAR = VBD~~@APPEARED
replace VB~~@APPEAR = VBG~~@APPEARING

replace VB~~@ACT = VBP~~@ACT
replace VB~~@ACT = VBZ~~@ACTS
replace VB~~@ACT = VBD~~@ACTED
replace VB~~@ACT = VBG~~@ACTING

replace VB~~@CONTRIBUTE = VBP~~@CONTRIBUTE
replace VB~~@CONTRIBUTE = VBZ~~@CONTRIBUTES
replace VB~~@CONTRIBUTE = VBD~~@CONTRIBUTED
replace VB~~@CONTRIBUTE = VBG~~@CONTRIBUTING

# An exception to the usual convention about not replacing #-variables

replace #AGENT_regulon = #AGENT_and_#ENTITY_regulon

# Added these in to cope with some co-reference/ellipsis phenomena:

# replace @AGENT = @SUBSTANCE ( @@ VBD~~called ( @@ @NOUN~~@AGENT ) )
# replace @TARGET = @SUBSTANCE ( @@ VBD~~called ( @@ @NOUN~~@TARGET ) )



# Why does this replacement rule stop the correct interaction being extracted from sentence 13 (10476035-10)? Surely, any replacement operation must leave the original pattern intact? Does this indicate a bug somewhere? Investigate this phenomenon later.

# replace @TARGET = @PROCESS ( prep_from @NOUN~~@TARGET )

# Similar things happen with these ones:

# replace @AGENT = @SUBSTANCE ( dep @NOUN~~@AGENT )
# replace @TARGET = @SUBSTANCE ( dep @NOUN~~@TARGET )



# Adjective-driven patterns

# "A is responsible for B":

pattern
JJ~~@RESPONSIBLE
	( @NSUBJ @NOUN~~@AGENT )
	( prep_for @NOUN~~@TARGET )
end

# "A is dependent on B":

pattern
JJ~~@DEPENDENT
	( prep_on @NOUN~~@AGENT )
	( @NSUBJ @NOUN~~@TARGET )
end

# "A is sensitive to B":

pattern
JJ~~@SENSITIVE
	( prep_to @NOUN~~@AGENT )
	( @NSUBJ @NOUN~~@TARGET )
end





# Simple verb-driven patterns

# "A [can/may/will/etc.] affect[s/ed/ing] B":

pattern
VB~~@AFFECT
	( @NSUBJ @NOUN~~@AGENT )
	( dobj @NOUN~~@TARGET )
end

# The passive equivalent

# "A [is/was/etc.] affected by B":

pattern
VBN~~@AFFECTED
	( agent @NOUN~~@AGENT )
	( @NSUBJPASS @NOUN~~@TARGET )
end

# The 'rhetorical-passive' equivalent

# "A [is/was/etc.] believed to be affected by B":

pattern
VBN~~@APPEARED
	( xcomp VBN~~@AFFECTED ( agent @NOUN~~@AGENT ) )
	( @NSUBJPASS @NOUN~~@TARGET )
end

# "A contributes to B" etc.:

pattern
VB~~@CONTRIBUTE
	( @NSUBJ @NOUN~~@AGENT )
	( prep_to @NOUN~~@TARGET )
end

# "A binds to a site on B" etc.:

pattern
VB~~@CONTRIBUTE
	( @NSUBJ @NOUN~~@AGENT )
	( prep_to @NOUN~~@LOCATION ( prep_on @NOUN~~@TARGET ) )
end




# Reverse verb-driven patterns

# "A [can/may/will] require B":

pattern
VB~~@REQUIRE
	( dobj @NOUN~~@AGENT )
	( @NSUBJ @NOUN~~@TARGET )
end

# "A [can/may/will] require  of B":

pattern
VB~~@REQUIRE
	( dobj @NOUN~~@PROCESS ( prep_of @NOUN~~@AGENT ) )
	( @NSUBJ @NOUN~~@TARGET )
end

# The passive equivalent

# "A [is/was/etc.] required by B":

pattern
VBN~~@REQUIRED
	( @NSUBJPASS @NOUN~~@AGENT )
	( agent @NOUN~~@TARGET )
end

# "A [can/will/etc.] depend on B":

pattern
VB~~@DEPEND
	( prep_on @NOUN~~@AGENT )
	( @NSUBJ @NOUN~~@TARGET )
end

# "A results from B"

pattern
VB~~@RESULT
	( prep_from @NOUN~~@AGENT )
	( @NSUBJ @NOUN~~@TARGET )
end





# Copular patterns

# "A is under the control of B":

pattern
VBZ~~is
	( prep_under @NOUN~~@AGENT )
	( @NSUBJ @NOUN~~@TARGET )
end




# "acts as" type patterns

# "A acts as B":

pattern
VB~~@ACT
	( @NSUBJ @NOUN~~@AGENT )
	( prep_as @NOUN~~@TARGET )
end





# Membership of group patterns

# "A belongs to increasing number of B-dependent genes" (quite specific):

pattern
VB~~belong
	( prep_to @NOUN~~@ROLE ( prep_of @NOUN~~{#AGENT_@DEPENDENT_@@} ) )
	( @NSUBJ @NOUN~~@TARGET )
end




# Nominal patterns not using composites

# "modification of A by B":

pattern
@NOUN~~@MODIFICATION
	( prep_of @NOUN~~@TARGET )
	( @PREP_BY @NOUN~~@AGENT )
end

# As above, but with prepositional attachment error!

pattern
@NOUN~~@MODIFICATION
	( prep_of @NOUN~~@TARGET ( @PREP_BY @NOUN~~@AGENT ) )
end




# More complex phrasal patterns

# "A plays a role in B" etc. @PROCESS stands in for "role" expressions:

pattern
VB~~@PLAY
	( @NSUBJ @NOUN~~@AGENT )
	( dobj @NOUN~~@PROCESS ( @PREP_IN @NOUN~~@TARGET ) )
end

# "In addition to B, A appears..." @PROCESS stands in for "addition" but we might need to tighten this up later:

pattern
VB~~@APPEAR
	( @NSUBJ @NOUN~~@AGENT )
	( @PREP_IN @NOUN~~@PROCESS ( prep_to @NOUN~~@TARGET ) )
end

# Object as agent of other interaction -- "[X blocks] the capacity of A to activate B" etc. The #-form of the variable is used to stop AFFECT being switched for AFFECTED (etc.), as this only works with infinitives:

pattern
VB~~@AFFECT
	( dobj @NOUN~~@AGENT )
	( xcomp VB~~#AFFECT ( dobj @NOUN~~@TARGET ) )
end

# "A [affects X] by affecting B":

pattern
VB~~@AFFECT
	( @NSUBJ @NOUN~~@AGENT )
	( @PREP_BY VBG~~@AFFECTING ( dobj @NOUN~~@TARGET ) )
end

# "A [controls X] thereby affecting B":

pattern
VB~~@AFFECT
	( @NSUBJ @NOUN~~@AGENT )
	( xcomp VBG~~@AFFECTING ( dobj @NOUN~~@TARGET ) )
end

# "A was affected in  in a  dependent manner" -- might need to tighten this up:

pattern
VBN~~@AFFECTED
	( @PREP_IN @NOUN~~@LOCATION ( @PREP_IN @NOUN~~@AGENT ) )
	( @NSUBJ @NOUN~~@TARGET )
end

# "A was affected via B" -- actually this models, in general pseudo-passives or cases where a passive has been graphed wrongly:

pattern
VBN~~@AFFECTED
	( @PREP_BY @NOUN~~@AGENT )
	( @NSUBJ @NOUN~~@TARGET )
	( auxpass @@~~@@ )
end

# "A [binds to X] and binds to B" -- this reflects one drawback of this approach, diffculty in automatically generating such conjunctions. Should probably anticipate more cases like this!!!

pattern
VB~~@CONTRIBUTE
	( @NSUBJ @NOUN~~@AGENT )
	( conj_and VB~~@CONTRIBUTE ( prep_to @NOUN~~@TARGET ) )
end

# "A binds near B, to act as a repressor":

pattern
VB~~@CONTRIBUTE
	( xcomp VB~~@ACT
		( xsubj @NOUN~~@AGENT )
		( prep_as @NOUN~~@ROLE ) )
	( prep_near @NOUN~~@TARGET )
end

# "A is sufficient to repress B" etc. There's a bit of a hack here; 'responsible' doesn't work in this case (it needs to take a to-complement) but we've put 'sufficient', 'essential' etc. in the same variable as 'responsible' so never mind.

pattern
JJ~~@RESPONSIBLE
	( @NSUBJ @NOUN~~@AGENT )
	( dep VB~~@AFFECT ( dobj @NOUN~~@TARGET ) )
end

# "A is regulated by relying on B" etc.:

pattern
VBN~~@AFFECTED
	( prep IN~~by ( dep VBG~~@DEPENDING ( prep_on @NOUN~~@AGENT ) ) )
	( nsubjpass @NOUN~~@TARGET )
end

# "A is regulated by acting at B" etc.:

pattern
VBN~~@AFFECTED
	( prep IN~~by ( dep VBG~~@ACTING ( prep_at @NOUN~~@AGENT ) ) )
	( nsubjpass @NOUN~~@TARGET )
end

# "A is expressed during X from Y" where Y is attached to X:

pattern
VBN~~@AFFECTED
	( prep_during @NOUN~~@PROCESS ( prep_from @NOUN~~@AGENT ) )
	( nsubjpass @NOUN~~@TARGET )
end







# Patterns involving composites and other nodes

# "The A regulon includes B" etc.:

pattern
VB~~@INCLUDE
	( @NSUBJ @NOUN~~{#AGENT_regulon} )
	( dobj @NOUN~~@TARGET )
end

# "A-dependent expression of B" etc.:

pattern
@NOUN~~{#AGENT_@DEPENDENT_@PROCESS}
	( prep_of @NOUN~~@TARGET )
end

# "-35 sequences of A- and X- dependent promoters of B" etc. (phew):

pattern
@NOUN~~@LOCATION
	( prep_of @NOUN~~@AGENT )
	( conj_and @NOUN~~{#ENTITY_@DEPENDENT_@PROCESS} ( prep_of @NOUN~~@TARGET ) )
end

# "A being a member of the B regulon" etc.:

pattern
@NOUN~~@TARGET
	( dep VBG~~being ( dobj @NOUN~~@ROLE ( prep_of @NOUN~~{#AGENT_regulon} ) ) )
end

# "A is a member of the B regulon" etc.:

pattern
@NOUN~~@ROLE
	( prep_of @NOUN~~{#AGENT_regulon} )
	( @NSUBJ @NOUN~~@TARGET )
end

# "A is similar to X which is a member of the C regulon" -- not sure this is really a correct relationship!

pattern
JJ~~@SIMILAR
	( prep_to @NOUN~~@ENTITY ( rcmod @NOUN~~@ROLE ( prep_of @NOUN~~{#AGENT_regulon} ) ) )
	( @NSUBJ @NOUN~~@TARGET )
end

# "X is a Y-dependent gene" etc.:

pattern
@NOUN~~{#AGENT_dependent}
	( @NSUBJ @NOUN~~@TARGET )
end







# Patterns involving just composites

# "A-dependent B":

pattern
@NOUN~~{#AGENT_@DEPENDENT_#TARGET}
end

# "A-dependent ... B" -- maybe we can figure out a more elegant way to do this later!

pattern
@NOUN~~{#AGENT_@DEPENDENT_@CHARS_#TARGET}
end

pattern
@NOUN~~{#AGENT_@DEPENDENT_@CHARS_@CHARS_#TARGET}
end

pattern
@NOUN~~{#AGENT_@DEPENDENT_@CHARS_@CHARS_@CHARS_#TARGET}
end

pattern
@NOUN~~{#AGENT_@DEPENDENT_@CHARS_@CHARS_@CHARS_@CHARS_#TARGET}
end

pattern
@NOUN~~{#AGENT_@DEPENDENT_@CHARS_@CHARS_@CHARS_@CHARS_@CHARS_#TARGET}
end

pattern
@NOUN~~{#AGENT_@DEPENDENT_@CHARS_@CHARS_@CHARS_@CHARS_@CHARS_@CHARS_#TARGET}
end

pattern
@NOUN~~{#AGENT_@DEPENDENT_@CHARS_@CHARS_@CHARS_@CHARS_@CHARS_@CHARS_@CHARS_#TARGET}
end






# Awkward sentence-specific ones (many of these are probably errors)


# Sentence 2, 11064201-3

# Entitybc -> Entitycc

pattern
@NOUN~~@TARGET
	( dep VBZ~~@DEPENDS ( prep_on @NOUN~~@AGENT ) )
end

# Entitybw -> Entitycc

pattern
VBN~~@AFFECTED
	( @PREP_IN @NOUN~~@AGENT )
	( @PREP_IN @NOUN~~@PROCESS ( prep_of @NOUN~~@TARGET ) )
end


# Sentence 6, 10767540-2

# Entitybk -> Entitydm and Entitybj -> Entitydm:

pattern
VBD~~@SHOWED
	( ccomp VBN~~@AFFECTED ( agent @NOUN~~@AGENT ) )
	( @NSUBJ @NOUN~~@PROCESS ( prep_of @NOUN~~@PROCESS ( prep_of @NOUN~~@TARGET ) ) )
end


# Sentence 14, 10468601-1

# "A is transcribed by... and requires for expression B" (Entityba -> Entityaz):

pattern
VBN~~@AFFECTED
	( conj_and VB~~@REQUIRE ( prep_for @NOUN~~@AGENT ) )
	( @NSUBJPASS @NOUN~~@TARGET )
end


# Sentence 18, 10383978-4

# "... does affect A, which ... is an activator of B" (Entitybw -> Entitybh):

pattern
@NOUN~~@TARGET
	( ccomp VB~~@AFFECT ( dobj @NOUN~~@AGENT ) )
end


# Sentence 23, 10200961-2

# "A, which is activated through B" (Entitycb -> Entitybj):

pattern
@NOUN~~@TARGET
	( dep VBZ~~is ( ccomp VBN~~@AFFECTED ( @PREP_BY @NOUN~~@AGENT ) ) )
end


# Sentence 26, 10075739-8

# "A acts as a repressor to limit B" (Entityad -> Entitybm):

pattern
VB~~@ACT
	( @NSUBJ @NOUN~~@AGENT )
	( prep_as @NOUN~~@ROLE ( dep VB~~@AFFECT ( dobj @NOUN~~@TARGET ) ) )
end


# Sentence 27, 10075739-11

# "[X activated] transcription of A by B" but with "by" attached to "activated" instead of "transcription" (Entitybm -> Entityn):

pattern
VB~~@AFFECT
	( dobj @NOUN~~@TARGET )
	( @PREP_BY @NOUN~~@AGENT )
end

# This one is based on the same fragment as before, but I don't think it implies the relationship extracted by the LLL curators. Decide for yourself. The construction is "A activated [transcription of X] by B" and the relationship is A -> B (Entityad -> Entitybm). Of course the attachment error still applies...

pattern
VB~~@AFFECT
	( @NSUBJ @NOUN~~@AGENT )
	( @PREP_BY @NOUN~~@TARGET )
end


# Sentence 30, 9696775-12

# "[the production of] A that inhibits B" (Entityz -> Entitybg) -- there's an attachment error here somewhere:

pattern
@NOUN~~@PROCESS
	( prep_of @NOUN~~@AGENT )
	( dep VB~~@AFFECT ( dobj @NOUN~~@TARGET ) )
end


# Sentence 32, 9636707-8

# "A, a transferase controlled by B" (Entitybe -> Entityp) -- shouldn't this be an appos?

pattern
@NOUN~~@TARGET
	( dep @NOUN~~@SUBSTANCE ( dep VBN~~@AFFECTED ( agent @NOUN~~@AGENT ) ) )
end


# Sentence 33, 9271869-4

# "A is reduced ... or after starvation for glucose in a B-dependent manner" (Entitybe -> Entityat) -- bad co-ordination:

pattern
VBZ~~is
	( dep VBN~~@AFFECTED ( conj_or IN~~after ( pobj @NOUN~~@PROCESS ( prep_for @NOUN~~@SUBSTANCE ( @PREP_IN @NOUN~~@AGENT ) ) ) ) )
	( @NSUBJ @NOUN~~@TARGET )
end


# Sentence 35, 10747015-5

# This is cos of a chunking error (Entityab -> Entitycb):

pattern
JJ~~@DEPENDENT
	( dobj @NOUN~~@AGENT )
	( @NSUBJ @NOUN~~@TARGET )
end


# Sentence 36, 10094682-7

# Awkward passive conjunction thing (Entitybp -> Entityf):

pattern
VBN~~@AFFECTED
	( conj_and VBD~~@DEPENDED ( prep_upon @NOUN~~@AGENT ) )
	( @NSUBJPASS @NOUN~~@TARGET )
end


# Sentence 41, 10075739-9

# Not sure if this is an error or just really complex

pattern
VB~~@CONTRIBUTE
	( @NSUBJ @NOUN~~@AGENT )
	( prep_to @NOUN~~@LOCATION ( dep VBP~~span ( xcomp @NOUN~~@LOCATION ( nsubj @NOUN~~@TARGET ) ) ) )
end


# Sentence 42, 9846747-4

# Definitely a parse error (co-ordination mistake)

pattern
VB~~@AFFECT
	( @NSUBJ @NOUN~~@AGENT )
	( ccomp VBN~~@AFFECTED ( nsubjpass @NOUN~~@TARGET ) )
end


# Sentence 47, 9401028-7

# This is unusual because the target is "expression from the A promoter". Not a common way of phrasing things. Also it's one of those passives that hasn't been correctly graphed.

pattern
VBZ~~is
	( ccomp VBN~~@AFFECTED ( @PREP_BY @NOUN~~@AGENT ) )
	( @NSUBJ @NOUN~~@PROCESS ( prep_from @NOUN~~@TARGET ) )
end


# Sentence 48, 10400595-1

# Long distance dependency: "Entityad stimulates transcription from several promoters that are used by RNA polymerase containing Entitybm":

pattern
VB~~@AFFECT
	( @NSUBJ @NOUN~~@AGENT )
	( dobj @NOUN~~@PROCESS ( prep_from @NOUN~~@LOCATION ( dep VBN~~used ( agent @NOUN~~@SUBSTANCE ( dep VBG~~containing ( dobj @NOUN~~@TARGET ) ) ) ) ) )
end


# Sentence 57, 10788508-2

# There is a chunking error here that has detached the target (Entityl_,_Entitym) from the parent node (Entityo_genes) -- wonder why?

pattern
@NOUN~~@PROCESS
	( @PREP_BY @NOUN~~@AGENT )
	( prep_of @NOUN~~@SUBSTANCE ( dep @NOUN~~@TARGET ) )
end

# The same error applies because the same three target genes are also targets of another gene (Entityad):

pattern
VBZ~~is
	( ccomp VBN~~@AFFECTED ( @PREP_BY @NOUN~~@AGENT ) )
	( nsubj @NOUN~~@PROCESS ( prep_of @NOUN~~@SUBSTANCE ( dep @NOUN~~@TARGET ) ) )
end

# And here's the non-broken version:

pattern
VBZ~~is
	( ccomp VBN~~@AFFECTED ( @PREP_BY @NOUN~~@AGENT ) )
	( nsubj @NOUN~~@PROCESS ( prep_of @NOUN~~@TARGET ) )
end


# Sentence 59, 10788508-8

# Unfortunately a VBD here has been tagged as a VBN which means we have to re-write this basic pattern (if we just allow either, we get false positives):

pattern
VBN~~@AFFECTED
	( @NSUBJ @NOUN~~@AGENT )
	( dobj @NOUN~~@TARGET )
end

# And then there's some trickier ones because we're really describing a three-way relationship here...

pattern
VBN~~@AFFECTED
	( @PREP_BY @NOUN~~@AGENT )
	( dobj @NOUN~~@TARGET )
end

pattern
VBN~~@AFFECTED
	( @NSUBJ @NOUN~~@AGENT )
	( @PREP_BY @NOUN~~@TARGET )
	( dobj @@~~@@ )
end

# And finally a nasty ellipsis where the the subject is in a completely different clause:

pattern
VBN~~@AFFECTED
	( @NSUBJ @NOUN~~@AGENT )
	( advcl VBN~~@@ ( xcomp VB~~@AFFECT ( dobj @NOUN~~@TARGET ) ) )
end


# Sentence 60, 10788508-9

# Another tricky ellipsis:

pattern
VB~~@AFFECT
	( @NSUBJ @NOUN~~@AGENT )
	( advcl VB~~@AFFECT ( dobj @NOUN~~@TARGET ) )
end

# And another, this time with an interesting "had little effect on" construction:

pattern
VB~~@AFFECT
	( @NSUBJ @NOUN~~@AGENT )
	( advcl VB~~have ( dobj @NOUN~~@EFFECT ( prep_on @NOUN~~@TARGET ) ) )
end


# Sentence 62, 10476035-3

# Looks like a mis-graph (verb depends on subject via dep) but actually appears elsewhere too. NB This causes a FP in sentence 70, 9891799-3, but that one's completely messed up anyway:

pattern
@NOUN~~@AGENT
	( dep VB~~@AFFECT ( dobj @NOUN~~@TARGET ) )
end


# Sentence 63, 10468601-2

# "Unlike the case for other X-dependent genes, Y is..." -- sheesh!

pattern
@VERB~~@@
	( @PREP @NOUN~~@ROLE ( @PREP @NOUN~~{other_#AGENT_dependent} ) )
	( nsubj @NOUN~~@TARGET )
end


# Sentence 64, 10411757-2

# "X is regulated by  that controls , Y", but something1 and Y have been chunked together by the chunker:

pattern
VBZ~~is
	( dep VBN~~@AFFECTED ( @PREP_BY @NOUN~~@SUBSTANCE ( dep VB~~@AFFECT ( dobj @NOUN~~@AGENT ) ) ) )
	( nsubjpass @NOUN~~@TARGET )
end


# Sentence 67, 10323866-2

# "X activates A by phosphorylating B" -- this causes three FPs so we might take this out later!

pattern
VB~~@AFFECT
	( @PREP_BY VBG~~@AFFECTING ( dobj @NOUN~~@AGENT ) )
	( dobj @NOUN~~@TARGET )
	( @NSUBJ @NOUN~~@ENTITY )
end


# Sentence 70, 9891799-3, is generally shafted...

# ... this one (Entityt -> Entitybq) might be okay though:

pattern
@NOUN~~@AGENT
	( rcmod VB~~@AFFECT ( dobj @NOUN~~@TARGET ) )
end

# Entityay and Entitybe are is such different regions of the graph that any attempt to join them would be a bit pointless...

# Same with Entityz and Entitybj.


# Sentence 73, 9852014-2

# "A is induced ... in addition to the typical B-dependent pattern":

pattern
VBN~~@AFFECTED
	( prep_in @NOUN~~addition ( prep_to @NOUN~~@PROCESS ( amod @NOUN~~{#AGENT_dependent} ) ) )
	( nsubjpass @NOUN~~@TARGET )
end


# Sentence 75, 10333516-3

# A chunking error has made this more difficult than it should have been:

pattern
VBN~~@AFFECTED
	( agent @NOUN~~@SUBSTANCE ( prep_for @NOUN~~@AGENT ) )
	( nsubjpass @NOUN~~@TARGET )
end


# Sentence 76, 9852015-9

# Not entirely sure what's going on here:

pattern
VB~~@ACT
	( @NSUBJ @NOUN~~@AGENT )
	( prep_as @NOUN~~@ROLE ( dep @NOUN~~@ROLE ( prep_of @NOUN~~@TARGET ) ) )
end

pattern
VB~~@ACT
	( prep_by VBG~~@AFFECTING ( prep_from @NOUN~~@AGENT ) )
	( prep_as @NOUN~~@ROLE ( dep @NOUN~~@ROLE ( prep_of @NOUN~~@TARGET ) ) )
end






# Check tense substitutions throughout.

# Also look at case sensitivity.

# Build correct versions of the patterns which allow for parse/graph errors -- we might not see the correct version now, but we could later.

# Sentence 38 (10788508-10) contains an annotation error. A construction like "A does not bind specifically to B" is marked up as an A -> B interaction. This is not picked up by my system, currently, as the negation filter spots the 'not' and disallows the interaction. I believe this behaviour is correct.

# Sentence 49 (10801786-5) contains two similar errors. The fragment is "the mutant was unable to stimulate transcription by final Entitybc-RNA polymerase from the Entitybw-dependent Entitycc operon promoter." My system picks up the bw -> cc relationship encoded in the composite okay. However, the training set also specifies bw -> bc and bc -> cc relationships which I do not believe are supported by this sentence.
Advertisements
%d bloggers like this: