Hi there,

Thanks for these notes! Very interesting!

In many ways GPT4 is useful, also for data formatting and annotation, or sample generation. You can give it some data format, then raw data, and ask it to reformat that accordingly. It also can give you more samples for some constructions if you need more material to train and test models.

However, just came back from Berlin, from the GGS, where I briefly discussed some of these points related to ChatGPT; not just are all the major parsers that I tested (constituent and dependency) problematic, but they are basically useless for complex constructions, only XLE for English being impressive! 😊 , ChatGPT is also not capable of dealing with specific constructions and ambiguities:

I am working on ellipsis, all kinds of gapping and sluicing constructions, and temporal scope in complex predicates, labeling for factivity and truth value, as well as generating presuppositions and implicatures for arguments (claims and evidence). It systematically fails to deal with ellipsis in coordination, deeply embedded pronoun bindings, and temporal or even reasoning. But, this is fine, since I would only use it as a front-end to users interfaces. What you can do is make it query a linguistic and knowledge backend (say SPARQL or knowledge graph language like Cypher), generate some formal results and make it articulate a response. I also play with the free and open versions, e.g. GPT4All or HuggingGPT, which in the case of GPT4All I can directly program with, and even integrate in a parser.

But, to give you some impression, when I said:

I stole a book and Peter kisses from Mary.

It gave me a moral lecture about why stealing is really bad and instructed me to return everything, in particular that I should return:

“Peter’s kisses from Mary”

This shows you that it took the coordination not to be of predicates, but of NPs [NP a book ] & [ Peter kisses from Mary ] and it did not consider that the PP modifies both predicates, that is, it does not get scope. And so on…

It is a good generator, though. If you provide some sample data format and tell it how to spell it out, it could be your eloquent text generator.

It does a surprisingly good job in generating presuppositions and implicatures. Just tell it something like this:

Miriam just benched 200 pounds in the gym. What are the implicatures of that?

Miriam just benched 200 pounds in the gym. What are the presuppositions of that?

But, since it just gives you word-sequence probability-based output, do not expect any real understanding. It is a dumb statistical model, not a smart parser and analyzer, although, we do not know all the details behind the front end. We know that they use two models in combination.

I would still bet on a good LFG parser, add to it probabilities, link it to professional knowledge systems, and let it interact with an LLM. Even though some people like Piantadosi like to tell us that linguistics and knowledge-based engineering approaches have failed, that is simply not true. I will jiust continue beating the drum… 😊

But, LLMs are important and very useful components. Instead of wasting my lifetime studying “query engineering” I would still say, real engineering and knowledge driven approaches are winning. LLMs will need them in the backend. It is like with the Kahneman model of Thinking Fast and Slow, is see it as there is one fast and probabilistic model, and there is a slow and knowledge-based model that interact with each other. ChatGPT 4 is the first attempt to get to a Fast Thinking module.

All the best

From: LFG-list <lfg-list-bounces@mailman.uni-konstanz.de> On Behalf Of Vijay Saraswat
Sent: Monday, May 1, 2023 4:08 AM
To: Miriam Butt <miriam.butt@uni-konstanz.de>
Cc: lfg-list@mailman.uni-konstanz.de; Mary Dalrymple <mary.dalrymple@ling-phil.ox.ac.uk>; Ken Kahn <toontalk@gmail.com>
Subject: Re: [LFG-list] Your message to LFG awaits moderator approval

Still haven’t seen my original message relayed by the list, here is yet another attempt…

I thought this group might find the following interesting.

For the last several years I have been leading an R&D group in AI (essentially applied NLP) at a major Wall Street Bank. Since 2020, I have been struck by the stunning progress made by Large Language Models (LLMs; GPT-3, ChatGPT, GPT-4, …). Essentially it seems that a lot of the work we were doing in computational linguistics (at least some of us), particularly in terms of applications to real-world question answering problems, has now been swamped by LLM advances.

I have a separate talk on the kind of highly-specific text analyses GPT-4 can do directly for (financial) professionals with domain-specific text, without having to go through parsing / semantic representation steps. Here I thought it might be illuminating to provide a log of a conversation with GPT-4 about representing meanings of simple sentences in IL.

As you can see it doesn’t always get things right. But some discussion helps it set things right (in itself, this is another startling capability of GPT-4). Such conversations can lead to the development of an accurate labeled data set, and thence to a fine-tuned end-point which should end up being highly accurate. (GPT-4 is not yet available to be fine-tuned, but GPT-3 is, and I would hope -4 will be, soon…)

There are tons of things to discuss here, of course, if there is interest. Just putting this out as a conversation starter…

Best,

Vijay

The examples are taken from https://aclanthology.org/E93-1013.pdf and https://aclanthology.org/E95-1005.pdf

VS: Parse the following sentence and generate its m meaning in Montague's Intensional Logic. -- Sentence Bill kissed Hillary. -- Meaning formula

GPT4: To represent the given sentence "Bill kissed Hillary" in Montague's Intensional Logic, we will first parse the sentence into its components:

Subject: Bill
Verb: kissed
Object: Hillary

Montague's Intensional Logic uses lambda calculus to represent meanings. We will use the following variables:

b for Bill
k for the verb 'kiss'
h for Hillary

Now, we can represent the meaning formula as follows:

k(b, h)

This formula states that the kissing action (k) occurs with Bill (b) as the subject and Hillary (h) as the object. The given sentence can be translated into Montague's Intensional Logic as the application of the binary relation 'k' to the individuals 'b' and 'h’.

VS: Good. Here is another one.

-- Sentence Bill obviously kissed Hillary.

GPT4: To represent the given sentence "Bill obviously kissed Hillary" in Montague's Intensional Logic, we need to consider the adverb "obviously" as a modifier of the verb "kissed". Let's denote the adverb with the letter 'o' and use the previously defined variables:

b for Bill
k for the verb 'kiss'
h for Hillary
o for the adverb 'obviously'

Now, we can represent the meaning formula as follows:

o(k)(b, h)

This formula states that the kissing action (k) occurs with Bill (b) as the subject and Hillary (h) as the object, and that this action is obviously true (o). The given sentence can be translated into Montague's Intensional Logic as the application of the binary relation 'k' to the individuals 'b' and 'h', modified by the adverb 'o’.

VS: Hmm.. shouldn't this be o(k(b,h))?

GPT4: Apologies for the confusion. You are correct. The correct representation of the sentence "Bill obviously kissed Hillary" in Montague's Intensional Logic should be:

o(k(b, h))

This formula states that the kissing action (k) occurs with Bill (b) as the subject and Hillary (h) as the object, and that this action is obviously true (o). The given sentence can be translated into Montague's Intensional Logic as the adverb 'o' modifying the result of applying the binary relation 'k' to the individuals 'b' and 'h’.

VS: Good. Now let's do:

-- Sentence Hillary let Bill write a letter -- Meaning formula

GPT4: To represent the given sentence "Hillary let Bill write a letter" in Montague's Intensional Logic, we will first parse the sentence into its components:

Subject: Hillary
Verb: let
Object: Bill
Infinitive verb: write
Object of the infinitive verb: a letter

We will use the following variables:

h for Hillary
l for the verb 'let'
b for Bill
w for the verb 'write'
a for a letter

Now, we can represent the meaning formula as follows:

l(h, λx.w(x, a))(b)

This formula states that the letting action (l) occurs with Hillary (h) as the subject and a function representing Bill writing a letter (λx.w(x, a)) as the object. Then, the function is applied to the individual Bill (b).

The given sentence can be translated into Montague's Intensional Logic as the application of the ternary relation 'l' to the individuals 'h' and 'b', and the action of writing a letter (w) with 'a' as the object.

VS: Good. Next:

-- Sentence Bill supported, and Hillary opposed, NAFTA.

GPT4: To represent the given sentence "Bill supported, and Hillary opposed, NAFTA" in Montague's Intensional Logic, we need to parse the sentence into its components and consider the conjunction "and" that connects two separate actions:

Subject 1: Bill
Verb 1: supported
Subject 2: Hillary
Verb 2: opposed
Object: NAFTA

We will use the following variables:

b for Bill
s for the verb 'support'
h for Hillary
o for the verb 'oppose'
n for NAFTA
∧ for the conjunction 'and'

Now, we can represent the meaning formula as follows:

(s(b, n) ∧ o(h, n))

This formula states that Bill (b) supports NAFTA (n), and Hillary (h) opposes NAFTA (n). The given sentence can be translated into Montague's Intensional Logic as two separate actions, the support action (s) with Bill (b) as the subject and NAFTA (n) as the object, and the oppose action (o) with Hillary (h) as the subject and NAFTA (n) as the object, connected by the conjunction 'and' (∧).

VS: OK, next:

-- Sentence Bill supported, and Hillary opposed, two trade bills.

GPT4: To represent the given sentence "Bill supported, and Hillary opposed, two trade bills" in Montague's Intensional Logic, we need to parse the sentence into its components and consider the conjunction "and" that connects two separate actions:

Subject 1: Bill
Verb 1: supported
Subject 2: Hillary
Verb 2: opposed
Quantifier: two
Object: trade bills

We will use the following variables:

b for Bill
s for the verb 'support'
h for Hillary
o for the verb 'oppose'
tb for trade bills
∧ for the conjunction 'and'
t for two

Since we are dealing with a quantifier, we will use lambda notation to represent the two trade bills. Let x represent trade bills and y for the second trade bill:

Now, we can represent the meaning formula as follows:

∃x∃y(tb(x) ∧ tb(y) ∧ t(x, y) ∧ s(b, x) ∧ s(b, y) ∧ o(h, x) ∧ o(h, y))

This formula states that there exist two trade bills (x and y) such that Bill (b) supports both (s(b, x) and s(b, y)), and Hillary (h) opposes both (o(h, x) and o(h, y)). The given sentence can be translated into Montague's Intensional Logic as two separate actions, the support action (s) with Bill (b) as the subject and the two trade bills (x and y) as the objects, and the oppose action (o) with Hillary (h) as the subject and the two trade bills (x and y) as the objects, connected by the conjunction 'and' (∧).

VS: Next: -- Sentence Hillary wanted, found, and supported two candidates.

GPT4: To represent the given sentence "Hillary wanted, found, and supported two candidates" in Montague's Intensional Logic, we need to parse the sentence into its components and consider the conjunctions "and" that connect three separate actions:

Subject: Hillary
Verb 1: wanted
Verb 2: found
Verb 3: supported
Quantifier: two
Object: candidates

We will use the following variables:

h for Hillary
w for the verb 'want'
f for the verb 'find'
s for the verb 'support'
c for candidates
∧ for the conjunction 'and'
t for two

Since we are dealing with a quantifier, we will use lambda notation to represent the two candidates. Let x represent the first candidate and y for the second candidate:

Now, we can represent the meaning formula as follows:

∃x∃y(c(x) ∧ c(y) ∧ t(x, y) ∧ w(h, x) ∧ w(h, y) ∧ f(h, x) ∧ f(h, y) ∧ s(h, x) ∧ s(h, y))

This formula states that there exist two candidates (x and y) such that Hillary (h) wanted both (w(h, x) and w(h, y)), found both (f(h, x) and f(h, y)), and supported both (s(h, x) and s(h, y)). The given sentence can be translated into Montague's Intensional Logic as three separate actions, the want action (w), find action (f), and support action (s) with Hillary (h) as the subject and the two candidates (x and y) as the objects, connected by the conjunctions 'and' (∧).

VS: It would seem to me that the meaning should be the conjunction of two propositions, one expressing the wanting, and the second that the finding-and-supporting was successfully accomplished…

GPT4: I apologize for my previous responses, and I understand now what you are looking for. To represent the given sentence "Hillary wanted, found, and supported two candidates" in Montague's Intensional Logic as the conjunction of two propositions, we can do the following:

Subject: Hillary
Intensional verb: wanted
Extensional verb 1: found
Extensional verb 2: supported
Quantifier: two
Object: candidates

We will use the following variables:

h for Hillary
w for the verb 'want'
f for the verb 'find'
s for the verb 'support'
c for candidates
t for two
∧ for the conjunction 'and'

Now, we can represent the meaning formula as follows:

w(h, λx.t(c, λy.(f(h, y) ∧ s(h, y)), x)) ∧ ∃x∃y(c(x) ∧ c(y) ∧ t(x, y) ∧ f(h, x) ∧ f(h, y) ∧ s(h, x) ∧ s(h, y))

In this formula, we have two separate propositions joined by a conjunction. The first proposition represents the wanting, with the intensional verb 'want' as the function w(h, λx...), taking Hillary (h) as the subject and a property λx representing the existence of two candidates that Hillary found and supported as its argument. The second proposition represents the successful finding and supporting of two candidates, with the quantifier 'two' applied to the candidates, and for each candidate x and y, the extensional verbs 'found' and 'supported' are applied.

This representation treats the intensional verb 'wanted' separately from the extensional verbs 'found' and 'supported', expressing the conjunction of two propositions: one for the wanting, and the second for the successful finding and supporting of two candidates.

Sent from my iPhone

On May 1, 2023, at 3:59 AM, Miriam Butt <miriam.butt@uni-konstanz.de> wrote:

Hi,

really interesting post Ken!   It seems it’s like with the text generation — very fluent posting and explanation of f-structures but with features we wouldn’t use in LFG (REL-CLAUSE, SUBORD-CLAUSE). I’d really like to know what the underlying technology for this is (beyond the basics).

I’ve started looking into prompt engineering — any pointers on this you might know?

Cheers,

Miriam

On 28. Apr 2023, at 10:53, Ken Kahn <toontalk@gmail.com> wrote:

Hi

I did get this notification of moderation 4 days ago.

Here's my message again:

Vijay's message got me wondering what else GPT-4 can do.

https://shareg.pt/VcJBEYK

Mary thought the result was at an undergraduate level.

Best,

-ken

---------- Forwarded message ---------

From: <lfg-owner@listserv.linguistlist.org>

Date: Mon, 24 Apr 2023 at 16:23

Subject: Your message to LFG awaits moderator approval

To: <toontalk@gmail.com>

Your mail to 'LFG' with the subject

   GPT-4 producing f-structures

Is being held until the list moderator can review it for approval.

The reason it is being held:

   Post by non-member to a members-only list

Either the message will get posted to the list, or you will receive

notification of the moderator's decision. If you would like to cancel

this posting, please visit the following URL:

   https://listserv.linguistlist.org/cgi-bin/mailman/confirm/lfg/7ecd6420d43b3b2becaeaab4d5e360996c0288cd

****************************************************************
Miriam Butt
Department of Linguistics
University of Konstanz
Fach 184        Tel: +49 7531 88 5109
78457 Konstanz Fax: +49 7531 88 4865
Germany                +49 7531 88 5115

miriam.butt@uni-konstanz.de
https://www.ling.uni-konstanz.de/butt/

'I would suppose, sir,' he said cautiously, 'that Modesty Blaise might
be a person awfully difficult for us---er---actually to get.' He
blinked towards the big, grey-haired main who stood by the window,
looking down at the night traffic, hurrying along Whitehall.

'For a moment,' Tarrant said, turning from the window, 'I hoped you
might split that infinitive, Fraser.’

Peter O'Donnell, Modesty Blaise (p. 1)
****************************************************************