Regarding "Multi-modal data input is essential to figure that carrots are typically orange, for example." I just asked ChatGPT 3.5 "Rank the most common colors for carrots".

Here's the reply:

The most common colors for carrots are:
  1. Orange
  2. Purple
  3. White
  4. Yellow
  5. Red
Orange is by far the most common color for carrots. However, purple, white, yellow, and red carrots are also grown and eaten, although they are less common than orange carrots. Purple carrots, for example, are a good source of anthocyanins, which are antioxidants that may help protect against certain diseases. White carrots, on the other hand, are lower in beta-carotene than orange carrots, but they are still a good source of dietary fiber and other nutrients. Yellow and red carrots also contain various beneficial nutrients and are often used in culinary applications for their unique flavors and colors.

And text-davinci-003 doesn't seem to even consider other colors:
image.png

I used these models because GPT-4 is multi-modal and is able to discuss images.

While multi-modal input is helpful, many studies are showing it isn't necessary. It can draw a unicorn in TikZ Latex. It can answer questions about what colors are close to other colors. It can reason about directions. 

It seems pretty grounded despite being trained on text only. Predicting the next word well involves having good models of the world and people.

Best,

-ken

On Tue, 2 May 2023 at 06:00, Cavar, Damir <dcavar@indiana.edu> wrote:
Joan,

You make an important point!

GPT3 was trained on text only. The new models include now image information, as well. Multi-modal data input is essential to figure that carrots are typically orange, for example. The issue is, if we talk online about things, we emphasize new and exciting facts. You might find "purple carrot" more frequently mentioned than "orange carrot" in text-based communication online, just because of pragmatic or conversational principles. Known and presupposed facts we do not talk about. Images of carrots would show orange as the default or dominant color. Bringing text and image together is essential 

Also, without the image, if I would talk about "my new car" you might conclude that it could be a 2023 Porsche or such. Only when I show you the picture, would it be clear that it is not the car that is new, since it is a 1965 Ford Mustang.

So, good they are now training the models on text and image.

But, nobody - as far as I can tell - is training right now on phonetic, prosodic, and textual language data to potentially be able to catch intonational clues. Most likely because that kind of data does not exist on the necessary scale needed for such models.

In the previous post I was testing gapping with semantic mismatches of the coordinated predicates, and mismatches of phi-features. Most informants get it. Without such mismatches (semantic or agreement) things should be pretty easy to process for native speakers, they are in fact common in conversation, though not easy to parse for GPT, and, neither for Stanford CoreNLP, Stanza, spaCy, Benepar, etc. Our State of the Art NLP parsers fail on anything that involves silence elements that are common in real conversational behavior or in social media posts. Parsers are rarely used in real world applications, it turns out, and there are good reasons for that.


I am not opposing LLMs, and I see good application domains, just exploring the limits. The models are certainly useful, but also over-hyped.


All the best

DC


--
Damir Cavar
Associate Professor
Indiana University at Bloomington
NLP-Lab



From: LFG-list <lfg-list-bounces@mailman.uni-konstanz.de> on behalf of Joan Bresnan <bresnan@stanford.edu>
Sent: Monday, May 1, 2023, 3:58 PM
To: Ron Kaplan <ron.kaplan@post.harvard.edu>
Cc: lfg-list@mailman.uni-konstanz.de <lfg-list@mailman.uni-konstanz.de>
Subject: Re: [LFG-list] ChatGPT discussion

John Maxwell's remark cited by Ron  is a variant of a well-known quotation illustrating historical prejudice:

As recorded by Boswell on July 31, 1763, Johnson said, ''Sir, a woman's preaching is like a dog's walking on his hinder legs. It's not done well; but you are surprised to find it done at all. ''

As a woman, I would like to see a more specific analysis of the problems of the software.  I believe most
people would struggle to parse some of the sentences offered without the appropriate intonation.

signed,

Mary about cats


On Mon, May 1, 2023 at 12:42 PM Ron Kaplan <ron.kaplan@post.harvard.edu> wrote:
I’m reminded of a comment that John Maxwell once made, about some piece of software:

It’s like a talking dog.  The fact that it talks at all is incredibly impressive—what it says, not so much., 

On May 1, 2023, at 9:47 AM, Cavar, Damir <dcavar@indiana.edu> wrote:

Hi there,

yes, powerful tools or toys, certainly, but not The Solution that replaces powerful engineers and renders smart linguists useless (or forces them into the hamburger wrapping or pizza delivery business).

My personal slogans are now more and more: Every time I fire a LLM expert, my systems run better. And, all statistical and neural parsers leak.

This after engineering them and running detailed evaluations and working with them for decades. XLE and the English grammar are still winning after 20 years of dependency corpora and training the smartest probabilistic/neural algorithms, or after piping almost the entire internet through an LLM like GPT 4.

(You all know that I love hyperbole, right?)


Now, it is funny:

I just tried again, saying "Give me the phrase structure tree for 'I stole a book and Peter kisses from Mary'" and it returned me:

ChatGPT
I'm sorry, but the sentence "I stole a book and Peter kisses from Mary" is not a grammatically correct sentence. It appears to have some syntax errors, which make it difficult to construct a meaningful phrase structure tree. Could you please provide a revised sentence or clarify the intended meaning?


Now it also has introspection and can provide grammaticality judgements, it seems. Very interesting. So, I ask:


Me again:
On a scale of 1 to 10, how ungrammatical is the sentence?

ChatGPT:
The sentence "I stole a book and Peter kisses from Mary" is highly ungrammatical, as it contains multiple errors in syntax and structure. It seems to mix up different verb tenses and noun phrases, making it difficult to determine the intended meaning. I would rate its ungrammaticality as a 9 or 10 on a scale of 1 to 10, with 10 being the most ungrammatical.


Then I asked again, adding a comma:
Draw me a phrase structure tree for "I stole a book, and Peter kisses from Mary"

ChatGPT:
                S
          _____|_____
         |           |
        NP          VP
         |      ____|___
         |     |        |
    Pronoun  V|P      ConjP
       |      |   _____|_____
       I   stole |  |         |
            |    |  NP       VP
           NP   |  |        _|_
         ___|___  |       |   |
        |      |  |       |  VP
        Det   Noun |    Verb Pronoun
         |     |   |       |    |
         a    book and    Peter kisses

In this tree, S represents the sentence as a whole, which is divided into two main branches: NP (noun phrase) and VP (verb phrase). The NP consists of a pronoun "I" and an NP headed by a determiner "a" and a noun "book". The VP is divided into two conjuncts by the coordinating conjunction "and". The first conjunct consists of a verb "stole" and a preposition "from" followed by an NP headed by the proper noun "Mary". The second conjunct consists of a proper noun "Peter" and a verb "kisses", forming a separate VP.


OK, so just one comma made all the difference in rating, now the sentence is good and the resulting analysis is quite interesting. But, let us try some simpler construction:


Me:
Draw me a phrase structure tree for "Peter likes to think about algorithms and Mary about cats"

ChatGPT:
                       S
          ____________|____________
         |                         |
     NP(Peter)                    S(Mary)
         |                __________|___________
      VP(likes)          |                       |
         |              VP(to think)         PP(about)
    S(to) and            |         ____________|________
     NP(algorithms)     NP(Mary)           NP(cats)
                      (about)

In this tree, S represents a sentence, NP represents a noun phrase, VP represents a verb phrase, and PP represents a prepositional phrase. The conjunction "and" is represented by S(to) and, where to is the conjunction, and S(Mary) represents the clause for Mary's action.

-------------------

Yes, yes, very good... (for me) :-)  I won't even go into more complex constructions... And, well, XLE-Web with English does not get it, either, but it gets much more much better, with or without comma.

Some people argue that we should adjust our language so that AIs, LLMs, and some tools can process it. Others claim that ellipsis or gapping are weird constructions that only some crazy linguists can come up with. A colleague from a big US bank asked me why we put in these weird sentences into ChatGPT, why don't we just speak in plain English to it... :-)

Do we really want to study "prompts for different versions of GPT" and engineer around them, or do we want to engineer professional systems around our insights that we can falsify and test? I'll continue working on my LFG parser, quantum NLP, CGs, and real working tech, until ChatGPT figures that out.  :-)  I'd use LLMs for innocent problems where it cannot harm or create issues with generating problematic nonsense. It is a useful toy, not more than that (except the carbon emission...).


CU all!

DC


PS: I don't know how about you, but we got students submitting essays on different topics that start with:

"As an artificial intelligence language model, I do not have opinions on (...)"

Some folks just copy and paste, they do not even read what ChatGPT gives them. At least that is easy to grade...





On Mon, 2023-05-01 at 11:53 -0400, Vijay Saraswat wrote:
Hi Damir — Everything depends on how the problem is formulated. And which model you are using — the right prompts for GPT-4 are not necessarily right for GPT 3.5 or GPT 3. Anyway will give your examples a shot later today / tomorrow. 

An important point to note is (and I said this in my message): One has the possibility of further  training these models on labeled data, e.g. human corrected LFG parses etc. I believe one of the LFG teams trained a statistical parser in this way 10-15 years ago. But with Language Model (LMs) technology now having progressed substantially we should be able to use LMs to develop really powerful tools for computational linguists.


-- 
Associate Professor
Indiana University at Bloomington



--
Joan Bresnan
Stanford University
http://web.stanford.edu/~bresnan/