Wednesday, December 29, 2021

What Does It Mean for AI to Understand?

It’s simple enough for AI to seem to comprehend data, but devising a true test of a machine’s knowledge has proved difficult.

https://www.quantamagazine.org/what-does-it-mean-for-ai-to-understand-20211216/?utm_source=Quanta+Magazine&utm_campaign=fffe95a7f6-RSS_Daily_Computer_Science&utm_medium=email&utm_term=0_f0cb61321c-fffe95a7f6-389846569&mc_cid=fffe95a7f6&mc_eid=61275b7d81

Melanie Mitchell

Contributing Columnist

December 16, 2021

[[More grist for AI skepticism.]]

Remember IBM’s Watson, the AI Jeopardy! champion? A 2010 promotion proclaimed, “Watson understands natural language with all its ambiguity and complexity.” However, as we saw when Watson subsequently failed spectacularly in its quest to “revolutionize medicine with artificial intelligence,” a veneer of linguistic facility is not the same as actually comprehending human language.

Natural language understanding has long been a major goal of AI research. At first, researchers tried to manually program everything a machine would need to make sense of news stories, fiction or anything else humans might write. This approach, as Watson showed, was futile — it’s impossible to write down all the unwritten facts, rules and assumptions required for understanding text. More recently, a new paradigm has been established: Instead of building in explicit knowledge, we let machines learn to understand language on their own, simply by ingesting vast amounts of written text and learning to predict words. The result is what researchers call a language model. When based on large neural networks, like OpenAI’s GPT-3, such models can generate uncannily humanlike prose (and poetry!) and seemingly perform sophisticated linguistic reasoning.

But has GPT-3 — trained on text from thousands of websites, books and encyclopedias — transcended Watson’s veneer? Does it really understand the language it generates and ostensibly reasons about? This is a topic of stark disagreement in the AI research community. Such discussions used to be the purview of philosophers, but in the past decade AI has burst out of its academic bubble into the real world, and its lack of understanding of that world can have real and sometimes devastating consequences. In one study, IBM’s Watson was found to propose “multiple examples of unsafe and incorrect treatment recommendations.” Another study showed that Google’s machine translation system made significant errors when used to translate medical instructions for non-English-speaking patients.

How can we determine in practice whether a machine can understand? In 1950, the computing pioneer Alan Turing tried to answer this question with his famous “imitation game,” now called the Turing test. A machine and a human, both hidden from view, would compete to convince a human judge of their humanness using only conversation. If the judge couldn’t tell which one was the human, then, Turing asserted, we should consider the machine to be thinking — and, in effect, understanding.

Unfortunately, Turing underestimated the propensity of humans to be fooled by machines. Even simple chatbots, such as Joseph Weizenbaum’s 1960s ersatz psychotherapist Eliza, have fooled people into believing they were conversing with an understanding being, even when they knew that their conversation partner was a machine.

In a 2012 paper, the computer scientists Hector Levesque, Ernest Davis and Leora Morgenstern proposed a more objective test, which they called the Winograd schema challenge. This test has since been adopted in the AI language community as one way, and perhaps the best way, to assess machine understanding — though as we’ll see, it is not perfect. A Winograd schema, named for the language researcher Terry Winograd, consists of a pair of sentences, differing by exactly one word, each followed by a question. Here are two examples:

Sentence 1: I poured water from the bottle into the cup until it was full.

Question: What was full, the bottle or the cup?

Sentence 2: I poured water from the bottle into the cup until it was empty.

Question: What was empty, the bottle or the cup?

Sentence 1: Joe’s uncle can still beat him at tennis, even though he is 30 years older.

Question: Who is older, Joe or Joe’s uncle?

Sentence 2: Joe’s uncle can still beat him at tennis, even though he is 30 years younger.

Question: Who is younger, Joe or Joe’s uncle?

Neural network language models have achieved about 97% accuracy on a particular set of Winograd schemas. This roughly equals human performance.

In each sentence pair, the one-word difference can change which thing or person a pronoun refers to. Answering these questions correctly seems to require commonsense understanding. Winograd schemas are designed precisely to test this kind of understanding, alleviating the Turing test’s vulnerability to unreliable human judges or chatbot tricks. In particular, the authors designed a few hundred schemas that they believed were “Google-proof”: A machine shouldn’t be able to use a Google search (or anything like it) to answer the questions correctly.

These schemas were the subject of a competition held in 2016 in which the winning program was correct on only 58% of the sentences — hardly a better result than if it had guessed. Oren Etzioni, a leading AI researcher, quipped, “When AI can’t determine what ‘it’ refers to in a sentence, it’s hard to believe that it will take over the world.”

However, the ability of AI programs to solve Winograd schemas rose quickly due to the advent of large neural network language models. A 2020 paper from OpenAI reported that GPT-3 was correct on nearly 90% of the sentences in a benchmark set of Winograd schemas. Other language models have performed even better after training specifically on these tasks. At the time of this writing, neural network language models have achieved about 97% accuracy on a particular set of Winograd schemas that are part of an AI language-understanding competition known as SuperGLUE. This accuracy roughly equals human performance. Does this mean that neural network language models have attained humanlike understanding?

Not necessarily. Despite the creators’ best efforts, those Winograd schemas were not actually Google-proof. These challenges, like many other current tests of AI language understanding, sometimes permit shortcuts that allow neural networks to perform well without understanding. For example, consider the sentences “The sports car passed the mail truck because it was going faster” and “The sports car passed the mail truck because it was going slower.” A language model trained on a huge corpus of English sentences will have absorbed the correlation between “sports car” and “fast,” and between “mail truck” and “slow,” and so it can answer correctly based on those correlations alone rather than by drawing on any understanding. It turns out that many of the Winograd schemas in the SuperGLUE competition allow for these kinds of statistical correlations.

Rather than give up on the Winograd schemas as a test of understanding, a group of researchers from the Allen Institute for Artificial Intelligence decided instead to try to fix some of their problems. In 2019 they created WinoGrande, a much larger set of Winograd schemas. Instead of several hundred examples, WinoGrande contains a whopping 44,000 sentences. To obtain that many examples, the researchers turned to Amazon Mechanical Turk, a popular platform for crowdsourcing work. Each (human) worker was asked to write several sentence pairs, with some constraints to ensure that the collection would contain diverse topics, though now the sentences in each pair could differ by more than one word.

The researchers then attempted to eliminate sentences that could allow statistical shortcuts by applying a relatively unsophisticated AI method to each sentence and discarding any that were too easily solved. As expected, the remaining sentences presented a much harder challenge for machines than the original Winograd schema collection. While humans still scored very high, neural network language models that had matched human performance on the original set scored much lower on the WinoGrande set. This new challenge seemed to redeem Winograd schemas as a test for commonsense understanding — as long as the sentences were carefully screened to ensure that they were Google-proof.

However, another surprise was in store. In the almost two years since the WinoGrande collection was published, neural network language models have grown ever larger, and the larger they get, the better they seem to score on this new challenge. At the time of this writing, the current best programs — which have been trained on terabytes of text and then further trained on thousands of WinoGrande examples — get close to 90% correct (humans get about 94% correct). This increase in performance is due almost entirely to the increased size of the neural network langua

Have these ever larger networks finally attained humanlike commonsense understanding? Again, it’s not likely. The WinoGrande results come with some important caveats. For example, because the sentences relied on Amazon Mechanical Turk workers, the quality and coherence of the writing is quite uneven. Also, the “unsophisticated” AI method used to weed out “non-Google-proof” sentences may have been too unsophisticated to spot all possible statistical shortcuts available to a huge neural network, and it only applied to individual sentences, so some of the remaining sentences ended up losing their “twin.” One follow-up study showed that neural network language models tested on twin sentences only — and required to be correct on both — are much less accurate than humans, showing that the earlier 90% result is less significant than it seemed.

So, what to make of the Winograd saga? The main lesson is that it is often hard to determine from their performance on a given challenge if AI systems truly understand the language (or other data) that they process. We now know that neural networks often use statistical shortcuts — instead of actually demonstrating humanlike understanding — to obtain high performance on the Winograd schemas as well as many of the most popular “general language understanding” benchmarks.

The crux of the problem, in my view, is that understanding language requires understanding the world, and a machine exposed only to language cannot gain such an understanding. Consider what it means to understand “The sports car passed the mail truck because it was going slower.” You need to know what sports cars and mail trucks are, that cars can “pass” one another, and, at an even more basic level, that vehicles are objects that exist and interact in the world, driven by humans with their own aAll this is knowledge that we humans take for granted, but it’s not built into machines or likely to be explicitly written down in any of a language model’s training text. Some cognitive scientists have argued that humans rely on innate, pre-linguistic core knowledge of space, time and many other essential properties of the world in order to learn and understand language. If we want machines to similarly master human language, we will need to first endow them with the primordial principles humans are born with. And to assess machines’ understanding, we should start by assessing their grasp of these principles, which one might call “infant metaphysics.”

Training and evaluating machines for baby-level intelligence may seem like a giant step backward compared to the prodigious feats of AI systems like Watson and GPT-3. But if true and trustworthy understanding is the goal, this may be the only path to machines that can genuinely comprehend what “it” refers to in a sentence, and everything else that understanding “it” entailS.

Wednesday, December 15, 2021

More on the Zohar

https://en.wikipedia.org/wiki/Zohar

Within Orthodox Judaism the traditional view that Shimon bar Yochai was the author is prevalent. R' Menachem Mendel Kasher in a 1958 article in the periodical Sinai argues against the claims of Gershom Scholem that the Zohar was written in the 13th Century by R' Moses de León.^[29] He writes:

Many statements in the works of the Rishonim (medieval commentors who preceded de León) refer to Medrashim that we are not aware of. He writes that these are in fact references to the Zohar. This has also been pointed out by R' David Luria in his work "Kadmus Sefer Ha'Zohar".
The Zohar's major opponent Elijah Delmedigo refers to the Zohar as having existed for "only" 300 years. Even he agrees that it was extant at the time of R' Moses de León.
He cites a document from R' Yitchok M' Acco who was sent by the Ramban to investigate the Zohar. The document brings witnesses that attest to the existence of the manuscript.
It is impossible to accept that R' Moses de León managed to forge a work within the scope of the Zohar (1700 pages) within a period of six years as Scholem claims.
A comparison between the Zohar and de León's other works show major stylistic differences. Although he made use of his manuscript of the Zohar, many ideas presented in his works contradict or ignore ideas mentioned in the Zohar. Luria also points this out.
Many of the Midrashic works achieved their final redaction in the Geonic period. Some of the anachronistic terminologies of the Zohar may date from that time.
Out of the thousands of words used in the Zohar, Scholem finds two anachronistic terms and nine cases of ungrammatical usage of words. This proves that the majority of the Zohar was written within the accepted time frame and only a small amount was added later (in the Geonic period as mentioned).
Some hard to understand terms may be attributed to acronyms or codes. He finds corollaries to such a practice in other ancient manuscripts.
The "borrowings" from medieval commentaries may be explained in a simple manner. It is not unheard of that a note written on the side of a text should on later copying be added to the main part of the text. The Talmud itself has Geonic additions from such a cause. Certainly, this would apply to the Zohar to which there did not exist other manuscripts to compare it with.
He cites an ancient manuscript that refers to a book Sod Gadol that seems to in fact be the Zohar.

Thursday, December 9, 2021

THE AUHENTICITY OF THE ZOHAR

Some people are worried that the Zohar has been proved not to stem from R' Shimon bar Yochai. I very strongly recommend reading the series of posts starting here

https://www.chabad.org/kabbalah/article_cdo/aid/380410/jewish/The-Zohars-Mysterious-Origins.htm?gclid=CjwKCAiA78aNBhAlEiwA7B76pxx6BsNATgWtrz5rhxU__H8BWMbLG8JeAQUK3ICh80UPRJ6MgvpsPBoCA7kQAvD_BwE

It seems to me that Rabbi Miller has done an excellent job of refut9ngh these academic critics. I would appreciate comments.

Sunday, October 17, 2021

The search for NCC - weak sate of the art

Issue Section:

OPINION PAPER

[[Here are some selections from the paper - it is worth reading the whole.]]

In our ASSC20 symposium, “Does unconscious perception really exist?”, the four of us asked some difficult questions about the purported phenomenon of unconscious perception, disagreeing on a number of points. This disagreement reflected the objective of the symposium: not only to come together to discuss a single topic of keen interest to the ASSC community, but to do so in a way that would fairly and comprehensively represent the heterogeneity of ideas, opinions, and evidence that exists concerning this contentious topic. The crux of this controversy rests in no small part on disagreement about what is meant by the terms of the debate and how to determine empirically whether a state is unconscious or not.

These are issues that directly concern all of us who study consciousness, so it seems it would be in our best interest to strive for consensus. Given the conversation at ASSC20, we are pleased to have the opportunity to address some of the nuanced topics that arose more formally, and share some of the thinking we have done since the meeting. To reflect the heterogeneity of ideas and opinions surrounding this topic, we have organized this discussion into four distinct contributions.

—M.A.K.P. and I.P.

Practical and theoretical considerations in seeking the neural correlates of consciousness

Megan A. K. Peters

Psychology Department, University of California, Los Angeles, Los Angeles, CA90095, USA.E-mail:meganakpeters@ucla.edu

As empirical scientists studying consciousness, we should be concerned with one question above all others: How can we design an experiment that will isolate the “conscious” processing of something from the “unconscious” processing of it, so that we can study the neural processing that underlies awareness – the neural correlates of consciousness (NCCs) – without inadvertently including a number of other confounds? This is the foundation of the scientific method.

Of course, this has always been the goal of studies seeking the NCCs, for example via comparing brain activity in “conscious” and “unconscious” conditions (Baars 1993). But a number of confounds continue to plague our experiments. My goal here, therefore, is to briefly enumerate the current practical concerns in experiments seeking to identify the NCCs, and to discuss how a newly developed paradigm can directly address these practical issues (Peters and Lau 2015).

......

Do these results mean that we can never achieve matched performance in “conscious” versus “unconscious” conditions, rendering the requirements for experiments seeking NCCs impossible to meet? Not necessarily. [[But maybe yes!!! Think about that.]] All we can infer from these results, for now, is that unconscious perception of the type we require seems to be harder to induce than the field may have realized. (We haven’t yet tried all the possible masking or neuromodulation techniques in existence.) Nevertheless, these experimental findings should make us think critically about what has actually been found in studies that do not control for task performance, may be susceptible to the criterion problem, or use masking or other manipulations to render a stimulus “unconscious”.

What we need to think about when we think about unconscious perception

Ian Phillips

St. Anne’s College, University of Oxford, OxfordOX2 6HS, UK.E-mail:ian.phillips@st-annes.ox.ac.uk

Theoretical discussions of unconscious perception typically focus on how consciousness should be operationally defined (Lau 2008; Seth et al. 2008; Irvine 2013). However, a compelling case of unconscious perception requires both evidence that consciousness is absent and that perception is present. Consequently, theorists must also consider how perception should be operationally defined, and assess alleged cases of unconscious perception accordingly.

Traditionally, it was assumed that to be perceived a stimulus must contribute to a subject’s conscious perspective (Moore 1925). To allow for the possibility of unconscious perception, Kanwisher suggests instead using “perception” to refer to “the extraction and/or representation of perceptual information from a stimulus, without any assumption that such information is necessarily experienced consciously” (2001, 90). Kanwisher’s proposal needs refinement. It risks counting distinctive allergic reactions as instances of perception (Dretske 2006). It also fails to secure the idea that perception is an individual-level phenomenon, not merely an occurrence in an individual’s visual system or brain (Burge 2010, 368 ff.).

By way of refinement, Burge proposes that perception is constitutively a matter of objective sensory representation by the individual. This means that perceptual states do not merely carry information but represent features of the physical environment as opposed to “idiosyncratic, proximal or subjective features of the individual” (2010, 397). According to Burge, such contents are attributable just when perceptual constancies are exercised. “Perception requires perceptual constancies.” (399).

An alternative approach focuses on the “role”, as opposed to “content”, of perceptual states. Thus, Dretske (2006) proposes that the information which perceptual states carry must be directly available for the control and guidance of action. Similarly, Prinz (2015) stipulates that perception involves the transduction of “useable” sensory information. Content and role approaches are not exclusive. Milner and Goodale understand perception to “refer to a process which [subserves] … the recognition and identification of [external] objects and events and their spatial and temporal relations” (1995/2006, 2). Here both content and role requirements are in play.

In line with contemporary orthodoxy, all the authors just mentioned claim that perception howsoever defined occurs unconsciously. [See also Block (2016) and Block in Block and Phillips (2016).] Here, I discuss four cases commonly invoked in support of this contention. Thinking about whether perception is genuinely present in these cases demonstrates that matters are much less clear cut than standardly supposed.

Conclusion

Proponents of unconscious perception face the challenge of providing an adequately justified operational definition of individual-level perception. Assessed in the light of extant proposals, many apparently clear cases of unconscious perception no longer appear so clear cut. Moreover, an obvious concern lurks in wait. One possible operational test for perception (closely associated with Dretske’s role-based proposal above) requires that the information carried by a perceptual state must be exploitable by a subject to make a discriminatory response. Yet this test for perception is equivalent to so-called “objective” measures of consciousness (i.e., above chance discriminative sensitivity) (Green and Swets 1966). As a result, no putative cases of unconscious perception can hope to avoid the familiar concern that they simply involve weak conscious awareness unreported due to a conservative response criterion (Eriksen 1960; Holender 1986; Phillips 2016; Peters, this symposium).

Monday, September 13, 2021

Pseudogenes Aren’t Nonfunctional Relics that Refute Intelligent Design

Casey Luskin

September 9, 2021, 6:36 AM

https://evolutionnews.org/2021/09/pseudogenes-arent-nonfunctional-relics-that-refute-intelligent-design/

[[If you need an example of the tone and accuracy of the debate about evolution, here is a good example. I highly recommend Dr. Luskin’s work in general.]]

Photo by Ann Kathrin Bopp via Unsplash.

We’ve been discussing a video in which Richard Dawkins claims that the evidence for common ancestry refutes intelligent design (see here, here, and here). We first saw that contrary to Dawkins, the genetic data does not yield “a perfect hierarchy” or “perfect family tree.” Then we saw that a treelike data structure does not necessarily refute intelligent design. But Dawkins isn’t done. At the end of his answer in the video, Dawkins raises the issue of “pseudogenes,” which he claims “don’t do anything but are vestigial relicts of genes that once did something.” Dawkins says elsewhere that pseudogenes “are never transcribed or translated. They might as well not exist, as far as the animal’s welfare is concerned.” These claims represent a classic but false “junk DNA” argument against intelligent design.

Functions of Pseudogenes

Pseudogenes can yield functional RNA transcripts, functional proteins, or perform a function without producing any transcript. A 2012 paper in Science Signaling noted that although “pseudogenes have long been dismissed as junk DNA,” recent advances have established that “the DNA of a pseudogene, the RNA transcribed from a pseudogene, or the protein translated from a pseudogene can have multiple, diverse functions and that these functions can affect not only their parental genes but also unrelated genes.” The paper concludes that “pseudogenes have emerged as a previously unappreciated class of sophisticated modulators of gene expression.”

A 2011 paper in the journal RNA concurs:

Pseudogenes have long been labeled as ‘junk’ DNA, failed copies of genes that arise during the evolution of genomes. However, recent results are challenging this moniker; indeed, some pseudogenes appear to harbor the potential to regulate their protein-coding cousins.

Likewise, a 2012 paper in RNA Biology states that “pseudogenes were long considered as junk genomic DNA” but “pseudogene regulation is widespread in eukaryotes.” Because pseudogenes may only function in specific tissues and/or only during particular stages of development, their true functions may be difficult to detect. The RNA Biology paper concludes that “the study of functional pseudogenes is just at the beginning” and predicts “more and more functional pseudogenes will be discovered as novel biological technologies are developed in the future.”

When we do carefully study pseudogenes, we often find function. One paper in Annual Review of Genetics observed: “pseudogenes that have been suitably investigated often exhibit functional roles.” A 2020 paper in Nature Reviews Genetics cautioned that pseudogene function is “Prematurely Dismissed” due to “dogma.” The paper cautions that there are many instances where DNA that was dismissed as pseudogene junk was later found to be functional: “with a growing number of instances of pseudogene-annotated regions later found to exhibit biological function, there is an emerging risk that these regions of the genome are prematurely dismissed as pseudogenic and therefore regarded as void of function.” Indeed, the literature is full of papers reporting function in what have been wrongly labeled “pseudogenes.”

Fingers in Ears?

At the end of the video, Dawkins says: “I find it extremely hard to imagine how any creationist who actually bothered to listen to that could possibly doubt the fact of evolution. But they don’t listen…they simply stick their fingers in their ear and say la la la.” It’s safe to say that Dawkins was wrong about many things in this video, but I’m not here to make any accusations about fingers and ears. I will say that the best resolution to these kinds of questions is to listen to the data, keep an open mind, and to think critically. When we’re wiling to do this, a lot of exciting new scientific possibilities open up — ones that don’t necessarily include traditional neo-Darwinian views of common ancestry or a “perfect hierarchy” in the tree of life, and ones that readily point toward intelligent design.

Rabbi Dr. Dovid Gottlieb

Wednesday, December 29, 2021

What Does It Mean for AI to Understand?

What Does It Mean for AI to Understand?

https://www.quantamagazine.org/what-does-it-mean-for-ai-to-understand-20211216/?utm_source=Quanta+Magazine&utm_campaign=fffe95a7f6-RSS_Daily_Computer_Science&utm_medium=email&utm_term=0_f0cb61321c-fffe95a7f6-389846569&mc_cid=fffe95a7f6&mc_eid=61275b7d81

Melanie Mitchell

Wednesday, December 15, 2021

More on the Zohar

Thursday, December 9, 2021

THE AUHENTICITY OF THE ZOHAR

Sunday, October 17, 2021

The search for NCC - weak sate of the art

Does unconscious perception really exist? Continuing the ASSC20 debate

Practical and theoretical considerations in seeking the neural correlates of consciousness

Megan A. K. Peters

What we need to think about when we think about unconscious perception

Ian Phillips

Conclusion

Monday, September 13, 2021

Recommended reading for the remaining fans of Richard Dawkins

Pseudogenes Aren’t Nonfunctional Relics that Refute Intelligent Design

Functions of Pseudogenes

Fingers in Ears?

About Me

Contact Me

Visit My Website

Subscribe via email

Labels

Blog Archive

Wednesday, December 29, 2021

What Does It Mean for AI to Understand?

Wednesday, December 15, 2021

Thursday, December 9, 2021

Sunday, October 17, 2021

Does unconscious perception really exist? Continuing the ASSC20 debate

Practical and theoretical considerations in seeking the neural correlates of consciousness

Megan A. K. Peters

What we need to think about when we think about unconscious perception

Ian Phillips

Conclusion

Monday, September 13, 2021

Pseudogenes Aren’t Nonfunctional Relics that Refute Intelligent Design

Functions of Pseudogenes

Fingers in Ears?

About Me

Contact Me

Visit My Website

Subscribe via email

Subscribe via newsfeed

Labels

Blog Archive