Beyond Myopic Assessments: An Accessible Yet Incisive AI Critique Anchored in Technical Foundations
Mitchell, M. (2019). Artificial Intelligence: A Guide for Thinking Humans
(1st ed.). Farrar, Straus and Giroux.
General Overview of the Book
- The
book traces the history of AI through symbolic approaches, neural networks,
machine learning, and deep learning. It discusses strengths and limitations of
different techniques over time.
- A
key thesis is that current AI systems, despite impressive capabilities on
narrow tasks, lack true understanding and meaning that even young children display
through intuitive physics, psychology etc.
- Understanding
is linked to forming explanatory mental models, running simulations about
likely outcomes, making predictions and generalizations - things lacking in
today's AI.
- Abstraction,
analogies, creativity, commonsense and metacognition remain extremely hard for
AI systems and central to general intelligence displayed by humans.
- The
book makes the case that today's AI may be more fragile, unreliable and opaque
than commonly portrayed - facing issues like bias, adversarial attacks,
sensitivity to edge cases etc.
- Claims
of near human-level performance on narrow tasks underestimate the limitations
of today's systems when dealing with nuanced real-world situations.
- There
are uncertainties about the potential societal impacts of AI in domains like
jobs, privacy violations through technologies like face-recognition etc.
- While
AI promises major benefits, we need to thoughtfully address risks around over-trusting
unreliable autonomy and use for malicious goals.
- The
quest for AI has also deepened our appreciation of the staggering complexity
behind non-conscious aspects of our cognition.
- We
are still in early days of understanding intelligence - which reflects embodied
sensorimotor processes, embedded emotions, social cooperation over evolution.
- General
human-level intelligence needs common sense at the level of a 5-year-old child
- still out of reach for modern algorithms and computational paradigms.
- Insights
from other fields like neuroscience, psychology, cognitive linguistics, and
philosophy will be needed to make progress on key hurdles related to meaning
and understanding.
Book
Review
The book “Artificial Intelligence: A Guide for
Thinking Humans” provides a broad overview of the history of AI along with an
insightful critique situating the promises and pitfalls of present advances.
Published in 2019, it serves as an approachable yet intellectually grounded
reference for the interested layperson while unpacking misconceptions regarding
achievements in narrow AI. The author Melanie Mitchell brings an authoritative
lens as a computer scientist and professor specializing in machine learning and
complex systems. The overall objective involves illustrating enduring
challenges around meaning and generalization that persist despite AI’s towering
ambitions.
The book charts AI’s trajectory from early symbolic
logic-based approaches focusing on human-encoded rules, knowledge, and
reasoning to the contemporary dominance of data-driven deep neural networks. A
persistent theme situates the alluring narrative tropes regarding AI alongside
the realities of innovations remaining bounded to specific functional
capabilities. For instance, Part I highlights cycles of optimism around
seamless problem-solving giving way to confronting constraints around
brittleness and inflexibility—characterized as “AI Winters” (Mitchell, 2019, p.
32). Breakthroughs on tasks like game-playing and computer vision could not
assuage uncertainties on whether further scale alone can traverse fundamental
gaps.
Parts II and III detail convolutional neural networks
leveraging big data and modern hardware to elevate computer vision and
reinforcement learning based game-playing systems like AlphaGo to superhuman
levels on narrow metrics respectively. However, the critiques resurface
regarding lack of transparency, explainability and tendencies for unpredictably
unreliable behavior revealing limitations in meaning and conceptual depth.
Parts IV and V extend assessments to natural language domains involving
promising applications from real-time speech transcription to automated
translation while underscoring enduring fragilities. Simple adversarial attacks
expose brittleness, despite claims regarding human parity on focused
benchmarks. Throughout, the book foregrounds profound differences from flexible
human cognition and intelligence tightly integrated across modalities.
A salient strength lies in effectively using deep
technical insights from the AI sub-fields discussed to inform a balanced, big
picture perspective. This lifts the discourse beyond reactionary forecasts of
dystopia or utopia to substantiate measured analysis. Concrete examples like
convoluted failures in edge scenarios that nevertheless cumulatively become
probable with ubiquitous deployment provide essential grounding. Even
remarkable accomplishments on games like Go obtain situating as programming tour
de forces rather than intimations of the AI field unlocking general
intelligence. The text further helps map interdisciplinary connections between
AI with philosophy of mind, cognitive science, linguistics and other fields to
position enduring challenges requiring transformative conceptual
innovation.
The breadth of coverage also incurs significant
trade-offs. Several specialized topics like evolutionary algorithms, logical AI
and expert systems obtain relatively cursory attention for instance. While
substantial advances in deep learning and big data catalyze present excitement
and funding for AI, they form but particular strands in a vast, complex domain
filled with alternative paradigms, ideologies, progress markers and debates on
intelligence itself. Parts I and V partially tackle this multiplicity but leave
open substantial room for expansion in a future edition. From a pedagogical
lens, more explicit graphical overviews summarizing relationships between
different concepts could aid comprehension for non-expert readers navigating
the intersections of technical terminology across areas. Nonetheless, within
intended confines on orienting thinking humans rather than producing AI
textbook, the work accomplishes admirable depth alongside accessibility.
In prefacing humans rather than machines or technical
practitioners as the nominal target readership, the book underscores a crucial
distinction—AI constitutes technologies actively developed to amplify human
capabilities and progress. Re-centering ethical considerations surrounding
accountability, transparency, bias etc. thus becomes imperative rather than
peripheral. Policymakers shaping sociotechnical ecosystems for maximizing
benefits and minimizing harm represent one audience that could immensely gain from
deeper engagement with these issues. Lay readers as voting citizens in
democracies ultimately mobilize collective priorities through their
participation, awareness, or acquiescence. The text proves sufficiently
multidimensional to offer each group layered insights tailored to their
inclinations on social, pragmatic or purely intellectual dimensions without
alienating or overwhelming.
In classrooms, it could enrich technology-focused
computer science curricula with philosophical perspectives and stand as a
complement more technically specialized AI courses. The Starting points
provided to delve deeper into different areas can support students undertaking
integrative projects or researchers from adjacent disciplines seeking
foundational references. As AI increasingly permeates everyday domains from
finance to healthcare, such cross-cutting contributions help anchor elite
technical advancements in considerations of robustness, safety, and ethics
societally. Rather than solely producing capabilities reflecting institutional
priorities or access to resources, the book compellingly invites more
participative co-shaping of emerging realities.
Overall, “Artificial Intelligence: A Guide for
Thinking Humans” proves an erudite yet accessible distillation of the
frontiers, fault-lines, and future trajectories for AI, anchored in the
author’s multi-decadal immersion at the leading edge. It foregrounds persistent
lacunae posed by tasks intrinsically requiring forms of flexible understanding,
abstraction and common sense that allow humans to smoothly navigate pervasive
uncertainty. While recognizing remarkable contemporary achievements within
circumscribed domains, the analysis suggests genuinely mimicking multifaceted,
context-responsive aspects of cognition remains an open front. The closing
reflections affirm how the unfinished journey continues illuminating
intricacies of biological and machine intelligence alike through intertwined
endeavors of science and engineering. By substantiating such nuances through
historical and technical grounding of concepts for wider readerships, the work
makes valuable contributions to furthering informed, ethical, and wise
co-evolution of humanity alongside its increasingly pervasive algorithmic
creations over the 21st century and beyond.
Chapter
Summaries
Part
1: Background
Part I offers a historical backdrop situating the
evolution of AI from early symbolic approaches based on rules and logic in the
1950s to the rise of machine learning and neural networks. It traces pioneering
efforts ranging from General Problem Solver to perceptrons and expert systems
along with their limitations. The current dominance of deep learning is
discussed as the latest "AI spring", powered by big data and modern
hardware. However, fundamental distinctions remain between narrow AI focused on
specific tasks and elusive general intelligence. Despite impressive
achievements, deep learning also faces issues related to brittleness, bias and
a lack of transparency. Thus, the book makes a case for not equating AI solely
with deep learning or overestimating such data-driven techniques. The
introductory section sets the stage for discussing core questions around
machines exhibiting meaningful understanding.
Chapter 1: The Roots of Artificial
Intelligence (AI)
- ·
Intelligence Complexity:
Intelligence is not singular but multidimensional, encompassing various aspects
like emotional, verbal, spatial, logical, artistic, and social intelligence.
- ·
Deep Learning as Dominant Paradigm:
Deep learning, a subset of machine learning, has become synonymous with AI in
popular media, despite AI encompassing a broader range of approaches.
- ·
Symbolic AI:
Symbolic AI uses human-understandable symbols and rules for processing to
perform tasks, dominating AI research initially through approaches like expert
systems.
- ·
Subsymbolic AI and Perceptrons:
Subsymbolic AI, inspired by neuroscience, focuses on learning from data and
includes technologies like perceptrons, which are basic neural networks
inspired by brain neurons.
- ·
Supervised Learning:
Supervised learning involves training systems with labeled examples to learn
from data, requiring both a training set for learning and a test set for
evaluation.
- ·
Challenges and AI Winter:
The initial overpromising and subsequent failures in AI led to periods of
reduced funding and interest, known as "AI winters."
Chapter 2: Neural Networks and the Ascent
of Machine Learning (ML)
- ·
Neural Networks Structure:
Neural networks consist of layers of simulated neurons, with multilayer
networks (deep networks) capable of recognizing more abstract features.
- ·
Back-propagation:
A crucial algorithm for training neural networks, back-propagation adjusts
weights within the network to minimize output errors across training examples.
- ·
Limitations of Symbolic AI:
Symbolic AI, while adept at tasks requiring explicit knowledge and reasoning,
proved brittle and unable to generalize well beyond specific scenarios.
- ·
Connectionism:
Emphasizes the importance of computational architecture inspired by the brain
and the system's ability to learn from data or experience.
- ·
Debate Over Symbolic vs. Subsymbolic
Approaches: The field has oscillated between
preferences for symbolic (rule-based) and subsymbolic (data-driven) approaches,
reflecting ongoing debates about the best path toward artificial intelligence.
- ·
Machine Learning's Rise:
Machine learning emerged as a distinct subdiscipline, focusing on learning from
data and rejecting the premises of symbolic AI, sometimes referred to as
"good old-fashioned AI" (GOFAI).
Chapter 3: AI Spring
- ·
Google's AI Experiment:
In 2012, Google's AI team developed a multilayer neural network with over a
billion weights, trained on YouTube videos, that could recognize cats, marking
a significant achievement in deep learning and capturing public attention.
- ·
Historical AI Milestones:
IBM's Deep Blue and Watson were pivotal in demonstrating AI's capabilities,
defeating the world chess champion and winning Jeopardy!, respectively,
showcasing the progression towards more advanced AI systems.
- ·
Narrow vs. General AI:
Current AI advancements, despite being impressive, remain examples of
"narrow" or "weak" AI, specialized in specific tasks,
contrasting with the concept of "general" or "strong" AI,
which remains a distant goal.
- ·
Human-Level AI Debate:
The AI research community debates the criteria for achieving human-level AI,
including whether such a system requires consciousness or the ability to think
and understand like humans.
- ·
Philosophical Views on Machine
Thought: The possibility of machines truly
thinking has been contested, with figures like Alan Turing advocating for the
potential of machine thought, while others like John Searle argue against the
possibility of machines having minds.
- ·
The Singularity Concept:
Ray Kurzweil's Singularity predicts a future where technological change, led by
AI surpassing human intelligence, will fundamentally transform human life, a
vision supported by the exponential progress in computing power.
- ·
Moore's Law and AI Progress:
Moore's law, noting the doubling of computer chip components roughly every two
years, underpins predictions of exponential growth in computing capabilities,
potentially leading to human-level AI through reverse engineering the brain.
- ·
Reverse Engineering the Brain:
Kurzweil argues that advancements in computation, neuroscience, and
nanotechnology will enable the reverse engineering of the brain, allowing AI to
rapidly expand its knowledge and skills by accessing human literature and the
internet.
- ·
Underlying Human Intelligence
Abilities: To achieve human-level AI, a deeper
understanding of human intelligence, including perception, language,
decision-making, common sense reasoning, and learning, is necessary.
Part II: Looking and Seeing
Part II dives into the domain of computer vision,
which involves subtleties that have challenged AI for decades before recent
breakthroughs using deep learning. It focuses specifically on convolutional
neural networks (ConvNets) and how they drew inspiration from the hierarchical
organization of the visual cortex. These models now power many real-world
applications from image classification to self-driving cars by learning from massive,
labeled datasets. However, the book also highlights their limitations - lack of
human-like robustness and vulnerabilities to adversarial examples. Current
systems recognize objects remarkably well, but scene understanding involving
relationships between entities remains poor. The critiques aim to inject nuance
into claims of human-parity, while tracing conceptual gaps around explaining
decision making or imagining counterfactuals. Core issues revolve around
embodied knowledge and meaning that evades even the most high-performing models
on narrow metrics.
Chapter 4: Who, What, When, Where, Why
- ·
Complexity of Vision:
Vision, encompassing both the act of looking and the process of seeing and
understanding, is highlighted as a significant challenge in AI, with computer
vision grappling with numerous difficulties since the 1950s.
- ·
Deep Learning Revolution:
Advances in deep learning, particularly through training deep neural networks
with multiple hidden layers, have dramatically improved machines' ability to
recognize objects in images and videos in the 2010s.
- ·
Inspiration from Biological Vision:
The architecture of convolutional neural networks (ConvNets), a cornerstone of
modern computer vision, is inspired by the hierarchical organization of the
visual system in cats and primates, as discovered by Nobel laureates David
Hubel and Torsten Wiesel.
- ·
ConvNets and the Brain's Visual
System: ConvNets mimic the brain's visual processing with
layers of simulated neurons that detect increasingly complex features through a
hierarchical structure, starting from simple edges in lower layers to more
complex patterns in higher layers.
- ·
Convolutional Process:
ConvNets perform convolutions, a process where each value in a neuron's
receptive field is multiplied by corresponding weights and summed, to create
activation maps for specific visual features across the image.
- ·
Hierarchical Feature Detection:
ConvNets learn to detect features hierarchically, with detectors in higher
layers sensitive to more complex features, mirroring the brain's visual
system's processing from simple to complex stimuli.
- ·
Classification in ConvNets:
ConvNets classify images by transforming them into a set of activation maps,
which are then fed into a traditional neural network that outputs confidence
percentages for known object categories.
- ·
Training ConvNets:
ConvNets are trained using the back-propagation algorithm, learning from
labeled examples to detect features at each layer and adjust weights in the
classification module for accurate object recognition.
- ·
Epochs in Training:
Training involves multiple epochs, where the network processes each image
repeatedly, gradually improving at the task until it converges on a set of
weights that allows for accurate recognition of objects like dogs and cats.
- ·
Learning Feature Detectors:
Despite not being programmed to detect specific features, ConvNets trained on
real-world photographs naturally learn a hierarchy of feature detectors akin to
those found in the brain's visual system.
Chapter 5: ConvNets and ImageNet
- ·
LeNet's Success:
Yann LeCun's early work on ConvNets led to the development of LeNet, which was
successfully applied in practical applications like zip code recognition for
the USPS and digit reading on checks.
- ·
ImageNet's Role:
The creation of the ImageNet dataset, spearheaded by Fei-Fei Li, provided a
vast amount of labeled images that were crucial for training and advancing
ConvNet technologies.
- ·
ImageNet Challenge:
The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) significantly
pushed the field forward, with ConvNets demonstrating superiority over other
algorithms in 2012.
- ·
Importance of Data and Hardware:
The success of deep learning has been attributed to the availability of large
datasets like ImageNet and advances in computing hardware, particularly GPUs.
- ·
Applications of ConvNets:
With the training made possible by these datasets and hardware, ConvNets have
been deployed in various applications, including mobile apps for real-time
object and face recognition.
- ·
Human vs. Machine Recognition:
While ConvNets have made impressive strides in object recognition, comparisons
to human capabilities should be approached cautiously due to differences in
learning processes and the robustness of the learned recognition.
- ·
Beyond Object Recognition:
True visual intelligence in machines would require understanding relationships
between objects, their interactions, and contexts—areas where human vision and
general intelligence intertwine.
Chapter 6: A Closer Look at Machines That
Learn
- ·
Learning Process of ConvNets:
Contrary to popular belief, the learning process of ConvNets is not humanlike.
Children can learn from few examples, whereas ConvNets require extensive human
effort in setting up and tuning.
- ·
Human Effort in ConvNet Learning:
ConvNets require a significant amount of human intervention to set up,
including choosing hyperparameters and designing the network's architecture.
- ·
Big Data and AI:
The success of deep learning hinges on big data. Tech companies gather vast
amounts of data from users, which is crucial for training AI programs.
- ·
The Long Tail Problem:
Supervised learning faces challenges with the long tail of low probability,
unexpected situations, highlighting the limitations of relying solely on
labeled data for AI training.
- ·
Common Sense and AI:
AI systems lack the common sense that humans use subconsciously, a significant
hurdle to achieving reliable and fully autonomous AI in complex real-world
scenarios.
- ·
Bias in AI:
AI systems can perpetuate societal biases present in training data, leading to
errors and inaccuracies, particularly in face-recognition systems.
- ·
Explainable AI:
There's a growing demand for AI systems, especially deep neural networks, to
explain their decisions in human-understandable terms, a field known as
explainable AI.
- ·
Vulnerability to Attacks:
Deep neural networks can be easily fooled by subtly modified inputs, raising
concerns about their trustworthiness and the fundamental nature of what they
learn.
- ·
AI's Understanding Problem:
Despite the success of AI in object recognition, the lack of deep understanding
compared to human cognition makes AI systems fragile and prone to unexpected
failures.
Chapter 7: On Trustworthy and Ethical AI
- ·
Potential of Self-Driving Cars:
Machine learning, particularly deep learning, is critical for the success of
self-driving cars, especially in computer vision and decision-making. These
cars could significantly reduce auto accidents, improve energy efficiency, and
provide mobility for those unable to drive, contingent on public trust.
- ·
Benefits of AI:
AI technologies already contribute positively to society, offering services
such as speech transcription, GPS navigation, email spam filtering, and more.
The potential for AI to take over undesirable jobs could greatly enhance human
well-being.
- ·
The Great AI Trade-Off:
The widespread integration of AI into devices, akin to electricity, presents a
dilemma due to AI's unpredictable behaviors, biases, and vulnerabilities. The
debate centers on whether the benefits of AI technologies outweigh the risks
associated with their implementation.
- ·
Ethical and Privacy Concerns:
The accuracy of AI, especially in applications like face recognition, while
beneficial, also raises concerns regarding privacy and the potential for
misuse. Reliability remains a significant issue due to the potential for errors
in recognition systems.
- ·
Regulation of AI:
There's a consensus among AI practitioners on the need for regulation, but the
responsibility shouldn't rest solely with researchers and companies. Addressing
AI's ethical, social, and political challenges requires a diverse and inclusive
dialogue. Efforts are underway at various levels, but there's no consensus on
priorities for regulation and ethics.
- ·
Moral Decision-Making by Machines:
The discussion around imbuing machines with the ability to make ethical
decisions autonomously is ongoing. While some suggest machines should learn
moral values through observation of human behavior, this approach carries the
inherent limitations of machine learning. The ultimate goal is to develop
machines that genuinely understand the contexts of their actions.
Part III: Learning to Play
Part III expands the purview to reinforcement learning
and its remarkable recent success in game-playing domains through algorithms
like deep Q-learning. It traces innovations from AlphaGo to AlphaZero that beat
human world champions at chess and Go via self-play to accumulate knowledge.
However, despite the surface appearance of intuition and creativity, these
systems lack transfer learning abilities that allow humans to seamlessly
generalize across tasks. Each game needs training from scratch, revealing limitations
around abstraction and common sense. The real world, unlike games, lacks
cleanly defined states and rewards. Issues like long-tail risk events become
debilitating in open environments. Thus, inflated perceptions around plausibly
imminently achieving human intelligence run into the obstacle of meaning.
Without fundamental progress on standing challenges related to explanation,
representation and generalization, impressive game-playing prowess has
uncertain applicability to messy practical settings. Significant breakthroughs
lie ahead.
Chapter 8: Rewards for Robots
- ·
Reinforcement Learning (RL):
Inspired by operant conditioning in psychology, RL is a machine-learning
approach where an agent learns from actions in an environment through rewards
without labeled examples. It contrasts with supervised learning by not
requiring pre-labeled data.
- ·
Historical Context:
While RL has been a part of AI for decades, it gained significant attention
with its application in developing a program that outperformed humans in Go in
2016, highlighting its potential beyond traditional neural networks and
supervised methods.
- ·
Learning Mechanism:
In RL, the agent learns optimal actions through trial and error, guided by
rewards from the environment, aiming to maximize long-term benefits. This
process involves understanding the value of actions in given states to predict
future rewards.
- ·
Q-Learning:
A specific method within RL where a Q-table tracks all possible states and
actions, allowing the agent to learn action values over time. Q-learning
focuses on updating these values to improve task performance.
- ·
Exploration vs. Exploitation:
A key challenge in RL is balancing between exploring new actions (exploration)
and optimizing known actions for rewards (exploitation). Achieving the right
balance is critical for effective learning.
- ·
Application Challenges:
Real-world applications of RL face obstacles, including defining a manageable
set of states in complex environments like driving, which often leads to using
neural networks instead of traditional Q-tables to generalize across states.
- ·
Use of Simulations:
Due to the impracticality of real-world training for complex tasks, RL often
relies on simulations to train agents. However, transferring learned behaviors
from simulations to real-world situations remains a significant challenge.
- ·
Domain Successes:
RL's most notable successes have been in simulated environments where variables
are controllable and predictable, rather than direct applications in
unpredictable real-world settings.
Chapter 9: Game On
- ·
Historical Fascination with Games:
AI pioneers like Alan Turing and Claude Shannon wrote chess-playing programs in
the 1940s. Demis Hassabis founded DeepMind Technologies in 2010, aiming to
build brain-inspired AI, focusing initially on mastering Atari video games
through reinforcement learning.
- ·
Deep Q-Learning:
DeepMind's approach, deep Q-learning, merges Q-learning with convolutional
neural networks (ConvNets), replacing the traditional Q-table with a ConvNet to
adapt to the complexity of video games like Breakout.
- ·
Temporal Difference Learning:
This method updates the neural network's weights to reduce discrepancies
between predictions in sequential iterations, enabling the system to improve
its performance without human-labeled data, relying on its estimation of
rewards.
- ·
Checkers and Chess as Milestones:
Early AI efforts in games include Arthur Samuel's checker-playing program and
IBM's Deep Blue, which defeated world chess champion Garry Kasparov in 1997.
These programs utilized game tree searches and were foundational in advancing
AI gaming strategies.
- ·
AlphaGo's Evolution:
AlphaGo, developed by DeepMind, marked a significant milestone by beating Lee
Sedol in Go using deep Q-learning and Monte Carlo tree search. AlphaGo Zero, an
advanced version, started with no prior Go knowledge and learned through
self-play, demonstrating superior performance.
- ·
Monte Carlo Tree Search:
A critical component of AlphaGo, this algorithm uses randomness to solve
complex problems, simulating numerous game scenarios to statistically determine
the best moves.
- ·
Significance:
The progression from traditional game-playing AI to systems like AlphaGo
reflects significant advancements in reinforcement learning, showcasing AI's
growing capability to tackle problems with emergent complexity and strategic
depth.
Chapter 10: Beyond Games
- ·
Reinforcement Learning's Rise:
Reinforcement learning (RL) has evolved from an obscure branch to a central,
exciting approach in AI, particularly notable for its achievements in gaming
domains.
- ·
Generality and Transfer Learning:
Current AI systems, including those excelling in games through deep Q-learning,
lack the ability to transfer knowledge from one task to another, a capability
inherent to human learning. This limitation highlights a significant gap
between AI and human intelligence in terms of abstraction, domain
generalization, and the flexible application of learned knowledge.
- ·
The Promise of Autonomous Learning:
While RL suggests a pathway toward systems that can learn independently by
interacting with their environments, these achievements are confined to
specific tasks. The systems do not demonstrate humanlike understanding or the
ability to generalize across domains.
- ·
Vulnerability to Adversarial Examples:
Like supervised learning systems, RL-based systems are susceptible to
adversarial attacks, suggesting a fundamental difference in how these AI
systems and humans conceptualize and understand their environments.
- ·
Limitations in Applying Game Learning
to Real World: The success of methods used in AlphaGo
and other game-playing AI does not readily translate to broader, real-world
applications due to the lack of transfer learning. Each new task requires the
system to start learning from scratch, contrasting with human intelligence's
flexible and generalizable nature.
- ·
Challenges in Real-World Application:
Extending the success of RL from games to real-world situations faces
significant obstacles, including the complexity and unpredictability of
real-world environments. Despite its potential, RL's application outside of
controlled settings remains a challenging frontier, emphasizing the need for
advancements in transfer learning and domain generalization.
Part IV: AI Meets Natural Language
Part IV explores exciting progress on natural language
processing - from speech recognition to machine translation - catalyzed by the
rise of deep learning. Powerful encoding-decoding neural network architectures
can now transcribe spoken audio or translate between languages with remarkable
accuracy on focused metrics. However, fundamental gaps persist in aspects like
true comprehension or nuanced expression. Claims regarding human-parity on
narrow tasks underestimate systemic vulnerabilities to carefully designed
adversarial attacks revealing brittleness. Plus, narrow evaluations sidestep
challenges involved in realistically complex dialogue, passage-level inference
or symbolic understanding of meaning. Core limitations revolve around contrived
training regimes that eschew grounded, commonsense, or social knowledge. The
gulf in generalizable, trustworthy intelligence persists despite circumscribed
successes on pattern recognition problems. Addressing it involves infusing
greater capacity for explanation, abstraction, and modeling ambiguity.
Chapter 11: Words, and the Company They
Keep
- ·
Scope of NLP:
Natural Language Processing (NLP) encompasses a wide range of applications,
including speech recognition, web search, automated question answering, and
machine translation, with deep learning propelling recent advancements.
- ·
Challenges in NLP:
The complexity of human language, characterized by its ambiguity,
context-dependence, and reliance on shared background knowledge, has
historically posed significant challenges to AI, rendering rule-based
approaches insufficient.
- ·
Speech Recognition:
Deep learning has notably advanced speech recognition, enabling near-perfect
transcription under some conditions without understanding the speech's meaning,
though dealing with ambiguity and context sensitivity remains a complex task.
- ·
Sentiment Analysis:
Automated systems strive to classify the sentiment of texts, a task that
requires understanding the semantic context of words within sentences, not just
isolated word analysis. Deep learning networks have been applied to sentiment
analysis, learning from examples labeled with sentiments.
- ·
Recurrent Neural Networks (RNNs):
Unlike ConvNets used in image classification, RNNs are designed to handle
sequences, such as sentences, by processing words over time steps, maintaining
context through recurrent connections among hidden units.
- ·
Encoding Words as Numbers:
The challenge of representing words for neural network inputs has led to the
development of schemes like one-hot encoding and more sophisticated methods
like word2vec, which captures semantic relationships by representing words as
vectors in a geometric space.
- ·
Semantic Space and Word Vectors:
Word2vec and similar techniques represent words in a semantic space, capturing
meanings based on the company words keep. This approach has become fundamental
in NLP, enabling systems to process words in a manner that reflects their
semantic relationships.
- ·
Biases in Word Vectors:
Research has shown that word vectors can inadvertently capture societal biases
present in the language data they are trained on, reflecting existing
prejudices in language use.
Chapter 12: Translation as Encoding and
Decoding
- ·
Neural Machine Translation (NMT):
Google's launch of a neural machine translation system in 2016 marked a
significant advancement in machine translation, claiming substantial
improvements over previous methods. Despite these advancements, NMT still falls
short of the capabilities of skilled human translators.
- ·
Early Efforts and Evolution:
Machine translation, spurred by the Cold War's need for English-Russian
translation tools, initially relied on rule-based approaches. These early
attempts struggled with the complexities of language, leading to brittle
systems. By the 1990s, statistical machine translation, leveraging large data
sets of sentence pairs for training, began to dominate, moving away from
rule-based methods.
- ·
Google Translate's Transformation:
From its inception in 2006 until 2016, Google Translate utilized statistical
methods. The shift to neural machine translation represented a leap forward,
using deep learning techniques to improve translation quality.
- ·
Encoder-Decoder Networks:
Modern NMT systems employ encoder-decoder networks, with the encoder using
recurrent neural networks (RNNs) to process input language sentences into a
compressed representation, and the decoder generating translated sentences from
this representation. Despite their sophistication, these systems struggle with
language's inherent complexities and ambiguities.
- ·
Long Short-Term Memory (LSTM) Units:
Introduced by Swiss researchers in the late 1990s, LSTM units in RNNs address
the challenge of retaining information over long sequences, essential for
processing entire sentences and improving translation accuracy.
- ·
Training NMT Systems:
State-of-the-art NMT systems are trained on massive datasets comprising
millions of human-translated sentence pairs, leveraging LSTM units within deep
recurrent neural networks to master the intricacies of language translation.
- ·
Evaluating Machine Translation:
The evaluation of machine translation quality is complex, given the
multiplicity of correct translation possibilities. Automated methods like BLEU
and manual bilingual human evaluations are standard, though both approaches
have significant limitations. Critics argue that evaluations often overlook the
nuanced understanding required for accurate translation, focusing instead on
isolated sentences from relatively straightforward texts.
- ·
Limitations of NMT:
Despite notable successes, NMT systems lack a true understanding of the texts
they process, leading to translations that may miss subtle nuances or
misinterpret complex expressions. This limitation underscores the gap between
current AI capabilities and the depth of human linguistic comprehension.
- ·
Image to Sentence Translation:
Google's Show and Tell system exemplifies the extension of NMT principles to
visual data, encoding images and decoding them into descriptive sentences.
While promising, these systems, like their textual counterparts, are prone to
errors, highlighting the challenges in achieving reliable AI interpretation
across different media.
Chapter 13: Ask Me Anything
- ·
Virtual Assistants' Limitations:
Current AI-powered virtual assistants like Siri, Alexa, Cortana, and Google
Now, despite being able to transcribe speech and respond with smooth voices, do
not truly understand the meaning of the queries posed to them. This gap
highlights the challenge of achieving genuine language comprehension in AI
systems.
- ·
IBM's Watson:
Watson's victory on Jeopardy! in 2011 showcased the potential for AI in
understanding and responding to complex language queries. However, the AI
community remains divided over whether Watson represents a genuine advancement
in AI or if it's more of a sophisticated stunt, given its lack of true language
comprehension.
- ·
SQuAD and Reading Comprehension:
The Stanford Question Answering Dataset (SQuAD) became a popular benchmark for
machine "reading comprehension." Despite its name, SQuAD primarily
tests an AI's ability to extract answers from a given text rather than
understand or reason about the content, highlighting the limitations of current
AI in true comprehension.
- ·
Winograd Schemas:
Winograd schemas are designed to test language understanding by requiring the
resolution of pronoun references in sentences. These tests, which are
straightforward for humans, prove challenging for AI, illustrating the
difficulty of equipping AI with human-like language understanding.
- ·
Adversarial Attacks on NLP Systems:
Similar to their counterparts in computer vision, NLP systems are susceptible
to adversarial examples. These attacks subtly alter texts in ways that do not
affect human interpretation but lead AI systems to produce incorrect answers,
exposing a significant vulnerability in current AI technologies.
- ·
Common Sense and Language
Understanding: The author argues that true language
understanding in AI, encompassing tasks like translation and reading
comprehension, is unlikely without the AI possessing human-like common sense.
This perspective suggests that current approaches to AI, focused on learning
from data, are insufficient for achieving genuine language comprehension.
Part V: Barrier of Meaning
The concluding part ties together overriding themes on
the enduring challenge of meaning and understanding in AI. It contrasts narrow
intelligence focused on specific tasks with general capability displaying
flexible abstraction, conceptualization and common sense as exhibited by even
young children. Understanding integrates mental simulations, predictions, and
causal explanations - something absent in current systems and revealing itself
in their brittleness. It also crucially builds on learning embedded in physical
and social realities unlike artificial contrivances. Progress requires going
beyond static pattern recognition towards more human-like dynamic and
interactive reasoning. Long standout challenges include analogy, metaphor, intuition,
and self-supervised learning broadly. The path ahead remains mysterious but
will likely involve infusing greater capacities for representation, explanation,
and generalization. The book ends with a sobering reminder that despite
towering ambitions, we are still traversing the foothills when it comes to
understanding intelligence itself.
Chapter 14: On Understanding
- ·
The Gap in Understanding Between AI
and Humans: Despite AI's advancements, no system yet
possesses the deep, essential understanding that humans have, evident in AI's
un-humanlike errors, difficulty with abstraction and transfer learning, lack of
common sense, and vulnerability to adversarial attacks.
- ·
Core Human Knowledge:
From infancy, humans acquire intuitive knowledge in physics, biology, and
psychology—fundamental understandings of how objects, living things, and social
interactions work. This core knowledge underpins cognitive development,
enabling humans to learn new concepts from minimal examples, generalize these
concepts, and make quick, sensible decisions.
- ·
Mental Models and Predictions:
Humans use mental models—based on knowledge of physical laws, biological facts,
cause and effect, and human behavior—to simulate and predict outcomes in
various situations. These models allow for the understanding of concepts and
the anticipation of future events through mental simulation.
- ·
Understanding as Simulation:
Lawrence Barsalou proposes that understanding situations involves
subconsciously performing mental simulations, even for abstract concepts,
through the simulation of specific situations where these concepts apply.
- ·
The Role of Metaphors:
George Lakoff and Mark Johnson argue that our understanding of abstract
concepts is heavily based on metaphors derived from physical experiences. This
notion supports Barsalou's theory by highlighting how abstract thinking is
grounded in concrete physical knowledge.
- ·
Physical Basis of Abstract Concepts:
Research suggests a link between physical experiences and abstract concepts,
such as the connection between physical and social warmth. This interaction
demonstrates the "strange loop" of consciousness, where symbolic and
physical levels influence each other.
- ·
Abstraction and Analogy:
Human cognition relies on abstraction (the ability to recognize general
categories from specific instances) and analogy (the perception of common
essence between two things). These fundamental capabilities underlie the
construction of mental models, concept formation, and our understanding of the
world.
Chapter 15: Knowledge, Abstraction, and
Analogy in Artificial Intelligence
- ·
Early AI and the Cyc Project:
Before the dominance of machine learning and neural networks, early AI research
focused on manually encoding rules and knowledge into systems, with Douglas
Lenat's Cyc project being the most ambitious attempt. Cyc aimed to capture
commonsense knowledge through a symbolic AI system of over fifteen million
logic-based assertions, though Lenat estimated this to be only 5% of the
required knowledge to achieve human-level intelligence.
- ·
Subconscious Knowledge and AI:
The chapter discusses the challenge of encoding subconscious commonsense
knowledge into AI systems. This knowledge, gained in infancy and childhood,
forms the foundation of human concepts but is not explicitly understood or
easily articulated, making it difficult for AI to replicate.
- ·
Limitations of Current AI Systems:
Despite advancements, current AI systems struggle with generalization beyond
narrow domains, abstraction, and understanding cause-and-effect relationships.
Their failures and vulnerabilities to adversarial attacks highlight a lack of
true understanding of concepts they are trained on.
- ·
Abstraction and Analogy-Making
Challenges: The challenge of enabling machines to
form humanlike conceptual abstractions remains largely unsolved. Bongard's
puzzles from 1967 are presented as examples that require humanlike abstraction
and analogy-making abilities, which current AI systems are unable to replicate
in any general sense.
- ·
Active Symbols and Analogy Making in
AI: The text discusses the subtlety of
"sameness" and analogy-making in AI, using examples from so-called
microworlds. These simplified domains are used to develop and test ideas, but
AI systems still struggle with identifying subtle similarities and differences.
- ·
Metacognition and AI:
The chapter highlights metacognition—reflecting on one's own thinking—as an
essential aspect of human intelligence does not present in current AI systems.
This lack prevents AI from recognizing and correcting its own errors or
ineffective problem-solving approaches.
- ·
Recognizing Visual Situations:
The text points out the difficulty AI systems face in recognizing complex
visual situations involving multiple entities and their relationships, a task
that humans perform easily but remains a significant challenge for AI.
Chapter 16: Questions, Answers, and
Speculations
- ·
Self-driving Cars and Long-tail
Situations: Self-driving cars face significant
challenges with long-tail situations or edge cases not covered in their
training, underscoring a lack of core intuitive knowledge like physics,
biology, and psychology that humans use to understand and predict the actions
of others on the road. Achieving full autonomy requires overcoming these
obstacles to ensure reliability in all circumstances.
- ·
AI and Employment:
The impact of AI on employment is uncertain, with a 2016 U.S. Council of
Economic Advisers report highlighting the potential for both unforeseen job
creation and displacement. The author suggests that true creativity in AI,
which includes understanding and judging creations, remains a challenge beyond
mere generative capabilities.
- ·
General Human-level AI:
The complexity and subtlety of human intelligence, which includes our ability
to learn, think, and adapt, remain unmatched by AI. The nuances of human
cognition, such as emotions, biases, and social interactions, contribute to our
general intelligence and are far from being replicated in AI systems.
- ·
Concerns about AI:
The author expresses concerns about over-trusting AI systems without fully
understanding their limitations. The potential for AI to cause massive job
losses, its misuse, and vulnerabilities, particularly in generating fake media,
pose significant risks. The anthropomorphizing of AI systems can lead to
overestimating their capabilities and reliability.
- · Unsolved Problems in AI: The author asserts that nearly all exciting problems in AI remain unsolved, highlighting the field's infancy and the vast scope for research and development to address the myriad challenges that lie ahead.
Comments
Post a Comment