Table of Contents

Open all
Close all
Preface
15
Objective of This Book
15
Target Audience
16
Prerequisites: What You Should Already Know
16
Structure of This Book
17
How to Use This Book Effectively
20
Downloadable Code and Additional Materials
21
System Setup
21
Python Installation
22
IDE Installation
22
Git Installation
23
Getting the Source Material
24
Setting up Your Local Environment
25
Acknowledgments
27
Conventions Used in This Book
28
1 Introduction to Generative AI
29
1.1 Introduction to Artificial Intelligence
30
1.2 Pillars of Generative AI Advancement
35
1.2.1 Computational Power
35
1.2.2 Model and Data Size
36
1.2.3 Investments
37
1.2.4 Algorithmic Improvements
37
1.3 Deep Learning
38
1.4 Narrow AI and General AI
40
1.5 Natural Language Processing Models
42
1.5.1 NLP Tasks
42
1.5.2 Architectures
45
1.6 Large Language Models
47
1.6.1 Training
47
1.6.2 Use Cases
48
1.6.3 Limitations
50
1.7 Large Multimodal Models
51
1.8 Generative AI Applications
52
1.8.1 Consumer
53
1.8.2 Business
53
1.8.3 Prosumer
54
1.9 Summary
54
2 Pretrained Models
57
2.1 Hugging Face
58
2.2 Coding: Text Summarization
60
2.3 Exercise: Translation
62
2.3.1 Task
63
2.3.2 Solution
63
2.4 Coding: Zero-Shot Classification
64
2.5 Coding: Fill-Mask
67
2.6 Coding: Question Answering
68
2.7 Coding: Named Entity Recognition
70
2.8 Coding: Text-to-Image
71
2.9 Exercise: Text-to-Audio
72
2.9.1 Task
73
2.9.2 Solution
73
2.10 Capstone Project: Customer Feedback Analysis
74
2.11 Summary
77
3 Large Language Models
79
3.1 Brief History of Language Models
80
3.2 Simple Use of LLMs via Python
81
3.2.1 Coding: Using OpenAI
81
3.2.2 Coding: Using Groq
84
3.2.3 Coding: Large Multimodal Models
87
3.2.4 Coding: Running Local LLMs
90
3.3 Model Parameters
93
3.3.1 Model Temperature
93
3.3.2 Top-p and Top-k
95
3.4 Model Selection
96
3.4.1 Performance
97
3.4.2 Knowledge Cutoff Date
98
3.4.3 On-Premise versus Cloud-Based Hosting
98
3.4.4 Open-Source, Open-Weight, and Proprietary Models
98
3.4.5 Price
99
3.4.6 Context Window
99
3.4.7 Latency
99
3.5 Messages
99
3.5.1 User
100
3.5.2 System
100
3.5.3 Assistant
100
3.6 Prompt Templates
101
3.6.1 Coding: ChatPromptTemplates
101
3.6.2 Coding: Improve Prompts with LangChain Hub
102
3.7 Chains
104
3.7.1 Coding: Simple Sequential Chain
105
3.7.2 Coding: Parallel Chain
106
3.7.3 Coding: Router Chain
109
3.7.4 Coding: Chain with Memory
113
3.8 Safety and Security
117
3.8.1 Security
118
3.8.2 Safety
118
3.8.3 Coding: Implementing LLM Safety and Security
119
3.9 Model Improvements
124
3.10 New Trends
125
3.10.1 Reasoning Models
126
3.10.2 Small Language Models
127
3.10.3 Test-Time Computation
128
3.11 Summary
130
4 Prompt Engineering
133
4.1 Prompt Basics
134
4.1.1 Prompt Process
134
4.1.2 Prompt Components
135
4.1.3 Basic Principles
136
4.2 Coding: Few-Shot Prompting
142
4.3 Coding: Chain-of-Thought
144
4.4 Coding: Self-Consistency Chain-of-Thought
145
4.5 Coding: Prompt Chaining
149
4.6 Coding: Self-Feedback
151
4.7 Summary
155
5 Vector Databases
157
5.1 Introduction
157
5.2 Data Ingestion Process
159
5.3 Loading Documents
160
5.3.1 High-Level Overview
161
5.3.2 Coding: Load a Single Text File
161
5.3.3 Coding: Load Multiple Text Files
163
5.3.4 Exercise: Load Multiple Wikipedia Articles
164
5.3.5 Exercise: Loading Project Gutenberg Book
166
5.4 Splitting Documents
167
5.4.1 Coding: Fixed-Size Chunking
169
5.4.2 Coding: Structure-Based Chunking
173
5.4.3 Coding: Semantic Chunking
176
5.4.4 Coding: Custom Chunking
178
5.5 Embeddings
182
5.5.1 Overview
182
5.5.2 Coding: Word Embeddings
184
5.5.3 Coding: Sentence Embeddings
190
5.5.4 Coding: Create Embeddings with LangChain
193
5.6 Storing Data
195
5.6.1 Selection of a Vector Database
196
5.6.2 Coding: File-Based Storage with a Chroma Database
196
5.6.3 Coding: Web-Based Storage with Pinecone
198
5.7 Retrieving Data
202
5.7.1 Similarity Calculation
202
5.7.2 Coding: Retrieve Data from Chroma Database
204
5.7.3 Coding: Retrieve Data from Pinecone
205
5.8 Capstone Project
207
5.8.1 Features
208
5.8.2 Dataset
209
5.8.3 Preparing the Vector Database
209
5.8.4 Exercise: Get All Genres from the Vector Database
213
5.8.5 App Development
214
5.9 Summary
218
6 Retrieval-Augmented Generation
221
6.1 Introduction
222
6.2 Coding: Simple Retrieval-Augmented Generation
225
6.2.1 Knowledge Source Setup
225
6.2.2 Retrieval
227
6.2.3 Augmentation
228
6.2.4 Generation
229
6.2.5 RAG Function Creation
230
6.3 Advanced Techniques
232
6.3.1 Advanced Preretrieval Techniques
232
6.3.2 Advanced Retrieval Techniques
234
6.3.3 Advanced Postretrieval Techniques
250
6.4 Coding: Prompt Caching
250
6.5 Evaluation
256
6.5.1 Challenges in RAG Evaluation
256
6.5.2 Metrics
257
6.5.3 Coding: Metrics
259
6.6 Summary
261
7 Agentic Systems
263
7.1 Introduction to AI Agents
264
7.2 Available Frameworks
265
7.3 Simple Agent
267
7.3.1 Agentic RAG
267
7.3.2 ReAct
271
7.4 Agentic Framework: LangGraph
275
7.4.1 Simple Graph: Assistant
275
7.4.2 Router Graph
279
7.4.3 Graph with Tools
284
7.5 Agentic Framework: AG2
289
7.5.1 Two Agent Conversations
290
7.5.2 Human in the Loop
293
7.5.3 Agents Using Tools
299
7.6 Agentic Framework: CrewAI
303
7.6.1 Introduction
303
7.6.2 First Crew: News Analysis Crew
304
7.6.3 Exercise: AI Security Crew
319
7.7 Agentic Framework: OpenAI Agents
328
7.7.1 Getting Started with a Single Agent
328
7.7.2 Working with Multiple Agents
329
7.7.3 Agent with Search and Retrieval Functionality
332
7.8 Agentic Framework: Pydantic AI
333
7.9 Monitoring Agentic Systems
336
7.9.1 AgentOps
336
7.9.2 Logfire
340
7.10 Summary
342
8 Deployment
345
8.1 Deployment Architecture
345
8.2 Deployment Strategy
347
8.2.1 REST API Development
347
8.2.2 Deployment Priorities
348
8.2.3 Coding: Local Deployment
350
8.3 Self-Contained App Development
355
8.4 Deployment to Heroku
361
8.4.1 Create a New App
361
8.4.2 Download and Configure CLI
362
8.4.3 Create app.py File
363
8.4.4 Procfile Setup
365
8.4.5 Environment Variables
365
8.4.6 Python Environment
366
8.4.7 Check the Result Locally
366
8.4.8 Deployment to Heroku
367
8.4.9 Stop Your App
368
8.5 Deployment to Streamlit
369
8.5.1 GitHub Repository
369
8.5.2 Creating a New App
370
8.6 Deployment with Render
372
8.7 Summary
374
9 Outlook
375
9.1 Advances in Model Architecture
375
9.2 Limitations and Issues of LLMs
376
9.2.1 Hallucinations
376
9.2.2 Biases
377
9.2.3 Misinformation
378
9.2.4 Intellectual Property
379
9.2.5 Interpretability and Transparency
379
9.2.6 Jailbreaking LLMs
379
9.3 Regulatory Developments
381
9.4 Artificial General Intelligence and Artificial Superintelligence
381
9.5 AI Systems in the Near Term
382
9.6 Useful Resources
384
9.7 Summary
384
The Author
387
Index
389