Open all

Close all

Preface

Objective of This Book

Target Audience

Prerequisites: What You Should Already Know

Structure of This Book

How to Use This Book Effectively

Downloadable Code and Additional Materials

System Setup

Python Installation

IDE Installation

Git Installation

Getting the Source Material

Setting up Your Local Environment

Acknowledgments

Conventions Used in This Book

1 Introduction to Generative AI

1.1 Introduction to Artificial Intelligence

1.2 Pillars of Generative AI Advancement

1.2.1 Computational Power

1.2.2 Model and Data Size

1.2.3 Investments

1.2.4 Algorithmic Improvements

1.3 Deep Learning

1.4 Narrow AI and General AI

1.5 Natural Language Processing Models

1.5.1 NLP Tasks

1.5.2 Architectures

1.6 Large Language Models

1.6.1 Training

1.6.2 Use Cases

1.6.3 Limitations

1.7 Large Multimodal Models

1.8 Generative AI Applications

1.8.1 Consumer

1.8.2 Business

1.8.3 Prosumer

1.9 Summary

2 Pretrained Models

2.1 Hugging Face

2.2 Coding: Text Summarization

2.3 Exercise: Translation

2.3.1 Task

2.3.2 Solution

2.4 Coding: Zero-Shot Classification

2.5 Coding: Fill-Mask

2.6 Coding: Question Answering

2.7 Coding: Named Entity Recognition

2.8 Coding: Text-to-Image

2.9 Exercise: Text-to-Audio

2.9.1 Task

2.9.2 Solution

2.10 Capstone Project: Customer Feedback Analysis

2.11 Summary

3 Large Language Models

3.1 Brief History of Language Models

3.2 Simple Use of LLMs via Python

3.2.1 Coding: Using OpenAI

3.2.2 Coding: Using Groq

3.2.3 Coding: Large Multimodal Models

3.2.4 Coding: Running Local LLMs

3.3 Model Parameters

3.3.1 Model Temperature

3.3.2 Top-p and Top-k

3.4 Model Selection

3.4.1 Performance

3.4.2 Knowledge Cutoff Date

3.4.3 On-Premise versus Cloud-Based Hosting

3.4.4 Open-Source, Open-Weight, and Proprietary Models

3.4.5 Price

3.4.6 Context Window

3.4.7 Latency

3.5 Messages

3.5.1 User

100

3.5.2 System

100

3.5.3 Assistant

100

3.6 Prompt Templates

101

3.6.1 Coding: ChatPromptTemplates

101

3.6.2 Coding: Improve Prompts with LangChain Hub

102

3.7 Chains

104

3.7.1 Coding: Simple Sequential Chain

105

3.7.2 Coding: Parallel Chain

106

3.7.3 Coding: Router Chain

109

3.7.4 Coding: Chain with Memory

113

3.8 Safety and Security

117

3.8.1 Security

118

3.8.2 Safety

118

3.8.3 Coding: Implementing LLM Safety and Security

119

3.9 Model Improvements

124

3.10 New Trends

125

3.10.1 Reasoning Models

126

3.10.2 Small Language Models

127

3.10.3 Test-Time Computation

128

3.11 Summary

130

4 Prompt Engineering

133

4.1 Prompt Basics

134

4.1.1 Prompt Process

134

4.1.2 Prompt Components

135

4.1.3 Basic Principles

136

4.2 Coding: Few-Shot Prompting

142

4.3 Coding: Chain-of-Thought

144

4.4 Coding: Self-Consistency Chain-of-Thought

145

4.5 Coding: Prompt Chaining

149

4.6 Coding: Self-Feedback

151

4.7 Summary

155

5 Vector Databases

157

5.1 Introduction

157

5.2 Data Ingestion Process

159

5.3 Loading Documents

160

5.3.1 High-Level Overview

161

5.3.2 Coding: Load a Single Text File

161

5.3.3 Coding: Load Multiple Text Files

163

5.3.4 Exercise: Load Multiple Wikipedia Articles

164

5.3.5 Exercise: Loading Project Gutenberg Book

166

5.4 Splitting Documents

167

5.4.1 Coding: Fixed-Size Chunking

169

5.4.2 Coding: Structure-Based Chunking

173

5.4.3 Coding: Semantic Chunking

176

5.4.4 Coding: Custom Chunking

178

5.5 Embeddings

182

5.5.1 Overview

182

5.5.2 Coding: Word Embeddings

184

5.5.3 Coding: Sentence Embeddings

190

5.5.4 Coding: Create Embeddings with LangChain

193

5.6 Storing Data

195

5.6.1 Selection of a Vector Database

196

5.6.2 Coding: File-Based Storage with a Chroma Database

196

5.6.3 Coding: Web-Based Storage with Pinecone

198

5.7 Retrieving Data

202

5.7.1 Similarity Calculation

202

5.7.2 Coding: Retrieve Data from Chroma Database

204

5.7.3 Coding: Retrieve Data from Pinecone

205

5.8 Capstone Project

207

5.8.1 Features

208

5.8.2 Dataset

209

5.8.3 Preparing the Vector Database

209

5.8.4 Exercise: Get All Genres from the Vector Database

213

5.8.5 App Development

214

5.9 Summary

218

6 Retrieval-Augmented Generation

221

6.1 Introduction

222

6.2 Coding: Simple Retrieval-Augmented Generation

225

6.2.1 Knowledge Source Setup

225

6.2.2 Retrieval

227

6.2.3 Augmentation

228

6.2.4 Generation

229

6.2.5 RAG Function Creation

230

6.3 Advanced Techniques

232

6.3.1 Advanced Preretrieval Techniques

232

6.3.2 Advanced Retrieval Techniques

234

6.3.3 Advanced Postretrieval Techniques

250

6.4 Coding: Prompt Caching

250

6.5 Evaluation

256

6.5.1 Challenges in RAG Evaluation

256

6.5.2 Metrics

257

6.5.3 Coding: Metrics

259

6.6 Summary

261

7 Agentic Systems

263

7.1 Introduction to AI Agents

264

7.2 Available Frameworks

265

7.3 Simple Agent

267

7.3.1 Agentic RAG

267

7.3.2 ReAct

271

7.4 Agentic Framework: LangGraph

275

7.4.1 Simple Graph: Assistant

275

7.4.2 Router Graph

279

7.4.3 Graph with Tools

284

7.5 Agentic Framework: AG2

289

7.5.1 Two Agent Conversations

290

7.5.2 Human in the Loop

293

7.5.3 Agents Using Tools

299

7.6 Agentic Framework: CrewAI

303

7.6.1 Introduction

303

7.6.2 First Crew: News Analysis Crew

304

7.6.3 Exercise: AI Security Crew

319

7.7 Agentic Framework: OpenAI Agents

328

7.7.1 Getting Started with a Single Agent

328

7.7.2 Working with Multiple Agents

329

7.7.3 Agent with Search and Retrieval Functionality

332

7.8 Agentic Framework: Pydantic AI

333

7.9 Monitoring Agentic Systems

336

7.9.1 AgentOps

336

7.9.2 Logfire

340

7.10 Summary

342

8 Deployment

345

8.1 Deployment Architecture

345

8.2 Deployment Strategy

347

8.2.1 REST API Development

347

8.2.2 Deployment Priorities

348

8.2.3 Coding: Local Deployment

350

8.3 Self-Contained App Development

355

8.4 Deployment to Heroku

361

8.4.1 Create a New App

361

8.4.2 Download and Configure CLI

362

8.4.3 Create app.py File

363

8.4.4 Procfile Setup

365

8.4.5 Environment Variables

365

8.4.6 Python Environment

366

8.4.7 Check the Result Locally

366

8.4.8 Deployment to Heroku

367

8.4.9 Stop Your App

368

8.5 Deployment to Streamlit

369

8.5.1 GitHub Repository

369

8.5.2 Creating a New App

370

8.6 Deployment with Render

372

8.7 Summary

374

9 Outlook

375

9.1 Advances in Model Architecture

375

9.2 Limitations and Issues of LLMs

376

9.2.1 Hallucinations

376

9.2.2 Biases

377

9.2.3 Misinformation

378

9.2.4 Intellectual Property

379

9.2.5 Interpretability and Transparency

379

9.2.6 Jailbreaking LLMs

379

9.3 Regulatory Developments

381

9.4 Artificial General Intelligence and Artificial Superintelligence

381

9.5 AI Systems in the Near Term

382

9.6 Useful Resources

384

9.7 Summary

384

The Author

387

Index

389

Table of Contents