Table of Contents

Open all
Close all
Preface
17
Objective
17
Target Audience
18
Structure of This Book
19
I Getting Started
21
1 An Introduction to Predictive Analytics
23
1.1 The Importance of Predictive Analysis
24
1.2 Predictive Analysis: Prescriptive and Exploratory
26
1.2.1 The Fundamental Idea
26
1.2.2 Prescriptive Analytics
28
1.2.3 Exploratory Analytics
29
1.2.4 Prescriptive vs. Exploratory Analytics
30
1.3 Preparing for a Successful Predictive Analysis Project
32
1.3.1 Stakeholders
32
1.3.2 Business Case: Objectives and Benefits
34
1.3.3 Requirements
35
1.3.4 Execution and Lifecycle Management
36
1.4 Industry Use Cases
37
1.5 Summary
39
2 What Is SAP Predictive Analytics?
41
2.1 Building Predictive Models with SAP Predictive Analytics
41
2.2 Automated Analytics and Expert Analytics
42
2.3 Mass Production of Predictive Models with the Predictive Factory
43
2.3.1 Result Production
43
2.3.2 Model Control
43
2.3.3 Model Retraining
44
2.3.4 Mass Production of Predictive Models
45
2.4 Data Preparation
45
2.5 Additional SAP Predictive Analytics Capabilities
47
2.6 Summary
48
3 Installing SAP Predictive Analytics
49
3.1 Recommended Deployments
50
3.2 Installing the SAP Predictive Analytics Server
51
3.2.1 Downloading the SAP Predictive Analytics Server
51
3.2.2 System Requirements
52
3.2.3 Installing the SAP Predictive Analytics Server
53
3.2.4 Post-Installation Steps
54
3.3 Installing the SAP Predictive Analytics Client
57
3.3.1 Downloading the SAP Predictive Analytics Client
58
3.3.2 System Requirements
58
3.3.3 Installation Steps
59
3.3.4 Checking the Installation
60
3.3.5 Starting the Client on Linux Operating Systems
60
3.4 Installing the Predictive Factory
61
3.4.1 Downloading the Predictive Factory
61
3.4.2 System Requirements
61
3.4.3 Installation Steps
62
3.4.4 Post-Installation Steps
63
3.5 Installing SAP Predictive Analytics Desktop
71
3.5.1 System Requirements
72
3.5.2 Installation Steps
72
3.5.3 Post-Installation Steps
73
3.6 SAP HANA Installation Steps
74
3.7 Summary
75
4 Planning a Predictive Analytics Project
77
4.1 Introduction to the CRISP-DM Methodology
78
4.2 Running a Project
81
4.2.1 Business Understanding
81
4.2.2 Data Understanding
85
4.2.3 Data Preparation
87
4.2.4 Modeling
93
4.2.5 Evaluation
96
4.2.6 Deployment
97
4.3 Summary
100
II The Predictive Factory
101
5 Predictive Factory
103
5.1 Predictive Factory: End-to-End Modeling
103
5.2 Creating a Project
105
5.3 External Executables
107
5.4 Variable Statistics
110
5.5 Summary
110
6 Automated Predictive Classification Models
111
6.1 Introducing Classification Models
112
6.1.1 The Classification Technique
112
6.1.2 Step-by-Step Classification Example
114
6.2 Creating an Automated Classification Model
115
6.2.1 Prerequisites
115
6.2.2 Creating the Data Connections and the Project
116
6.2.3 Creating the Model
117
6.3 Understanding and Improving an Automated Classification Model
123
6.3.1 Understanding an Automated Classification Model
123
6.3.2 Improving an Automated Classification Model
135
6.4 Applying an Automated Classification Model
139
6.4.1 Prerequisites
139
6.4.2 Applying a Classification Model
143
6.5 The Data Science behind Automated Predictive Classification Models
149
6.5.1 Foundations of Automated Analytics
150
6.5.2 Automated Data Preparation
160
6.5.3 Automated Data Encoding
164
6.6 Summary
172
7 Automated Predictive Regression Models
173
7.1 Introducing Regression Models
173
7.2 Creating an Automated Regression Model
174
7.3 Understanding and Improving an Automated Regression Model
179
7.3.1 Understanding an Automated Regression Model
179
7.3.2 Improving an Automated Regression Model
183
7.4 Applying an Automated Regression Model
184
7.4.1 Safely Applying an Automated Regression Model
184
7.4.2 Applying an Automated Regression Model
186
7.5 Summary
191
8 Automated Predictive Time Series Forecasting Models
193
8.1 Creating and Understanding Time Series Forecast Models
194
8.1.1 Creating and Training Models
195
8.1.2 Understanding Models
202
8.1.3 Saving Time Series Forecasts
206
8.1.4 Increasing Model Accuracy
210
8.2 Mass Producing Time Series Forecasts
214
8.3 Productizing the Forecast Model
221
8.4 The Data Science behind Automated Time Series Forecasting Models
227
8.4.1 Data Split
227
8.4.2 De-trending
228
8.4.3 De-cycling
229
8.4.4 Fluctuations
230
8.4.5 Smoothing
230
8.4.6 Model Quality
230
8.5 Summary
231
9 Massive Predictive Analytics
233
9.1 Deploying Predictive Models in Batch Mode
234
9.1.1 Deploying Times Series Forecasting Models
234
9.1.2 Deploying Classification/Regression Models
237
9.2 Model Quality and Deviation
241
9.2.1 Model Deviation Test Task Parameters
242
9.2.2 Model Deviation Test Task Outputs
242
9.3 Automatically Retraining Models
244
9.3.1 Defining a Model Retraining Task
244
9.3.2 Model Retraining Task Outputs
247
9.4 Scheduling and Combining Massive Tasks
250
9.4.1 Scheduling Tasks Independently
251
9.4.2 Event-Based Scheduling
253
9.5 Deploying Expert Analytics Models
253
9.6 Summary
256
III Automated Analytics
257
10 Automated Analytics User Interface
259
10.1 When to Use Automated Analytics
259
10.2 Navigating the User Interface
260
10.3 Exploring the Automated Analytics Modules
264
10.3.1 Data Manager
264
10.3.2 Data Modeler
265
10.3.3 Toolkit
266
10.4 Summary
269
11 Automated Predictive Clustering Models
271
11.1 The Clustering Approach of Automated Analytics
272
11.2 Creating a Clustering Model
273
11.2.1 Starting the Clustering Module and Importing the Data
274
11.2.2 Analyzing the Dataset’s Content
279
11.2.3 Choosing the Variables and Setting the Model Properties
283
11.2.4 Generating the Model and Analyzing the Cluster Profiles
287
11.2.5 Applying the Model to a Dataset
293
11.2.6 Saving and Exporting the Model
295
11.3 Supervised and Unsupervised Clustering
297
11.4 The Data Science behind Automated Clustering Models
299
11.5 Summary
300
12 Social Network Analysis
301
12.1 Terminology of Social Network Analysis
302
12.2 Automated Functionalities of Social Network Analysis
303
12.2.1 Node Pairing
304
12.2.2 Communities and Roles Detection
304
12.2.3 Social Graph Comparison
305
12.2.4 Bipartite Graphs Derivation and Recommendations
305
12.2.5 Proximity
305
12.2.6 Path Analysis
306
12.3 Creating a Social Network Analysis Model
306
12.3.1 Starting the Module and Importing the Dataset
308
12.3.2 Defining the Graph Model to Build
310
12.3.3 Adding More Graphs
314
12.3.4 Setting Community, Mega-hub, and Node Pairing Detection
315
12.3.5 Providing Descriptions of Nodes
320
12.4 Navigating and Understanding the Social Network Analysis Output
323
12.4.1 Understanding Model Quality from the Model Overview
323
12.4.2 Navigating into the Social Network
328
12.4.3 Applying the Model
336
12.5 Colocation and Path Analysis Overview
340
12.5.1 Colocation Analysis
340
12.5.2 Frequent Path Analysis
344
12.6 Conclusion
346
13 Automated Predictive Recommendation Models
347
13.1 Introduction
348
13.1.1 Basic Concepts
348
13.1.2 Datasets
349
13.1.3 Recommended Approaches
351
13.2 Using the Social Network Analysis Module
353
13.2.1 Creating the Model
353
13.2.2 Understanding the Model
358
13.2.3 Applying the Model
365
13.3 Using the Recommendation Module
368
13.3.1 Creating the Model
368
13.3.2 Understanding the Model
369
13.3.3 Applying the Model
372
13.4 Using the Automated Predictive Library
374
13.5 Summary
375
14 Advanced Data Preparation Techniques with the Data Manager
377
14.1 Data Preparation for SAP Predictive Analytics
377
14.2 Building Datasets for SAP Predictive Analytics
379
14.2.1 Datasets
379
14.2.2 Methodology
380
14.3 Creating a Dataset using the Data Manager
381
14.3.1 Creating Data Manager Objects
383
14.3.2 Merging Tables
391
14.3.3 Defining Temporal Aggregates
394
14.3.4 Using the Formula Editor
398
14.4 Additional Functionalities
400
14.4.1 Visibility and Value
400
14.4.2 Domains
401
14.4.3 Documentation
402
14.4.4 Data Preview and Statistics
403
14.4.5 Generated SQL
406
14.4.6 Prompts
406
14.5 Using Data Manager Objects in the Modeling Phase
408
14.6 Managing Metadata
408
14.7 SQL Settings
411
14.8 Summary
412
IV Advanced Workflows
413
15 Expert Analytics
415
15.1 When to Use Expert Analytics
415
15.2 Navigating the Expert Analytics Interface
416
15.3 Understanding a Typical Project Workflow
417
15.3.1 Selecting Data Source
418
15.3.2 Data Explorations
420
15.3.3 Graphical Components
422
15.3.4 Applying a Trained Model
424
15.3.5 Productionize
425
15.4 Creating an Expert Analytics Predictive Model
427
15.4.1 Load Data into SAP HANA
428
15.4.2 Create a Predictive Model
430
15.4.3 Model Deployment
436
15.5 Exploring the Available Algorithms
438
15.5.1 Connected to SAP HANA
438
15.5.2 Other Connectivity Types
440
15.6 Extending Functionality with R
442
15.6.1 Developing in an R Editor
442
15.6.2 Developing an R Function
444
15.6.3 Creating an R Extension
445
15.7 Summary
448
16 Integration into SAP and Third-Party Applications
451
16.1 Exporting Models as Third-Party Code
451
16.2 In-Database Integration
455
16.3 Scripting
456
16.3.1 Getting Started
456
16.3.2 KxShell Script
458
16.3.3 Executing Scripts
464
16.3.4 APIs
465
16.4 Summary
465
17 Hints, Tips, and Best Practices
467
17.1 Improving Predictive Model Quality
467
17.1.1 Data Quality
467
17.1.2 Data Description
468
17.1.3 Adding Predictors
468
17.1.4 Composite Variables
469
17.1.5 Adding Rows of Data
474
17.1.6 Reducing the Historical Period
474
17.1.7 Uniform Target Encoding
475
17.1.8 Segmented Modeling
475
17.1.9 Changing the Modeling Approach
475
17.2 Additional Resources
476
17.2.1 SAP Community and Newsletter
476
17.2.2 Product Documentation and Product Availability Matrix
476
17.2.3 Tutorials and Webinars
477
17.2.4 SAP Notes
478
17.3 Summary
478
18 Conclusion
479
18.1 Lessons Learned
479
18.2 The Future of SAP Predictive Analytics
479
18.3 Next Steps
480
Appendices
481
The Authors
481
Index
483