طراحی پورتال های سازمانی شرکت پروجان

شیرپوینت و پراجکت سرور پروجان

استقرار شیرپوینت و پراجکت سرور

مسیر سایت

کتاب SQL Server 2017 Integration Services Cookbook.pdf

SQL Server 2017 Integration Services Cookbook.pdf

دانلود رایگان کتاب SQL Server 2017 Integration Services Cookbook.pdf

ETL techniques to load and transform data from various sources using SQL Server 2017 Integration Services

Christian Cote     Matija Lah     Dejan Sarka

Copyright © 2017 Packt Publishing

لینک دانلود کتاب SQL Server 2017 Integration Services Cookbook.pdf

 

Contents

 

Chapter 1: SSIS Setup 7

Introduction 7
SQL Server 2016 download 7
Getting ready 7
How to do it... 8
Installing JRE for PolyBase 13
Getting ready 13
How to do it... 14
How it works... 20
Installing SQL Server 2016 20
Getting ready 20
How to do it... 21
SQL Server Management Studio installation 42
Getting ready 42
How to do it... 42
SQL Server Data Tools installation 46
Getting ready 47
How to do it... 47
Testing SQL Server connectivity 55
Getting ready 55
How to do it... 56

 

Chapter 2: What Is New in SSIS 2016 58

Introduction 58
Creating SSIS Catalog 59
Getting ready 59
How to do it... 60
Custom logging 64
Getting ready 65
How to do it... 65
How it works... 68
There's more... 68
Create a database 69
Create a simple project 70

Testing the custom logging level 92
See also 99
Azure tasks and transforms 99
Getting ready 99
How to do it... 100
See also 106
Incremental package deployment 106
Getting ready 107
How to do it... 107
There's more... 114
Multiple version support 114
Getting ready 115
How to do it... 115
There's more... 116
Error column name 117
Getting ready 117
How to do it... 117
Control Flow templates 125
Getting ready 126
How to do it... 126

 

Chapter 3: Key Components of a Modern ETL Solution 132

Introduction 132
Installing the sample solution 136
Getting ready 136
How to do it... 137
There's more... 139
Deploying the source database with its data 139
Getting ready 139
How to do it... 139
There's more... 149
Deploying the target database 151
Getting ready 151
How to do it... 152
SSIS projects 156
Getting ready 159
How to do it... 160
Framework calls in EP_Staging.dtsx 165
Getting ready 168
How to do it... 169

There's more... 171

 

Chapter 4: Data Warehouse Loading Techniques 172

Introduction 172
Designing patterns to load dimensions of a data warehouse 174
Getting ready 183
How to do it... 183
There's more... 189
Loading the data warehouse using the framework 190
Getting ready 190
How to do it... 191
Near real-time and on-demand loads 199
Getting ready 199
How to do it... 199
There's more... 202
Using parallelism 202
Getting ready 202
How to do it... 202
There's more... 204

 

Chapter 5: Dealing with Data Quality 205

Introduction 205
Profiling data with SSIS 208
Getting ready 208
How to do it... 209
Creating a DQS knowledge base 213
Getting ready 214
How to do it... 214
Data cleansing with DQS 218
Getting ready 218
How to do it... 218
Creating a MDS model 222
Getting ready 222
How to do it... 222
Matching with DQS 230
Getting ready 230
How to do it... 233
Using SSIS fuzzy components 239
Getting ready 240
How to do it... 240

 

 

Chapter 6: SSIS Performance and Scalability 244

Introduction 244
Using SQL Server Management Studio to execute an SSIS package 248
Getting ready 248
How to do it... 249
How it works... 253
Using T-SQL to execute an SSIS package 253
How to do it... 254
How it works... 257
Using the DTExec command-line utility to execute an SSIS package 257
How to do it... 257
How it works... 258
There's more... 258
Scheduling an SSIS package execution 258
Getting ready 259
How to do it... 259
How it works... 268
Using the cascading lookup pattern 268
How to do it... 268
How it works... 276
Using the lookup cache 277
How to do it... 278
How it works... 283
Using lookup expressions 283
How to do it... 283
How it works... 288
Determining the maximum number of worker threads in a data flow 289
How to do it... 289
How it works... 291
Using the master package concept 291
How to do it... 292
How it works... 296
Requesting an execution tree in SSDT 296
How to do it... 297
How it works... 304
Monitoring SSIS performance 305
Establishing a performance monitor session 306
How to do it... 306
How it works... 308 

Configuring a performance monitor data collector set 309
How to do it... 309
How it works.... 312

 

Chapter 7: Unleash the Power of SSIS Script Task and Component 313

Introduction 313
Using variables in SSIS Script task 314
Getting ready 315
How to do it... 315
Execute complex filesystem operations with the Script task 318
Getting ready 318
How to do it... 319
Reading data profiling XML results with the Script task 322
Getting ready 322
How to do it... 322
Correcting data with the Script component 325
Getting ready 325
How to do it... 326
Validating data using regular expressions in a Script component 331
Getting ready 332
How to do it... 332
Using the Script component as a source 343
How to do it... 343
How it works... 350
Using the Script component as a destination 350
Getting ready 351
How to do it... 351
How it works... 357

 

Chapter 8: SSIS and Advanced Analytics 358

Introduction 358
Splitting a dataset into a training and test set 359
Getting ready 359
How to do it... 359
Testing the randomness of the split with a SSAS decision trees model 362
Getting ready 363
How to do it... 363
Preparing a Naive Bayes SSAS data mining model 370
Getting ready 370
How to do it... 370

Querying the SSAS data mining model with the data mining query transformation 374
Getting ready 374
How to do it... 375
Creating an R data mining model 379
Getting ready 379
How to do it... 380
Using the R data mining model in SSIS 383
Getting ready 384
How to do it... 384
Text mining with term extraction and term lookup transformations 389
Getting ready 389
How to do it... 389

 

Chapter 9: On-Premises and Azure Big Data Integration 394

Introduction 394
Azure Blob storage data management 395
Getting ready 395
How to do it... 395
Installing a Hortonworks cluster 401
Getting ready 401
How to do it... 401
Copying data to an on-premises cluster 404
Getting ready 404
How to do it... 404
Using Hive – creating a database 409
Getting ready 409
How to do it... 410
There's more... 412
Transforming the data with Hive 412
Getting ready 412
How to do it... 412
There's more... 415
Transferring data between Hadoop and Azure 415
Getting ready 416
How to do it... 416
Leveraging a HDInsight big data cluster 423
Getting ready 423
How to do it... 423
There's more... 428

Managing data with Pig Latin 428
Getting ready 428
How to do it... 428
There's more... 430
Importing Azure Blob storage data 431
Getting ready 431
How to do it... 431
There's more... 436
Azure Data Factory and SSIS 436

 

Chapter 10: Extending SSIS Custom Tasks and Transformations 438

Introduction 438
Designing a custom task 439
Getting ready 441
How to do it... 443
How it works... 457
Designing a custom transformation 458
How to do it... 462
How it works... 483
Managing custom component versions 484
Getting ready 485
How to do it... 485
How it works... 490

 

Chapter 11: Scale Out with SSIS 2017 491

Introduction 491
SQL Server 2017 download and setup 492
Getting ready 492
How to do it... 492
There's more... 507
SQL Server client tools setup 507
Getting ready 508
How to do it... 508
Configuring SSIS for scale out executions 515
Getting ready 515
How to do it... 515
There's more... 520
Executing a package using scale out functionality 520
Getting ready 520
How to do it... 521

Index 531

 

 

Preface
SQL Server Integration Services is a tool that facilitates data extraction, consolidation, and loading options (ETL), SQL Server coding enhancements, data warehousing, and customizations. With the help of the recipes in this book, you'll gain hands-on experience of SSIS 2017 as well as the new 2016 features, design and development improvements including SCD, tuning, and customizations. At the start, you'll learn to install and set up SSIS as well other SQL Server resources to make optimal use of this business intelligence tool. We’ll begin by taking you through the new features in SSIS 2016/2017 and implementing the necessary features to get a modern scalable ETL solution that fits the modern data warehouse. Through the course of the book,you will learn how to design and build SSIS data warehouses packages using SQL Server Data Tools. Additionally, you'll learn how to develop SSIS packages designed to maintain a data warehouse using the data flow and other control flow tasks. You'll also go through many recipes on cleansing data and how to get the end result after applying different transformations. Some real-world scenarios that you might face are also covered and how to handle various issues that you might face when designing your packages. At the end of this book, you'll get to know all the key concepts to perform data integration and transformation. You'll have explored onpremises big data integration processes to create a classic data warehouse, and will know how to extend the toolbox with custom tasks and transforms.

 

What this book covers
Chapter 1, SSIS Setup, contains recipes describing the step by step setup of SQL Server 2016 to get the features that are used in the book.
Chapter  2, What Is New in SSIS 2016, contains recipes that talk about the evolution of SSIS over time and what's new in SSIS 2016. This chapter is a detailed overview of Integration.
Chapter 3, Key Components of a Modern ETL Solution, explains how ETL has evolved over the past few years and will explain what components are necessary to get a modern scalable ETL solution that fits the modern data warehouse. This chapter will also describe what each catalog view provides and will help you learn how you can use some of them to archive SSIS execution statistics.
Chapter 4, Data Warehouse Loading Techniques, describes many patterns used when it comes to data warehouse or ODS load. You will learn how to effectively load a data warehouse and process a tabular model, maintain data partitions and modern data refresh rates.

Chapter 5, Dealing with Data Quality, focuses on how SSIS can be leveraged to validate and load data. You will learn how to identify invalid data, cleanse data and load valid data to the data warehouse.
Chapter 6, SSIS Performance and Scalability, will talk about how to monitor SSIS package execution. It will also provide solutions to scale out processes by using parallelism. You will learn how to identify bottlenecks and how to resolve them using various techniques.
Chapter 7, Unleash the Power of SSIS Script Task and Component, covers how to use scripting with SSIS. You will learn how script tasks and script components are very valuable in many situations to overcome the limitations of stock toolbox tasks and transforms.
Chapter 8, SSIS and Advanced Analytics, talks about how SSIS can be used to prepare the data you need for further analysis. Here, you will learn how you can make use of SQL Server Analysis Services (SSAS) and R models in the SSIS data flow.
Chapter 9, On-Premises and Azure Big Data Integration, describes the Azure feature pack that allows SSIS to integrate Azure data from blob storage and HDInsight clusters. You will learn how to use Azure feature pack components to add flexibility to their SSIS solution architecture and integrate on-premises Big Data can be manipulated via SSIS.
Chapter 10, Extending SSIS Tasks and Transformations, talks about extending and customizing the toolbox using custom developed tasks and transforms and security features. You will learn the pros and cons of creating custom tasks to extend the SSIS toolbox and secure your deployment.
Chapter 11, Scale Out with SSIS 2017, talks about scaling out SSIS package executions on multiple servers. You will learn how SSIS 2017 can scale out to multiple workers to enhance execution scalability.

 

 

What you need for this book
This book was written using SQL Server 2016 and all the examples and functions should work with it. Other tools you may need are Visual Studio 2015, SQL Data Tools 16 or higher and SQL Server Management Studio 17 or later.
In addition to that, you will need Hortonworks Sandbox Docker for Windows Azure account and Microsoft Azure.
The last chapter of this book has been written using SQL Server 2017.

 

Who this book is for
This book is ideal for software engineers, DW/ETL architects, and ETL developers who need to create a new, or enhance an existing, ETL implementation with SQL Server 2017 Integration Services. This book would also be good for individuals who develop ETL solutions that use SSIS and are keen to learn the new features and capabilities in SSIS 2017.

 

1-SSIS Setup
In this chapter, we will cover the following recipes:
- SQL Server 2016 download
- Installing JRE for PolyBase
- Installing SQL Server 2016
- SQL Server Management Studio installation
- SQL Server Data Tools installation
- Test SQL Server connectivity

 

 

Introduction
This chapter will cover the basics of how to install SQL Server 2016 to properly go through the examples in this book. The version of SQL Server used through out this book is the Developer edition of SQL Server 2016. It's available for free as long as you subscribe to Visual Studio Dev Essentials.

 

SQL Server 2016 download
Following are the steps to download and install SQL Server 2016.

 

لینک دانلود کتاب SQL Server 2017 Integration Services Cookbook.pdf

 

 

عضویت در خبرنامه