HP Vertica - Architecture and SQL [1 ed.] 9781940540344

One of the most exciting new database inventions is Columnar technology. HP has built one of the best columnar databases

153 71 5MB

English Pages 766 Year 2015

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

HP Vertica - Architecture and SQL [1 ed.]
 9781940540344

Citation preview

The Tera-Tom Video Series

Lessons with Tera-Tom Teradata Architecture and SQL Video Series These exciting videos make learning and certification much easier

Three ways to view them: 1. Safari (look up Coffing Studios) 2. CoffingDW.com (sign-up on our website) 3. Your company can buy them all for everyone to see (contact [email protected])

The Tera-Tom Genius Series

The Tera-Tom Genius Series consists of ten books. Each book is designed for a specific audience, and Teradata is explained to the level best suited for that audience. The books take a building block approach; always starting out simple, then each page builds upon the previous point. Order them all at www.CoffingDW.com.

Tera-Tom- Author of over 50 Books

Tera-Tom books have been the primary source of Teradata learning for over 20 years. They have helped to teach millions of people all aspects of Teradata. What people love the most about the Tera-Tom books is how easy they are to understand. They are so easy that a seven year old boy (raised by wolves) can understand them!

The Best Query Tool Works on all Systems

When you possess a tool like Nexus, you have access to every system in your enterprise! The Nexus Query Chameleon is the only tool that works on all systems. Its Super Join Builder allows for the ERwin Logical Model to be loaded, and then Nexus shows tables and views visually. It then guides users to show what joins to what. As users choose the tables and columns they want in their report, Nexus builds the SQL for them with each click of the mouse. Nexus was designed for Teradata and Hadoop, but works on all platforms. Nexus even converts table structures between vendors, so querying and managing multi-vendor platforms is transparent. Even if you only work with one system, you will find that the Nexus is the best query tool you have ever used. If you work with multiple systems, you will be even more amazed. Download a free trial at www.CoffingDW.com.

Trademarks and Copyrights Microsoft Windows, Windows 2003 Server, SQL Server 2012, SQL Server Compact Edition, .NET, PDW, SQL Server, T-SQL, Azure SQL Data Warehouse and Azure Cloud are trademarks of Microsoft. Teradata, NCR, BYNET and SQL Assistant are registered trademarks of Teradata Corporation, Dayton, Ohio, U.S.A., IBM, DB2 and Netezza are registered trademarks of IBM Corporation, ANSI is a registered trademark of the American National Standards Institute. Ethernet is a trademark of Xerox. UNIX is a trademark of The Open Group. Linux is a trademark of Linus Torvalds. Java and Oracle is a trademark of Oracle. ParAccel is a trademark of ParAccel. Kognitio is a trademark of Kognitio. Greenplum is a trademark of EMC and Dell Corporation. Vertica is a trademark of HP Corporation. Nexus Query Chameleon is a trademark of Coffing Data Warehousing. Coffing Data Warehousing shall have neither liability nor responsibility to any person or entity with respect to any loss or damages arising from the information contained in this book or from the use of programs or program segments that are included. The manual is not a publication of HP Corporation, nor was it produced in conjunction with HP Corporation. Copyright © December 2015 by Coffing Publishing ISBN 978-1-940540-34-4 All rights reserved. No part of this book shall be reproduced, stored in a retrieval system, or transmitted by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission from the publisher. No patent liability is assumed with respect to the use of information contained herein. Although every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions, neither is any liability assumed for damages resulting from the use of information contained herein.

About Tom Coffing

Tom Coffing, better known as Tera-Tom, is the founder of Coffing Data Warehousing where he has been CEO for the past 20 years. Tom has written over 50 books on all aspects of Teradata, Netezza, Kognitio, Redshift, ParAccel, Vertica, SQL Server, and Greenplum. Tom has taught over 1,000 Teradata classes in places such as India, Africa, Europe, China, Malaysia, and throughout North America. Tom is also the owner and designer of the Nexus Query Chameleon, the most sophisticated enterprise query tool in the industry. The Nexus works on all platforms, including Hadoop, converts table structures between all systems, and allows companies to load their ERwin logical model inside Nexus. The Nexus guides users like a GPS system. Users point and click on any table or view from any system, and they are guided to what joins to what. As users choose the columns they want on their report, the SQL is built automatically. In High School, Tom was the first athlete from his school to ever place at state. He was selected by his school to represent them at Buckeye Boys State, and Tom was inducted into the first class of the Lakota High School Hall of Fame. At the University of Arizona and University of Nevada Las Vegas, Tom was a two-time All-American wrestler, Sophomore Athlete of the year, and a two-time winner of the 1980 Olympic wrestling trials. Tom graduated with a Bachelor’s degree in Speech Communications. After college, Tom became a state and national champion speech winner for Toastmasters and won two orchid awards as an actor. Tom is the proud father of three wonderful children and has been married for the past 32 years. You can contact Tom at 513 300-0341 or at [email protected].

About Leslie Nolander

Leslie Nolander has been the Chief Operating Officer (COO) at CoffingDW for the past ten years. She is responsible for running the business and has done a brilliant job of orchestrating the procedures and standards at CoffingDW. Leslie has personally negotiated hundreds of contracts both internationally and domestically and has overseen every financial transaction. Leslie is the author of multiple books and a key asset in the design and implementation of the Nexus Query Chameleon. Leslie has a Bachelor's Degree from the University of Arizona in Child Development and Family Relations and completed a fifth year Teacher Certification Program. Leslie was a teacher the first ten years of her career before joining the team at CoffingDW. Leslie resides in Phoenix, Arizona with her husband Jason and they have two daughters.

Table of Contents

Contents Chapter 1 – What is Columnar? .................................................................................................................................. 28 What is Parallel Processing? .................................................................................................................................... 29 Nothing Happens on Disk ........................................................................................................................................ 30 Data in Memory is fast as Lightning ........................................................................................................................ 31 Parallel Processing Of Data ..................................................................................................................................... 32 The Problem with Row-Based Data......................................................................................................................... 33 Columnar Data Can Store Each Column in Their Own Block ................................................................................ 34 Why Columnar? ....................................................................................................................................................... 35 Row Based Blocks vs. Columnar Based Blocks ...................................................................................................... 36 Visualize the Data – Rows vs. Columns .................................................................................................................. 37 The Architecture of Vertica ..................................................................................................................................... 38 Vertica Architecture Terms ...................................................................................................................................... 39 Vertica has Linear Scalability .................................................................................................................................. 40 Chapter 2 – Vertica Data Distribution ........................................................................................................................ 42 Distribution Strategy 1 - Segmented By Hash ......................................................................................................... 43 Distribution Strategy 2 - Unsegmented.................................................................................................................... 44 Sorting the Data in a Table CREATE Statement ..................................................................................................... 45 Even Distribution ..................................................................................................................................................... 46 Uneven Distribution Where the Data is Non-Unique .............................................................................................. 47 Matching Distribution Keys for Co-Location of Joins ............................................................................................ 48 Big Table / Small Table Joins .................................................................................................................................. 49 Fact and Dimension Table Distribution Key Designs ............................................................................................. 50 Why a Sort Key Improves Performance .................................................................................................................. 51 Sort Keys Help Group By, Order By and Window Functions................................................................................. 52

Table of Contents Chapter 3 – Clever Features of Vertica ...................................................................................................................... 54 Super Projections ..................................................................................................................................................... 55 Vertica Projections ................................................................................................................................................... 56 The Five Advantages of Projections ........................................................................................................................ 57 Creating a Projection ................................................................................................................................................ 58 Read-Optimized Store (ROS)/Write-Optimized Store (WOS) ................................................................................ 59 Write-Optimized Store (WOS) is Memory Resident ............................................................................................... 60 Updates are collected in Time-Based Buckets called Epochs ................................................................................. 61 Vertica Does Not Support In-Place Updates ........................................................................................................... 62 K-Safety ................................................................................................................................................................... 63 K-Safety of 2 ............................................................................................................................................................ 64 The Five Data Isolation Modes ................................................................................................................................ 65 Import/Export between Multiple Vertica Systems .................................................................................................. 66 Roles ......................................................................................................................................................................... 67 Compression ............................................................................................................................................................. 68 Runlength encoding ................................................................................................................................................. 69 LZO Encoding .......................................................................................................................................................... 70 Delta Encoding ......................................................................................................................................................... 71 Block Based Dictionary Encoding for Character Data ............................................................................................ 72 Chapter 4 - Nexus ....................................................................................................................................................... 74 Nexus is Available on the Cloud.............................................................................................................................. 75 Nexus Queries Every Major System ........................................................................................................................ 76 How to Use Nexus ................................................................................................................................................... 77 Why is Nexus Special? Visualization and Automatic SQL ..................................................................................... 78 Why is Nexus Special? Cross-System Joins ............................................................................................................ 79 Why is Nexus Special? The Amazing Hub System ................................................................................................. 80 Why is Nexus Special? Save Answer Sets as Tables .............................................................................................. 81 Why is Nexus Special? Automated Data Movement ............................................................................................... 82

Table of Contents Why is Nexus Special? Nexus makes the Servers Talk Directly ............................................................................ 83 What Makes Nexus Special? The Garden of Analysis ............................................................................................ 84 The Garden of Analysis Grouping Sets Tab ............................................................................................................ 85 The Garden of Analysis - Grouping Sets Answer Sets ............................................................................................ 86 The Garden of Analysis – Join Tab (1 of 4) ............................................................................................................ 87 The Garden of Analysis – Join Tab (2 of 4) ............................................................................................................ 88 The Garden of Analysis – Join Tab (3 of 4) ............................................................................................................ 89 The Garden of Analysis – Join Tab (4 of 4) ............................................................................................................ 90 The Garden of Analysis – Charts/Graphs Tab (1 of 4) ............................................................................................ 91 The Garden of Analysis – Charts/Graphs Tab (2 of 4) ............................................................................................ 92 The Garden of Analysis – Charts/Graphs Tab (3 of 4) ............................................................................................ 93 The Garden of Analysis – Charts/Graphs Tab (4 of 4) ............................................................................................ 94 The Garden of Analysis – Dynamic Charts Tab (1 of 4) ......................................................................................... 95 The Garden of Analysis – Dynamic Charts Tab (2 of 4) ......................................................................................... 96 The Garden of Analysis – Dynamic Charts Tab (3 of 4) ......................................................................................... 97 The Garden of Analysis – Dynamic Charts Tab (4 of 4) ......................................................................................... 98 The Garden of Analysis – Dashboard Tab (1 of 5).................................................................................................. 99 The Garden of Analysis – Dynamic Charts Tab (2 of 5) ....................................................................................... 100 The Garden of Analysis – Dynamic Charts Tab (3 of 5) ....................................................................................... 101 The Garden of Analysis – Dynamic Charts Tab (4 of 5) ....................................................................................... 102 The Garden of Analysis – Dynamic Charts Tab (5 of 5) ....................................................................................... 103 Getting to the Super Join Builder ........................................................................................................................... 104 The Super Join Builder is the First Entry in the Menu .......................................................................................... 105 The Super Join Builder Shows Tables Visually .................................................................................................... 106 Using the Add Join Button ..................................................................................................................................... 107 What to Do When No Tables are Joinable? ........................................................................................................... 108 Drag a Joinable Object into the Super Join Builder ............................................................................................... 109 You will see the Add Custom Join Window .......................................................................................................... 110 Defining the Join Columns .................................................................................................................................... 111

Table of Contents Your Tables Will Appear Together ....................................................................................................................... 112 Select the Columns You Want on the Report ........................................................................................................ 113 Check out the SQL Tab to See the SQL that has been built .................................................................................. 114 SQL Tab ................................................................................................................................................................. 115 Hit Execute to get the Report inside the Super Join Builder ................................................................................. 116 The Report is delivered inside the Super Join Builder .......................................................................................... 117 Let's Join Two Tables Again (1 of 6)..................................................................................................................... 118 Let's Join Two Tables Again (2 of 6)..................................................................................................................... 119 Let's Join Two Tables Again (3 of 6)..................................................................................................................... 120 Let's Join Two Tables Again (4 of 6)..................................................................................................................... 121 Let's Join Two Tables Again (5 of 6)..................................................................................................................... 122 Let's Join Two Tables Again (6 of 6)..................................................................................................................... 123 The Tabs of the Super Join Builder Philosophy – One Query............................................................................... 124 The Tabs of the Super Join Builder – Objects Tab ................................................................................................ 125 The Tabs of the Super Join Builder – Columns Tab) ............................................................................................ 126 The Tabs of the Super Join Builder – Sorting Tab ................................................................................................ 127 The Tabs of the Super Join Builder – Joins Tab .................................................................................................... 128 The Tabs of the Super Join Builder – SQL Tab..................................................................................................... 129 The Tabs of the Super Join Builder – Metadata Tab ............................................................................................. 130 The Tabs of the Super Join Builder – Analytics Tab ............................................................................................. 131 The Tabs of the SJB – Analytics Tab – OLAP Screen .......................................................................................... 132 Getting a Simple CSUM in the Analytics Tab – OLAP ........................................................................................ 133 Getting a Simple CSUM – The SQL Automatically Generated ............................................................................ 134 The Answer Set of the CSUM ............................................................................................................................... 135 Getting all of the OLAP functions in the Analytics Tab ....................................................................................... 136 A Five Table Join Using the Menu ........................................................................................................................ 137 The First Table is placed in the Super Join Builder ............................................................................................... 138 Using the Add Join Cascading Menu ..................................................................................................................... 139 All Five Tables Are In the Super Join Builder ...................................................................................................... 140

Table of Contents A Five Table Join Two Steps (Cube) ..................................................................................................................... 141 Choose Cube with Columns from the Left Top of the Table ................................................................................ 142 All Tables are Cubed (Joined Together Instantly) ................................................................................................. 143 Choose Cube and then Choose Your Columns ...................................................................................................... 144 Create Cube - Tables Are Joined Without Columns Selected ............................................................................... 145 Create Cube – Select the Columns You Want on the Report ................................................................................ 146 How to join Vertica, Oracle and SQL Server Tables............................................................................................. 147 The Vertica Table is now in the Super Join Builder .............................................................................................. 148 Drag the Joining Oracle Table to the Super Join Builder ...................................................................................... 149 Defining the Join Columns .................................................................................................................................... 150 Choose the Columns You Want on Your Report................................................................................................... 151 Let's Add a SQL Server Table to our Vertica and Oracle Join .............................................................................. 152 Defining the Join Columns .................................................................................................................................... 153 All Three Tables are now in the Super Join Builder .............................................................................................. 154 Change the Hub and Run the Join on Oracle ......................................................................................................... 155 Change the Hub and Run the Join on SQL Server................................................................................................. 156 Simply Amazing - Change the Hub to the Garden of Analysis ............................................................................. 157 Have the Answer Set Saved Automatically to Any System .................................................................................. 158 Saving the Answer Set to an Oracle or SQL Server System ................................................................................. 159 Saving the Answer Set to a Vertica System........................................................................................................... 160 Saving the Answer Set to a Teradata System ........................................................................................................ 161 Chapter 5 – The Basics of SQL ................................................................................................................................ 163 Introduction ............................................................................................................................................................ 164 Setting your Path .................................................................................................................................................... 165 Setting Your Default Database .............................................................................................................................. 166 SELECT * (All Columns) in a Table ..................................................................................................................... 167 Fully Qualifying a Database, Schema and Table ................................................................................................... 168 SELECT Specific Columns in a Table .................................................................................................................. 169

Table of Contents Commas in the Front or Back? .............................................................................................................................. 170 Place your Commas in front for better Debugging Capabilities ............................................................................ 171 Sort the Data with the ORDER BY Keyword ....................................................................................................... 172 ORDER BY Defaults to Ascending ....................................................................................................................... 173 Use the Name or the Number in your ORDER BY Statement .............................................................................. 174 Two Examples of ORDER BY using Different Techniques ................................................................................. 175 Changing the ORDER BY to Descending Order ................................................................................................... 176 NULL Values sort First in Ascending Mode (Default) ......................................................................................... 177 NULL Values sort Last in Descending Mode (DESC).......................................................................................... 178 Major Sort vs. Minor Sorts .................................................................................................................................... 179 Multiple Sort Keys using Names vs. Numbers ...................................................................................................... 180 Sorts are Alphabetical, NOT Logical ..................................................................................................................... 181 Using A CASE Statement to Sort Logically .......................................................................................................... 182 How to ALIAS a Column Name ............................................................................................................................ 183 A Missing Comma can by Mistake become an Alias ............................................................................................ 184 Aliasing a Column Name with Spaces or Reserved Words................................................................................... 185 Comments using Double Dashes are Single Line Comments ............................................................................... 186 Comments for Multi-Lines..................................................................................................................................... 187 Comments for Multi-Lines as Double Dashes per Line ........................................................................................ 188 Formatting Number ................................................................................................................................................ 189 Formatting Number Examples ............................................................................................................................... 190 Formatting Dates .................................................................................................................................................... 191 Formatting Date Example ...................................................................................................................................... 192 Chapter 6 – The WHERE Clause............................................................................................................................. 194 The WHERE Clause limits Returning Rows ......................................................................................................... 195 Double Quoted Aliases are for Reserved Words and Spaces ................................................................................ 196 Character Data needs Single Quotes in the WHERE Clause................................................................................. 197 Character Data needs Single Quotes, but Numbers Don’t..................................................................................... 198

Table of Contents Comparisons against a Null Value ......................................................................................................................... 199 NULL means UNKNOWN DATA so Equal (=) won’t Work .............................................................................. 200 Use IS NULL or IS NOT NULL when dealing with NULLs ............................................................................... 201 NULL is UNKNOWN DATA so NOT Equal won’t Work .................................................................................. 202 Use IS NULL or IS NOT NULL when dealing with NULLs ............................................................................... 203 Using Greater Than or Equal To (>=).................................................................................................................... 204 AND in the WHERE Clause .................................................................................................................................. 205 Troubleshooting AND ............................................................................................................................................ 206 OR in the WHERE Clause ..................................................................................................................................... 207 Troubleshooting Or ................................................................................................................................................ 208 Troubleshooting Character Data ............................................................................................................................ 209 Using Different Columns in an AND Statement ................................................................................................... 210 Quiz – How many rows will return? ...................................................................................................................... 211 Answer to Quiz – How many rows will return? .................................................................................................... 212 What is the Order of Precedence? .......................................................................................................................... 213 Using Parentheses to change the Order of Precedence .......................................................................................... 214 Using an IN List in place of OR ............................................................................................................................ 215 The IN List is an Excellent Technique................................................................................................................... 216 IN List vs. OR brings the same Results ................................................................................................................. 217 The IN List Can Use Character Data ..................................................................................................................... 218 Using a NOT IN List .............................................................................................................................................. 219 Null Values in a NOT IN List Bring Back No Rows ............................................................................................ 220 A Technique for Handling Nulls with a NOT IN List ........................................................................................... 221 BETWEEN is Inclusive ......................................................................................................................................... 222 NOT BETWEEN is Also Inclusive ....................................................................................................................... 223 LIKE uses Wildcards Percent ‘%’ and Underscore ‘_’ ......................................................................................... 224 LIKE command Underscore is Wildcard for one Character.................................................................................. 225 LIKE Command Works Differently on Char Vs Varchar ..................................................................................... 226 LIKE Command on Character Data Auto Trims ................................................................................................... 227

Table of Contents Quiz – What Data is Left Justified and what is Right? .......................................................................................... 228 Numbers are Right Justified and Character Data is Left ....................................................................................... 229 Answer – What Data is Left Justified and what is Right? ..................................................................................... 230 An Example of Data with Left and Right Justification ......................................................................................... 231 A Visual of CHARACTER Data vs. VARCHAR Data ........................................................................................ 232 Use the TRIM command to remove spaces on CHAR Data ................................................................................. 233 Escape Character in the LIKE Command changes Wildcards .............................................................................. 234 Escape Characters Turn off Wildcards in the LIKE Command ............................................................................ 235 Quiz – Turn off that Wildcard................................................................................................................................ 236 ANSWER – To Find that Wildcard ....................................................................................................................... 237 The Distinct Command .......................................................................................................................................... 238 Distinct vs. GROUP BY ........................................................................................................................................ 239 Quiz – How many rows come back from the Distinct? ......................................................................................... 240 Answer – How many rows come back from the Distinct? .................................................................................... 241 Chapter 7 – Aggregation ........................................................................................................................................... 243 Quiz – You calculate the Answer Set in your own Mind ...................................................................................... 244 Answer – You calculate the Answer Set in your own Mind ................................................................................. 245 Quiz – You calculate the Answer Set in your own Mind ...................................................................................... 246 Answer – You calculate the Answer Set in your own Mind ................................................................................. 247 The 3 Rules of Aggregation ................................................................................................................................... 248 There are Five Aggregates ..................................................................................................................................... 249 Quiz – How many rows come back? ..................................................................................................................... 250 Answer – How many rows come back? ................................................................................................................. 251 Troubleshooting Aggregates .................................................................................................................................. 252 GROUP BY when Aggregates and Normal Columns Mix ................................................................................... 253 GROUP BY delivers one row per Group .............................................................................................................. 254 GROUP BY Dept_No or GROUP BY 1 the same thing ....................................................................................... 255 Limiting Rows and Improving Performance with WHERE .................................................................................. 256

Table of Contents WHERE Clause in Aggregation limits unneeded Calculations ............................................................................. 257 Keyword HAVING tests Aggregates after they are totaled .................................................................................. 258 Keyword HAVING is like an Extra WHERE Clause for totals ............................................................................ 259 Keyword HAVING tests Aggregates after they are totaled .................................................................................. 260 Getting the Average Values per Column ............................................................................................................... 261 GROUP BY Rollup ................................................................................................................................................ 262 GROUP BY Rollup Result Set .............................................................................................................................. 263 Chapter 8 – Join Functions ....................................................................................................................................... 265 A Two-Table Join Using Traditional Syntax ......................................................................................................... 266 A two-table join using Non-ANSI Syntax with Table Alias ................................................................................. 267 You Can Fully Qualify All Columns ..................................................................................................................... 268 A two-table join using ANSI Syntax ..................................................................................................................... 269 Both Queries have the same Results and Performance.......................................................................................... 270 Quiz – Can You Finish the Join Syntax? ............................................................................................................... 271 Answer to Quiz – Can You Finish the Join Syntax? ............................................................................................. 272 Quiz – Can You Find the Error? ............................................................................................................................ 273 Answer to Quiz – Can You Find the Error? .......................................................................................................... 274 Super Quiz – Can You Find the Difficult Error? ................................................................................................... 275 Answer to Super Quiz – Can You Find the Difficult Error? ................................................................................. 276 Quiz – Which rows from both tables won’t return? .............................................................................................. 277 Answer to Quiz – Which rows from both tables Won’t Return?........................................................................... 278 LEFT OUTER JOIN .............................................................................................................................................. 279 LEFT OUTER JOIN Results ................................................................................................................................. 280 RIGHT OUTER JOIN............................................................................................................................................ 281 RIGHT OUTER JOIN Example and Results......................................................................................................... 282 FULL OUTER JOIN .............................................................................................................................................. 283 FULL OUTER JOIN Results ................................................................................................................................. 284 Which Tables are the Left and which Tables are Right? ....................................................................................... 285

Table of Contents Answer - Which Tables are the Left and which are the Right? ............................................................................. 286 INNER JOIN with Additional AND Clause .......................................................................................................... 287 ANSI INNER JOIN with Additional AND Clause ............................................................................................... 288 ANSI INNER JOIN with Additional WHERE Clause .......................................................................................... 289 OUTER JOIN with Additional WHERE Clause ................................................................................................... 290 OUTER JOIN with Additional AND Clause ......................................................................................................... 291 OUTER JOIN with Additional AND Clause Results ............................................................................................ 292 Quiz – Why is this considered an INNER JOIN? .................................................................................................. 293 Evaluation Order for Outer Queries ....................................................................................................................... 294 The DREADED Product Join ................................................................................................................................ 295 The DREADED Product Join Results ................................................................................................................... 296 The Horrifying Cartesian Product Join .................................................................................................................. 297 The ANSI Cartesian Join will ERROR .................................................................................................................. 298 Quiz – Do these Joins Return the Same Answer Set? ........................................................................................... 299 Answer – Do these Joins Return the Same Answer Set? ....................................................................................... 300 The CROSS JOIN .................................................................................................................................................. 301 The CROSS JOIN Answer Set............................................................................................................................... 302 The Self Join........................................................................................................................................................... 303 The Self Join with ANSI Syntax ............................................................................................................................ 304 Quiz – Will both queries bring back the same Answer Set? ................................................................................. 305 Answer – Will both queries bring back the same Answer Set? ............................................................................. 306 Quiz – Will both queries bring back the same Answer Set? ................................................................................. 307 Answer – Will both queries bring back the same Answer Set? ............................................................................. 308 How would you join these two tables? .................................................................................................................. 309 An Associative Table is a Bridge that Joins Two Tables ...................................................................................... 310 Quiz – Can you write the 3-Table Join? ................................................................................................................ 311 Answer to quiz – Can you Write the 3-Table Join? ............................................................................................... 312 Quiz – Can you write the 3-Table Join to ANSI Syntax? ...................................................................................... 313 Answer – Can you write the 3-Table Join to ANSI Syntax? ................................................................................. 314

Table of Contents Quiz – Can you Place the ON Clauses at the End?................................................................................................ 315 Answer – Can you Place the ON Clauses at the End? ........................................................................................... 316 The 5-Table Join – Logical Insurance Model ........................................................................................................ 317 Quiz - Write a Five Table Join Using ANSI Syntax .............................................................................................. 318 Answer - Write a Five Table Join Using ANSI Syntax ......................................................................................... 319 Quiz - Write a Five Table Join Using Non-ANSI Syntax ..................................................................................... 320 Answer - Write a Five Table Join Using Non-ANSI Syntax ................................................................................. 321 Quiz –Re-Write this putting the ON clauses at the END ...................................................................................... 322 Answer –Re-Write this putting the ON clauses at the END .................................................................................. 323 Chapter 9 – Date Functions....................................................................................................................................... 325 Current_Date .......................................................................................................................................................... 326 Current_Date, Current_Time and Current_Timestamp ......................................................................................... 327 Timestamp Differences .......................................................................................................................................... 328 Getdate ................................................................................................................................................................... 329 Date and Time Keywords....................................................................................................................................... 330 Using CAST in Literal Values ................................................................................................................................ 331 Add or Subtract Days from a date .......................................................................................................................... 332 Formatting Dates .................................................................................................................................................... 333 Formatting Date Example ...................................................................................................................................... 334 A Summary of Math Operations on Dates ............................................................................................................. 335 The ADD_MONTHS Command ........................................................................................................................... 336 Using the ADD_MONTHS Command to Add 1 Year .......................................................................................... 337 Using the ADD_MONTHS Command to Add 1 Year .......................................................................................... 338 Using the ADD_MONTHS Command to Add 5 Years ........................................................................................ 339 The EXTRACT Command .................................................................................................................................... 340 YEAR, MONTH, and DAY Functions .................................................................................................................. 341 A Better Technique for YEAR, MONTH, and DAY Functions ........................................................................... 342 Another Version of the EXTRACT Command ..................................................................................................... 343

Table of Contents EXTRACT from DATES and TIME ..................................................................................................................... 344 Why EXTRACT is a Better Form.......................................................................................................................... 345 EXTRACT with DATE and TIME Literals........................................................................................................... 346 EXTRACT of the Month on Aggregate Queriesntervals for Date, Time and Timestamp ............................................................................................................... 353 Interval Data Types and the Bytes to Store Them ................................................................................................. 354 Using Intervals ....................................................................................................................................................... 355 How a Simple Interval Handles Leap Year ........................................................................................................... 356 Interval Arithmetic Results .................................................................................................................................... 357 A Time Interval Example ....................................................................................................................................... 358 A DATE Interval Example Going Back in Time ................................................................................................... 359 A Complex Time Interval Example using CAST .................................................................................................. 360 A Complex Time Interval Example using CAST .................................................................................................. 361 The OVERLAPS Command .................................................................................................................................. 362 An OVERLAPS Example that Returns No Rows ................................................................................................. 363 The OVERLAPS Command using TIME.............................................................................................................. 364 Chapter 10 – OLAP Functions ................................................................................................................................. 366 The Row_Number Command ................................................................................................................................ 367 Quiz – How did the Row_Number Reset? ............................................................................................................. 368 Quiz – How did the Row_Number Reset? ............................................................................................................. 369 Using a Derived Table and Row_Number ............................................................................................................. 370 Finding the First Occurrence using a WITH Derived Table ................................................................................. 371 Finding the Last Occurrence using a WITH Derived Table .................................................................................. 372

Table of Contents Ordered Analytics OVER ...................................................................................................................................... 373 RANK and DENSE RANK ................................................................................................................................... 374 RANK Defaults to Ascending Order ..................................................................................................................... 375 Getting RANK to Sort in DESC Order .................................................................................................................. 376 RANK OVER and PARTITION BY ..................................................................................................................... 377 PERCENT_RANK OVER ..................................................................................................................................... 378 PERCENT_RANK OVER with 14 rows in Calculation ....................................................................................... 379 PERCENT_RANK OVER with 21 rows in Calculation ....................................................................................... 380 Quiz – What Causes the Product_ID to Reset? ..................................................................................................... 381 Answer to Quiz – What Cause the Product_ID to Reset? ..................................................................................... 382 Finding Gaps between Dates.................................................................................................................................. 383 CSUM – Rows Unbounded Preceding Explained ................................................................................................. 384 CSUM – Making Sense of the Data ....................................................................................................................... 385 CSUM – Making Even More Sense of the Data .................................................................................................... 386 CSUM – The Major and Minor Sort Key(s) .......................................................................................................... 387 The ANSI CSUM – Getting a Sequential Number ................................................................................................ 388 Troubleshooting the ANSI OLAP on a GROUP BY............................................................................................. 389 Reset with a PARTITION BY Statement .............................................................................................................. 390 PARTITION BY only Resets a Single OLAP not ALL of them........................................................................... 391 CURRENT ROW AND UNBOUNDED FOLLOWING ...................................................................................... 392 Different Windowing Options ............................................................................................................................... 393 Moving Sum has a Moving Window ..................................................................................................................... 394 How ANSI Moving SUM Handles the Sort .......................................................................................................... 395 Quiz – How is that Total Calculated? .................................................................................................................... 396 Answer to Quiz – How is that Total Calculated? .................................................................................................. 397 Moving SUM every 3-rows Vs a Continuous Average ......................................................................................... 398 PARTITION BY Resets an ANSI OLAP .............................................................................................................. 399 The Moving Window is Current Row and Preceding ............................................................................................ 400 How Moving Average Handles the Sort ................................................................................................................ 401

Table of Contents Moving Average..................................................................................................................................................... 402 Moving Average..................................................................................................................................................... 403 Quiz – How is that Total Calculated? .................................................................................................................... 404 Answer to Quiz – How is that Total Calculated? .................................................................................................. 405 Quiz – How is that 4th Row Calculated? ................................................................................................................ 406 Answer to Quiz – How is that 4th Row Calculated? .............................................................................................. 407 Moving Average every 3-rows vs a Continuous Average ..................................................................................... 408 PARTITION BY Resets an ANSI OLAP .............................................................................................................. 409 Moving Difference using ANSI Syntax ................................................................................................................. 410 Moving Difference using ANSI Syntax with Partition By .................................................................................... 411 COUNT OVER for a Sequential Number ............................................................................................................. 412 COUNT OVER without Rows Unbounded Preceding .......................................................................................... 413 Quiz – What caused the COUNT OVER to Reset? ............................................................................................... 414 Answer to Quiz – What caused the COUNT OVER to Reset? ............................................................................. 415 The MAX OVER Command.................................................................................................................................. 416 MAX OVER with PARTITION BY Reset ............................................................................................................ 417 MAX OVER without Rows Unbounded Preceding .............................................................................................. 418 The MIN OVER Command ................................................................................................................................... 419 MIN OVER without Rows Unbounded Preceding ................................................................................................ 420 Finding a Value of a Column in the Next Row with MIN .................................................................................... 421 The CSUM for Each Product_Id and the Next Start Date ..................................................................................... 422 Quiz – Fill in the Blank .......................................................................................................................................... 423 Answer – Fill in the Blank ..................................................................................................................................... 424 How Ntile Works ................................................................................................................................................... 425 Ntile ........................................................................................................................................................................ 426 Ntile Continued ...................................................................................................................................................... 427 Ntile Percentile ....................................................................................................................................................... 428 Another Ntile Example .......................................................................................................................................... 429 Using Tertiles (Partitions of Four) ......................................................................................................................... 430

Table of Contents NTILE .................................................................................................................................................................... 431 NTILE Using a Value of 10 ................................................................................................................................... 432 NTILE with a Partition........................................................................................................................................... 433 Using FIRST_VALUE ........................................................................................................................................... 434 FIRST_VALUE ..................................................................................................................................................... 435 FIRST_VALUE after Sorting by the Highest Value ............................................................................................. 436 FIRST_VALUE with Partitioning ......................................................................................................................... 437 Using LAST_VALUE ............................................................................................................................................ 438 LAST_VALUE ...................................................................................................................................................... 439 Using LAG and LEAD........................................................................................................................................... 440 Using LEAD........................................................................................................................................................... 441 Using LEAD With and Offset of 2 ........................................................................................................................ 442 LEAD ..................................................................................................................................................................... 443 LEAD With Partitioning ........................................................................................................................................ 444 Using LAG ............................................................................................................................................................. 445 Using LAG with an Offset of 2 .............................................................................................................................. 446 LAG ........................................................................................................................................................................ 447 LAG with Partitioning............................................................................................................................................ 448 MEDIAN with Partitioning .................................................................................................................................... 449 CUME_DIST ......................................................................................................................................................... 450 CUME_DIST with a Partition................................................................................................................................ 451 SUM (SUM (n)) ..................................................................................................................................................... 452 Chapter 11 – Temporary Tables ............................................................................................................................... 454 There are three types of Temporary Tables ........................................................................................................... 455 CREATING A Derived Table................................................................................................................................ 456 Naming the Derived Table ..................................................................................................................................... 457 Aliasing the Column Names in The Derived Table ............................................................................................... 458 Multiple Ways to Alias the Columns in a Derived Table ...................................................................................... 459

Table of Contents CREATING a Derived Table using the WITH Command .................................................................................... 460 The Same Derived Query shown Three Different Ways ....................................................................................... 461 Most Derived Tables Are Used To Join To Other Tables ..................................................................................... 462 The Three Components of a Derived Table ........................................................................................................... 463 Visualize This Derived Table ................................................................................................................................ 464 Our Join Example with a Different Column Aliasing Style .................................................................................. 465 Column Aliasing Can Default for Normal Columns ............................................................................................. 466 A Derived example Using the WITH Syntax ........................................................................................................ 467 Quiz - Answer the Questions ................................................................................................................................. 468 Answer to Quiz - Answer the Questions................................................................................................................ 469 Clever Tricks on Aliasing Columns in a Derived Table ........................................................................................ 470 A Derived Table lives only for the lifetime of a single query ............................................................................... 471 An Example of Two Derived Tables in a Single Query ........................................................................................ 472 Example of Two Derived Tables in a Single WITH Statement ............................................................................ 473 Finding the First Occurrence of a Row using WITH ............................................................................................. 474 Finding the Last Occurrence of a Row using WITH ............................................................................................. 475 Syntax for Temporary Tables ................................................................................................................................ 476 Temporary Tables Explained ................................................................................................................................. 477 Key Temporary Table Terms ................................................................................................................................. 478 Creating and Populating a Local Temporary Table ............................................................................................... 479 Using a Local Temporary Table ............................................................................................................................ 480 Creating and Populating a Global Temporary Table ............................................................................................. 481 Creating and Populating a Global Temporary Table ............................................................................................. 482 Some Great Examples of Creating a Temporary Table Quickly ........................................................................... 483 Creating a Temporary Table That is sorted ........................................................................................................... 484 A Temp Table That Populates some of the Rows.................................................................................................. 485 A Temporary Table with Some of the Columns .................................................................................................... 486

Table of Contents Chapter 12 – Sub-query Functions ........................................................................................................................... 488 An IN List is much like a Subquery ....................................................................................................................... 489 An IN List Never has Duplicates – Just like a Subquery....................................................................................... 490 The Subquery ......................................................................................................................................................... 491 The Three Steps of How a Basic Subquery Works................................................................................................ 492 These are Equivalent Queries ................................................................................................................................ 493 The Final Answer Set from the Subquery.............................................................................................................. 494 Quiz- Answer the Difficult Question ..................................................................................................................... 495 Answer to Quiz- Answer the Difficult Question ................................................................................................... 496 Should you use a Subquery or a Join? ................................................................................................................... 497 Quiz- Write the Subquery ...................................................................................................................................... 498 Answer to Quiz- Write the Subquery..................................................................................................................... 499 Quiz- Write the More Difficult Subquery .............................................................................................................. 500 Answer to Quiz- Write the More Difficult Subquery ............................................................................................ 501 Quiz – Write the Extreme Subquery ...................................................................................................................... 502 Answer to Quiz- Write the Extreme Subquery ...................................................................................................... 503 Quiz- Write the Subquery with an Aggregate........................................................................................................ 504 Answer to Quiz- Write the Subquery with an Aggregate ...................................................................................... 505 Quiz- Write the Correlated Subquery .................................................................................................................... 506 Answer to Quiz- Write the Correlated Subquery ................................................................................................... 507 The Basics of a Correlated Subquery ..................................................................................................................... 508 The Top Query always runs first in a Correlated Subquery .................................................................................. 509 Correlated Subquery Example vs. a Join with a Derived Table ............................................................................ 510 Quiz- A Second Chance to Write a Correlated Subquery ..................................................................................... 511 Answer - A Second Chance to Write a Correlated Subquery ................................................................................ 512 Quiz- A Third Chance to Write a Correlated Subquery ........................................................................................ 513 Answer - A Third Chance to Write a Correlated Subquery ................................................................................... 514 Quiz- Last Chance to Write a Correlated Subquery .............................................................................................. 515 Answer – Last Chance to Write a Correlated Subquery ........................................................................................ 516

Table of Contents Quiz – Write the Extreme Correlated Subquery .................................................................................................... 517 Answer To Quiz – Write the Extreme Correlated Subquery ................................................................................. 518 Quiz- Write the NOT Subquery ............................................................................................................................. 519 Answer to Quiz- Write the NOT Subquery ........................................................................................................... 520 Quiz- Write the Subquery using a WHERE Clause............................................................................................... 521 Answer - Write the Subquery using a WHERE Clause ......................................................................................... 522 Quiz- Write the Subquery with Two Parameters ................................................................................................... 523 Answer to Quiz- Write the Subquery with Two Parameters ................................................................................. 524 How the Double Parameter Subquery Works ........................................................................................................ 525 More on how the Double Parameter Subquery Works .......................................................................................... 526 Quiz – Write the Triple Subquery .......................................................................................................................... 527 Answer to Quiz – Write the Triple Subquery ........................................................................................................ 528 Quiz – How many rows return on a NOT IN with a NULL? ................................................................................ 529 Answer – How many rows return on a NOT IN with a NULL? ........................................................................... 530 How to handle a NOT IN with Potential NULL Values........................................................................................ 531 IN is equivalent to =ANY ...................................................................................................................................... 532 Using a Correlated Exists ....................................................................................................................................... 533 How a Correlated Exists matches up ..................................................................................................................... 534 The Correlated NOT Exists.................................................................................................................................... 535 The Correlated NOT Exists Answer Set ................................................................................................................ 536 Quiz – How many rows come back from this NOT Exists? .................................................................................. 537 Answer – How many rows come back from this NOT Exists? ............................................................................. 538 Chapter 13 – Strings.................................................................................................................................................. 540 The LENGTH Command Counts Characters ........................................................................................................ 541 The LENGTH Command – Spaces can Count too ................................................................................................ 542 The LENGTH Command and Character Data ....................................................................................................... 543 LENGTH and CHARACTER_LENGTH Are Equivalent .................................................................................... 544 OCTET_LENGTH ................................................................................................................................................. 545

Table of Contents UPPER and LOWER Commands .......................................................................................................................... 546 Using the LOWER Command ............................................................................................................................... 547 A LOWER Command Example ............................................................................................................................. 548 Using the UPPER Command ................................................................................................................................. 549 An UPPER Command Example ............................................................................................................................ 550 Non-Letters are Unaffected by UPPER and LOWER ........................................................................................... 551 The TRIM Command trims both Leading and Trailing Spaces ............................................................................ 552 Trim Combined with the CHARACTERS Command ........................................................................................... 553 How to TRIM only the Trailing Spaces ................................................................................................................. 554 A Visual of the TRIM Command Using Concatenation ........................................................................................ 555 Trim and Trailing is Case Sensitive ....................................................................................................................... 556 How to TRIM Trailing Letters ............................................................................................................................... 557 The SUBSTRING Command................................................................................................................................. 558 SUBSTRING and SUBSTR are equal, but use different syntax ........................................................................... 559 How SUBSTRING Works with NO ENDING POSITION .................................................................................. 560 Using SUBSTRING to move backwards ............................................................................................................... 561 How SUBSTRING Works with a Starting Position of -1 ..................................................................................... 562 How SUBSTRING Works with an Ending Position of 0 ...................................................................................... 563 An Example using SUBSTRING, TRIM and CHAR Together ............................................................................ 564 The POSITION Command finds a Letters Position .............................................................................................. 565 Quiz – Find that SUBSTRING Starting Position .................................................................................................. 566 Answer to Quiz – Find that SUBSTRING Starting Position ................................................................................. 567 Using the SUBSTRING to Find the Second Word On .......................................................................................... 568 Quiz – Why did only one Row Return ................................................................................................................... 569 Answer to Quiz – Why Did only one Row Return ................................................................................................ 570 Concatenation ......................................................................................................................................................... 571 Concatenation and SUBSTRING........................................................................................................................... 572 Four Concatenations Together ............................................................................................................................... 573 Troubleshooting Concatenation ............................................................................................................................. 574

Table of Contents Chapter 14 – Interrogating the Data.......................................................................................................................... 576 Numeric Manipulation Functions .......................................................................................................................... 577 Finding the Cube Root ........................................................................................................................................... 578 Ceiling Gets the Smallest Integer Not Smaller Than X ......................................................................................... 579 Floor Finds the Largest Integer Not Greater Than X ............................................................................................. 580 The Round Function and Precision ........................................................................................................................ 581 Quiz – What would the Answer be? ...................................................................................................................... 582 Answer to Quiz – What would the Answer be? ..................................................................................................... 583 The NULLIFZERO Command .............................................................................................................................. 584 The NULLIFZERO vs. Zeroes .............................................................................................................................. 585 Quiz – Fill in the Blank Values in the Answer Set ................................................................................................ 586 Answer to Quiz – Fill in the Blank Values in the Answer Set .............................................................................. 587 Quiz – Fill in the Answers for the NULLIF Command ......................................................................................... 588 Answer – Fill in the Answers for the NULLIF Command .................................................................................... 589 The ZEROIFNULL Command .............................................................................................................................. 590 Answer to the ZEROIFNULL Question ................................................................................................................ 591 The COALESCE Command .................................................................................................................................. 592 The COALESCE Answer Set ................................................................................................................................ 593 The Coalesce Quiz ................................................................................................................................................. 594 Answer – The Coalesce Quiz ................................................................................................................................. 595 The COALESCE Command – Fill In the Answers ............................................................................................... 596 The COALESCE Answer Set ................................................................................................................................ 597 COALESCE is Equivalent to This CASE Statement ............................................................................................ 598 Some Great CAST (Convert and Store) Examples ................................................................................................ 599 Some Great CAST (Convert and Store) Examples ................................................................................................ 600 A Rounding Example ............................................................................................................................................. 601 Some Great CAST (Convert and Store) Examples ................................................................................................ 602 Quiz - The Basics of the CASE Statements ........................................................................................................... 603 Answer to Quiz - The Basics of the CASE Statements ......................................................................................... 604

Table of Contents Using an ELSE in the Case Statement ................................................................................................................... 605 Using an ELSE as a Safety Net .............................................................................................................................. 606 Rules for a Valued Case Statement ........................................................................................................................ 607 Rules for a Searched Case Statement ..................................................................................................................... 608 The Basics of the CASE Statements ...................................................................................................................... 609 The Basics of the CASE Statement........................................................................................................................ 610 Valued Case vs. a Searched Case........................................................................................................................... 611 Quiz - Valued Case Statement ............................................................................................................................... 612 Answer - Valued Case Statement........................................................................................................................... 613 Quiz - Searched Case Statement ............................................................................................................................ 614 Answer - Searched Case Statement ....................................................................................................................... 615 Quiz - When NO ELSE is present in CASE Statement ......................................................................................... 616 Answer - When NO ELSE is present in CASE Statement .................................................................................... 617 When an ELSE is present in CASE Statement ...................................................................................................... 618 Answer - When an ELSE is present in CASE Statement ...................................................................................... 619 The CASE Challenge ............................................................................................................................................. 620 The CASE Challenge Answer................................................................................................................................ 621 Combining Searched Case and Valued Case ......................................................................................................... 622 A Trick for getting a Horizontal Case.................................................................................................................... 623 Nested Case ............................................................................................................................................................ 624 Put a CASE in the ORDER BY ............................................................................................................................. 625 Chapter 15 – View Functions ................................................................................................................................... 627 The Fundamentals of Views .................................................................................................................................. 628 Creating a Simple View to Restrict Sensitive Columns ........................................................................................ 629 You SELECT From a View ................................................................................................................................... 630 Creating a Simple View to Restrict Rows ............................................................................................................. 631 A View Provides Security for Columns and Rows ................................................................................................ 632 Basic Rules for Views ............................................................................................................................................ 633

Table of Contents How to Modify a View .......................................................................................................................................... 634 An Exception to the ORDER BY Rule inside a View ........................................................................................... 635 Views Are Sometimes CREATED for Formatting ................................................................................................ 636 Creating a View to Join Tables Together............................................................................................................... 637 How to Alias Columns in a View CREATE .......................................................................................................... 638 The Standard Way Most Aliasing is done ............................................................................................................. 639 What Happens When Both Aliasing Options Are Present .................................................................................... 640 Resolving Aliasing Problems in a View CREATE ............................................................................................... 641 Answer to Resolving Aliasing Problems in a View CREATE .............................................................................. 642 Aggregates on View Aggregates............................................................................................................................ 643 Altering A Table After a View Has Been Created ................................................................................................ 644 A View that Errors after An ALTER ..................................................................................................................... 645 Chapter 16 – Set Operators Functions ...................................................................................................................... 647 Rules of Set Operators ........................................................................................................................................... 648 INTERSECT Explained Logically......................................................................................................................... 649 INTERSECT Explained Logically......................................................................................................................... 650 UNION Explained Logically ................................................................................................................................. 651 UNION Explained Logically ................................................................................................................................. 652 UNION ALL Explained Logically ........................................................................................................................ 653 UNION ALL Explained Logically ........................................................................................................................ 654 EXCEPT Explained Logically ............................................................................................................................... 655 EXCEPT Explained Logically ............................................................................................................................... 656 Minus Explained Logically .................................................................................................................................... 657 Minus Explained Logically .................................................................................................................................... 658 Testing Your Knowledge ....................................................................................................................................... 659 Answer - Testing Your Knowledge ....................................................................................................................... 660 Testing Your Knowledge ....................................................................................................................................... 661 Answer - Testing Your Knowledge ....................................................................................................................... 662

Table of Contents An Equal Amount of Columns in both SELECT List ........................................................................................... 663 Columns in the SELECT list should be from the same Domain ........................................................................... 664 The Top Query handles all Aliases ........................................................................................................................ 665 The Bottom Query does the ORDER BY (a Number) .......................................................................................... 666 Great Trick: Place your Set Operator in a Derived Table..................................................................................... 667 UNION Vs UNION ALL ....................................................................................................................................... 668 Using UNION ALL and Literals ........................................................................................................................... 669 A Great Example of how EXCEPT works ............................................................................................................ 670 USING Multiple SET Operators in a Single Request............................................................................................ 671 Changing the Order of Precedence with Parentheses ............................................................................................ 672 Using UNION ALL for speed in Merging Data Sets ............................................................................................ 673 Chapter 17 – Table Create and Data Types .............................................................................................................. 675 Distribution Strategy 1 - Segmented By Hash ....................................................................................................... 676 Distribution Strategy 2 - Unsegmented.................................................................................................................. 677 Sorting the Data in a Table CREATE Statement ................................................................................................... 678 Even Distribution ................................................................................................................................................... 679 Uneven Distribution Where the Data is Non-Unique ............................................................................................ 680 Matching Distribution Keys for Co-Location of Joins .......................................................................................... 681 Big Table / Small Table Joins ................................................................................................................................ 682 Fact and Dimension Table Distribution Key Designs ........................................................................................... 683 Why a Sort Key Improves Performance ................................................................................................................ 684 Sort Keys Help GROUP BY, ORDER BY and Window Functions ..................................................................... 685 Syntax for Temporary Tables ................................................................................................................................ 686 Temporary Tables Explained ................................................................................................................................. 687 Key Temporary Table Terms ................................................................................................................................. 688 Creating and Populating a Local Temporary Table ............................................................................................... 689 Using a Local Temporary Table ............................................................................................................................ 690 Creating and Populating a Global Temporary Table ............................................................................................. 691

Table of Contents Creating and Populating a Global Temporary Table ............................................................................................. 692 Some Great Examples of Creating a Temporary Table Quickly ........................................................................... 693 Creating a Temporary Table That is sorted ........................................................................................................... 694 A Temp Table That Populates Some of the Rows ................................................................................................. 695 A Temporary Table with Some of the Columns .................................................................................................... 696 Chapter 18 – Data Manipulation Language (DML) ................................................................................................. 698 INSERT Syntax # 1 ................................................................................................................................................ 699 INSERT example with Syntax 1 ............................................................................................................................ 700 INSERT Syntax # 2 ................................................................................................................................................ 701 INSERT example with Syntax 2 ............................................................................................................................ 702 INSERT/SELECT Command ................................................................................................................................ 703 INSERT/SELECT example using All Columns (*) .............................................................................................. 704 INSERT/SELECT example with Less Columns ................................................................................................... 705 Two UPDATE Examples ....................................................................................................................................... 706 Subquery UPDATE Command Syntax .................................................................................................................. 707 Example of Subquery UPDATE Command .......................................................................................................... 708 Join UPDATE Command Syntax .......................................................................................................................... 709 Example of an UPDATE Join Command .............................................................................................................. 710 Fast UPDATE ........................................................................................................................................................ 711 Example of Subquery DELETE Command ........................................................................................................... 712 Chapter 19 – Statistical Aggregate Functions........................................................................................................... 714 The Stats Table ....................................................................................................................................................... 715 The STDDEV_POP Function ................................................................................................................................ 716 A STDDEV_POP Example ................................................................................................................................... 717 The STDDEV_SAMP Function............................................................................................................................. 718 A STDDEV_SAMP Example ................................................................................................................................ 719 The VAR_POP Function ....................................................................................................................................... 720

Table of Contents A VAR_POP Example ........................................................................................................................................... 721 The VAR_SAMP Function .................................................................................................................................... 722 A VAR_SAMP Example ....................................................................................................................................... 723 The VARIANCE Function..................................................................................................................................... 724 A VARIANCE Example ........................................................................................................................................ 725 The CORR Function .............................................................................................................................................. 726 A CORR Example .................................................................................................................................................. 727 Another CORR Example so you can compare ...................................................................................................... 728 The COVAR_POP Function .................................................................................................................................. 729 A COVAR_POP Example ..................................................................................................................................... 730 Another COVAR_POP Example so you can compare .......................................................................................... 731 The COVAR_SAMP Function .............................................................................................................................. 732 A COVAR_SAMP Example .................................................................................................................................. 733 Another COVAR_SAMP Example so you can compare ...................................................................................... 734 The REGR_INTERCEPT Function ....................................................................................................................... 735 A REGR_INTERCEPT Example .......................................................................................................................... 736 Another REGR_INTERCEPT Example so you can compare ............................................................................... 737 The REGR_SLOPE Function ................................................................................................................................ 738 REGR_SLOPE Example ........................................................................................................................................ 739 Another REGR_SLOPE Example so you can compare ........................................................................................ 740 The REGR_AVGX Function ............................................................................................................................... 741 A REGR_AVGX Example .................................................................................................................................. 742 Another REGR_AVGX Example so you can compare ......................................................................................... 743 The REGR_AVGY Function ............................................................................................................................... 744 A REGR_AVGY Example .................................................................................................................................... 745 Another REGR_AVGY Example so you can compare ......................................................................................... 746 The REGR_COUNT Function ............................................................................................................................. 747 A REGR_COUNT Example .................................................................................................................................. 748 The REGR_R2 Function ........................................................................................................................................ 749

Table of Contents A REGR_R2 Example ........................................................................................................................................... 750 The REGR_SXX Function..................................................................................................................................... 751 A REGR_SXX Example ........................................................................................................................................ 752 The REGR_SXY Function..................................................................................................................................... 753 A REGR_SXY Example ........................................................................................................................................ 754 The REGR_SYY Function..................................................................................................................................... 755 A REGR_SYY Example ........................................................................................................................................ 756 Using GROUP BY ................................................................................................................................................. 757

Chapter 1

Page 27

What is Columnar?

Chapter 1

What is Columnar?

Chapter 1 – What is Columnar?

“When you go into court you, are putting your fate into the hands of twelve people who weren’t smart enough to get out of jury duty.” – Norm Crosby

Page 28

Chapter 1

What is Columnar?

What is Parallel Processing? "After enlightenment, the laundry" - Zen Proverb

Tera-Tom's Parallel Processing Wash and Dry

"After parallel processing the laundry, enlightenment!" -Matrix Zen Proverb

Two guys were having fun on a Saturday night when one said, “I’ve got to go and do my laundry.” The other said, "What!?" The first man explained that if he went to the laundry mat the next morning, he would be lucky to get one machine and be there all day. But if he went on Saturday night, he could get all the machines. Then, he could do all his wash and dry in two hours. Now that's parallel processing mixed in with a little dry humor!

Page 29

Chapter 1

What is Columnar?

Nothing Happens on Disk CPU

Memory How are we doing on orders today?

Orders Order_No 100 200 300 400

Customer_No

Order_Date

21345679 32456733 31323134 87323456

01/01/2013 01/01/2013 01/01/2013 01/01/2013

Order_Total

12347.53 8005.91 5111.47 15231.62

How would I know? I'm just a disk. I need to transfer the block of data to the memory, and that is a slow process.

“When you are courting a nice girl, an hour seems like a second. When you sit on a red-hot cinder, a second seems like an hour. That’s relativity.” – Albert Einstein

Data on disk does absolutely nothing. When data is requested, the computer moves the data one block at a time from disk into memory. Once the data is in memory, it is processed by the CPU at lightning speed. All computers work this way. The "Achilles Heel" of every computer is the slow process of moving data from disk to memory. The real theory of relativity is to find out how to get blocks of data from the disk into memory faster!

Page 30

Chapter 1

What is Columnar?

Data in Memory is fast as Lightning CPU

Memory Order_No 100 200 300 400

Customer_No

Order_Date

21345679 32456733 31323134 87323456

01/01/2013 01/01/2013 01/01/2013 01/01/2013

Order_Total 12347.53 8005.91 5111.47 15231.62

Orders Order_No 100 200 300 400

Customer_No

Order_Date

21345679 32456733 31323134 87323456

01/01/2013 01/01/2013 01/01/2013 01/01/2013

Order_Total 12347.53 8005.91 5111.47 15231.62

“You can observe a lot by watching.” – Yogi Berra

Once the data block is moved off of the disk and into memory, the processing of that block happens as fast as lightning. It is the movement of the block from disk into memory that slows down every computer. Data being processed in memory is so fast that even Yogi Berra couldn't catch it!

Page 31

Chapter 1

What is Columnar?

Parallel Processing Of Data Parallel Process

Parallel Process

Memory

Memory

Cust_No

Order_Date

Order_Total

Cust_No

21345679 32456733 31323134 87323456

01/01/2013 01/01/2013 01/01/2013 01/01/2013

12347.53 8005.91 5111.47 15231.62

34345699 41456543 51323154 67823486

Order_Date

Orders Cust_No 21345679 32456733 31323134 87323456

Parallel Process Memory

Order_Total

01/01/2013 01/01/2013 01/01/2013 01/01/2013

13347.51 13005.91 7611.57 11671.92

Cust_No

Order_Date

87945679 98756733 35623134 97873456

Orders

Order_Date

Order_Total

Cust_No

01/01/2013 01/01/2013 01/01/2013 01/01/2013

12347.53 8005.91 5111.47 15231.62

34345699 41456543 51323154 67823486

Order_Date 01/01/2013 01/01/2013 01/01/2013 01/01/2013

Parallel Process Memory

Order_Total

Cust_No

Order_Date

Order_Total

8347.53 17005.91 3451.47 19871.62

44445679 32547733 57497134 87768956

01/01/2013 01/01/2013 01/01/2013 01/01/2013

12447.53 8055.66 5651.47 231.62

Order_Total

Cust_No

01/01/2013 01/01/2013 01/01/2013 01/01/2013

Orders

Order_Total 13347.51 13005.91 7611.57 11671.92

Cust_No

Order_Date

87945679 98756733 35623134 97873456

01/01/2013 01/01/2013 01/01/2013 01/01/2013

Orders 8347.53 17005.91 3451.47 19871.62

44445679 32547733 57497134 87768956

Order_Date 01/01/2013 01/01/2013 01/01/2013 01/01/2013

Order_Total 12447.53 8055.66 5651.47 231.62

"If the facts don't fit the theory, change the facts."

-Albert Einstein

Big Data is all about parallel processing. Parallel processing is all about taking the rows of a table and spreading them among many parallel processing units. Above, we can see a table called Orders. There are 16 rows in the table. Each parallel processor holds four rows. Now they can process the data in parallel and be four times as fast. What Albert Einstein meant to say was, “If the theory doesn't fit the dimension table, change it to a fact."

Page 32

Chapter 1

What is Columnar?

The Problem with Row-Based Data Parallel Process

M e m o r y

The entire block must be placed into memory just to calculate one column Cust_No ________

Customer_Name ______________

31323134 57896883 11111111 11111111 87323456

ACE Consulting XYZ Plumbing Billy's Best Choice Billy's Best Choice Databases N-U

Cust_No ________ 31323134 57896883 11111111 11111111 87323456

Customer_Name ______________ ACE Consulting XYZ Plumbing Billy's Best Choice Billy's Best Choice Databases N-U

Phone Order_No Order_Date Order_Total ________ ________ __________ __________ 555-1212 347-8954 555-1234 555-1234 322-1012

123552 123777 123456 123512 123585

10/01/1999 5111.47 09/09/1999 23454.84 05/04/1998 12347.53 01/01/1999 8005.91 10/10/1999 15231.62

Phone Order_No ________ ________ Order_Date __________Order_Total __________ 555-1212 123552 10/01/1999 5111.47 347-8954 123777 09/09/1999 23454.84 555-1234 123456 05/04/1998 12347.53 555-1234 123512 01/01/1999 8005.91 322-1012 123585 10/10/1999 15231.62

SELECT AVG(Order_Total) FROM Row_Based_Table;

Nothing happens on disk. For data to be processed, the block of disk data must be copied and moved into memory. The problem with row-based data is that the entire block must be moved into memory even when the query only needs to analyze a single column. When queries only need a few columns, moving the entire block is a lot of wasted energy.

Page 33

Chapter 1

What is Columnar?

Columnar Data Can Store Each Column in Their Own Block Parallel Process

M e m o r y

Columnar data is designed to only move the columns needed to satisfy the query

Cust_No ________ 31323134 57896883 11111111 11111111 87323456

5111.47 23454.84 12347.53 8005.91 15231.62

Customer_Name ______________ ACE Consulting XYZ Plumbing Billy's Best Choice Billy's Best Choice Databases N-U

Phone Order_No ________ ________ Order_Date __________ Order_Total __________ 555-1212 123552 10/01/1999 5111.47 347-8954 123777 09/09/1999 23454.84 555-1234 123456 05/04/1998 12347.53 555-1234 123512 01/01/1999 8005.91 322-1012 123585 10/10/1999 15231.62

SELECT AVG(Order_Total) FROM Row_Based_Table;

Columnar systems can store each column of a table in their own individual block. This is extremely efficient when a query only needs a relatively few columns from the table to satisfy the query. Our query above only needs the Order_Total column to get the average Order_Total. Only one small block on each parallel process is moved into memory. Wow, that was fast!

Page 34

Chapter 1

What is Columnar?

Why Columnar?

“Everyone is kneaded out of the same dough but not baked in the same oven.” – Yiddish Proverb

Emp_No

Dept_No

1001 1002 1003 1004 1005 1006 1007 1008 1009

100 200 300 400 400 300 200 100 300

First_Name

Rafael Maria Charl Kyle Rob Inna Sushma Mo Mo

Last_Name

Minal Gomez Kertzel Stover Rivers Kinski Davis Khan Swartz

Salary

90000 80000 70000 60000 50000 50000 50000 60000 70000

Each data block holds a single column. The row can be rebuilt because everything is aligned perfectly. If someone runs a query that would return the average salary, then only one small data block is moved into memory. The salary block moves into memory where it is processed as fast as lightning. We just cut down on moving large blocks by 80%! Why columnar? Because like our Yiddish Proverb says, "All data is not kneaded on every query, so that is why it costs so much dough."

Page 35

Chapter 1

What is Columnar?

Row Based Blocks vs. Columnar Based Blocks “Two roads diverged in a wood and I took the one less traveled by, and that has made all the difference.” – Robert Frost

Row based

Columnar Design

Both designs have the same amount of data. Both take up just as much space. In this example, both have 9 rows and five columns. If a query needs to analyze all of the rows or return most of the columns, then the row based design is faster and more efficient. However, if the query only needs to analyze a few rows or merely a few columns, then the columnar design is much lighter because not all of the data is moved into memory. Just one or two columns move. Take the road less traveled.

Page 36

Chapter 1

What is Columnar?

Visualize the Data – Rows vs. Columns 24 rows (five columns) stored in 6 blocks in this row-based system

24 rows (five columns) stored in 15 blocks (each column is its own block)

Both examples above have the same data and the same amount of data. If your applications tend to need to analyze the majority of columns or read the entire table, then a row-based system (top example) can move more data into memory. Columnar tables are advantageous when only a few columns need to be read. This is just one of the reasons why analytics goes with columnar like bread goes with butter. A row-based system must move the entire block into memory even if it only needs to read one row or even a single column. If a user above needed to analyze the Salary, the columnar system would move 80% less block mass.

Page 37

Chapter 1

What is Columnar?

The Architecture of Vertica Compute Node 1 S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

Compute Node n S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

“Be the change that you want to see in the world.”

- Mahatma Gandhi

Vertica is a shared nothing architecture, designed as a collection of Linux cluster nodes connected by a TCP/IP network. Storage can be directly attached to each node, or SAN-based. This technology is relatively inexpensive. It might not "be the change you want to see in the world", but it will help your company "keep the change" because costs are low.

Page 38

Chapter 1

What is Columnar?

Vertica Architecture Terms Host - A server with a 64-bit processor, memory, hard disk and TCP/IP network interface. Hosts share neither disk space nor main memory with each other. Vertica is a shared nothing MPP architecture. Instance - An instance is a node running the Vertica process and disk storage (catalog and data) on a host. Only one instance of HP Vertica can be running on each host. Multiple instances make up a cluster.

Node - A host that is configured to run an instance of Vertica. It is a member of a database cluster, which consists of one or more nodes working together in parallel. HP has recoverability for multiple nodes (minimum 3 – recommended 4) so a database can recover from a node failure. Cluster - A collection of nodes bound to a database. Database - A cluster of nodes that perform distributed data storage and SQL statement execution as a single unit. Above are the key terms for the Vertica architecture.

Page 39

Chapter 1

What is Columnar?

Vertica has Linear Scalability Ethernet Network

S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

Vertica

S e g m e n t

S e g m e n t

S e g m e n t

Ethernet Network

S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

S e g m e n t

"A Journey of a thousand miles begins with a single step." - Lao Tzu

Vertica was born to be parallel. With each query, a single step is performed in parallel by each Segment. A Vertica system consists of a series of nodes that will work in parallel to store and process your data. This design allows you to start small and grow infinitely. If your Vertica system provides you with an excellent Return on Investment (ROI), then continue to invest by purchasing more nodes (adds additional Segments). Most companies start small, but after seeing what Vertica can do, they continue to grow their ROI from the single step of implementing a Vertica system to millions of dollars in profits. Double the Segments and double the speeds….Forever. Vertica actually provides a journey of a thousand smiles! Page 40

Chapter 2

Page 41

Vertica Data Distribution

Chapter 2

Vertica Data Distribution

Chapter 2 – Vertica Data Distribution

“Fall seven times, stand up eight.” – Japanese Proverb

Page 42

Chapter 2

Vertica Data Distribution

Distribution Strategy 1 - Segmented By Hash CREATE TABLE Employee_Table ( Employee_No integer NOT NULL, Dept_No integer, Last_Name char(20), First_Name varchar(12) ) SEGMENTED BY HASH(Employee_No) ALL NODES

Emp_No Dept_No First_Name _______ ________ __________ Last_Name _________ 1001 100 Rafael Minal 1002 200 Maria Gomez 1003 300 Charl Kertzel 1004 400 Kyle Stover 1005 400 Rob Rivers 1006 300 Inna Kinski 1007 200 Sushma Davis 1008 100 Mo Khan 1009 300 Mo Swartz

Segment 2

Segment 1 Hash Key

Segment 3

Hash Key

Hash Key

1001

100

Rafael

Minal

1002

200

Maria

Gomez

1003

300

Charl

Kertzel

1008

100

Mo

Khan

1007

200

Sushma

Davis

1006

300

Inna

Kinski

1009

300

Mo

Swartz

1005

400

Rob

Rivers

1004

400

Kyle

Stover

The entire row of a table is on a segment, but each column in the row is in a separate block. Vertica spreads the rows of a table evenly across the nodes. A good Distribution Key is the key to good distribution!

Page 43

Chapter 2

Vertica Data Distribution

Distribution Strategy 2 - Unsegmented Segment 1

CREATE TABLE Emp_Table (Emp_No INTEGER, Dept_No INTEGER, First_name VARCHAR(12), Last_name CHAR(20)) Unsegmented all nodes ;

Unsegmented (Replicated) 1001

100

Rafael

1002 1007

200 200

Maria Sushma

1004 1005

400 400

Kyle Rob

Stover Rivers

1008

100

Mo

Khan

1003 1006

300 300

Charl Inna

Kertzel Kinski

1009

300

Mo

Swartz

Segment n

Unsegmented (Replicated)

Minal Gomez Davis

Unsegmented means the table is copied in its entirety to all segments.

1001

100

Rafael

Minal Gomez

1002 1007

200 200

Maria Sushma

1004 1005

400 400

Kyle Rob

Stover Rivers

1008

100

Mo

Khan

1003 1006

300 300

Charl Inna

Kertzel Kinski

1009

300

Mo

Swartz

Davis

When Unsegmented is chosen for distribution, the entire table is copied to each segment. This is often termed replicated. The general idea is to Segment by Hash all large tables and to use Unsegmented on smaller tables.

Page 44

Chapter 2

Vertica Data Distribution

Sorting the Data in a Table CREATE Statement Order_No Cust_No Order_Date _________ Order_Total _______ ______ ________

1 2 3 4 5 6 7 8 9

CREATE TABLE Order_Table (Order_No INTEGER Not NULL, Cust_No INTEGER, Order_Date Date, Order_Total Decimal(8,2)) ORDER BY Order_Date SEGMENTED BY HASH(Order_No) ALL NODES

100 200 300 400 400 300 200 100 300

1-1-2015 1-1-2015 1-2-2015 1-3-2015 1-3-2015 1-4-2015 1-5-2015 1-6-2015 1-6-2015

Segment

Segment Hash

1000 2000 3000 1000 3000 4000 1000 2000 1000

Segment

Hash

Hash

1

100

1-1-2015

1000

2

200

1-1-2015

2000

3

300

1-2-2015

3000

8

100

1-6-2015

2000

5

400

1-3-2015

3000

4

400

1-3-2015

1000

9

300

1-6-2015

1000

7

200

1-5-2015

1000

6

300

1-4-2015

4000

sortkey

sortkey

sortkey

We have chosen the Order_Date column as the sort key and the Order_No column as the Hash Key.

Page 45

Chapter 2

Vertica Data Distribution

Even Distribution Order_No Cust_No Order_Date _________ Order_Total _______ ______ ________ CREATE TABLE Order_Table3 (Order_No INTEGER Not NULL, Cust_No INTEGER, Order_Date Date, Order_Total Decimal(8,2)) SEGMENTED BY HASH(Order_No) ALL NODES

1 2 3 4 5 6 7 8 9

100 200 300 400 400 300 200 100 300

1-1-2015 1-1-2015 1-2-2015 1-3-2015 1-3-2015 1-4-2015 1-5-2015 1-6-2015 1-6-2015

Segment

Segment Hash

1000 2000 3000 1000 3000 4000 1000 2000 1000

Segment

Hash

Hash

1

100

1-1-2015

1000

2

200

1-1-2015

2000

3

300

1-2-2015

3000

8

100

1-6-2015

2000

5

400

1-3-2015

3000

4

400

1-3-2015

1000

1000

7

200

1-5-2015

1000

6

300

1-4-2015

4000

9

300

1-6-2015

The data has spread evenly among the segments for this table. Do you know why? The Hash Key is Order_No and it is a unique value. Hashing unique values results in near perfect distribution every single time.

Page 46

Chapter 2

Vertica Data Distribution

Uneven Distribution Where the Data is Non-Unique Order_No Cust_No Order_Date _________ Order_Total _______ ______ ________ CREATE TABLE Order_Table4 (Order_No INTEGER Not NULL, Cust_No INTEGER, Order_Date Date, Order_Total Decimal(8,2)) SEGMENTED BY HASH(Cust_No) ALL NODES

1 2 3 4 5 6 7 8 9

Hash

8

100 100

1-6-2015

1000 2000 3000 1000 3000 4000 1000 2000 1000

Segment

Hash 1-1-2015

1-1-2015 1-1-2015 1-2-2015 1-3-2015 1-3-2015 1-4-2015 1-5-2015 1-6-2015 1-6-2015

Segment

Segment

1

100 200 300 400 400 300 200 100 300

Hash

1000

2

200

1-1-2015

2000

3

300

1-2-2015

3000

2000

7

200

1-5-2015

1000

6

300

1-4-2015

4000

5 4

400 400

1-3-2015 1-3-2015

3000 1000

9

300

1-6-2015

1000

The data did not spread evenly among the segments for this table. Do you know why? The Hash Key is Cust_No. All like values went to the same Node. This distribution isn't perfect, but it is reasonable, so it is an acceptable practice.

Page 47

Chapter 2

Vertica Data Distribution

Matching Distribution Keys for Co-Location of Joins CREATE TABLE Employee_table (Emp_No INTEGER NULL, Dept_No INTEGER NULL, Last_name CHAR(20) NULL, First_name VARCHAR(12) NULL) SEGMENTED BY HASH(Dept_No) ALL NODES;

1001 1008

Segment

Segment

Employee_Table

Employee_Table

100 100

Rafael Mo

Minal Khan

Fin

1008

Segment Employee_Table

1002 1007

200 200

Maria Sushma

Gomez Davis

1004 1005

400 400

Kyle Rob

Stover Rivers

Department_Table 100

CREATE TABLE Department_table (Dept_No INTEGER NULL, Dept_Name CHAR(20) NULL, Mgr_No INTEGER Budget Decimal (10,2)) SEGMENTATED BY HASH(Dept_No) ALL NODES;

1003

300

Charl

Kertzel

1006 1009

300 300

Inna Mo

Kinski Swartz

Department_Table 90000

200

HR

1002

500000

400

IT

1005

600000

Department_Table 300

Mrkt

1006

500000

Notice that both tables are distributed by Hash on the column Dept_No. When these two tables are joined WHERE Dept_No = Dept_No, the rows with matching department numbers are on the same segment. This is called CoLocation. This makes joins efficient and fast.

Page 48

Chapter 2

Vertica Data Distribution

Big Table / Small Table Joins Segment

Segment

Employee_Table

Employee_Table

Segmented by Hash

Segment

Employee_Table

Segmented by Hash

Segmented by Hash

1001 1008

100 100

Rafael Mo

Minal Khan

1002 1007

200 200

Maria Sushma

Gomez Davis

1003 1006

300 300

Charl Inna

Kertzel Kinski

1009

300

Mo

Swartz

1005

400

Rob

Rivers

1004

400

Kyle

Stover

Department_Table

Department_Table

Department_Table

100

Fin

1008

90000

100

Fin

1008

90000

100

Fin

1008

90000

200

HR

1002

500000

200

HR

1002

500000

200

HR

1002

500000

300 400

Mrkt IT

1006 1005

500000 600000

300 400

Mrkt IT

1006 1005

500000 600000

300 400

Mrkt IT

1006 1005

500000 600000

Replicated

Replicated

Replicated

Notice that the Department_Table has only four rows. Those four rows are copied to every segment. This is distributed by UNSEGMENTED. Now, the Department_Table can be joined to the Employee_Table with a guarantee that matching rows are co-located. They are co-located because the smaller table has copied ALL of its rows to each Node. When two joining tables have one large table (fact table) and one small table (dimension table), then use the UNSEGMENTED keyword to distribute the smaller table. This theory is also called a "Big Table/ Small Table Join".

Page 49

Chapter 2

Vertica Data Distribution

Fact and Dimension Table Distribution Key Designs Line_Order_Fact_Table

Part_Table P_Part_Key

Make the Part_Key the Distribution Key for the two largest tables

LO_Order_Key LO_Line_Number LO_Cust_Key LO_Part_Key LO_Ship_Priority LO_Quantity LO_Extended_Price LO_Supp_Key LO_Order_Total_Price LO_Discount LO_Tax LO_Order_Date LO_Supply_Cost LO_Revenue LO_Ship_Mode

REPLICATED Customer_Table

C_Cust_Key REPLICATED Supplier_Table S_Supp_Key REPLICATED Date_Table D_Date_Key

The fact table (Line_Order_Fact_Table) is the largest table, but the Part_Table is the largest dimension table. That is why you make Part_Key the distribution key for both tables. Now, when these two tables are joined together, the matching Part_Key rows are on the same Node. You can then distribute by UNSEGMENTED, which replicates the other dimension tables to each node. Each table will have all their rows on each Node. Now, everything that joins to the fact table is co-located!

Page 50

Chapter 2

Vertica Data Distribution

Why a Sort Key Improves Performance CREATE TABLE Order_Table (Order_No INTEGER Not NULL, Cust_No INTEGER, Order_Date Date, Order_Total Decimal(8,2)) ORDER BY Order_Date SEGMENTED BY HASH(Order_No) ALL NODES

Segment 1

Segment 2

Segment 3

Order_Table

Order_Table

Order_Table

JAN FEB

JAN FEB

JAN FEB

MAR APR

MAR APR

MAR APR

MAY JUN

MAY JUN

MAY JUN

There are three basic reasons to use the sortkey keyword when creating a table. 1) If recent data is queried most frequently, specify the timestamp or date column as the leading column for the sort key. 2) If you do frequent range filtering or equality filtering on one column, specify that column as the sort key. 3) If you frequently join a (dimension) table, specify the join column as the sort key. Above, you can see we have made our sortkey the Order_Date column. Look how the data is sorted!

Page 51

Chapter 2

Vertica Data Distribution

Sort Keys Help Group By, Order By and Window Functions CREATE TABLE Order_Table (Order_Number INTEGER Not NULL, Customer_Number INTEGER, Order_Date Date, Order_Total Decimal(8,2)) ORDER BY Customer_Number, Order_Date SEGMENTED BY HASH(Order_Number) ALL NODES

SELECT Customer_Number ,SUM(Order_Total) as "Order Sum" ,AVG(Order_Total) as "Avg Order" FROM Order_Table GROUP BY Customer_Number ORDER BY Customer_Number ;

SELECT Customer_Number ,Order_Date ,Order_Total ,SUM(Order_Total) OVER (Partition By Customer_Number Order By Customer_Number ,Order_Date Rows Unbounded Preceding) as "Cumulative Sum" FROM Order_Table

When data is sorted on a strategic column, it will improve (GROUP BY and ORDER BY operations), window functions (PARTITION BY and ORDER BY operations), and even as a means of optimizing compression. But, as new rows are incrementally loaded, these new rows are sorted but they reside temporarily in a separate region on disk. In order to maintain a fully sorted table, you need to run the VACUUM command at regular intervals. You will also need to run ANALYZE.

Page 52

Chapter 3

Page 53

Clever Features of Vertica

Chapter 3

Clever Features of Vertica

Chapter 3 – Clever Features of Vertica

“Always remember that you are unique just like everyone else.” - Anonymous

Page 54

Chapter 3

Clever Features of Vertica

Super Projections Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 Replicated on all nodes

Billy’s Best Choice Acme Products ACE Consulting

123456 123512 123552 123585

11111111 11111111 31323134 31323134

Node 1

Node n

Customer_Table Unsegmented 11111111 31313131 31323134

12347.53 8005.91 5111.47 15231.62

Billy’s Best Choice Acme Products ACE Consulting

Customer_Table Unsegmented 11111111 31313131 31323134

Replicated on all nodes

Billy’s Best Choice Acme Products ACE Consulting

Hashed

Customer_Table Segmented 123456 123512

11111111 11111111

12347.53 8005.91

Customer_Table Segmented 123552 123585

31323134 31323134

Hashed

5111.47 15231.62

A superprojection contains all columns of a single table by default

Vertica creates a default superprojection for each table in the database so that all SQL queries can be answered. A superprojection consists of all columns in the table and this is done by default when the data is first loaded or inserted. Notice that both superprojections above are either replicated across all nodes or the rows are hashed to different nodes.

Page 55

Chapter 3

Clever Features of Vertica

Vertica Projections Stu_ID First_Name __________ Last_Name Class_Code Grade_Pt ______ __________ __________ ________ 1 Thomas Wendy FR 4.00 2 Smith Andy SO 2.00 Original Data 3 McRoberts Richard JR 1.90 4 Phillips Martin SR 3.00 Physically Stored as Columns Stu_ID 1 2 3 4

First_Name Last_Name Class_Code Thomas Wendy FR Smith Andy SO McRoberts Richard JR Phillips Martin SR

Grade_Pt 4.00 2.00 1.90 3.00

Student_Names Stu_ID

First_Name Last_Name

Student_Grades Stu_ID

Class_Code

Grade_Pt

Split into Multiple Projections

Vertica stores data physically in views called projections. Each projection contains a subset of the columns, but each subset can be sorted differently. Projections can even contain columns from multiple tables, like a materialized view. Every data element in a table will appear in at least one projection. Tables occupy no physical storage! It is only the projections that are stored. This allows Vertica to group columns used most often together right next to each other in the physical storage.

Page 56

Chapter 3

Clever Features of Vertica

The Five Advantages of Projections 1.

The Vertica query optimizer automatically picks the best projections to use for any query, so no user interaction is required.

2.

Projections compress and encode data to greatly reduce the space required for storing data.

3.

Vertica operates on the encoded data when it can in order to avoid the cost of decoding.

4.

Because Vertica uses a combination of both compression and encoding, this ensures the smallest disk space possible and yet it still maximizes query performance.

5.

Projections also provide high availability and recovery by duplicating table columns on at least K+1 nodes within its cluster. If a node fails, the database continues to operate by using duplicate data on a buddy node(s).

Vertica projections store data in encoded format designed for automatic performance tuning. Think of projections similar to Join Indexes in Teradata or a materialized views in Oracle. Projections are really result sets that are stored on disk. Instead of computing these results each time they can be used in each query. Projection results are automatically refreshed whenever data values are inserted, deleted, updated or copied.

Page 57

Chapter 3

Clever Features of Vertica

Creating a Projection Database

Projection Columns

Schema

Projection Name

CREATE Projection Coffing.SQL_Class.Order_Projection ( Order_Number_P ENCODING RLE ,Customer_Number_P ENCODING RLE ,Order_Date_P ,Order_Total_P ) AS SELECT Order_Number ,Customer_Number ,Order_Total ,Order_Date FROM Coffing.SQL_Class.Order_Table How the data ORDER BY Order_Date is sorted on SEGMENTED BY Hash(Customer_Number) each node ALL NODES How the data will be OFFSET 1; distributed to the nodes

Vertica projections store data in encoded format designed for automatic performance tuning. Think of projections similar to Join Indexes in Teradata or a materialized views in Oracle. Projections are really result sets that are stored on disk. Above, we did create a projection using all columns in the table, but we determined how we wanted the data sorted.

Page 58

Chapter 3

Clever Features of Vertica

Read-Optimized Store (ROS)/Write-Optimized Store (WOS) Node Memory Cache Ready for Transfer

Ready for Transfer

Current Epoch

Both can Be queried Write-Optimized Storage (WOS) Disks for Permanent Storage

Read-Optimized Storage (ROS)

Periodically, the Tuple Mover migrates the recent updates that are "Ready for Transfer" to the permanent storage in the Read-Only Storage (ROS)

Vertica caches all updates to a main memory called the Write-Optimized Store (WOS), which by the way is queryable. The WOS puts the data into projections in collection buckets that are uncompressed and unsorted, but are in update order. The Tuple Mover then migrates the recent updates during certain periods to the permanent disk storage in the Read-Optimized Store (ROS). The data in the ROS is sorted, compressed and packed into variable length disk blocks.

Page 59

Chapter 3

Clever Features of Vertica

Write-Optimized Store (WOS) is Memory Resident Write-Optimized Storage (WOS) Ready for Transfer

Ready for Transfer

Current Epoch

Read-Optimized Storage (ROS)

Vertica's Write Optimized Store (WOS) is always memory-resident and it is buffer for INSERT, UPDATE, DELETE, and COPY operations. To support very fast data load speeds, the WOS stores records without data compression or indexing. A projection in the WOS is sorted only when it is queried. It remains sorted as long as no further data is inserted into it. The WOS organizes data by epoch and holds both committed and uncommitted transaction data. Both the Read Optimized Store (ROS) and the Write Optimized Store (WOS) are arranged by projections. This technique allows for continuous loading throughout the day without having a major impact on read queries.

Page 60

Chapter 3

Clever Features of Vertica

Updates are collected in Time-Based Buckets called Epochs Write-Optimized Storage (WOS) Ready for Transfer

Ready for Transfer

Current Epoch

Read-Optimized Storage (ROS) Vertica caches all updates to a main memory called the Write-Optimized Store (WOS), which by the way is queryable. The WOS is designed so updated can be collected in time-based buckets. At fixed intervals, Vertica closes the current epoch and begins a new Epoch. The non-current Epochs are queryable and deemed for migration by the Tuple Mover to update the permanent disks called the Read-Optimized Storage (ROS). This design allows for the majority of users who only need to read data to have an open gateway, however it also allows for near real-time data warehouses with high append data volumes.

Page 61

Chapter 3

Clever Features of Vertica

Vertica Does Not Support In-Place Updates Node Write-Optimized Storage (WOS)

Ready for Transfer

Ready for Transfer

Current Epoch

Tuple Mover updates by deleting and re-inserting rows Read-Optimized Storage (ROS)

Appended data is added to the end of a column-store block and updated data (in the middle) of a block is deleted and re-inserted.

Vertica's Tuple Mover updates by deleting and re-inserting rows. Appended data is added to the end of a columnstore block and updated data (in the middle) of a block is deleted and re-inserted. Vertica does not support in-place updates.

Page 62

Chapter 3

Clever Features of Vertica

K-Safety Node 1

Node 2

Node 3

Node 4

Node 5

The K in K-Safety means how many duplicate copies are stored. In this example K = 1. This example is not designed to represent mirroring, but in effect each node has a buddy node that it keeps a backup copy of its data in case of a failure. Node 1 holds the backup for node 2 and node 2 holds the backup for node 3, etc.

You can view a list of critical nodes in your database by running the query below from your Nexus Chameleon.

SELECT * FROM v_monitor.critical_nodes; Any of the nodes in the cluster example above could fail, and the database would still be able to continue perform. The performance would be lower because one node would have to handle its own workload and the workload of the failed node.

Page 63

Chapter 3

Clever Features of Vertica

K-Safety of 2 Node 1

Node 2

Node 4

Node 3

Node 5

To see the K-Safety numbers just run the query below from your Nexus Chameleon. SELECT current_fault_tolerance FROM system ;

Any two nodes in the cluster example above could fail, and the database would still be able to continue perform. Each node is a buddy the nodes before and after it.

Page 64

Chapter 3

Clever Features of Vertica

The Five Data Isolation Modes 1.

Snapshot – Queries and updates do not interfere with each other, so read only queries do not require locking.

2.

Serializable - Transactions run in serial order. Locks are acquired for both read and write operations, which ensures that any successive SELECT commands within a single transaction always produce the same results.

3.

Repeatable read – Auto-converts to SERIALIZABLE.

4.

Read Committed - SELECT queries sees a snapshot of the committed data at the start of the transaction and any results of updates run within its transaction, even if they have not been committed.

5.

Read Uncommitted (Read Without Integrity) By default, Vertica uses the READ COMMITTED isolation level.

Vertica supports all types of database isolation. Database isolation refers to how the concurrent users of data affect each other as they read and change data in the database. The key question comes down to integrity of data vs. concurrency. Although the optimizer understands all five standard SQL isolation levels, internally Vertica uses only two isolation levels. They are "Read Committed" and "Serializable". So, you may not get the other isolations you request. Vertica automatically translates "Read Uncommitted" to "Read Committed" and "Repeatable Read" to "Serializable".

Page 65

Chapter 3

Clever Features of Vertica

Import/Export between Multiple Vertica Systems Vertica System 1 Node 1

Node 2

Node 3

Node 4

Vertica System 2 Node 1

Node 2

Node 3

SAN Storage

Entire databases, or certain portions of databases, can be moved from one Vertica system to another by using a simple SQL statement. Notice that the instances do not need to be the same size or have the same storage requirements. The data is automatically re-segmented to match the new configuration, and projections are resorted based on queries being run.

Page 66

Chapter 3

Clever Features of Vertica

Roles 1,000 Users

You’ve been given the Mrkt_User_Role

Database Mrkt Mrkt_User_Role I Grant thee SELECT (read) privilege to all tables in the database Mrkt

Tables Customers Products

Orders Sales

Roles simplify database administration by assigning access rights to tables and other objects, and then groups of people with similar job functions (or roles) can access these objects. It is as simple as creating different roles for different job functions and responsibilities, and then granting specific privileges (access rights) on database objects to these roles, and then granting a role or roles to users who share the same privileges. Vertica database security supports roles conforming to SQL 2008 specifications. This type of security is essential for management of data access across large organizations.

Page 67

Chapter 3

Clever Features of Vertica

Compression Vertica compresses data in order to save space. Here are the facts: •

Vertica can utilize over twelve different compression options.



The compression depends on the data.



Vertica will choose which compression option to apply.



NULLs take up no space on Vertica because they are compressed.



Vertically will compress data on average 70%.



HP Vertica queries data in encoded form.



When similar data is grouped, you have even more options.

One of the key advantages to columnar storage is the ability to compress column data. When column stores are compressed they can stores more data, provides more projections and use less hardware. This can provide up to 50% more historical data being stored and queried. The following pages show just some of the compression techniques utilized. Encoding is the process of converting data into a standard format. Encoded data can be directly processed, however compressed data cannot. Vertica operates on encoded data when it can to avoid the heavy costs of decoding.

Page 68

Chapter 3

Clever Features of Vertica

Runlength encoding Encoding Type

Run-length

Original Data Ohio Ohio Ohio Ohio California California California Michigan Michigan

Encoding Keyword

RUNLENGTH

Data Types

All

Original size (bytes) Compressed Value Compressed Size 4 4 4 4 10 10 10 8 8

4, Ohio

3, California

2 Michigan

5 0 0 0 11 0 0 9 0

Runlength encoding replaces a value that is repeated consecutively with a token that consists of the value and a count of the number of consecutive occurrences (the length of the run). This is where the name Runlength comes into play. A separate dictionary of unique values is created for each block of column values on disk. This encoding is best suited to a table in which data values are often repeated consecutively, for example, when the table is sorted by those values.

Page 69

Chapter 3

Clever Features of Vertica

LZO Encoding Encoding Type LZO

Data Types

Encoding Keyword LZO

All except BOOLEAN, REAL, and DOUBLE PRECISION



Designed to work best with Char and Varchar data that store long character strings



Is a portable lossless data compression library written in ANSI C



Offers fast compression but extremely fast decompression



Includes slower compression levels achieving a quite competitive compression ratio while still decompressing at this very high speed



Often implemented with a tool called LZOP

Lempel–Ziv–Oberhumer (LZO) is a lossless data compression algorithm that is focused on decompression speed. LZO encoding provides a high compression ratio with good performance. LZO encoding is designed to work well with character data. It is especially good for CHAR and VARCHAR columns that store very long character strings, especially free form text such as product descriptions, user comments or JSON strings.

Page 70

Chapter 3

Clever Features of Vertica

Delta Encoding Encoding Type

Encoding Keyword

Delta

DELTA

Delta

DELTA32K

Uncompressed Data

4-byte integers

1 2 3 4 5 6 7 8

Data Types SMALLINT, INT, BIGINT, DATE, TIMESTAMP, DECIMAL INT, BIGINT, DATE, TIMESTAMP, DECIMAL

Delta Encoding Compression 0001 1 1 1 1 1 1

The first row is a 4-byte integer (plus one flag byte).

One byte with the number 1. Each is 1 greater than the previous value.

The Delta encodings are very useful for date and time columns. Delta encoding compresses data by recording the difference between values that follow each other in the column. These differences are recorded in a separate dictionary for each block of column values on disk. If the column contains 10 integers in sequence from 1 to 10, the first will be stored as a 4-byte integer (plus a 1-byte flag), and the next 9 will each be stored as a byte with the value 1, indicating that it is one greater than the previous value. Delta encoding comes in two variations. DELTA records the differences as 1-byte values (8-bit integers), and DELTA32K records differences as 2-byte values (16bit integers).

Page 71

Chapter 3

Clever Features of Vertica

Block Based Dictionary Encoding for Character Data Encoding Type

Data Types

Encoding Keyword

Block Based dictionary Block Based

Uncompressed Data

Compressed Data

Ohio California Minnesota Alaska Oregon Ohio California Minnesota Alaska

1 2 3 4 5 1 2 3 4

Character

Dictionary

1 - Ohio 2 - California 3 - Minnesota 4 - Alaska 5 - Oregon

Block Based Dictionary Encoding utilizes a separate dictionary of unique values for each block of column values on disk. Remember, each Vertica disk block occupies 1 MB. The dictionary contains up to 256 one-byte values that are stored as indexes to the original data values. If more than 256 values are stored in a single block, the extra values are written into the block in raw, uncompressed form. The process repeats for each disk block. This encoding is very effective when a column contains a limited number of unique values, and it is especially optimal when there is less than 256 unique values.

Page 72

Chapter 4

Page 73

Nexus

Chapter 4

Nexus

Chapter 4 - Nexus

“We envisioned the cloud a decade before its arrival and that is why the Nexus is the most dominating tool in the industry today” - Tera-Tom Coffing

Page 74

Chapter 4

Nexus

Nexus is Available on the Cloud

Why the Nexus Chameleon should be your query tool of choice: 1) Queries every major system 2) Provides visualization and automatically writes the SQL 3) Can perform cross-system joins with a few clicks of the mouse 4) Converts table structures and moves the table and data between systems 5) Compares and synchronizes databases 6) Can move an entire database of tables or views between systems 7) Has the "Garden of Analysis" to re-query answer sets inside your PC 8) Provides a dashboard of graphs and charts for answer sets

Download the Nexus for a free trial at www.CoffingDW.com and use Nexus in-house or on the cloud. Nexus is on the Amazon (AWS) cloud, the Microsoft Azure cloud and the Century Link cloud.

Page 75

Chapter 4

Nexus

Nexus Queries Every Major System Nexus Chameleon File Edit View Query Tools Help Web Windows System:

Systems + + + + + + + + + + + + + + +

Database: SQL Class

Vertica

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3

SELECT * FROM Employee_table Where Dept_No = 400 ; Messages

Garden of Analysis

Employee_No Dept_No 1 2 3

1256349 1121334 2341218

400 400 400

Nexus is designed to work with every system in your enterprise, whether on-premises or in the cloud

Result 1 Last_Name First_Name Salary Herbert 54500.00 Harrison Cletus 54500.00 Strickling William 36000.00 Reilly

And you can query them all simultaneously

Nexus is designed to work with every system in your enterprise, on-premises systems and cloud systems. Nexus works with traditional systems, such as DB2, Oracle, SQL Server and Teradata. Nexus also works with newer systems, such as Netezza, Greenplum, Kognitio, Hana, Matrix, Aster Data and Vertica. Nexus also works with your top cloud systems, such as Amazon Redshift, Microsoft Azure SQL Data Warehouse and Hadoop.

Page 76

Chapter 4

Nexus

How to Use Nexus Nexus Chameleon

You query history

File Edit View Query Tools Help Web Windows System:

Vertica

Systems

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1

+ Vertica System your connected to Click on the plus sign and see all the databases and objects for this particular system

Your current database

Runs a Query, or press F5

Clears The SQL from your screen

Provides the EXPLAIN Plan, or hit F6

Hit F2 to see SQL Syntax

The important buttons and function keys you need to know about are listed above.

Page 77

Opens Up another Query Window for any system you have in your systems tree

Chapter 4

Nexus

Why is Nexus Special? Visualization and Automatic SQL Nexus Chameleon History

File Edit View Query Tools Help Web Windows System:

Vertica

Execute Objects

Database: SQL Class

EXECUTE

Create Table Preview SQL in Nexus Columns

Sorting

Joins

V

Customer_Table

Sandbox

WHERE

?

New Query

Join Hub System Teradata SQL

Metadata

Analytics

V

Order_Table

Add Join

Add Join

Select *

Select *

Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)

Order_Number Integer Customer_Number Integer Order_Date Date Order_Total Decimal (10,2)

Right click on any table or view in your Nexus system tree and choose "Super Join Builder". Your table, or view, will be shown visually, along with its columns and their data types. Press the "Add Join" drop down menu and see what other tables or views can be joined. Click on the columns you want on your report and a checkmark appears. Nexus automatically writes the SQL for you. You now have the ability to develop at record speeds.

Page 78

Chapter 4

Nexus

Why is Nexus Special? Cross-System Joins Nexus Chameleon System: Vertica Execute Objects

Sandbox

History

File Edit View Query Tools Help Web Windows Database: SQL Class

EXECUTE

?

New Query

Create Table Preview SQL in Nexus Columns

Sorting

Joins

T

ADDRESSES Add Join

Select * Street Varchar(30) City Varchar(20) State Char(2) Zip Integer AreaCode Smallint Phone Char(15) Subscriber_No Integer

Join Hub System Teradata

WHERE

SQL

Metadata

Analytics

O

SUBSCRIBERS Add Join

Select * Last_Name Varchar(20) First_Name Varchar(20) Gender Char(1) SSN Integer Member_No Smallint Subscriber_No Integer

V

CLAIMS

Add Join

Select * Claim_Id Integer Claim_Date Date Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Provider_No Integer Claim_Service Integer

Did you ever even imagine that you would be able to join tables from different systems? Above, we are joining a Teradata table to an Oracle table to a Vertica table. Just checkmark the columns you want on your report and Nexus handles everything behind the scenes. Nexus builds the SQL, converts the table structures and moves the tables to the Hub system. The report comes flying back with no intervention from the user.

Page 79

Chapter 4

Nexus

Why is Nexus Special? The Amazing Hub System Nexus Chameleon System: Vertica Execute Objects

Sandbox

History

File Edit View Query Tools Help Web Windows

Database: SQL Class

EXECUTE

?

New Query

Create Table Preview SQL in Nexus Columns

Sorting

Joins

T

ADDRESSES Add Join

WHERE

Join Hub System Vertica SQL

Metadata

Analytics

O

SUBSCRIBERS

V

CLAIMS

Add Join

Add Join

Select * Street Varchar(30) City Varchar(20)

Select * Last_Name Varchar(20) First_Name Varchar(20)

Select * Claim_Id Integer Claim_Date Date

State Char(2) Zip Integer AreaCode Smallint Phone Char(15) Subscriber_No Integer

Gender Char(1) SSN Integer Member_No Smallint Subscriber_No Integer

Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Provider_No Integer Claim_Service Integer

Nexus allows the user to select which system they want to process the data. We just changed the Hub to Vertica (above). Now, the tables from the three systems will be converted, moved and joined on the Vertica system. Nexus allows you to join tables from any system in your enterprise (both on-premises and cloud), but then tops it off by allowing you to process the joins on any system in your enterprise. Need extra processing power? Spin up a server on the cloud and make it the hub. Simply amazing!

Page 80

Chapter 4

Nexus

Why is Nexus Special? Save Answer Sets as Tables Nexus Chameleon System: Vertica Execute Objects

Sandbox

History

File Edit View Query Tools Help Web Windows Database: SQL Class

EXECUTE

?

New Query

Create Table Preview SQL in Nexus Columns

Sorting

Joins

V

ADDRESSES Add Join

Join Hub System SQL Server

WHERE

SQL

Metadata

Analytics

O

SUBSCRIBERS Add Join

V

CLAIMS

Add Join

Select * Street Varchar(30) City Varchar(20) State Char(2) Zip Integer

Select * Last_Name Varchar(20) First_Name Varchar(20) Gender Char(1) SSN Integer

Select * Claim_Id Integer Claim_Date Date Subscriber_No Integer Member_No Smallint

AreaCode Smallint Phone Char(15) Subscriber_No Integer

Member_No Smallint Subscriber_No Integer

Claim_Amt Decimal(12,2) Provider_No Integer Claim_Service Integer

Nexus allows you to run a join or any query in the Super Join Builder and save the answer set as a table on any system in your enterprise (on-premises or cloud). This is a fantastic way to create a data mart or to save data to your sandbox. It is also great for building a test system. And Nexus makes it so easy. Just click on the "Create Table" button and tell Nexus the system and the table name you desire, and that new table and its data will be there when the query is finished.

Page 81

Chapter 4

Nexus

Why is Nexus Special? Automated Data Movement Database Movement Execute Table Movement System Teradata V15

Options

Log

Database SQL_Class

System Vertica

Source Tables

Target Tables

SQL_Class Tables Addresses Claims Course_Table Customer_Table Employee_Table Order_Table Providers Sales_Table

Database SQL_Sandbox

SQL_Sandbox

X

Tables Addresses OK Claims OK Course_Table OK Customer_Table Employee_Table Order_Table Providers Sales_Table

Do you know how long it normally takes to convert the table structures and move data between systems? Sometimes this is a month long project, but not with Nexus. Just right click on any database in your systems tree and choose "Move Data". Pick the target system in which you want the tables and data to move and then checkmark the tables you want to move. Press the blue arrow and the tables with a checkmark move to the Target system. Press Execute and watch the tables light up in green with every successful move. If there is a problem, the table(s) will light up in red.

Page 82

Chapter 4

Nexus

Why is Nexus Special? Nexus makes the Servers Talk Directly Vertica Teradata

Hadoop

Oracle

Redshift

Whenever Nexus converts and moves data from one system to another the data never lands in a landing zone. When Nexus performs cross-system joins between systems the data moves directly from server to server. Whether you install Nexus on your desktop, laptop, Citrix server or access Nexus via a remote desktop on the cloud, there is no data passing through the Nexus. All data movement, conversions and cross-system joins are done directly between the big systems. Nexus uses Progress Data Direct drivers to establish a connection between systems and this allows the big systems to talk directly. This is the secret sauce that allows Nexus to move billions of rows from one system to another!

Page 83

Chapter 4

Nexus

What Makes Nexus Special? The Garden of Analysis Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Result 1

Result 2

Product_ID 3000 1000 3000 1000 3000 1000

Result 3

Sale_Date 09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000

Result 4

Daily_Sales 61301.77 48850.40 34509.13 54500.22 43868.86 36000.07

Nexus can query all of systems in your enterprise and each answer set is automatically present in the Garden of Analysis. This is a unique concept because any and all answer sets can be re-queried by Nexus inside your desktop. Above, we have four answer sets. We right clicked on answer set 4 (Result 4 – in red). Now, we can click on any tab above and get Aggregates, OLAP, Rank, Grouping Sets, Quantiles, Top, Sort, Join, Charts/Graphs, Dynamic Charts or a Dashboard. All you have to do is press on a table and drag columns in the answer set to the templates and you immediately see the new data.

Page 84

Chapter 4

Nexus

The Garden of Analysis Grouping Sets Tab Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Products

Date Column (Optional)

> Product_Id … Sale_Date Daily_Sales

2

Sum

> Sale_Date …

Choose the Product_ID from the drop down

> Product_Id … Sale_Date Daily_Sales

Choose the date column from the drop down

3 3

Options Grouping Sets Rollup Cube Date Extraction

4

Choose the column you want to sum from the drop down

Year Month Create

Result 1

Result 2

Product_ID 3000 1000 3000 1000 3000 1000

Result 3

Sale_Date 09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000

Result 4

Daily_Sales 61301.77 48850.40 34509.13 54500.22 43868.86 36000.07

5

Right click on Result 4 tab and choose Set as active Result Set

1

In five easy steps we got three new reports. 1) We made Result 4 the active result set by right clicking on the Result 4 tab and we chose "Set as active Result Set". We then chose the Grouping Sets tab at the top (red circle). 2) We then clicked on the Products drop down menu and chose the column Product_Id. 3) We then clicked on the Date Column drop down and chose Sale_Date. 4) We clicked on the Sum drop down menu and chose Daily_Sales. Our options already had the Grouping Sets, Rollup and Cube pre-selected. 5) We hit the CREATE button. Watch what happens on the next page! Page 85

Chapter 4

Nexus

The Garden of Analysis - Grouping Sets Answer Sets Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Result 1

Result 2

Result 3

Result 4

Grouping sets

Cube

Product_ID MTH This report depicts the actual results of the Grouping Sets

3000 2000 1000 ? ? ?

Rollup

YR

Sum_Daily_Sales

? ? ? ? ? ? 10 ? 9 ? ? 2000

224587.82 306611.81 331204.72 443634.99 418769.36 862404.35

3 new reports were created with the PC doing all the work

We instantly received three more answer sets (in yellow and pink) for Grouping Sets, Group by Cube and Group by Rollup. What is truly intelligent is that we re-queried Result 4 to get the three new reports and did so inside Nexus. The data warehouse was not re-queried, but instead the Nexus used the processor and memory inside the PC to calculate the analytics. All answer sets are saved to the Garden of Analysis so users can get additional reports by merely clicking on the appropriate tab and selecting the columns they want in the varying templates. Why not have your own data warehouse inside Nexus?

Page 86

Chapter 4

Nexus

The Garden of Analysis – Join Tab (1 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard Left click on the Join Tab

Result 1

Result 1

Order_Number Customer_Number Order_Date 123512 123456 123552 123777 123585

11111111 11111111 31323134 57896883 87323456

If you hit the blue button you will leave the Garden and be back to your main Nexus screen

01/01/1999 05/04/1998 10/01/1999 09/09/1999 10/10/1999

Order_Total 8005.91 12347.53 5111.47 23454.84 15231.62

Customer_Number Customer_Name 11111111 31313131 31323134 57896883 87323456

Billy's Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

Phone_Number 555-1234 555-1111 555-1212 347-8954 322-1012

To join answer sets together you need a single step. 1) Click on the join Tab (at the top). Notice that we have two answer sets that are named Result 1 and Result 2. Turn the page and see what happens next.

Page 87

Chapter 4

Nexus

The Garden of Analysis – Join Tab (2 of 4) Toggle Garden Docking

1

Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Left Table

2

Right Table

Options

3

Join Type: Left Table Join Column(s) > …

4

Right Table Join Column(s) > …

5

6

Inner Join

Left (Outer) Join Right (Outer) Join Full Join

Clear

Create

7

In seven easy steps you can sort an answer set. 1) Choose the Join tab at the top (red circle). 2) Choose the Left Table from the Left Table Drop Down (one of the answer sets). 3) Choose the Right Table from the Right Table Drop Down (one of the answer sets) 4) Choose the Left Table Join Column from the menu. 5) Choose the Right Table Join Column from the menu. 6) Pick the join type you want from the Options menu. 7) Hit the Create button. Turn the page to see the actual choices.

Page 88

Chapter 4

Nexus

The Garden of Analysis – Join Tab (3 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Left Table

Result 1

Right Table Result 2

Options Join Type:

Left Table Join Column(s) > …

Customer_Number

Right Table Join Column(s) > …

Customer_Number

Inner Join Left (Outer) Join

Right (Outer) Join Full Join

Clear

Create

We Chose the Result 1 answer set for the Left Table from the Left Table drop down menu. We Chose the Result 2 answer set for the Right Table from the Right Table drop down menu. We chose the Customer_Number column from the Left Table Join Column drop down menu. We Chose the Customer_Number column for the Right Table Join Column from the Right Table Join Columns drop down menu. We kept the Join Type of Inner Join. We hit the Create button. Turn the page to see the results

Page 89

Chapter 4

Nexus

The Garden of Analysis – Join Tab (4 of 4) Result 1

Result 2

Order_Number Customer_Number Order_Date 123552 123777 123512 123456 123585

31323134 57896883 11111111 11111111 87323456

10/01/1999 09/09/1999 01/01/1999 05/04/1998 10/10/1999

Result 3

Result 4

Order_Total 5111.47 23454.84 8005.91 12347.53 15231.62

Join

Customer_Number2 Customer_Name 31323134 57896883 11111111 11111111 87323456

ACE Consulting XYZ Plumbing Billy's Best Choice Billy's Best Choice Databases N-U

Phone_Number 555-1212 347-8954 555-1234 555-1234 322-1012

The two answer sets are now joined together. You can join answer sets from different systems just as easily.

Page 90

Chapter 4

Nexus

The Garden of Analysis – Charts/Graphs Tab (1 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

2 Left click on the Charts/Graphs Tab

Result 1

1 Right click on any Result tab you want to work with and choose "Set as Active Result Set"

Result 2

Product_ID 3000 1000 3000 1000 3000 1000

Result 3

Sale_Date 09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000

Result 4

Daily_Sales 61301.77 48850.40 34509.13 54500.22 43868.86 36000.07

>

If you hit the blue button you will leave the Garden and be back to your main Nexus screen

Set as Active Result Set Rename Tab Print Result Set Save Result Set to Garden Remove Result Set Export Result Set

To get charts and graphs for an answer set do the following two steps. 1) Right click on the answer set you want to work with and choose "Set as Active Result Set". That answer set is now ready to be placed into graphs and charts. 2) Click on the Charts/Graphs Tab (at the top). Turn the page and see what happens next.

Page 91

Chapter 4

Nexus

The Garden of Analysis – Charts/Graphs Tab (2 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Values (Y Axis)

Labels (X axis)

Sort By (Optional)

> …

> …

> …

2

3

4

1

Options Basic Chart Types: Advanced Pie Chart

6

Line Chart Column Chart All Types

Partition By (Optional) > …

Visual Type:

7

Flat (2D) 3D

5

Basic Chart Values: Sum

8

Clear

Create

In eight easy steps you can graph and chart an answer set with 35 different graphs and charts. 1) Choose the Graphs/Charts tab at the top (red circle). 2) Choose the column you want as the Y Axis from the Values (Y Axis) Drop Down menu. 3) Choose the column you want to be the X axis from the Labels (X axis) Drop Down menu. 4) Choose the Sort By (Optional) column in which you want to sort from the drop down menu. 5) Choose the optional Partition By column from the drop down menu. 6) Pick the type of chart from the Basic Chart Types options. 7) Pick the Visual Type of chart you want. 8) Hit the Create button. Turn the page to see the actual choices.

Page 92

Chapter 4

Nexus

The Garden of Analysis – Charts/Graphs Tab (3 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Values (Y Axis) > …

Labels (X axis) > …

Sort By (Optional) > … Product_ID (ASC)

Daily_Sales

Sale_Date

Sale_Date (ASC)

Options Basic Chart Types: Advanced Pie Chart

Line Chart Column Chart All Types

Partition By (Optional) > …

All Charts

Visual Type:

Flat (2D) 3D Product_ID (ASC)

Basic Chart Values: Sum

Clear

Create

You can see the columns above that we chose. Because we chose All Types and All Charts in the options tab we got 35 charts and each was placed in the Dashboard. Turn the page to see one of the many charts.

Page 93

Chapter 4

Nexus

The Garden of Analysis – Charts/Graphs Tab (4 of 4) 64300.0 59437.5 54575.1 49712.6 44850.1

39987.7 35125.2 30262.7 25400.3

20537.8 15675.3

9/28/2000 9/29/2000 9/30/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000

1000

9/28/2000 9/28/2000 9/28/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000

2000

The chart above represents the answer set. This chart has been placed in the dashboard along with 35 other charts depicting the same answer set and parameters.

Page 94

Chapter 4

Nexus

The Garden of Analysis – Dynamic Charts Tab (1 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

2

Result 1

1 Right click on any Result tab you want to work with and choose "Set as Active Result Set"

Result 2

Product_ID 3000 1000 3000 1000 3000 1000

Result 3

Sale_Date 09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000

Result 4

Daily_Sales 61301.77 48850.40 34509.13 54500.22 43868.86 36000.07

>

Left click on the Dynamic Charts Tab

Set as Active Result Set Rename Tab Print Result Set Save Result Set to Garden Remove Result Set Export Result Set

To get Dynamic Charts for an answer set do the following two steps. 1) Right click on the answer set you want to work with and choose "Set as Active Result Set". That answer set is now ready to be placed in a Dynamic Chart. 2) Click on the Dynamic Charts Tab (at the top). Turn the page and see what happens next.

Page 95

Chapter 4

Nexus

The Garden of Analysis – Dynamic Charts Tab (2 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Load Visualization

You need to take two steps. 1) Make sure you have selected the Dynamic Charts tab at the top (red circle). 2) Hit the Load Visualization button. Turn the page to see what happens next.

Page 96

Chapter 4

Nexus

The Garden of Analysis – Dynamic Charts Tab (3 of 4) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

File Edit View Data Layout Help Execute

Close Session Find:

Pages

Rows

Attributes Sale_Date

Columns

Data

Measures Filters

Measures

Level of Detail

Product_ID Daily_Sales Values

Encodings Markin Auto-Text Label Color

Size

Dynamic Charts allow you to drop and drag columns from the Attributes and Measures area. As you drop and drag, the charts dynamically change. Above, you can see the Sale_Date (in red) and the Product_ID (in blue) and the Daily_Sales column (in pink). The next slide will show how we drop and drag these Attributes and Measures to get our chart.

Page 97

Chapter 4

Nexus

The Garden of Analysis – Dynamic Charts Tab (4 of 4) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

File Edit View Data Layout Help Execute

Close Session Find:

Pages

Attributes Sale_Date

Data

Daily_Sales Values

Rows

Daily_Sales

140K 120K 100K 90K

Filters

Product_ID

Sale_Date

160K

Measures

Measures

Columns

Level of Detail Daily_Sales

80K 70K 60K

Encodings

50K

Markin Auto-Text

40K

Label

30K

Color

Size

Sale_Date

20K 10K

9/28/2000 9/29/2000 9/30/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000

Notice how we dragged the attributes and measures to the varying parts of the application. Because we put the Daily_Sales column in the level of detail, the actual Daily_Sales will appear if you hover your mouse over any of the bars.

Page 98

Chapter 4

Nexus

The Garden of Analysis – Dashboard Tab (1 of 5) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Left click on the Dashboard Tab

Once you create and save graphs and charts to the dashboard you can view your graphs and charts in varying ways. Just left click on the Dashboard Tab at the top. Turn the page and see what happens next.

Page 99

Chapter 4

Nexus

The Garden of Analysis – Dynamic Charts Tab (2 of 5) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Slideshow

Thumbnails

Scroll

Compare

64300.0 59437.5 54575.1

Seconds to Display

1

Pause

Every 1 second another slide displays, until you hit Pause

49712.6 44850.1 39987.7 35125.2 30262.7 25400.3

20537.8 15675.3

9/28/2000 9/29/2000 9/30/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000

1000

9/28/2000 9/28/2000 9/28/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000

2000

The slideshow will display the many graphs one at a time in intervals of seconds. Hit the pause button to stop and examine any particular slide.

Page 100

Chapter 4

Nexus

The Garden of Analysis – Dynamic Charts Tab (3 of 5) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Slideshow

Thumbnails

Scroll

Compare

Pause

64300.0 59437.5 54575.1 49712.6

The graphs and charts will scroll across the screen

1000 2000 3000

44850.1 39987.7 35125.2 30262.7 25400.3 20537.8 15675.3

9/28/2000 9/29/2000 9/30/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000

The scroll will scroll the graphs and charts across the screen from right to left. Hit the pause button to stop and examine any particular graph or chart. Hit the speed bar to speed up or slow down the scrolling.

Page 101

Chapter 4

Nexus

The Garden of Analysis – Dynamic Charts Tab (4 of 5) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Slideshow

Thumbnails

Scroll

Compare

Result 6 Column Chart

With

Result 6 Pie Chart (3D)

64300.0 59437.5 54575.1 49712.6 44850.1

1000 2000 3000

Sales

39987.7 35125.2 30262.7 25400.3 20537.8 15675.3

1st Qtr

2nd Qtr

3rd Qtr

4th Qtr

9/28/2000 9/29/2000 9/30/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000

The Compare allows you to compare two different charts. The drop down menus are there so you can pick chart 1 vs chart 2.

Page 102

Chapter 4

Nexus

The Garden of Analysis – Dynamic Charts Tab (5 of 5) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard

Slideshow

Thumbnails

Send Selected Graphs to Garden Tabs

Scroll

Compare

Compare Selected Graphs

Delete Selected Graphs

Sales

1st Qtr

2nd Qtr

3rd Qtr

4th Qtr

Chart Title 5 0 Cat1 Column1

Cat2 Column2

cat3 Column3

The thumbnails show all of the graphs and charts in your dashboard. This gives you a broad view that allows you to double click on any thumbnail and see it in actual size. You can also use the menu (at the top) to send selected graphs to Garden Tabs, Compare Selected Graphs, or Delete Selected Graphs.

Page 103

Chapter 4

Nexus

Getting to the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows System:

-

Vertica

Database: SQL Class

Systems

History EXECUTE

Sandbox ?

New Query

Query 1

Vertica SQL_Class

-

Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table

Messages

Garden of Analysis

Right Click on any table and choose Super Join Builder

+ Order_Table + Providers

+ Sales_Table + Services

Right click on any table in your systems tree and choose Super Join Builder. You will be placed inside the Super Join Builder.

Page 104

Chapter 4

Nexus

The Super Join Builder is the First Entry in the Menu Nexus Chameleon File Edit View Query Tools Help Web Windows System:

Vertica

Database: SQL Class

Systems

- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers

+ Sales_Table + Services

History

Sandbox

EXECUTE

?

New Query

Query 1 Choose Super Join Builder from the menu Super Join Builder Right Click

Quick Select View DDL Move data to Oracle Move data to SQL Server Move data to Teradata Move data to Azure SQL Data Warehouse SmartScript Hound Dog Compression Compare/Sync Data

Right click on any table in your systems tree and a menu will appear. Choose Super Join Builder (top menu item) and the table you selected will be placed inside the Super Join Builder.

Page 105

Chapter 4

Nexus

The Super Join Builder Shows Tables Visually Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table

+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table

Query 1 Objects

Columns

Sorting

Joins

WHERE

SQL

Join Hub System Teradata Metadata

Analytics

V

Customer_Table

+

Preview SQL in Nexus

Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)

+ Services

This is exactly what you will see when you first enter the Super Join Builder. You will see your table, its columns and the data types of each column. Notice the table name (Customer_Table) and the Vertica icon of V. Turn the page for more.

Page 106

Chapter 4

Nexus

Using the Add Join Button Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table

+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table

Query 1

Super Join Builder

Execute

Create Table

Query 1 Objects

Columns

Sorting

Joins

WHERE

SQL

Join Hub System Teradata Metadata

Analytics

V

Customer_Table

+

Preview SQL in Nexus

Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)

Press the Add Join button to see what this table can Join to

+ Services

One of the greatest features of the Super Join Builder is the Add Join drop down menu. Press the drop down and the menu will show you what other tables or views can be joined to this table.

Page 107

Chapter 4

Nexus

What to Do When No Tables are Joinable? Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table

+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table + Services

Query 1 Objects

Columns

Sorting

Joins

WHERE

SQL

Join Hub System Teradata Metadata

Analytics

V

Customer_Table

+

Preview SQL in Nexus

Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)

Could not identify any joins to this object.

When no joins have been defined yet, you will get the above message in the menu

You might find that when you click on the Add Join drop down menu that you receive a message that says, "Could not identify any joins to this object". This means that you haven't actually told Nexus what does join to this table. We are about to fix that. The next page shows you how to define joins so that the menu will recognize them.

Page 108

Chapter 4

Nexus

Drag a Joinable Object into the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table

+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table

Query 1 Objects

Columns

Left click on the table you want to join and drag it into the Super Join Builder

Sorting

Preview SQL in Nexus Joins

WHERE

Join Hub System Teradata

SQL

Metadata

V

Customer_Table

+

Analytics

Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)

+ Services

If you want to define how a table joins to another table in the Super Join Builder, you merely left click on the table in the tree and drag it into the Super Join Builder area.

Page 109

Chapter 4

Nexus

You will see the Add Custom Join Window Add Custom Join Nexus Chameleon File Edit ViewJoin Query Tools Help Web Windows Type

-

SystemsInner

Query 1

Super Join Builder

Vertica Execute SQL_Class Existing Tables

Query 1

Tables

Objects

Create Table Columns

SQL_Class.Customer_Table Addresses

+ + Claims Customer_Table + Customer_Table

V

+ Employee_Table Left click on the table Customer_Number Integer and drag it into the + Department_Table Customer_Name Varchar(20) Super Join Builder + Order_Table

Phone_Number Char(8)

+ Providers

+ Sales_Table

Preview SQL in Nexus

Sorting

Join Hub System Teradata

Joins WHERE SQL Metadata SQL_Class.Order_Table

Order_Table

T

Customer_Table

+

Analytics

V

Order_Number Integer Add Join

Customer_Number Integer

Select * Order_Date date Customer_Number Integer Customer_Name Varchar(20) Order_Total Decimal(10,2) Phone_Number Char(8)

+ Services

SQL_Class.Customer_Table cus INNER JOIN SQL_Class.Order_Table ord

Reset

+

Add Join

When you drag a second table into the Super Join Builder the Add Custom Join window appears. This is the window where you will define the join conditions. Turn the page and watch how we define the columns that join the tables together.

Page 110

Chapter 4

Nexus

Defining the Join Columns Add Custom Join Join Type Inner Existing Tables

1) 2) 3) 4)

Left click the join column on the first table Left click the join column on the second table Left click the blue arrow to establish the join condition Hit the Add Join Button

SQL_Class.Order_Table

SQL_Class.Customer_Table Customer_Table Customer_Number Integer

Order_Table

V

Order_Number Integer

1

Customer_Name Varchar(20) Phone_Number Char(8)

V

Customer_Number Integer 3

2

Order_Date date Order_Total Decimal(10,2)

SQL_Class.Customer_Table cus INNER JOIN SQL_Class.Order_Table ord ON cus.Customer_Number = Ord.Customer_Number

Reset

4

+

Add Join

In four easy steps you can define the join conditions. Nexus will remember this table next time.

Page 111

Chapter 4

Nexus

Your Tables Will Appear Together Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table

+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table + Services

Query 1 Objects

Columns

Sorting

Joins

V

Customer_Table

+

Preview SQL in Nexus

Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)

WHERE

SQL

Join Hub System Teradata Metadata

Analytics

V

Order_Table Add Join Select * Order_Number Integer Customer_Number Integer Order_Date Date Order_Total Decimal (10,2)

Now that you have defined your tables and the join conditions the tables will appear together in the Super Join Builder. Notice the line connecting the tables points to the columns from both tables that are their respective join conditions.

Page 112

Chapter 4

Nexus

Select the Columns You Want on the Report Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Sales_Table + Services

Query 1 Objects

Columns

Sorting

Joins

V

Customer_Table

+

Preview SQL in Nexus WHERE

SQL

Metadata

Analytics

V

Order_Table

Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)

Join Hub System Teradata

Add Join Select * Order_Number Integer Customer_Number Integer Order_Date Date Order_Total Decimal (10,2)

Click on the columns you want on the report and Nexus will build the SQL automatically

Put a checkmark in the column boxes for all columns you want on the report and Nexus will build the SQL automatically. We have checked the Customer_Number, Customer_Name, Order_Date and Order_Total columns.

Page 113

Chapter 4

Nexus

Check out the SQL Tab to See the SQL that has been built Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table + Employee_Table

+ Department_Table + Order_Table + Providers

+ Sales_Table + Services

Query 1 Objects

Columns

Preview SQL in Nexus

Sorting

Joins

You are currently in the Objects Tab

SQL

Metadata

Analytics

Click on the SQL Tab to see the SQL

V

Customer_Table

+

WHERE

Join Hub System Vertica

Add Join

V

Order_Table Add Join

Select * Customer_Number Integer Customer_Name Varchar(20)

Select * Order_Number Integer Customer_Number Integer

Phone_Number Char(8)

Order_Date Date Order_Total Decimal (10,2)

When you see your tables in the Super Join Builder you are in the Objects Tab. Now that you have put a checkmark on the columns you want on your report, click on the SQL tab to see the SQL that Nexus has automatically built for you.

Page 114

Chapter 4

Nexus

SQL Tab Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table + Employee_Table

+ Department_Table + Order_Table + Providers

+ Sales_Table + Services

Query 1 Objects

Columns

Sorting

Preview SQL in Nexus Joins

WHERE

SQL

Join Hub System Teradata Metadata

Analytics

SQL

SELECT cus.Customer_Number, cus.Customer_Name, ord.Order_Date, ord.Order_Total FROM SQL_Class.customer_table cus INNER JOIN SQL_Class.order_table ord ON cus.Customer_Number = Ord.Customer_Number ;

The SQL has automatically been built for you because you put a check on the column you wanted to see on the report. You can hit the Execute button (above the Objects tab) or you can Preview SQL in Nexus. We will show you both options next.

Page 115

Chapter 4

Nexus

Hit Execute to get the Report inside the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table + Employee_Table

+ Department_Table + Order_Table + Providers

+ Sales_Table + Services

Query 1

Super Join Builder

Execute

Create Table

Query 1 Objects

Columns

Sorting

Preview SQL in Nexus Joins

WHERE

SQL

Join Hub System Teradata Metadata

Analytics

SQL

SELECT cus.Customer_Number, cus.Customer_Name, ord.Order_Date, ord.Order_Total FROM SQL_Class.customer_table cus INNER JOIN SQL_Class.order_table ord ON cus.Customer_Number = Ord.Customer_Number ;

If you click the Execute button (above the Objects tab) the report will come back inside the Super Join Builder. Turn to the next page and see the report.

Page 116

Chapter 4

Nexus

The Report is delivered inside the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows

-

Systems

Query 1

Vertica SQL_Class

Super Join Builder

SELECT cus.Customer_Number, cus.Customer_Name, ord.Order_Date, ord.Order_Total FROM SQL_Class.customer_table cus INNER JOIN SQL_Class.order_table ord ON cus.Customer_Number = Ord.Customer_Number ;

Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers

Messages

+ Sales_Table

1 2 3 4 5

Page 117

Garden of Analysis

Result 1

Customer_Number Customer_Name

+ Services

The report is delivered.

Join Builder x

31323134 57896883 11111111 11111111 87323456

ACE Consulting XYZ Plumbing Billy's Best Choice Billy's Best Choice Databases N-U

Order_Date 10/01/1999 09/09/1999 05/04/1998 01/01/1999 10/10/1999

Order_Total 5111.47 23454.84 12347.53 8005.91 15231.62

Chapter 4

Nexus

Let's Join Two Tables Again (1 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows System:

Vertica

Database: SQL Class

Systems

- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table

History EXECUTE

Sandbox ?

New Query

Query 1

Messages

Garden of Analysis

Right Click on any table and choose Super Join Builder

+ Order_Table + Providers

+ Sales_Table + Services

Right click on the table you previously defined in your systems tree and choose Super Join Builder. You will be placed inside the Super Join Builder. The next page will show the Right Click menu. Do you remember where the Super Join Builder is?

Page 118

Chapter 4

Nexus

Let's Join Two Tables Again (2 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows System:

-

Vertica

Database: SQL Class

Systems

+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table

EXECUTE

?

New Query

Choose Super Join Builder from the menu

Tables + Addresses + Claims + Customer_Table

Sandbox

Query 1

Vertica SQL_Class

-

History

Super Join Builder Right Click

Quick Select View DDL Move data to Oracle Move data to SQL Server Move data to Teradata Move data to Azure SQL Data Warehouse SmartScript Hound Dog Compression Compare/Sync Data

+ Services

The Super Join Builder is always the top menu item. Select that and you will be placed inside the Super Join Builder.

Page 119

Chapter 4

Nexus

Let's Join Two Tables Again (3 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table

+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table

Query 1 Objects

Columns

Sorting

Joins

WHERE

SQL

Join Hub System Vertica Metadata

Analytics

V

Customer_Table

+

Preview SQL in Nexus

Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)

+ Services

You once again see your table, its columns and the data types of each column. Watch what happens when we hit the Add Join drop down menu now.

Page 120

Chapter 4

Nexus

Let's Join Two Tables Again (4 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table

+ Employee_Table + Department_Table

Query 1

Super Join Builder

Execute

Create Table

Query 1 Objects

Columns

Sorting

Joins

Add Join

+ Providers

Select * Customer_Number Integer Customer_Name Varchar(20)

+ Sales_Table

Phone_Number Char(8)

+ Order_Table

WHERE

SQL

Join Hub System Vertica Metadata

Analytics

V

Customer_Table

+

Preview SQL in Nexus

Press the Add Join button to see what this table can Join to

+ Services

What do you think will be in the Add Join drop down menu this time? I bet you already guessed it. Turn the page.

Page 121

Chapter 4

Nexus

Let's Join Two Tables Again (5 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table

+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table + Services

Query 1 Objects

Columns

Sorting

Joins

WHERE

Join Hub System Vertica

SQL

Metadata

Analytics

V

Customer_Table

+

Preview SQL in Nexus

Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)

V

SQL_Class.Order_Table

You now have the Order_Table in the menu. Choose it.

Since you previously defined the relationship between the Customer_Table and the Order_Table the Add Join drop down menu will list these tables as joinable. Just use the menu to select the Order_Table and both tables will appear side by side together in the Super Join Builder. Turn the page and see for yourself. You can then look at the dashboard tab (top right) and see the compression savings for every table.

Page 122

Chapter 4

Nexus

Let's Join Two Tables Again (6 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table

+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table + Services + Subscribers

Query 1 Objects

Columns

Sorting

Joins

V

Customer_Table

+

Preview SQL in Nexus

Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)

WHERE

SQL

Join Hub System Vertica Metadata

Analytics

V

Order_Table Add Join Select * Order_Number Integer Customer_Number Integer Order_Date Date Order_Total Decimal (10,2)

Now that you have defined your tables and the join conditions, the tables will appear together in the Super Join Builder. Notice the line connecting the tables’ points to the columns respective join conditions.

Page 123

Chapter 4

Nexus

The Tabs of the Super Join Builder Philosophy – One Query Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics

Vertica

The tabs above work as a team on a single query. Each tab is designed for a different purpose. The next series of slides will explain how to use each tab effectively, in order to build a single query quickly and efficiently. Each time you change something in a tab, the SQL being built is changed to build the query as you desire.

The Super Join Builder is one of the most intricate pieces of commercial software ever built. Each tab above performs a different function so that you can quickly build the most efficient query possible.

Page 124

Chapter 4

Nexus

The Tabs of the Super Join Builder – Objects Tab Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics

Vertica

The Objects tab shows your objects (tables and views) and provides a menu of what other objects are joinable.

V

Customer_Table

+ Create cube Create cube w/ columns

V

Order_Table

Add Join

Add Join

Select * Order_Table Customer_Number Integer Customer_Name Varchar(20)

Select * Order_Number Integer Customer_Number Integer

Phone_Number Char(8)

Order_Date Date Order_Total Decimal (10,2)

The Objects tab is the first screen you will always see when you right click on a table and choose "Super Join Builder". It shows the table you right clicked on in your systems tree (visually). You can then select from the Add Join drop down to see a menu of what other objects can be joined. If you click on an object in the Add Join drop down it will be joined. You can also click the Cube drop down menu to automatically select all objects that are joinable (instantly). Each time you checkmark a column in an object the SQL is automatically built and that column will be on your report. Of course you can always check all columns in an object by putting a checkmark in the Select * check box of an object. Above, we are joining the Customer_Table to the Order_Table and have put a checkmark on the Customer_Name, Order_Date and Order_Total.

Page 125

Chapter 4

Nexus

The Tabs of the Super Join Builder – Columns Tab) Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics

Vertica

Trashcan – Drag and drop object here to remove them from the report. Report Columns Customer_Name

Order_Date

Order_Total

Additional Columns Customer_Number

Phone_Number

The Columns tab allows you to see the columns on your report, and to change their order. It shows the columns you have selected on the report (at the top), because you placed a checkmark in them from the objects tab. It also shows you the columns you did not checkmark (at the bottom). You can move the columns around, throw them in the trash, bring up columns that were not selected, and the SQL will change to reflect exactly what you want your report columns to look like.

Order_Number

Customer_Number

The columns tab allows you to rearrange columns on your report. You can drag and drop columns to change their order. Notice that the columns are color coded to match the color of the table (in the objects tab) that they came from. Columns at the top are the ones you selected. The columns at the bottom are the ones you did not select. You can throw columns at the top in the trash can and they are no longer on your report, as they will reappear at the bottom. You can even move columns from the bottom up to the top and they will then be on your report.

Page 126

Chapter 4

Nexus

The Tabs of the Super Join Builder – Sorting Tab Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics

Vertica

Trashcan – Drag and drop object here to remove them from the report. Column Name Order_Date

Sort Type ASC

The Sorting tab is for the ORDER BY clause. Double click on any column below and it will be used to sort the report.

Report Columns Customer_Name

Order_Date

Order_Total

Additional Columns Customer_Number

Phone_Number

Order_Number

Customer_Number

The Sorting Tab allows you to double-click on any column that you want to use in the ORDER BY statement of the SQL. This essentially sorts the report. The Report Columns shows the columns you selected to be on your report. The Additional Columns shows the columns you did not select to be on your report. You can however choose from any of these columns to be the sort key. Double click on a column, or click-and-drag it up, and it will be a sort key. You can have multiple sort keys and you can always choose either ASC or DESC mode.

Page 127

Chapter 4

Nexus

The Tabs of the Super Join Builder – Joins Tab Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics Join Type

Join Objects

SQL_Class.Customer_Table Cus Joins to SQL_Class.Order_Table ORD ON Cus.Customer_Number = ORD.Customer_Number

INNER

INNER LEFT RIGHT FULL

The Joins tab allows you to change from Inner joins to outer joins.

Vertica

Customer_Table

INNER JOIN Order_Table

ON Cus.Customer_Number = ORD.Customer_Number The Joins tab also gives you a visual of the joining tables and the columns in the ON Clause

Use the Joins tab if you want to change your joins from Inner to Outer joins. The drop down menu (red arrow) allows you to easily adjust your SQL to utilize the outer join of your choice. It is also designed to show you the tables being joined and the join column conditions in the ON CLAUSE. Above, we have decided to keep the default INNER JOIN.

Page 128

Chapter 4

Nexus

The Tabs of the Super Join Builder – SQL Tab Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics

Vertica

SQL

SELECT Cus.Customer_Name, ORD.Order_Date, ORD.Order_Total FROM SQL_Class.Customer_Table Cus INNER JOIN SQL_Class.Order_Table ORD ON Cus.Customer_Number = ORD.Customer_Number ORDER BY Ord.Order_Date ASC ;

The SQL tab shows you the SQL that Nexus has automatically generated. The SQL begins being built the first time you checkmark a column in the objects tab, and changes with each change you request from any of the other tabs. We originally requested Customer_Name, Order_Date and Order_Total in the Objects tab. We sorted by Order_Date ASC in the Sorting Tab. Nexus always generates the SQL perfectly for all systems.

Page 129

Chapter 4

Nexus

The Tabs of the Super Join Builder – Metadata Tab Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Create Table Preview SQL in Nexus Join Hub System Vertica Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics Table Metadata

Customer_Table

Explain

V

Order_Table

V

Table Size: 5 KB

Table Size: 5 KB

Row Count: 6

Row Count: 6

The Metadata tab shows you the size of each table in your join. This becomes a strategic asset for Cross-System joins.

This request is eligible for incremental planning and execution (IPE) but does not meet cost thresholds. The following is the static plan for the request. 1) First, we lock a distinct SQL_CLASS."pseudo table" for read on a RowHash to prevent global deadlock for SQL_CLASS.ORD. 2) Next, we lock SQL_CLASS.ORD for read. 3) We do an all-AMPs RETRIEVE step from SQL_CLASS.ORD by way of an all-rows scan with a condition of ("SQL_CLASS.ORD.Customer_Number = 11111111") into Spool 2 (one-amp), which is redistributed by the hash code of (11111111) to all AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is estimated with low confidence to be 2 rows (60 bytes). The estimated time for this step is 0.01 seconds.

The Metadata tab will show you the size of your table, including row counts. This will be extremely important when you begin performing cross-system joins. Nexus always thinks about performance tuning first because Nexus has been used (in production) by many of the largest companies in the world. The Metadata tab will also show you the optimizer's plan, which is often called the Explain plan, if you request it by clicking on the magnifying glass (red circle above).

Page 130

Chapter 4

Nexus

The Tabs of the Super Join Builder – Analytics Tab Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Create Table Preview SQL in Nexus Join Hub System Vertica Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics V

Sales_Table Start the Analytics by bringing in a table to the Super Join Builder, but make sure you checkmark the columns you will be using in the Analytics tab.

+

Add Join Select * Product_ID Integer Sale_Date Date Daily_Sales Char(8)

The Analytics tab will allow a user to quickly build the SQL needed for Ordered Analytics (OLAP), Rank, and Grouping Sets. These analytics can also be done in the Garden of Analysis after an answer set returns, but on extremely large data sets it can be advantageous to have analytics performed by the data warehouse. The Analytics tab will build the SQL for the user so the user can submit that SQL to the data warehouse to receive an answer set. Start by right clicking on a table in the system tree and choosing Super Join Builder from the right click menu. Then, checkmark the columns you will want on your report, and then go to the Analytics tab.

Page 131

Chapter 4

Nexus

The Tabs of the SJB – Analytics Tab – OLAP Screen Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Objects

OLAP

Create Table

Columns Sorting

Rank

Preview SQL in Nexus Joins

WHERE

SQL

Sorting

Place a checkmark on the OLAPs below

Partitioning

Column Name

Moving Window 6

Report Columns Product_ID

Sale_Date

Vertica

Metadata Analytics

Grouping Sets

OLAP

Join Hub System

Daily_Sales

Drag and drop these columns

OLAP Function Select * CSUM MSUM MAVG MDIFF COUNT MAX MIN

With Partitioning Select * (P) CSUM (P) MSUM (P) MAVG (P) MAVG (C) MDIFF (P) COUNT (P) MAX (P) MIN (P)

This is usually the exact screen you see by default when you enter the Analytics tab. We are in the OLAP subtab. You will drop and drag the columns below to the appropriate window, change the moving window parameter and then check the OLAP functions you desire. You can select all of the OLAP functions by checking the Select * checkbox, and you can select the With Partitioning OLAP functions as well. The With Partitioning will reset the calculations on the Partitioning column.

Page 132

Chapter 4

Nexus

Getting a Simple CSUM in the Analytics Tab – OLAP Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Objects

OLAP

Create Table

Columns Sorting

Rank

Preview SQL in Nexus Joins

SQL

Sorting

Daily_Sales

Column Name

Place a checkmark on the OLAPs below

Partitioning

Moving Window 6

Product_ID Sale_Date

Report Columns Sale_Date

Greenplum

Metadata Analytics

Grouping Sets

OLAP

Product_ID

WHERE

Join Hub System

Daily_Sales

OLAP Function Select * CSUM MSUM MAVG MDIFF COUNT MAX MIN

With Partitioning Select * (P) CSUM (P) MSUM (P) MAVG (P) MAVG (C) MDIFF (P) COUNT (P) MAX (P) MIN (P)

In the example above, we dragged the Daily_Sales column to the OLAP box because that is the column we want to perform the calculations on. We dragged the Product_ID column first to the sorting tab and then we dragged the Sale_Date column there also. This means we will first sort the data by Product_ID, Sale_Date. We also checked the CSUM OLAP function. We didn't touch the Partitioning or Moving Window information. The next page shows the SQL tab and the SQL generated.

Page 133

Chapter 4

Nexus

Getting a Simple CSUM – The SQL Automatically Generated Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics

Vertica

SQL

SELECT Sal.Product_ID, Sal.Sale_Date, Sal.Daily_Sales, SUM(Sal.Daily_Sales) OVER ( ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS UNBOUNDED PRECEDING) AS CSUM_Sale_Date_Sal FROM SQL_CLASS.Sales_table Sal ;

The SQL above was automatically generated by the previous OLAP screen and was done so the second we checked the CSUM checkbox. The above query will OLAP the column Daily_Sales, but only after first sorting the data by Product_ID, Sale_Date. The Rows Unbounded Preceding will generate a Cumulative Sum. Let's check out the report on the next slide.

Page 134

Chapter 4

Nexus

The Answer Set of the CSUM SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID ASC, Sale_Date ASC ROWS UNBOUNDED PRECEDING) AS CSUM_Sale_Date_Sal FROM Sales_Table ; Product_ID Sale_Date _________ Daily_Sales _________________ CSUM_Sale_Date_Sal ________ _________ 1000 2000-09-28 48850.40 48850.40 1000 2000-09-29 54500.22 103350.62 1000 2000-09-30 36000.07 139350.69 1000 2000-10-01 Not all rows 40200.43 179551.12 are displayed 1000 2000-10-02 32800.50 212351.62 in this 1000 2000-10-03 64300.00 276651.62 answer set 1000 2000-10-04 54553.10 331204.72 2000 2000-09-28 41888.88 373093.60 2000 2000-09-29 48000.00 421093.60 2000 2000-09-30 49850.03 470943.63 2000 2000-10-01 54850.29 525793.92 The Sales_Table was first sorted by Product_ID, Sale_Date. Then, the Cumulative Sum (CSUM) began. We made 48850 for the first row so 48850.50 is our first CSUM value. Then, we made 54500.22 so 48850 + 54500.22 equals 103350.62. The values from each Daily_Sales entry was continually added.

Page 135

Chapter 4

Nexus

Getting all of the OLAP functions in the Analytics Tab Nexus Chameleon File Edit View Query Tools Help Web Windows

Execute Objects

OLAP

Create Table

Columns Sorting

Rank

Preview SQL in Nexus Joins

WHERE

Place a checkmark on the OLAPs below

Sorting

Partitioning

Daily_Sales

Column Name

Product_ID

Product_ID Sale_Date

Report Columns Sale_Date

Vertica

Metadata Analytics

Grouping Sets

OLAP

Product_ID

SQL

Join Hub System

Daily_Sales

Moving Window 3

OLAP Function Select * CSUM MSUM MAVG MDIFF COUNT MAX MIN

With Partitioning Select * (P) CSUM (P) MSUM (P) MAVG (P) MAVG (C) MDIFF (P) COUNT (P) MAX (P) MIN (P)

In the example above, we will OLAP the Daily_Sales column after first sorting by Product_ID, Sale_Date. We dragged the Product_ID column to the partitioning window. We changed the moving window to 3 and checked all of the OLAP and OLAP with partitioning boxes by merely clicking on SELECT *. The next page shows the SQL tab and the SQL generated.

Page 136

Chapter 4

Nexus

A Five Table Join Using the Menu Nexus Chameleon File Edit View Query Tools Help Web Windows System:

Vertica

Database: SQL Class

Systems

- Vertica - SQL_Class - Tables

?

New Query

Super Join Builder Right Click

+ Department_Table

+ Services

EXECUTE

Right click and choose Super Join Builder from the menu

+ Employee_Table

+ Providers

Sandbox

Query 1

+ Addresses + Claims + Customer_Table

+ Order_Table

History

Quick Select View DDL Move data to Oracle Move data to SQL Server Move data to Teradata Move data to Azure SQL Data Warehouse SmartScript Hound Dog Compression Compare/Sync Data

+ Subscribers

You are about to see how the menu system of the Super Join Builder is designed to work. We will first use the menu of the Super Join Builder and then show you an even quicker way (using the Cube method).

Page 137

Chapter 4

Nexus

The First Table is placed in the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers

Query 1 Objects

Columns

Sorting

V

Addresses

+

Add Join

Select * Street Varchar(30) City Varchar(20) State Char(2) Zip Integer AreaCode Smallint Phone Integer Subscriber_No Integer

Preview SQL in Nexus Joins

WHERE

SQL

Join Hub System Vertica Metadata

Analytics

Get ready to select from the Add Join drop down menu

This is exactly what you will see when you first enter the Super Join Builder. You will see your table, its columns and the data types of each column. Notice the table name (Addresses) and the V for the Vertica icon. Turn the page for more.

Page 138

Chapter 4

Nexus

Using the Add Join Cascading Menu Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica Execute Objects

Database: SQL Class

EXECUTE

?

Create Table Preview SQL in Nexus Columns

Sorting

Joins

New Query Join Hub System Teradata

WHERE

SQL

Metadata

Analytics

V

Addresses

+

Sandbox

History

Add Join Select * Street Varchar(30) City Varchar(20) State Char(2)

Zip Integer AreaCode Smallint Phone Integer Subscriber_No Integer

V

Subscribers

V

Claims

V

Providers

V

Services

Left Click on the final table (Services)

The Add Join menu drop down shows that the Addresses table joins to the Subscribers table. Keep cascading down the menu and you see that the Subscribers table joins to the Claims table. Keep cascading all the way until you get to the final table, which is the Services table. Left click on the Services table and all five tables will be in the Super Join Builder.

Page 139

Chapter 4

Nexus

All Five Tables Are In the Super Join Builder Nexus Chameleon Execute

Objects

Create Table Preview SQL in Nexus

Columns

V

Addresses

+

Add Join

Sorting

Join Hub System Vertica

Joins

WHERE

Subscribers

V

+

Add Join

SQL

Metadata

Analytics

V

Claims

+

Add Join

Select * Street Varchar(30) City Varchar(20)

Select *

Select *

Last_Name Varchar(20) First_Name Varchar(20)

State Char(2)

Vender Char(1)

Zip Integer

SSN Integer

AreaCode Smallint

Member_No Smallint

Phone Integer

Subscriber_No Integer

Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(9,2) Phone Integer Subscriber_No Integer

Subscriber_No Integer

V

Services

+

Add Join

Select * Service_Code Integer Service_Desc Varchar(20) Service_Pay Decimal(7,2)

V

Providers

+

Add Join

Select * Provider_Code Integer Prov_Name Varchar(20) Error_Rate Decimal(4,2)

Now that all five tables are present in the Super Join Builder, all you have to do is checkmark the columns you want on the report. The SQL is built automatically with each mouse click. When you are done selecting the columns, just hit Execute.

Page 140

Chapter 4

Nexus

A Five Table Join Two Steps (Cube) Nexus Chameleon File Edit View Query Tools Help Web Windows System:

-

Vertica

Database: SQL Class

Systems

Right Click

+ Addresses + Claims + Customer_Table

+ Employee_Table + Department_Table

+ Providers

+ Services

EXECUTE

?

New Query

Right click and choose Super Join Builder from the menu

Tables

+ Order_Table

Sandbox

Query 1

Vertica SQL_Class

-

History

Super Join Builder Quick Select View DDL Move data to Oracle Move data to SQL Server Move data to Teradata Move data to Azure SQL Data Warehouse SmartScript Hound Dog Compression Compare/Sync Data

+ Subscribers

Be prepared to be amazed. We are about to do the two-step! These two steps will allow a user to join many tables in an instant. Watch the two steps that it takes to join a table to everything possible. Pick any table in the join of many tables.

Page 141

Chapter 4

Nexus

Choose Cube with Columns from the Left Top of the Table Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers

Query 1 Objects

Columns

Sorting

Preview SQL in Nexus Joins

WHERE

Join Hub System Vertica

SQL

+

Analytics

V

Addresses Choose Create Cube with Columns From the Cube drop down menu

Metadata

Add Join

Select * Create Cube Street Varchar(30) Create Cube with Columns City Varchar(20) State Char(2) Zip Integer AreaCode Smallint Phone Integer Subscriber_No Integer

On the left side of the table is the Cube drop down menu. Choose the Create Cube with Columns option (highlighted above). The Nexus will join every table possible in the entire lineage instantly, and choose all of the columns. Turn the page!

Page 142

Chapter 4

Nexus

All Tables are Cubed (Joined Together Instantly) Nexus Chameleon Execute

Objects

Create Table Preview SQL in Nexus

Columns

V

Addresses

+

Add Join

Sorting

Join Hub System Vertica

Joins

WHERE

Subscribers

V

+

Add Join

SQL

Metadata

Analytics

V

Claims

+

Add Join

Select * Street Varchar(30) City Varchar(20)

Select *

Select *

Last_Name Varchar(20) First_Name Varchar(20)

State Char(2)

Gender Char(1)

Zip Integer

SSN Integer

AreaCode Smallint

Member_No Smallint

Phone Integer

Subscriber_No Integer

Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(9,2) Phone Integer Subscriber_No Integer

Subscriber_No Integer

V

Services

+

Add Join

Select * Service_Code Integer Service_Desc Varchar(20) Service_Pay Decimal(7,2)

V

Providers

+

Add Join

Select * Provider_Code Integer Prov_Name Varchar(20) Error_Rate Decimal(4,2)

There were five total tables that were joinable and the Create Cube with Columns choice instantly joined them together, including all of the columns. The SQL has been built automatically (in 2 seconds) and you can hit Execute to get your report.

Page 143

Chapter 4

Nexus

Choose Cube and then Choose Your Columns Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

Vertica SQL_Class

Execute

Create Table

-

Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers

Query 1 Objects

Columns

Sorting

Preview SQL in Nexus Joins

WHERE

Join Hub System Vertica

SQL

+

Analytics

V

Addresses Choose Create Cube From the Cube drop down menu

Metadata

Add Join

Select * Create Cube Street Varchar(30) Create Cube with Columns City Varchar(20) State Char(2) Zip Integer AreaCode Smallint Phone Integer Subscriber_No Integer

On the left side of the table is the Cube drop down menu. Choose the Create Cube option (highlighted above). The Nexus will join every table possible in the entire lineage instantly, but you can decide what columns you want on the report.

Page 144

Chapter 4

Nexus

Create Cube - Tables Are Joined Without Columns Selected Nexus Chameleon Execute

Objects

Create Table Preview SQL in Nexus

Columns

V

Addresses

+

Add Join

Sorting

Join Hub System Vertica

Joins

WHERE

Subscribers

V

+

Add Join

SQL

Metadata

Analytics

V

Claims

+

Add Join

Select * Street Varchar(30) City Varchar(20)

Select *

Select *

Last_Name Varchar(20) First_Name Varchar(20)

State Char(2)

Gender Char(1)

Zip Integer

SSN Integer

AreaCode Smallint

Member_No Smallint

Phone Integer

Subscriber_No Integer

Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(9,2) Phone Integer Subscriber_No Integer

Subscriber_No Integer

V

Services

+

Add Join

Select * Service_Code Integer Service_Desc Varchar(20) Service_Pay Decimal(7,2)

V

Providers

+

Add Join

Select * Provider_Code Integer Prov_Name Varchar(20) Error_Rate Decimal(4,2)

All of the tables joinable are present in the Super Join Builder, but none of the columns are selected. You can now select the columns you want on the report. There will also be an X on the top of the tables, so you can delete any table you don't need.

Page 145

Chapter 4

Nexus

Create Cube – Select the Columns You Want on the Report Nexus Chameleon Execute

Objects

Create Table Preview SQL in Nexus

Columns

V

Addresses

+

Add Join

Sorting

Join Hub System Vertica

Joins

WHERE

Subscribers

V

+

Add Join

SQL

Metadata

Analytics

V

Claims

+

Add Join

Select * Street Varchar(30) City Varchar(20)

Select *

Select *

Last_Name Varchar(20) First_Name Varchar(20)

State Char(2)

Gender Char(1)

Zip Integer

SSN Integer

AreaCode Smallint

Member_No Smallint

Phone Integer

Subscriber_No Integer

Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(9,2) Phone Integer Subscriber_No Integer

Subscriber_No Integer

V

Services

+

Add Join

Select * Service_Code Integer Service_Desc Varchar(20) Service_Pay Decimal(7,2)

V

Providers

+

Add Join

Select * Provider_Code Integer Prov_Name Varchar(20) Error_Rate Decimal(4,2)

Notice that we have checked the columns we want on the report, but that not all columns were selected. The SQL is built automatically with each check or uncheck of a column box. When you are finished choosing the columns you want on the report, please hit Execute (above) or Preview SQL in Nexus, where you can hit Execute in the main Nexus screen.

Page 146

Chapter 4

Nexus

How to join Vertica, Oracle and SQL Server Tables Nexus Chameleon File Edit View Query Tools Help Web Windows System:

Vertica

Database: SQL Class

Systems

History

Sandbox

EXECUTE

?

New Query

Query 1

+ Oracle + SQL Server

- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers

Choose Super Join Builder from the menu

Right Click

Super Join Builder Quick Select View DDL

Move data to Oracle Move data to SQL Server Move data to Teradata Move data to Azure SQL Data Warehouse SmartScript Hound Dog Compression Compare/Sync Data

We are about to do a three-table join, but the incredible part is that one table is from Vertica, another from Oracle and the third table is from SQL Server. We will start with the Vertica table. We will right click on the Addresses table from the Vertica systems tree and a menu will appear. We choose the Super Join Builder (top menu item) and begin the process.

Page 147

Chapter 4

Nexus

The Vertica Table is now in the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

Query 1

Super Join Builder

+ Oracle + SQL Server

Execute

Create Table

- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers

Query 1 Objects

Columns

V

Addresses

+

Sorting

Preview SQL in Nexus Joins

V

WHERE

SQL

Join Hub System Vertica Metadata

Analytics

Stands for Vertica

Add Join

Select * Subscriber_No Integer Street Varchar(30) City Varchar(20) State Char(2) Zip Integer AreaCode Smallint

Phone Integer

The Addresses Table (from the Vertica system) is now in the Super Join Builder. You will see the table, its columns and the data types of each column. Notice that the table name (Addresses) has an icon of V for Vertica in the upper right corner. Now is the time to open up our Oracle system tree. We will do that on the next slide.

Page 148

Chapter 4

Nexus

Drag the Joining Oracle Table to the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows

-

Systems

Oracle SQL_Class

-

Tables

Query 1

Super Join Builder

Execute

Create Table

Query 1 Objects

Columns

+ Addresses + Claims + Customer_Table

+ Providers + Services + Subscribers

+ SQL Server + Vertica

Joins

WHERE

+

SQL

Join Hub System Vertica Metadata

Analytics

V

Addresses

+ Employee_Table + Department_Table + Order_Table

Sorting

Preview SQL in Nexus

Add Join

Select * Subscriber_No Integer

Left click on the Oracle table you want to join and drag it into the Super Join Builder

Street Varchar(30) City Varchar(20) State Char(2) Zip Integer AreaCode Smallint

Phone Integer

Open up your Oracle Systems Tree and left click on the Oracle table you want to join and drag it into the Super Join Builder.

Page 149

Chapter 4

Nexus

Defining the Join Columns Add Custom Join Join Type Inner Existing Tables

1) 2) 3) 4)

SQL_Sandbox.Subscribers931827O

SQL_Class.Addresses

Addresses Subscriber_No Street City State Zip AreaCode Phone

Settings icon – If you want to Change the Data movement options

Left click the join column on the first table (highlights in blue) Left click the join column on the second table Left click the blue arrow to establish the join condition Hit the Add Join Button

Subscribers931827O

V Integer Varchar(30) Varchar(20) Char(2) Integer Smallint Integer

1

3

SSN Gender First_Name Last_Name Member_No Subscriber_No

Number(38,0) Char(1) Varchar(20) Char(20) Number (38,0) Number (38,0)

O

2

SQL_Class.Addresses Add INNER JOIN SQL_Sandbox.Subscribers931827O SUB ON Add.Subscriber_No = SUB.Subscriber_No

Reset

4

+ Add Join

In four easy steps you can define the join conditions. Notice two things about the Subscribers Table from Oracle. First, notice the Oracle icon in the table's right hand corner. Second, notice the number behind the Subscribers name (pink). In the above example, the name is Subscribers931827O. The table will be moved to Vertica temporarily for the life of the join.

Page 150

Chapter 4

Nexus

Choose the Columns You Want on Your Report Nexus Chameleon File Edit View Query Tools Help Web Windows

-

Systems

Oracle SQL_Class

-

Tables

+ Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers

+ SQL Server + Vertica

Query 1

Super Join Builder

Execute

Create Table

Query 1 Objects

Columns

V

Addresses

+

Add Join

Select * Subscriber_No Integer Street Varchar(30) City Varchar(20)

Preview SQL in Nexus

Sorting

Joins

Vertica Table

WHERE

Join Hub System Vertica

SQL

Metadata

Subscribers

+

O

Analytics Oracle Table

Add Join

Select * Last_Name Varchar(20) First_Name Varchar(20) Gender Char(1)

State Char(2)

SSN Integer

Zip Integer

Member_No Smallint

AreaCode Smallint

Subscriber_No Integer

Phone Integer

The Vertica and Oracle tables are in the Super Join Builder, the relationships have been defined and so have the data movement strategies. All you need to do now is to checkmark the columns you want from both tables on the report. We have placed a checkmark on the Subscriber_No, State, Last_Name and First_Name columns. The SQL has already been built (automatically), and if you hit the Execute button, the report will return.

Page 151

Chapter 4

Nexus

Let's Add a SQL Server Table to our Vertica and Oracle Join Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

+ Oracle

- SQL Server - SQL_Class - System Tables + dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table + dbo.Department_Table + dbo.Order_Table + dbo.Providers + dbo.Services + dbo.Subscribers

+ Vertica

Query 1

Super Join Builder

Execute

Create Table

Query 1 Objects

Columns

Left click on the SQL Server table you want to join and drag it into the Super Join Builder

Preview SQL in Nexus

Sorting

Joins

WHERE

Addresses

V

+

Add Join

Select * Subscriber_No Integer Street Varchar(30) City Varchar(20)

SQL

Join Hub System Vertica Metadata

Analytics

Subscribers

+

O

Add Join

Select * Last_Name Varchar(20) First_Name Varchar(20) Gender Char(1)

State Char(2)

SSN Integer

Zip Integer

Member_No Smallint

AreaCode Smallint

Subscriber_No Integer

Phone Integer

Open up your SQL Server Systems Tree and left click on the table you want to join, and drag it into the Super Join Builder.

Page 152

Chapter 4

Nexus

Defining the Join Columns Add Custom Join Join Type

Inner Existing Tables

1

Make sure you use the Table drop down menu to pick the table that the new table Joins with

SQL_Sandbox.Claims222583S

SQL_Sandbox.Subscribers931827O Subscribers9318270 SSN Gender First_Name Last_Name Member_No Subscriber_No

Number(38,0) Char(1) Varchar(20) Char(20) Number (38,0) Number (38,0)

Claims222583S

O

4 2

Settings icon – If you want to change the data movement or working database options

Claim_Id Claim_Date Claim_Service Subscriber_No Member_No Claim_Amt Provider_No

SQL

Integer Date Smallint 3 Integer Smallint Decimal(12,2) Smallint

SQL_Sandbox. Subscribers9318270 SUB INNER JOIN SQL_Sandbox. Claims222583S CLA ON SUB.Subscriber_No = CLA.Subscriber_No

Reset

5

+ Add Join

1) Choose the correct table that the new table joins with from the table drop down menu. 2) Choose the joining column(s) from the left table. 3) Choose the joining column(s) from the new table on the right. 4) Hit the blue arrow to actually define the join conditions (the SQL will change below to reflect the join). 5) Hit the Add Join Button.

Page 153

Chapter 4

Nexus

All Three Tables are now in the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

+

Oracle

- SQL Server - SQL_Class - System Tables + dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table

Query 1

Super Join Builder

Execute

Create Table

Query 1 Objects

Columns

V

Addresses

+

Add Join

Select * Subscriber_No Integer

Preview SQL in Nexus

Sorting

Joins

WHERE

Subscribers

+

O

Add Join

Claims

+

Analytics

SQL Add Join

Select *

Last_Name Varchar(20) First_Name Varchar(20)

Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Phone Integer Subscriber_No Integer

Street Varchar(30) City Varchar(20)

+ dbo.Providers + dbo.Services + dbo.Subscribers

State Char(2)

SSN Integer

Zip Integer

Member_No Smallint

AreaCode Smallint

Subscriber_No Integer

+ Vertica

Metadata

Select *

+ dbo.Department_Table + dbo.Order_Table

Phone Integer

SQL

Join Hub System Vertica

Gender Char(1)

You now have three tables (from different systems) in the Super Join Builder. You have already defined the joining columns and your data movement strategies. Just click on the columns you want on the report and hit Execute. The SQL has already been built and the Oracle and SQL Server tables will be moved temporarily to the Vertica system, where they will be joined. Because we only selected Last_Name and First_Name from the Oracle table and Claim_Date and Claim_Amt from the SQL Server table, only those columns (plus the join condition column – Subscriber_No) will be moved to Vertica.

Page 154

Chapter 4

Nexus

Change the Hub and Run the Join on Oracle From the Join Hub System drop down menu choose Oracle

Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

+ Oracle

-

SQL Server SQL_Class

Query 1

Super Join Builder

Execute

Create Table

Query 1 Objects

Columns

System Tables

+ dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table

V

Addresses

+

Add Join

Select * Subscriber_No Integer

Preview SQL in Nexus

Sorting

Joins

WHERE

Subscribers

+

O

Add Join

Claims

+

Analytics

SQL Add Join

Select *

Last_Name Varchar(20) First_Name Varchar(20)

Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Phone Integer Subscriber_No Integer

Street Varchar(30) City Varchar(20)

+ dbo.Providers + dbo.Services + dbo.Subscribers

State Char(2)

SSN Integer

Zip Integer

Member_No Smallint

AreaCode Smallint

Subscriber_No Integer

+ Vertica

Metadata

Select *

+ dbo.Department_Table + dbo.Order_Table

Phone Integer

SQL

Join Hub System Oracle

Gender Char(1)

Nexus allows you to determine on which system the processing will take place. We call this the Hub. Above, we have changed the Join Hub System to Oracle. This means that the Vertica and SQL Server tables will be moved to Oracle, where all three tables will be joined. You can actually change the Hub to any system in our enterprise.

Page 155

Chapter 4

Nexus

Change the Hub and Run the Join on SQL Server From the Join Hub System drop down menu choose SQL Server

Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

+

Oracle

-

SQL Server SQL_Class

Query 1

Super Join Builder

Execute

Create Table

Query 1 Objects

Columns

System Tables

+ dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table

V

Addresses

+

Add Join

Select * Subscriber_No Integer

Preview SQL in Nexus

Sorting

Joins

WHERE

Subscribers

+

O

Add Join

Metadata

Claims

+

Analytics

SQL Add Join

Select *

Last_Name Varchar(20) First_Name Varchar(20)

Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Phone Integer Subscriber_No Integer

Street Varchar(30) City Varchar(20)

+ dbo.Providers + dbo.Services + dbo.Subscribers

State Char(2)

SSN Integer

Zip Integer

Member_No Smallint

AreaCode Smallint

Subscriber_No Integer

+ Vertica

SQL

Select *

+ dbo.Department_Table + dbo.Order_Table

Phone Integer

Join Hub System SQL Server

Gender Char(1)

Nexus allows you to determine on which system the processing will take place. We call this the Hub. Above, we have changed the Join Hub System to SQL Server. This means that the Vertica and Oracle tables will be moved to the SQL Server system, where all three tables will be joined. You can actually change the Hub to any system in our enterprise.

Page 156

Chapter 4

Nexus

Simply Amazing - Change the Hub to the Garden of Analysis From the Join Hub System drop down menu choose Garden of Analysis

Nexus Chameleon File Edit View Query Tools Help Web Windows

Systems

+

Oracle

- SQL Server - SQL_Class - System Tables + dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table

Query 1

Super Join Builder

Execute

Create Table

Query 1 Objects

Columns

V

Addresses

+

Add Join

Select * Subscriber_No Integer

Preview SQL in Nexus

Sorting

Joins

WHERE

Subscribers

+

O

Add Join

Metadata

Claims

+

Analytics

SQL Add Join

Select *

Last_Name Varchar(20) First_Name Varchar(20)

Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Phone Integer Subscriber_No Integer

Street Varchar(30) City Varchar(20)

+ dbo.Providers + dbo.Services + dbo.Subscribers

State Char(2)

SSN Integer

Zip Integer

Member_No Smallint

AreaCode Smallint

Subscriber_No Integer

+ Vertica

SQL

Select *

+ dbo.Department_Table + dbo.Order_Table

Phone Integer

Join Hub System Garden of A

Gender Char(1)

Nexus allows you to determine on which system the processing will take place. We call this the Hub. Above, we have changed the Join Hub System to the Garden of Analysis. This should be done when the joining tables are not huge. Now, all of the tables will be queried separately, and then joined transparently inside the user's PC. It is as fast as lightning! Brilliant!

Page 157

Chapter 4

Nexus

Have the Answer Set Saved Automatically to Any System Nexus Chameleon

Choose the Create Table File Edit View Query Tools Help Web Windows option

Systems

+ Oracle

- SQL Server - SQL_Class - System Tables + dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table

Query 1 Execute

Create Table

Query 1 Objects

Columns

V

Addresses

+

Add Join

Select * Subscriber_No Integer

Preview SQL in Nexus

Sorting

Joins

WHERE

Subscribers

+

O

Add Join

Metadata

Claims

+

Analytics

SQL Add Join

Select *

Last_Name Varchar(20) First_Name Varchar(20)

Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Phone Integer Subscriber_No Integer

Street Varchar(30) City Varchar(20)

+ dbo.Providers + dbo.Services + dbo.Subscribers

State Char(2)

SSN Integer

Zip Integer

Member_No Smallint

AreaCode Smallint

Subscriber_No Integer

+ Vertica

SQL

Select *

+ dbo.Department_Table + dbo.Order_Table

Phone Integer

Join Hub System Vertica

Gender Char(1)

Nexus allows you to create a table on any system in your enterprise with an answer set from the Super Join Builder. Once the Create Table option (above) is selected, you will be asked on which system you want the answer set saved, which database or schema and the table name. The answer set won't return to your screen, but instead will be saved as a table to the system you have chosen.

Page 158

Chapter 4

Nexus

Saving the Answer Set to an Oracle or SQL Server System Create Table

Create Table

System

System

Oracle Cloud

SQL Server Test System Database

Database

SQL_Sandbox

SQL_Sandbox

Schema

Table Name

dbo

Addresses_SJB1

Table Name

Create Table

Cancel

Addresses_SJB_Test Create Table

Cancel

Above are the screens you will have to fill in if you want to save your answer set to Oracle (on the left) or SQL Server (on the right). Once you hit the Create Table button you will be returned to the Super Join Builder. When you hit Execute, the answer set will be created as a table using the above details.

Page 159

Chapter 4

Nexus

Saving the Answer Set to a Vertica System Create Table System

Vertica

Database

SQL_Sandbox

Table Name

Addresses_SJB_Test

Distribution Key

Distribution Key SER.Service_Code SER.Service_Desc SER.Service_Pay

Create Table

Cancel

Above is the screen you will have to fill in if you want to save your answer set to a Vertica system. Once you hit the Create Table button you will be returned to the Super Join Builder. When you hit Execute, the answer set will be created as a table using the above details.

Page 160

Chapter 4

Nexus

Saving the Answer Set to a Teradata System Create Table

Multiset tables allow duplicate rows

System

Teradata V15

Database

SQL_Sandbox

Table Name

Addresses_SJB_Test

Table Type Index Type

MultiSet

Set

A Set table kicks out duplicate rows.

Non-Unique Primary Index

The Primary Index is the Distribution Key

SER.Service_Code SER.Service_Desc SER.Service_Pay

Create Table

Cancel

Above is the screen you will have to fill in if you want to save your answer set to a Teradata system. Once you hit the Create Table button you will be returned to the Super Join Builder. When you hit Execute, the answer set will be created as a table using the above details. In Teradata, you will have either a Unique Primary Index, a NonUnique Primary Index or a No Primary Index (NoPI). Above, we have a Non-Unique Primary Index on SER.Service_Code.

Page 161

Chapter 5

Page 162

The Basics of SQL

Chapter 5

The Basics of SQL

Chapter 5 – The Basics of SQL

“As I would not be a slave, so I would not be a master.” - Abraham Lincoln

Page 163

Chapter 5

The Basics of SQL

Introduction Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

The Student_Table above will be used in our early SQL Examples

This is a pictorial of the Student_Table which we will use to present some basic examples of SQL and get some hands-on experience with querying this table. This book attempts to show you the table, show you the query and show you the result set.

Page 164

Chapter 5

The Basics of SQL

Setting your Path Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 set search_path to SQL_Class

Messages

Garden of Analysis

Result 1

Database 1 SQL_CLASS

The example above shows you how to set your path to include a database where your tables can be queried directly.

Page 165

Chapter 5

The Basics of SQL

Setting Your Default Database set search_path to SQL_Class

We have set our default schema to be SQL_Class

SELECT * FROM Student_Table ;

SELECT * FROM SQL_Class.Student_Table ;

The schema is assumed to be SQL_Class

We have specified our schema to be SQL_Class.

Vertica allows you to set your default database. Above, we have set our default database to SQL_Class. If we run a query without specifying the database, then Vertica will assume the database is SQL_Class.

Page 166

Chapter 5

The Basics of SQL

SELECT * (All Columns) in a Table SELECT * FROM Student_Table ;

Student_ID Last_Name ________ ________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450

Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith

First_Name ________ Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy

An asterisk (*) means you want to see ALL columns in the table on your report

Class_Code Grade_Pt _________ _______ FR FR JR ? SO FR SR SR JR SO

0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00

Most every SQL statement will consist of a SELECT and a FROM. You SELECT the columns you want to see on your report and an Asterisk (*) means you want to see all columns in the table on the returning answer set!

Page 167

Chapter 5

The Basics of SQL

Fully Qualifying a Database, Schema and Table Database

Schema

TableName

CREATE TABLE Coffing.SQL_Class.DEPT (DEPT_NO SMALLINT, DEPARTMENT_NAME CHARACTER(30), MGR_NO INT, BUDGET Decimal (10,2) );

SELECT * FROM DEPT; SELECT * FROM Dept;

We just created a table called Dept inside the Coffing Database in theSQL_Class schema.

If you are in the Sales Database, then these statements are both valid because Vertica and is NOT case sensitive.

SELECT * FROM Coffing.SQL_Class.DEPT ;

This is fully qualified

To refer to objects in other databases on Vertica, you must use three-level naming, which consists of the database, the schema (which is the name of the database owner) and the object (table or view etc.). The last example (SQL_Class..Dept) is a convenient way of specifying a fully qualified object name. The system supplies the schema name by internally inserting the current schema name.

Page 168

Chapter 5

The Basics of SQL

SELECT Specific Columns in a Table SELECT First_Name ,Last_Name ,Class_Code ,Grade_Pt FROM Student_Table ;

First_Name _________ Last_Name _________ Class_Code ________ Grade_Pt _________ Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy

Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith

FR FR JR ? SO FR SR SR JR SO

0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00

This is a great way to show the columns you are selecting from the Table_Name.

Page 169

Chapter 5

The Basics of SQL

Commas in the Front or Back? SELECT First_Name ,Last_Name 1 ,Class_Code ,Grade_Pt FROM Student_Table ;

SELECT First_Name, Last_Name, 2 Class_Code, Grade_Pt FROM Student_Table ;

First_Name Last_Name _________ Class_Code ________ Grade_Pt _________ _________ Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy

Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith

FR FR JR ? SO FR SR SR JR SO

0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00

Why is the example on the left better even though they are functionally equivalent? Errors are easier to spot and comments won't cause errors.

Page 170

Chapter 5

The Basics of SQL

Place your Commas in front for better Debugging Capabilities

SELECT First_Name, Last_Name, Class_Code, Grade_Pt,

FROM Student_Table ;

Sometimes if you Add or Remove a COLUMN you can overlook an ending Comma!

SELECT

First_Name ,Last_Name ,Class_Code ,Grade_Pt

FROM Student_Table ;

Error!

Successful

"A life filled with love may have some thorns, but a life empty of love will have no roses." Anonymous Having commas in front to separate column names makes it easier to debug. Remember our quote above. "A query filled with commas at the end just might fill you with thorns, but a query filled with commas in the front will allow you to always come up smelling like roses."

Page 171

Chapter 5

The Basics of SQL

Sort the Data with the ORDER BY Keyword Sorts the Answer Set in Ascending order by default

SELECT * FROM Student_Table ORDER BY Last_Name ;

Student_ID _________ Last_Name First_Name Class_Code Grade_Pt _________ ________ _________ _______ 322133 324652 125634 260000 423400 280023 123250 333450 234121 231222

Bond Delaney Hanson Johnson Larkins McRoberts Phillips Smith Thomas Wilson

Jimmy Danny Henry Stanley Michael Richard Martin Andy Wendy Susie

JR SR FR ? FR JR SR SO FR SO

3.95 3.35 2.88 ? 0.00 1.90 3.00 2.00 4.00 3.80

Rows typically come back to the report in random order. To order the result set, you must use an ORDER BY. When you order by a column, it will order in ASCENDING order. This is called the Major Sort!

Page 172

Chapter 5

The Basics of SQL

ORDER BY Defaults to Ascending Sorts the Answer Set In Ascending Order By Last_Name

SELECT * FROM Student_Table ORDER BY Last_Name ;

Student_ID _________ Last_Name First_Name Class_Code Grade_Pt _________ ________ _________ _______ 322133 324652 125634 260000 423400 280023 123250 333450 234121 231222

Bond Delaney Hanson Johnson Larkins McRoberts Phillips Smith Thomas Wilson

Jimmy Danny Henry Stanley Michael Richard Martin Andy Wendy Susie

JR SR FR ? FR JR SR SO FR SO

3.95 3.35 2.88 ? 0.00 1.90 3.00 2.00 4.00 3.80

Rows typically come back to the report in random order, but we decided to use the ORDER BY statement. Now, the data comes back ordered by Last_Name.

Page 173

Chapter 5

The Basics of SQL

Use the Name or the Number in your ORDER BY Statement SELECT * FROM Student_Table ORDER BY 2 ;

Sorts the Answer Set by Column 2 which is Last_Name

Sort by the 2nd column coming back on the report

Student_ID _________ Last_Name First_Name Class_Code Grade_Pt _________ ________ _________ _______ 322133 324652 125634 260000 423400 280023 123250 333450 234121 231222

Bond Delaney Hanson Johnson Larkins McRoberts Phillips Smith Thomas Wilson

Jimmy Danny Henry Stanley Michael Richard Martin Andy Wendy Susie

JR SR FR ? FR JR SR SO FR SO

3.95 3.35 2.88 ? 0.00 1.90 3.00 2.00 4.00 3.80

The ORDER BY can use a number to represent the sort column. The number 2 represents the second column on the report.

Page 174

Chapter 5

The Basics of SQL

Two Examples of ORDER BY using Different Techniques SELECT * FROM Student_Table ORDER BY 5 ;

Student_ID _________ 260000 423400 280023 333450 125634 123250 324652 231222 322133 234121

Same Query

Last_Name First_Name _________ _________ Johnson Larkins McRoberts Smith Hanson Phillips Delaney Wilson Bond Thomas

Stanley Michael Richard Andy Henry Martin Danny Susie Jimmy Wendy

SELECT * FROM Student_Table ORDER BY Grade_Pt ;

Class_Code _________ Grade_Pt _______ ? FR JR SO FR SR SR SO JR FR

? 0.00 1.90 2.00 2.88 3.00 3.35 3.80 3.95 4.00

Notice that the answer set is sorted in ascending order based on the column Grade_Pt. Also, notice that Grade_Pt is the fifth column coming back on the report. That is why the SQL in both statements is ordering by Grade_Pt. Did you notice that the null value came back first? Nulls sort first in ascending order and last in descending order.

Page 175

Chapter 5

The Basics of SQL

Changing the ORDER BY to Descending Order Sorts the Answer Set In DESC Order By Last_Name

Student_ID Last_Name ________ _________ 231222 Wilson 234121 Thomas 333450 Smith 123250 Phillips 280023 McRoberts 423400 Larkins 260000 Johnson 125634 Hanson 324652 Delaney 322133 Bond

SELECT * FROM Student_Table ORDER BY Last_Name DESC;

First_Name Class_Code Grade_Pt ________ _________ _______ Susie SO 3.80 Wendy FR 4.00 Andy SO 2.00 Martin SR 3.00 Richard JR 1.90 Michael FR 0.00 Stanley ? ? Henry FR 2.88 Danny SR 3.35 Jimmy JR 3.95

Notice that the answer set is sorted in descending order based on the column Last_Name. Also, notice that Last_Name is the second column coming back on the report. We could have done an Order By 2. If you spell out the word DESCENDING the query will fail, so you must remember to just use DESC.

Page 176

Chapter 5

The Basics of SQL

NULL Values sort First in Ascending Mode (Default) SELECT * FROM Student_Table ORDER BY 5 ;

Student_ID _________ 260000 423400 280023 333450 125634 123250 324652 231222 322133 234121

SELECT * FROM Student_Table ORDER BY Grade_Pt ;

Last_Name First_Name _________ _________ Johnson Larkins McRoberts Smith Hanson Phillips Delaney Wilson Bond Thomas

Stanley Michael Richard Andy Henry Martin Danny Susie Jimmy Wendy

Class_Code _________ Grade_Pt _______ ? FR JR SO FR SR SR SO JR FR

Nulls sort first in ASC Order

? 0.00 1.90 2.00 2.88 3.00 3.35 3.80 3.95 4.00

Did you notice that the null value came back first? Nulls sort first in ascending order and last in descending order.

Page 177

Chapter 5

The Basics of SQL

NULL Values sort Last in Descending Mode (DESC) SELECT * FROM Student_Table ORDER BY 5 DESC ;

Student_ID Last_Name _________ __________ 234121 Thomas 322133 Bond 231222 Wilson 324652 Delaney 123250 Phillips 125634 Hanson 333450 Smith 280023 McRoberts 423400 Larkins 260000 Johnson

SELECT * FROM Student_Table ORDER BY Grade_Pt DESC ;

First_Name __________ Wendy Jimmy Susie Danny Martin Henry Andy Richard Michael Stanley

Class_Code ________ Grade_Pt __________ 4.00 FR 3.95 JR 3.80 SO 3.35 SR 3.00 SR 2.88 FR 2.00 SO Nulls sort 1.90 JR Last in 0.00 FR DESC Order ? ?

You can ORDER BY in descending order by putting a DESC after the column name or its corresponding number. Null Values will sort Last in DESC order.

Page 178

Chapter 5

The Basics of SQL

Major Sort vs. Minor Sorts SELECT * FROM Student_Table ORDER BY Class_Code DESC, Grade_Pt ASC;

Student_ID _________ Last_Name ________ 123250 324652 333450 231222 280023 322133 423400 125634 234121 260000

Phillips Delaney Smith Wilson McRoberts Bond Larkins Hanson Thomas Johnson

Minor Sort on Grade_Pt Ascending

First_Name _________ Class_Code Grade_Pt _________ _______ Martin Danny Andy Susie Richard Jimmy Michael Henry Wendy Stanley

SR SR SO SO JR JR FR FR FR ?

Major sorts first

3.00 3.35 2.00 3.80 1.90 3.95 0.00 2.88 4.00 ?

Minor sorts on ties

Major sort is the first sort. There can only be one major sort. A minor sort kicks in if there are Major Sort ties. There can be zero or more minor sorts.

Page 179

Chapter 5

The Basics of SQL

Multiple Sort Keys using Names vs. Numbers SELECT * FROM Employee_Table ORDER BY Dept_No DESC ,Salary ASC ,Last_Name ASC;

SELECT * FROM Employee_Table ORDER BY 2 DESC, 5, 3 ASC ;

These queries sort identically Employee_No __________ 2341218 1256349 1121334 2312225 1324657 1333454 1232578 1000234 2000000

Dept_No _______ 400 400 400 300 200 200 100 10 ?

Last_Name _________ First_Name _______ Salary ________ Reilly Harrison Strickling Larkins Coffing Smith Chambers Smythe Jones

William Herbert Cletus Loraine Billy John Mandee Richard Squiggy

36000.00 54500.00 54500.00 40200.00 41888.88 48000.00 48850.00 64300.00 32800.50

In the example above, the Dept_No is the major sort and we have two minor sorts. The minor sorts are on the Salary and the Last_Name columns. Both Queries above have an equivalent Order by statement and sort exactly the same.

Page 180

Chapter 5

The Basics of SQL

Sorts are Alphabetical, NOT Logical SELECT * FROM Student_Table ORDER BY Class_Code ;

Student_ID ________ Last_Name First_Name Grade_Pt ________ ________ Class_Code ________ ________ 260000 234121 125634 423400 322133 280023 231222 333450 324652 123250

Johnson Thomas Hanson Larkins Bond McRoberts Wilson Smith Delaney Phillips

Stanley Wendy Henry Michael Jimmy Richard Susie Andy Danny Martin

? FR FR FR JR JR SO SO SR SR

? 4.00 2.88 0.00 3.95 1.90 3.80 2.00 3.35 3.00

This sorts alphabetically. Can you change the sort so the Freshman come first, followed by the Sophomores, Juniors, Seniors and then the Null?

Can you change the query to Order BY Class_Code logically (FR, SO, JR, SR, ?)?

Page 181

Chapter 5

The Basics of SQL

Using A CASE Statement to Sort Logically SELECT * FROM Student_Table ORDER BY CASE Class_Code WHEN 'FR' WHEN 'SO' CASE in the WHEN 'JR' ORDER BY WHEN 'SR' Statement

THEN 1 THEN 2 THEN 3 THEN 4 ELSE 5

END; Student_ID ________ Last_Name First_Name Grade_Pt ________ ________ Class_Code ________ ________ 234121 125634 423400 333450 231222 280023 322133 123250 324652 260000

This is the way the pros do it.

Page 182

Thomas Hanson Larkins Smith Wilson McRoberts Bond Phillips Delaney Johnson

Wendy Henry Michael Andy Susie Richard Jimmy Martin Danny Stanley

FR FR FR SO SO JR JR SR SR ?

4.00 2.88 0.00 2.00 3.80 1.90 3.95 3.00 3.35 ?

Chapter 5

The Basics of SQL

How to ALIAS a Column Name Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 Double quotes are used because of spaces or SELECT First_Name, Last_Name reserved words ,Class_Code "class code" ,Grade_Pt AS "AVG" The Keyword AS is optional ,Student_ID AS STU_ID FROM Student_Table You need single quotes in a WHERE WHERE Class_Code = 'JR' clause for character data, but you cannot use single quotes to alias Messages

first_name 1 Richard 2 Jimmy

Garden of Analysis

Result 1

last_name class code McRoberts Bond

JR JR

avg

stu_Id

1.90 3.95

1.90 3.95

When you ALIAS a column, you give it a new name for the report header. You should always reference the column using the ALIAS everywhere else in the query. You never need Double Quotes in SQL unless you are Aliasing.

Page 183

Chapter 5

The Basics of SQL

A Missing Comma can by Mistake become an Alias SELECT First_Name, Last_Name, Class_Code Grade_Pt FROM Student_Table ; Missing a Comma

First_Name Last_Name _________ _________

Michael Susie Richard Jimmy Henry Andy Danny Stanley Wendy Martin

Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

Grade_Pt _______

FR SO JR JR FR SO SR ? FR SR

Aliased as Grade_Pt

Column names must be separated by commas. Notice in this example, there is a comma missing between Class_Code and Grade_Pt. What this will result in is only three columns appearing on your report with one being aliased wrong.

Page 184

Chapter 5

The Basics of SQL

Aliasing a Column Name with Spaces or Reserved Words SELECT Employee_No AS Emp_Number , Dept_No Dept_Number , Last_name "Last Name" , First_name AS 'First Name' , Salary AS "Employee Pay" FROM Employee_table

For an Alias with a space or one that is also a reserved word use either: 1. Double quotes 2. Single quotes

Emp_Number Dept_Number _________ Last Name __________ First_Name Employee Pay ____________ ____________ ____________ 1000234 10 Smythe Richard 64300.00 1121334 400 Strickling Cletus 54500.00 1232578 100 Chambers Mandee 48850.00 1236548 400 Mays Mary 50000.00 1256349 400 Harrison Herbert 54500.00 1324657 200 Coffing Billy 41888.88 1333454 200 Smith John 48000.00 2000000 ? Jones Squiggy 32800.50 2312225 300 Larkins Loraine 40200.00 2341218 400 Reilly William 36000.00

When you ALIAS a column, you give it a new name for the report header. If your alias is a reserved word or has a space in it you can still use it, but you must use either double quotes or single quotes.

Page 185

Chapter 5

The Basics of SQL

Comments using Double Dashes are Single Line Comments Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

+ + + + + + + + + + + + + + +

Database: SQL Class

History

Sandbox

EXECUTE

New Query

Systems Query 1 Query 2 Query 3 Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

-- This is how you can make a comment

Two single dashes comments out the rest of the line

SELECT * FROM Employee_Table WHERE Dept_No = 400 -- We only want Department 400 rows

Messages

Garden of Analysis

Employee_No Dept_No 1 2 3

1256349 1121334 2341218

400 400 400

Result 1 Last_Name First_Name Salary Herbert 55000.00 Harrison Cletus 55000.00 Strickling William 72000.00 Reilly

Double dashes make a single line comment that will be ignored by the system.

Page 186

?

Chapter 5

The Basics of SQL

Comments for Multi-Lines Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

+ + + + + + + + + + + + + + +

Database: SQL Class

History

Sandbox

EXECUTE

New Query

Systems Query 1 Query 2 Query 3 Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

/* This is how you can make multi-line comments to express what is going on in the code. */ SELECT * FROM Employee_Table WHERE Dept_No = 400 Messages

Garden of Analysis

Employee_No Dept_No 1 2 3

1256349 1121334 2341218

400 400 400

Result 1 Last_Name First_Name Salary Herbert 55000.00 Harrison Cletus 55000.00 Strickling William 72000.00 Reilly

Slash Asterisk starts a multi-line comment and Asterisk Slash ends the comment.

Page 187

?

Chapter 5

The Basics of SQL

Comments for Multi-Lines as Double Dashes per Line Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica + + + + + + + + + + + + + + +

Database: SQL Class

History

Sandbox

EXECUTE

New Query

Systems Query 1 Query 2 Query 3 Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

-- This is how you can make multi-line comments -- to express what is going on in the code. SELECT * FROM Employee_Table WHERE Dept_No = 400 Messages

Garden of Analysis

Employee_No Dept_No 1 2 3

1256349 1121334 2341218

400 400 400

Result 1 Last_Name First_Name Salary Herbert 55000.00 Harrison Cletus 55000.00 Strickling William 72000.00 Reilly

Double Dashes in front of both lines comments both lines out and they’re ignored.

Page 188

?

Chapter 5

The Basics of SQL

Formatting Number 9 -Value with the specified number of digits 0 -Value with leading zeros . (period) - Decimal point , (comma) - Group (thousand) separator PR - Negative value in angle brackets S - Sign anchored to number (uses locale) L - Currency symbol (uses locale) D - Decimal point (uses locale) G - Group separator (uses locale) MI - Minus sign in specified position (if number < 0) PL - Plus sign in specified position (if number > 0) SG - Plus/minus sign in specified position RN - Roman numeral (input between 1 and 3999) TH or th - Ordinal number suffix V - Shift specified number of digits (see notes)

Vertica gives you many options for formatting numbers. The next page will show an example.

Page 189

Chapter 5

The Basics of SQL

Formatting Number Examples SELECT Salary ,TO_CHAR(Salary , 'L999,999.99') AS Dollarsign ,TO_CHAR(Salary , 'SL999999') AS Anchored ,TO_CHAR(Salary , '0000000') AS LeadingZ ,TO_CHAR(Salary , '99999999999.99') AS Float9 ,TO_CHAR(Salary , '999,999,999.99') AS Commas FROM Employee_Table WHERE Dept_No = 200 ;

Salary __________ Dollarsign _________ Anchored _________ LeadingZ ________ Float9 _________ Commas _______

41888.88 48000.00

$ 41,888.88 $ +41889 $ 48,000.00 $ +48000

0041889 41888.88 41,888.88 0048000 48000.00 48,000.00

Above you can see an example of formatted numbers using the TO_CHAR command.

Page 190

Chapter 5

The Basics of SQL

Formatting Dates HH - Hour of day (00-23) HH12 - Hour of day (01-12) HH24 - Hour of day (00-23) MI - Minute (00-59) SS - Second (00-59) MS - Millisecond (000-999) US - Microsecond (000000-999999) SSSS - Seconds past midnight (0-86399) AM or A.M. or PM or P.M. - (uppercase) am or a.m. or pm or p.m. - (lowercase) Y,YYY - Year with comma YYYY - Year (4 and more digits) YYY - Last 3 digits of year YY - Last 2 digits of year Y - Last digit of year IYYY - ISO year (4 and more digits) IYY - Last 3 digits of ISO year IY - Last 2 digits of ISO year I - Last digits of ISO year BC or B.C. or AD or A.D. - (uppercase) bc or b.c. or ad or a.d. - (lowercase) MONTH - Full uppercase month name Month - Full mixed-case month name month - Full lowercase month name

MON - Uppercase month (3 chars) Mon - Mixed-case month (3 chars) mon - Lowercase month (3 chars) MM - Month number (01-12) DAY - Full uppercase day Day - Full mixed-case day day - Full lowercase day DY - Uppercase day (3 chars) Dy - Mixed-case day (3 chars) dy - Lowercase day (3 chars) DDD - Day of year (001-366) DD - Day of month (01-31) for TIMESTAMP D - Day of week (1-7) Sunday = 1 W - Week of month (1-5) WW - Week number of year (1-53) IW - ISO week number of year CC - Century (2 digits) J - Julian Day (days since Jan 1, 4712 BC) Q - Quarter RM - Month in Roman numerals rm - Month in Roman numerals (lowercase) TZ - Time-zone name (uppercase) tz - Time-zone name (lowercase)

Vertica gives you many options for formatting dates. The next page will show an example.

Page 191

Chapter 5

The Basics of SQL

Formatting Date Example SELECT Order_Date ,TO_CHAR(Order_Date , 'YY-MM-DD') AS YMD ,TO_CHAR(Order_Date , 'MON, DD, YYYY') AS Month ,TO_CHAR(Order_Date , 'D, Mon DD, YY') AS DayofWeek ,Current_Time as Time ,TO_CHAR(Current_Time , 'HH24:MI:SS:MS') AS Micro FROM Order_Table WHERE EXTRACT(Year from Order_Date) = 1998 ;

Order_Date YMD Month DayofWeek _______ Time Micro __________ ________ ____________ ____________ ___________

05/04/1998 98-05-04 MAY, 04, 1998 2, May 04, 98 10:09:58 10:09:58:109

Above you can see an example of formatted dates using the TO_CHAR command.

Page 192

Chapter 6

Page 193

The Where Clause

Chapter 6

The Where Clause

Chapter 6 – The WHERE Clause

“I saw the angel in the marble and carved until I set him free.” - Michelangelo

Page 194

Chapter 6

The Where Clause

The WHERE Clause limits Returning Rows Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3

SELECT * FROM Employee_table Where Dept_No = 400 ; Messages

Garden of Analysis

Employee_No Dept_No 1 1256349 400 2 1121334 400 3 2341218 400

The WHERE clause limits the rows Result 1 Last_Name Harrison Strickling Reilly

First_Name Salary Herbert 55000.00 Cletus 55000.00 William 72000.00

The WHERE Clause here filters how many ROWS are coming back. In this example, I am asking for the report to only rows WHERE the first name is Henry.

Page 195

Chapter 6

The Where Clause

Double Quoted Aliases are for Reserved Words and Spaces The AS keyword is always optional.

SELECT First_Name AS Fname If spaces are in the ,Last_Name Lname Alias, you must use ,Class_Code "Class Code" double quotes. ,Grade_Pt AS "AVG" ,Student_ID FROM Student_Table ORDER BY "AVG" ; If Double Quotes are used, then use the Double Quotes throughout the SQL

“Write a wise saying and your name will live forever.”

- Anonymous

When you ALIAS a column you give it a new name for the report header, but a good rule of thumb is to refer to the column by the alias throughout the query. Whoever wrote the above quote was way off. "Write a wise alias and it will live until the query ends – bummer".

Page 196

Chapter 6

The Where Clause

Character Data needs Single Quotes in the WHERE Clause Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_table WHERE First_Name = 'Henry' ; Messages

Garden of Analysis

Character data needs single quotes

Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt 1 125634 Hanson Henry FR 2.88

In the WHERE clause, if you search for Character data such as first name, you need single quotes around it. You don’t single-quote integers.

Page 197

Chapter 6

The Where Clause

Character Data needs Single Quotes, but Numbers Don’t Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_table WHERE Grade_Pt = 0.00 ; Messages

Garden of Analysis

Numeric data never needs quotes Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt 1 423400 Larkins Michael FR 0.00

Character data (letters) need single quotes, but you need NO Single Quotes for Integers (numbers). Remember you never use double quotes except for aliasing.

Page 198

Chapter 6

The Where Clause

Comparisons against a Null Value Col_A 1 2 3 4 5 6 7 8 9 10 NULL NULL NULL NULL NULL NULL

Operator + / * > >=
= =) Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_table WHERE Grade_PT >= 3.0; Messages

1 2 3 4 5

Student_ID 123250 231222 234121 322133 324652

Garden of Analysis

Result 1

Last_Name First_Name Class_Code Grade_Pt Martin SR 3.00 Phillips Susie SO 3.80 Wilson Wendy FR 4.00 Thomas Jimmy JR 3.95 Bond Danny SR 3.35 Delaney

The WHERE Clause doesn’t just deal with ‘Equals’. You can look for things that are GREATER or LESSER THAN along with asking for things that are GREATER/LESSER THAN or EQUAL to.

Page 204

Chapter 6

The Where Clause

AND in the WHERE Clause Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_table WHERE Class_Code = 'Fr' AND First_Name = 'Henry' ; Messages

Garden of Analysis

Both conditions must be met

Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt 1 125634 Hanson Henry FR 2.88

Notice the WHERE statement and the word AND. In this example, qualifying rows must have a Class_Code = ‘FR’ and also must have a First_Name of ‘Henry’. Notice how the WHERE and the AND clause are on their own line. Good practice!

Page 205

Chapter 6

The Where Clause

Troubleshooting AND Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT FROM WHERE AND Messages

* Student_Table Grade_Pt = 3.0 Grade_Pt = 4.0 ; Garden of Analysis

Both conditions must be met Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt No rows are returned because No student can have two different Grade_Pt values

What is going wrong here? You are using an AND to check the same column. What you are basically asking with this syntax is to see the rows that have BOTH a Grade_Pt of 3.0 and a 4.0. No rows will be returned.

Page 206

Chapter 6

The Where Clause

OR in the WHERE Clause Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT FROM WHERE OR Messages

* Student_Table Grade_Pt = 3.0 Grade_Pt = 4.0 ; Garden of Analysis

Either conditions can be met Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt 123250 Phillips Martin SR 3.00 1 234121 Thomas Wendy FR 4.00 2

Notice above in the WHERE Clause we use OR. OR allows for either of the parameters to be TRUE in order for the data to qualify and return.

Page 207

Chapter 6

The Where Clause

Troubleshooting Or Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 OR 4.0; error

SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 OR Grade_Pt = 4.0; perfect

Notice above in the WHERE Clause we use OR. OR allows for either of the parameters to be TRUE in order for the data to qualify and return. The first example errors and is a common mistake. The second example is perfect.

Page 208

Chapter 6

The Where Clause

Troubleshooting Character Data Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 AND Class_Code = SR ;

Error!!! Why?

This query errors! What is WRONG with this syntax? No Single quotes around SR.

Page 209

Chapter 6

The Where Clause

Using Different Columns in an AND Statement Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_table WHERE Grade_Pt = 3.0 AND Class_Code = 'SR' ; Messages

Garden of Analysis

Character data needs single quotes

Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt 1 123250 Phillips Martin SR 3.00

Notice that AND separates two different columns, and the data will come back if both are TRUE.

Page 210

Chapter 6

The Where Clause

Quiz – How many rows will return? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Grade_Pt = 4.0 OR Grade_Pt = 3.0 AND Class_Code = 'SR' ; Which Seniors have a 3.0 or a 4.0 Grade_Pt average. How many rows will return?

Page 211

A) 2

C) Error

B) 1

D) 3

Chapter 6

The Where Clause

Answer to Quiz – How many rows will return? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Grade_Pt = 4.0 OR Grade_Pt = 3.0 AND Class_Code = 'SR' ;

Student_ID _________ Last_Name __________ First_Name Class_Code Grade_Pt _________ __________ ________ 234121 Thomas Wendy FR 4.00 123250 Phillips Martin SR 3.00

We had two rows return! Isn’t that a mystery? Why?

Page 212

Chapter 6

The Where Clause

What is the Order of Precedence?

1

()

2

NOT

3

AND

4

OR

SELECT * FROM Student_Table WHERE Grade_Pt = 4.0 OR Grade_Pt = 3.0 AND Class_Code = 'SR' ; Syntax has an ORDER OF PRECEDENCE. It will read anything with parentheses around it first. Then, it will read all the NOT statements. Next, the AND statements. FINALLY, the OR Statements. This is why the last query came out odd. Let’s fix it and bring back the right answer set.

Page 213

Chapter 6

The Where Clause

Using Parentheses to change the Order of Precedence Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE (Grade_Pt = 3.0 OR Grade_Pt = 4.0) AND Class_Code = 'SR' ; Messages

Garden of Analysis

Parenthesis are evaluated first

Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt 1 123250 Phillips Martin SR 3.00

This is the proper way of looking for rows that have both a Grade_Pt of 3.0 or 4.0 AND also having a Class_Code of ‘SR’. Only ONE row comes back. Parentheses are evaluated first, so this allows you to direct exactly what you want to work first.

Page 214

Chapter 6

The Where Clause

Using an IN List in place of OR Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Grade_Pt IN (3.0, 4.0) AND Class_Code = 'SR' ; Messages

Garden of Analysis

This IN list means any Grade_Pt with a 3.0 or 4.0

Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt 1 123250 Phillips Martin SR 3.00

Using an IN List is a great way of looking for rows that have both a Grade_Pt of 3.0 or 4.0 AND also have a Class_Code of ‘SR’. Only ONE row comes back.

Page 215

Chapter 6

The Where Clause

The IN List is an Excellent Technique Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Grade_Pt IN (2.0, 3.0, 4.0) ; Messages

Student_ID 1 123250 2 234121 3 333450

Garden of Analysis

Last_Name Phillips Thomas Smith

Result 1

First_Name Class_Code Grade_Pt 3.00 Martin SR 4.00 Wendy FR 2.00 Andy SO

The IN Statement avoids retyping the same column name separated by an OR. The IN allows you to search the same column for a list of values. Both queries above are equal, but the IN list is a nice way to keep things easy and organized.

Page 216

Chapter 6

The Where Clause

IN List vs. OR brings the same Results Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT * FROM Student_Table WHERE Grade_Pt IN (2.0, 3.0, 4.0) ; An IN list is a better technique

Both examples Produce the same results

SELECT FROM WHERE OR OR

* Student_Table Grade_Pt = 2.0 Grade_Pt = 3.0 Grade_Pt = 4.0 ;

The IN Statement avoids retyping the same column name separated by an OR. The IN allows you to search the same column for a list of values. Both queries above are equal, but the IN list is a nice way to keep things easy and organized.

Page 217

Chapter 6

The Where Clause

The IN List Can Use Character Data Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Last_Name IN ('Larkins', 'Bond') ; Messages

Garden of Analysis

Single quotes are used for character data

Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt 3.95 1 322133 Bond Jimmy JR 0.00 2 423400 Larkins Michael FR

The IN Statement avoids retyping the same column name separated by an OR. The IN allows you to search the same column for a list of values. This works with character data as long as you use single quotes.

Page 218

Chapter 6

The Where Clause

Using a NOT IN List Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Grade_Pt NOT IN (2.0, 3.0, 4.0) ; Messages

1 2 3 4 5 6

Student_ID 125634 231222 280023 322133 324652 423400

Garden of Analysis

Last_Name Hanson Wilson McRoberts Bond Delaney Larkins

Result 1

First_Name Class_Code Grade_Pt 2.88 Henry FR 3.80 Susie SO 1.90 Richard JR 3.95 Jimmy JR 3.35 Danny SR 0.00 Michael FR

“First you imitate, then you innovate.” - Miles Davis

You can also ask to see the results that ARE NOT IN your parameter list. That requires the column name and a NOT IN. Neither the IN nor NOT IN can search for NULLs! Miles Davis got this IT quote all wrong. First you innovate, and then you sue anyone who imitates. Please make a note of it!

Page 219

Chapter 6

The Where Clause

Null Values in a NOT IN List Bring Back No Rows Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Grade_Pt NOT IN (2.0, 3.0, 4.0, NULL) ; Warning: When a NOT IN statement encounters a NULL- NO DATA RETURNS Messages

Garden of Analysis

Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt

Few people know that when a NOT IN is used, and a null value is encountered, that no data returns. This is because a null value equals nothing so it can't compare and eliminate values.

Page 220

Chapter 6

The Where Clause

A Technique for Handling Nulls with a NOT IN List SELECT FROM WHERE OR

Student_ID _________ 423400 231222 280023 322133 125634 324652 260000

* Student_Table Grade_Pt NOT IN (2.0, 3.0, 4.0) Grade_Pt IS NULL ;

Last_Name _________ Larkins Wilson McRoberts Bond Hanson Delaney Johnson

First_Name __________ Class_Code ________ Grade_Pt __________ Michael Susie Richard Jimmy Henry Danny Stanley

FR SO JR JR FR SR ? The null row now comes back

This is a great technique to look for a NULL when using a NOT IN List.

Page 221

0.00 3.80 1.90 3.95 2.88 3.35 ?

Chapter 6

The Where Clause

BETWEEN is Inclusive SELECT * FROM Student_Table WHERE Grade_Pt BETWEEN 2.0 AND 4.0 ;

Student_ID _________ 125634 231222 324652 322133 234121 333450 123250

Last_Name _________ First_Name __________ Class_Code __________ Grade_Pt ________ Hanson Wilson Delaney Bond Thomas Smith Phillips

Henry Susie Danny Jimmy Wendy Andy Martin

FR SO SR JR FR SO SR

2.88 3.80 3.35 3.95 4.00 2.00 3.00

2.0 and 4.0 come back in the answer set. The BETWEEN statement is therefore inclusive.

This is a BETWEEN. What this allows you to do is see if a column falls in a range. It is inclusive, meaning that in our example, and we will be getting the rows that also have a 2.0 and 4.0 in their column!

Page 222

Chapter 6

The Where Clause

NOT BETWEEN is Also Inclusive Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Grade_Pt NOT BETWEEN 2.0 AND 4.0 ; Messages

Garden of Analysis

Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt 1.90 1 280023 McRoberts Richard JR 0.00 2 423400 Larkins Michael FR NOT BETWEEN is also inclusive

"The difference between genius and stupidity is that genius has its limits." Albert Einstein

This is a NOT BETWEEN example. What this allows you to do is see if a column does not fall in a range. It is inclusive, meaning that in our example, we will be getting no rows where the grade_pt is between a 2.0 and 4.0 in their column! The 2.0 and the 4.0 will also not return.

Page 223

Chapter 6

The Where Clause

LIKE uses Wildcards Percent ‘%’ and Underscore ‘_’ Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Last_Name LIKE 'Sm%' ;

% is a wildcard for any number of characters when used with the LIKE command Messages

Garden of Analysis

Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt 1 333450 Smith Andy SO 2.00 Any Last_Name that starts with 'Sm' returns

The wildcard percentage sign (%) is a wildcard for any number of characters. We are looking for anyone whose name starts with SM! In this example, the only row that would come back is ‘Smith’. The next page will show an example of underscore.

Page 224

Chapter 6

The Where Clause

LIKE command Underscore is Wildcard for one Character Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 Any Last_Name that has an 'a' SELECT * in the second character qualifies FROM Student_Table WHERE Last_Name LIKE '_a%' ; An Underscore _ is a wildcard for a single character when used with the LIKE command Messages

Garden of Analysis

Result 1

Student_ID Last_Name First_Name Class_Code Grade_Pt 2.88 1 125634 Hanson Henry FR 0.00 2 423400 Larkins Michael FR

The _ underscore sign is a wildcard for any a single character. We are looking for anyone who has an 'a' as the second letter of their last name.

Page 225

Chapter 6

The Where Clause

LIKE Command Works Differently on Char Vs Varchar Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

EXECUTE

Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE First_Name LIKE '%y' ; Messages

1 2 3 4 5 6

Student_ID 125634 234121 260000 322133 324652 333450

Garden of Analysis

Last_Name Hanson Thomas Johnson Bond Delaney Smith

Sandbox ?

New Query

Any First_Name that ends in 'y'

Result 1

First_Name Class_Code Grade_Pt 2.88 Henry FR 4.00 Wendy FR ? Stanley ? 3.95 Jimmy JR 3.35 Danny SR 2.00 Andy SO

It is important that you know the data type of the column you are using with your LIKE command. VARCHAR and CHAR data differ slightly, but Vertica handles them both wisely.

Page 226

Chapter 6

The Where Clause

LIKE Command on Character Data Auto Trims Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

EXECUTE

Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Last_Name LIKE '%n' ;

Messages

Student_ID 1 125634 2 231222 3 260000

Garden of Analysis

Last_Name Hanson Wilson Johnson

Sandbox ?

New Query

Any Last_Name that ends in 'n'

Result 1

First_Name Class_Code Grade_Pt 2.88 Henry FR 3.80 Susie SO ? Stanley ?

This is a CHAR (20) data type. That usually means that any words under 20 characters will pad spaces behind them until they reach 20 characters. You will normally not get any rows back from this example because technically no row ends in an ‘N’, but instead ends in a space. Vertica handles this for you so rows come back.

Page 227

Chapter 6

The Where Clause

Quiz – What Data is Left Justified and what is Right? SELECT FROM WHERE AND

* Sample_Table Column1 IS NULL Column2 IS NULL ;

Answer Set Column1 Integers are Right Justified!

? Right Justified

Column2

?

Character Data is Left Justified!

Left Justified

Which Column from the Answer Set could have a DATA TYPE of INTEGER, and which could have Character Data?

Page 228

Chapter 6

The Where Clause

Numbers are Right Justified and Character Data is Left SELECT FROM WHERE AND

* Sample_Table Column1 IS NULL Column2 IS NULL ;

Answer Set Column1 Integers are Right Justified!

? Right Justified

Column2

?

Character Data is Left Justified!

Left Justified

All Integers will start from the right and move left. Thus, Col1 was defined during the table create statement to hold an INTEGER. The next page shows a clear example.

Page 229

Chapter 6

The Where Clause

Answer – What Data is Left Justified and what is Right? SELECT Employee_No, First_Name FROM Employee_Table WHERE Employee_No = 2000000;

Answer Set Employee_No ____________ Integers are Right justified!

2000000

First_Name __________ Squiggy

Characters are Left justified!

All Integers will start from the right and move left. All Character data will start from the left and move to the right.

Page 230

Chapter 6

The Where Clause

An Example of Data with Left and Right Justification SELECT Student_ID, Last_Name FROM Student_Table ;

Student_ID __________

Integers are Right justified!

423400 125634 280023 260000 231222 234121 324652 123250 322133 333450

Last_Name _______

Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith

Characters are Left justified!

This is how a standard result set will look. Notice that the integer type in Student_ID starts from the right and goes left. Character data type in Last_Name moves left to right like we are used to seeing when reading English.

Page 231

Chapter 6

The Where Clause

A Visual of CHARACTER Data vs. VARCHAR Data Character Data on Disk Last_Name as a Char(20)

Jones _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Hanson _ _ _ _ _ _ _ _ _ _ _ _ _ _

Spaces padded at the end

McRoberts _ _ _ _ _ _ _ _ _ _ _ Johnson _ _ _ _ _ _ _ _ _ _ _ _ _ Varchar Data on Disk

Last_Name as a Varchar(20) 2-byte VLI Variable Length Indicator

0

5 Jones

0

6 Hanson

0

9 McRoberts

0

7

No Spaces

Johnson

Character data pads spaces to the right and Varchar uses a 2-byte VLI instead.

Page 232

Chapter 6

The Where Clause

Use the TRIM command to remove spaces on CHAR Data Character Data on Disk Last_Name as a Char(20) Jones _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Hanson _ _ _ _ _ _ _ _ _ _ _ _ _ _

Spaces padded at the end

Wilson _ _ _ _ _ _ _ _ _ _ _ _ _ _

Johnson _ _ _ _ _ _ _ _ _ _ _ _ _ SELECT Last_Name FROM Student_Table WHERE TRIM (Last_Name) LIKE '%n' ;

Trim removes spaces at the front and back

Last_Name __________ Hanson Wilson Johnson

Last_Name has a Data Type of CHAR (20)

By using the TRIM command on the Last_Name column you are able to trim off any spaces from the end. Once we use the TRIM on Last_Name we have eliminated any spaces at the end, so now we are set to bring back anyone with a Last_Name that truly ends in ‘n’!

Page 233

Chapter 6

The Where Clause

Escape Character in the LIKE Command changes Wildcards

Student_ID __________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999

Student_Table Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_

Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%

FR FR JR ? SO FR SR SR JR SO FR

0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90

/* We just pretended to add a new row to the Student_Table */

/* Can you use the LIKE command to find S% above? */ Here you will have to utilize a Wildcard Escape Character. Turn the page for more.

Page 234

Chapter 6

The Where Clause

Escape Characters Turn off Wildcards in the LIKE Command Student_Table

__________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999

__________ __________ __________ ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_

Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%

FR FR JR ? SO FR SR SR JR SO FR

0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90

Can you use the LIKE command to find S% above? SELECT * FROM Student_Table WHERE First_Name LIKE 'S@%' Escape '@';

We can pick our Escape character and we have chosen a @ sign character. This turns the wildcard off for 1 character, so we find ‘S%’ without bringing back Stanley or Susie.

Page 235

Chapter 6

The Where Clause

Quiz – Turn off that Wildcard

Student_ID __________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999

Student_Table Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_

Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%

FR FR JR ? SO FR SR SR JR SO FR

0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90

Can you use the LIKE command to find the Last_Name of T_? (pronounced Tunderscore!)

This is a little trickier than you might think so be on your toes…. And get a haircut!

Page 236

Chapter 6

The Where Clause

ANSWER – To Find that Wildcard Student_ID __________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999

Student_Table Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_

Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%

FR FR JR ? SO FR SR SR JR SO FR

0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90

Can you use the LIKE command to find the Last_Name of T_? (pronounced Tunderscore!)

SELECT * FROM Student_Table WHERE TRIM(Last_Name) LIKE 'T@_' Escape '@' ;

You didn’t really need to get a full haircut, but just a TRIM Command and the Escape!

Page 237

Chapter 6

The Where Clause

The Distinct Command Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT DISTINCT Class_Code FROM Student_Table ORDER BY 1 ; Messages

1 2 3 4 5

Garden of Analysis

The keyword DISTINCT won't allow duplicate Class_Code values to return Result 1

Class_Code FR JR SO SR ?

DISTINCT eliminates duplicates from returning in the Answer Set.

Page 238

Sandbox

Chapter 6

The Where Clause

Distinct vs. GROUP BY Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT Class_Code FROM Student_Table GROUP BY Class_Code ; Messages

Garden of Analysis

The GROUP BY statement is equivalent to the DISTINCT command seen previously

Result 1

Class_Code 1 2 3 4 5

? FR JR SO SR

Distinct and GROUP BY in the two examples return the same answer set.

Page 239

Sandbox

Chapter 6

The Where Clause

Quiz – How many rows come back from the Distinct? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT Distinct Class_Code, Grade_Pt FROM Student_Table ORDER BY Class_Code, Grade_Pt;

How many rows will come back from the above SQL?

Page 240

Chapter 6

The Where Clause

Answer – How many rows come back from the Distinct? SELECT Distinct Class_Code, Grade_Pt FROM Student_Table ORDER BY Class_Code, Grade_Pt ;

Class_Code __________ ? FR FR FR JR JR SO SO SR SR

Grade_Pt ________ ? 0.00 2.88 4.00 1.90 3.95 2.00 3.80 3.00 3.35

No Rows have the exact same values for both the Class_Code and Grade_Pt. Each row is Distinct!

How many rows will come back from the above SQL? 10. All rows came back. Why? Because there are no exact duplicates that contain a duplicate Class_Code and Duplicate Grade_Pt combined. Each row in the SELECT list is distinct.

Page 241

Chapter 7

Page 242

Aggregation

Chapter 7

Aggregation

Chapter 7 – Aggregation

“Vertica climbed Aggregate Mountain and delivered a better way to Sum It.” - Tera-Tom Coffing

Page 243

Chapter 7

Aggregation

Quiz – You calculate the Answer Set in your own Mind Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT

FROM WHERE

Avg(Grade_Pt) AS "AVG" ,Count(Grade_Pt) AS "Count" ,Count(*) AS "Count *" Student_Table Class_Code IS NULL AVG _____ Count _____

Count * _______

What would the result set be from the above query? The next slide shows answers!

Page 244

Chapter 7

Aggregation

Answer – You calculate the Answer Set in your own Mind Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT

Avg(Grade_Pt) AS "AVG" ,Count(Grade_Pt) AS "Count" ,Count(*) AS "Count *" Student_Table Class_Code IS NULL

FROM WHERE

AVG _____ Count _____ ?

Here are your answers!

Page 245

0

Count * _______ 1

Here are the correct answers

Aggregates ignore Null values

Chapter 7

Aggregation

Quiz – You calculate the Answer Set in your own Mind Aggregation_Table Employee_No 423400 423401 423402

Salary

100000.00 100000.00 NULL

SELECT AVG(Salary) as "AVG" ,Count(Salary) as SalCnt ,Count(*) as RowCnt FROM Aggregation_Table ;

AVG _____

SalCnt _______ RowCnt ______

Please fill in the values you think will be in the Answer.

What would the result set be from the above query? The next slide shows answers!

Page 246

Chapter 7

Aggregation

Answer – You calculate the Answer Set in your own Mind Aggregation_Table Employee_No

Salary

100000.00 100000.00 NULL

423400 423401 423402

Aggregates ignore Null values

SELECT AVG(Salary) as "AVG" ,Count(Salary) as SalCnt ,Count(*) as RowCnt FROM Aggregation_Table ;

AVG _____

100000.00 Here are your answers!

Page 247

SalCnt RowCnt ______ _______

2

3

Here are the correct answers

Chapter 7

Aggregation

The 3 Rules of Aggregation Aggregation_Table Employee_No 423400 423401 423402

Salary 100000.00 100000.00 NULL

SELECT AVG(Salary) as "AVG" ,Count(Salary) as SalCnt ,Count(*) as RowCnt FROM Aggregation_Table ;

1) Aggregates Ignore Null Values.

2) Aggregates WANT to come back in one row. 3) You CAN’T mix Aggregates with normal columns unless you use a GROUP BY.

AVG(Salary) = $100000.00

Count(Salary) = 2

Follow the three rules of aggregation and you will make fewer mistakes.

Page 248

Count(*) = 3

Chapter 7

Aggregation

There are Five Aggregates Nexus Chameleon System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

Sandbox

History

File Edit View Query Tools Help Web Windows

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,COUNT(*) FROM Employee_Table ; Messages

Garden of Analysis

MIN 1 32800.50

MAX 64300.00

These are the five aggregates

Result 1

SUM 421039.38

AVG 46782.15

COUNT 9

Aggregate are designed to return a single row “Don’t count the days, make the days count.” – Mohammed Ali

The five aggregates are listed above. Mohammed Ali was way off in his quote. He meant to say, "Don't you count the days, but instead make the data count for you".

Page 249

Chapter 7

Aggregation

Quiz – How many rows come back? Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT MIN (Salary) AS Min ,MAX (Salary) AS Max ,SUM (Salary) AS Sum ,AVG (Salary) AS AVG ,COUNT(*) COUNT FROM Employee_Table ; Messages

Garden of Analysis

How many rows will the above query produce in the result set?

Page 250

History

1) How many columns return? 2) How many rows return?

Result 1

Chapter 7

Aggregation

Answer – How many rows come back? Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

Query 1 Query 2 Query 3 SELECT MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,COUNT(*) FROM Employee_Table ; Messages

Garden of Analysis

MIN 1 32800.50

MAX 64300.00

These are the five aggregates

Result 1

SUM AVG 421039.38 46782.15

Aggregate are designed to return a single row

How many rows will the above query produce in the result set? The answer is one.

Page 251

New Query

COUNT 9

Chapter 7

Aggregation

Troubleshooting Aggregates Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT Dept_No ,MIN (Salary) AS Min ,MAX (Salary) AS Max ,SUM (Salary) AS Sum ,AVG (Salary) AS AVG ,COUNT(*) COUNT FROM Employee_Table ; Messages

Garden of Analysis

What happens when you mix normal columns with columns that are aggregated?

Result 1

If you have a normal column (not aggregated) in your query, you must have a corresponding GROUP BY statement.

Page 252

Chapter 7

Aggregation

GROUP BY when Aggregates and Normal Columns Mix

NON-Aggregate

Group By Needed

If you have a normal column (not aggregated) in your query, you must have a corresponding GROUP BY statement.

Page 253

Chapter 7

Aggregation

GROUP BY delivers one row per Group

Group By Needed

Dept_No ________ 10 100 200 300 400 ?

Min(Salary) __________ 64300.00 48850.00 41888.88 40200.00 36000.00 32800.50

NON-Aggregate SELECT Dept_No ,MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,Count(*) FROM Employee_Table GROUP BY Dept_No ORDER BY Dept_No ;

Max(Salary) __________ 64300.00 48850.00 48000.00 40200.00 54500.00 32800.50

Sum(Salary) ___________ AVG(Salary) Count(*) __________ _______ 64300.00 1 64300.00 48850.00 1 48850.00 44944.44 2 89888.88 40200.00 1 40200.00 48333.33 3 145000.00 32800.50 1 32800.50

GROUP BY Dept_No command allow for the Aggregates to be calculated per Dept_No. The data has also been sorted with the ORDER BY statement.

Page 254

Chapter 7

Aggregation

GROUP BY Dept_No or GROUP BY 1 the same thing SELECT Dept_No ,MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,Count(*) FROM Employee_Table GROUP BY Dept_No ORDER BY Dept_No;

Dept_No ________ ? 10 100 200 300 400

Min(Salary) __________ 32800.50 64300.00 48850.00 41888.88 40200.00 36000.00

Both Queries are exactly the same

Max(Salary) __________ 32800.50 64300.00 48850.00 48000.00 40200.00 54500.00

SELECT Dept_No ,MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,Count(*) FROM Employee_Table GROUP BY 1 ORDER BY 1;

Sum(Salary) ___________ AVG(Salary) Count(*) __________ _______ 32800.50 1 32800.50 64300.00 1 64300.00 48850.00 1 48850.00 44944.44 2 89888.88 40200.00 1 40200.00 48333.33 3 145000.00

Both queries above produce the same result. The GROUP BY allows you to either name the column or use the number in the SELECT list just like the ORDER BY. .

Page 255

Chapter 7

Aggregation

Limiting Rows and Improving Performance with WHERE Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

EXECUTE

Query 1 Query 2 Query 3 SELECT Dept_No ,MAX (Salary) AS Max ,SUM (Salary) AS Sum ,AVG (Salary) AS AVG FROM Employee_Table WHERE Dept_No IN (200, 400) GROUP BY Dept_No ; Messages

Dept_No 1 2

200 400

Garden of Analysis

MAX 48000.00 54500.00

?

SUM

89888.88 145000

New Query

Calculations are only done on rows WHERE Dept_No Equals 200 or 400

Result 1

Will Dept_No 300 be calculated? Of course you know it will…NOT!

Page 256

Sandbox

AVG

44944.44 48333.33

Chapter 7

Aggregation

WHERE Clause in Aggregation limits unneeded Calculations Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

EXECUTE

Query 1 Query 2 Query 3 SELECT Dept_No ,MAX (Salary) AS Max ,SUM (Salary) AS Sum ,AVG (Salary) AS AVG FROM Employee_Table WHERE Dept_No IN (200, 400) GROUP BY Dept_No ; Messages

Dept_No 1 2

200 400

Garden of Analysis

MAX 48000.00 54500.00

Sandbox ?

New Query

The WHERE clause is a filter that will speed up the query.

Result 1

SUM

89888.88 145000

AVG

44944.44 48333.33

The system eliminates reading any other Dept_No’s other than 200 and 400. This means that only Dept_No’s of 200 and 400 will come off the disk to be calculated.

Page 257

Chapter 7

Aggregation

Keyword HAVING tests Aggregates after they are totaled Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

EXECUTE

Query 1 Query 2 Query 3 SELECT Dept_No ,MAX (Salary) AS Max ,SUM (Salary) AS Sum ,AVG (Salary) AS AVG FROM Employee_Table WHERE Dept_No IN (200, 400) GROUP BY Dept_No HAVING AVG(Salary) > 45000 ; Messages

Dept_No

1

400

Garden of Analysis

MAX

54500.00

Sandbox ?

New Query

HAVING filters rows after the rows are aggregated.

Result 1

SUM 145000

AVG 48333.33

The HAVING Clause only works on Aggregate Totals. The WHERE filters rows to be excluded from calculation, but the HAVING filters the Aggregate totals after the calculations, thus eliminating certain Aggregate totals.

Page 258

Chapter 7

Aggregation

Keyword HAVING is like an Extra WHERE Clause for totals SELECT Dept_No, MIN (Salary), MAX (Salary), SUM (Salary) , AVG (Salary) , COUNT(*) FROM Employee_Table WHERE Dept_No in (200, 400) GROUP BY Dept_No HAVING Count(*) > 2 ;

HAVING clause acts as a filter on all aggregates after they are totaled.

Previous Answer Set (without HAVING statement) Dept_No Min(Salary) Max(Salary) Sum(Salary) AVG(Salary) Count(*) ________ __________ __________ __________ ___________ ________ 200 41888.88 48000.00 89888.88 2 44944.44 400 36000.00 54500.00 145000.00 3 48333.33

New Answer Set using the HAVING Statement Dept_No Max(Salary) __________ Sum(Salary) ___________ AVG(Salary) ________ Count(*) ________ Min(Salary) __________ __________ 400 36000.00 145000.00 54500.00 3 48333.33

The HAVING Clause only works on Aggregate Totals, and in the above example, only Count (*) > 2 can return.

Page 259

Chapter 7

Aggregation

Keyword HAVING tests Aggregates after they are totaled SELECT Dept_No , MIN (Salary) as "Min" , MAX (Salary) as "Max" , SUM (Salary) as "Sum" , AVG (Salary) as "Avg" , COUNT(*) as "Count" FROM Employee_Table WHERE Dept_No IN (200, 400) GROUP BY Dept_No HAVING AVG(Salary) > 45000 Order by 1 ;

HAVING Clause acts as a filter on the totals after the Calculations are done

Dept_No __________ Min Max Sum AVG Count ________ __________ __________ ___________ ________ 200 400

41888.88 36000.00

48000.00 54500.00

89888.88 145000.00

44944.44 48333.33

2 3

The HAVING Clause only works on Aggregate Totals. The WHERE filters rows to be excluded from calculation, but the HAVING filters the Aggregate totals after the calculations, thus eliminating certain Aggregate totals.

Page 260

Chapter 7

Aggregation

Getting the Average Values per Column Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT 'Product_ID' AS "Column Name" ,CAST(COUNT(*) / COUNT(DISTINCT(Product_ID)) as Decimal (5,2)) AS "Avg Rows" ,'Sale_Date' AS "Column Name2" ,CAST(COUNT(*) / COUNT(DISTINCT(Sale_Date)) as Decimal (5,2)) AS "Avg Rows2" FROM Sales_Table ; Messages

Garden of Analysis

Result 1

Column Name Avg Rows Column Name2 1 Product_ID

7.00

Sale_Date

Avg Rows2 3.00

The query retrieved the average rows per value for the columns Product_ID and Sale_Date.

Page 261

Chapter 7

Aggregation

GROUP BY Rollup SELECT Product_ID ,EXTRACT (MONTH FROM Sale_Date) AS MTH ,EXTRACT (YEAR FROM Sale_Date) AS YR ,SUM(Daily_Sales) AS SUM_Daily_Sales FROM Sales_Table GROUP BY ROLLUP (Product_ID, MTH, YR) ORDER BY Product_ID Desc, MTH Desc, YR Desc;

GROUP BY ROLLUP displays what the Daily_Sales were for each Product_ID, for each distinct month, for each month per year and for each year, plus a grand total.

Page 262

Chapter 7

Aggregation

GROUP BY Rollup Result Set Product_ID _________

MTH ____

3000 3000 3000 3000 3000 2000 2000 2000 2000 2000 1000 1000 1000 1000 1000 ?

10 10 9 9 ? 10 10 9 9 ? 10 10 9 9 ? ?

YR ____ 2000 ? 2000 ? ? 2000 ? 2000 ? ? 2000 ? 2000 ? ? ?

SUM_Daily_Sales _______________ 84908.06 84908.06 139679.76 139679.76 224587.82 166872.90 166872.90 139738.91 139738.91 306611.81 191854.03 191854.03 139350.69 139350.69 331204.72 862404.35

This is the full result set from the previous GROUP BY ROLLUP query.

Page 263

Chapter 8

Page 264

Join Functions

Chapter 8

Join Functions

Chapter 8 – Join Functions

“When spider webs unite they can tie up a lion.” - African Proverb

Page 265

Chapter 8

Join Functions

A Two-Table Join Using Traditional Syntax Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Customer_Table.Customer_Number The column ,Customer_Name Customer_Number is in both ,Order_Number tables. It must be fully ,Order_Total qualified with the table name FROM Customer_Table, or it errors. Order_Table WHERE Customer_Table.Customer_Number = Order_Table.Customer_Number ; Customer_Number is the column that has matching data in both tables. This is called the "Join Condition"

A Join combines columns on the report from more than one table. The example above joins the Customer_Table and the Order_Table together. The most complicated part of any join is the JOIN CONDITION. The JOIN CONDITION is which Column from each table is a match. In this case, Customer_Number is a match that establishes the relationship, so this join will happen on matching Customer_Number columns.

Page 266

Chapter 8

Join Functions

A two-table join using Non-ANSI Syntax with Table Alias Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT The column Customer_Number is in both tables. It must be fully qualified or it errors.

Cust.Customer_Number ,Customer_Name We alias the table ,Order_Number names to shorten the typing when ,Order_Total fully qualifying a FROM Customer_Table as Cust, column. Order_Table as ORD WHERE Cust.Customer_Number = Ord.Customer_Number;

A Join combines columns on the report from more than one table. The example above joins the Customer_Table and the Order_Table together. The most complicated part of any join is the JOIN CONDITION. The JOIN CONDITION means which Column from each table is a match. In this case, Customer_Number is a match that establishes the relationship.

Page 267

Chapter 8

Join Functions

You Can Fully Qualify All Columns Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

The column Customer_Number is in both tables. It must be fully qualified or it errors.

SELECT

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

A good practice is

Cust.Customer_Number to fully qualify all ,Cust.Customer_Name columns in the SELECT list for ,Ord.Order_Number clarity to other ,Ord.Order_Total users. FROM Customer_Table as Cust, Order_Table as ORD WHERE Cust.Customer_Number = Ord.Customer_Number ;

Whenever a column is in both tables, you must fully qualify it when doing a join. You don't have to fully qualify tables that are only in one of the tables because the system knows which table that particular column is in. You can choose to fully qualify every column if you like. This is a good practice because it is more apparent which columns belong to which tables for anyone else looking at your SQL.

Page 268

Chapter 8

Join Functions

A two-table join using ANSI Syntax Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT Cust.Customer_Name, Order_Number, Order_Total FROM Customer_Table Cust INNER JOIN Order_Table ORD ON Cust.Customer_Number = Ord.Customer_Number ; Messages

1 2 3 4 5

Garden of Analysis

Customer_Name Billy's Best Choice Billy's Best Choice ACE Consulting XYZ Plumbing Databases N-U

Result 1

Order_Number 123456 123512 123552 123777 123585

Order_Total 12347.53 8005.91 5111.47 23454.84 15231.62

This is the same join as the previous slide except it is using ANSI syntax. Both will return the same rows with the same performance. Rows are joined when the Customer_Number matches on both tables, but non-matches won’t return.

Page 269

Chapter 8

Join Functions

Both Queries have the same Results and Performance Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________

11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Traditional Syntax

ANSI Syntax

SELECT Cust.Customer_Number, Customer_Name, Order_Number, Order_Total FROM Customer_Table as Cust, Order_Table as ORD WHERE Cust.Customer_Number = Ord.Customer_Number ;

SELECT Cust.Customer_Number, Customer_Name, Order_Number, Order_Total FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number ;

Both of these syntax techniques bring back the same result set and have the same performance. The INNER JOIN is considered ANSI. Which one does Outer Joins?

Page 270

Chapter 8

Join Functions

Quiz – Can You Finish the Join Syntax? Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT First_Name, Last_Name, Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON Finish the Join

Finish this join by placing the missing SQL in the proper place!

Page 271

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Chapter 8

Join Functions

Answer to Quiz – Can You Finish the Join Syntax? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Primary Key

Foreign Key

SELECT First_Name, Last_Name, Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

This query is ready to run.

Page 272

Dept_No is the column that both tables have in common. This is called a Primary Key/Foreign Key relationship

Chapter 8

Join Functions

Quiz – Can You Find the Error? Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________

SELECT First_Name ,Last_Name ,Dept_No ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ; This query has an error! Can you find it?

Page 273

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Can you find the error?

Chapter 8

Join Functions

Answer to Quiz – Can You Find the Error? Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

The column Dept_No is in both tables. It needs to be fully qualified as E.Dept_No or D.Dept_No

Department_Table Dept_No ________________ Department_Name ________

SELECT First_Name ,Last_Name ,E.Dept_No ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

If a column in the SELECT list is in both tables, you must fully qualify it.

Page 274

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Chapter 8

Join Functions

Super Quiz – Can You Find the Difficult Error? Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

SELECT First_Name ,Last_Name ,E.Dept_No ,Department_Name Can you find FROM Employee_Table as E the error? INNER JOIN Department_Table as D ON Employee_Table.Dept_No = D.Dept_No ;

This query has an error! Can you find it?

Page 275

Marketing Research and Dev Sales Customer Support Human Resources

Chapter 8

Join Functions

Answer to Super Quiz – Can You Find the Difficult Error? Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name, Last_Name, E.Dept_No ,Department_Name Once you FROM Employee_Table as E alias a table INNER JOIN (as E) Department_Table as D ON Employee_Table.Dept_No = D.Dept_No ; You must fully qualify with E.Dept_No (Not Employee_Table.Dept_No) (This query thinks there are three tables (E, D, and Employee_Table)

If a column in the SELECT list is in both tables, you must fully qualify it.

Page 276

Chapter 8

Join Functions

Quiz – Which rows from both tables won’t return? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,E.Last_Name ,D.Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

This inner join will return all rows that have a matching Dept_No in both tables. Which rows won't return?

An Inner Join returns matching rows, but did you know an Outer Join returns both matching rows and nonmatching rows? You will understand soon!

Page 277

Chapter 8

Join Functions

Answer to Quiz – Which rows from both tables Won’t Return? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,E.Last_Name ,D.Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

1 2 3

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Squiggy Jones has a NULLDept_No Richard Smythe has an invalid Dept_No 10

No Employees work in Department 500

The bottom line is that the three rows excluded did not have a matching Dept_No.

Page 278

Marketing Research and Dev Sales Customer Support Human Resources

Chapter 8

Join Functions

LEFT OUTER JOIN Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

1st Table after FROM is always the LEFT Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

Department_Table Dept_No ________________ Department_Name ________

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Since we are doing a Left Outer Join, the Employee_Table is referred to as the outer table.

This is a LEFT OUTER JOIN. That means that all rows from the LEFT Table will appear in the report regardless if it finds a match on the right table.

Page 279

Chapter 8

Join Functions

LEFT OUTER JOIN Results Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

First_Name __________ Mandee Herbert William Loraine Squiggy Richard Cletus Billy John

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Department_Name ________________ Marketing Customer Support Customer Support Sales Nulls show ? mismatches ? Customer Support Research and Dev Research and Dev

Marketing Research and Dev Sales Customer Support Human Resources

The matching rows return just like an inner join, but orphaned rows from the Left table also return.

A LEFT Outer Join Returns all rows from the LEFT Table including all Matches. If a LEFT row can’t find a match, a NULL is placed on right columns not found!

Page 280

Chapter 8

Join Functions

RIGHT OUTER JOIN Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

2nd Table after FROM is always the RIGHT Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E RIGHT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

Department_Table Dept_No ________________ Department_Name ________

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Since we are doing a Right Outer Join, the Department_Table is referred to as the outer table.

This is a RIGHT OUTER JOIN. That means that all rows from the RIGHT Table will appear in the report regardless if it finds a match with the LEFT Table.

Page 281

Chapter 8

Join Functions

RIGHT OUTER JOIN Example and Results Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E RIGHT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ; Nulls show mismatches

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

First_Name __________ Department_Name ________________ Mandee Herbert William Loraine Cletus Billy John ?

Marketing Customer Support Customer Support Sales Customer Support Research and Dev Research and Dev Human Resources

The matching rows return just like an inner join, but orphaned rows from the Right table also return.

All rows from the Right Table were returned with matches, but since Dept_No 500 didn’t have a match, the system put a NULL Value for Left Column values.

Page 282

Chapter 8

Join Functions

FULL OUTER JOIN Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E FULL OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

Department_Table Dept_No ________________ Department_Name ________

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Since we are doing a Full Outer Join, both tables are referred to as the outer table.

This is a FULL OUTER JOIN. That means that all rows from both the RIGHT and LEFT Table will appear in the report regardless if it finds a match.

Page 283

Chapter 8

Join Functions

FULL OUTER JOIN Results Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E FULL OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;

First_Name __________ Mandee Herbert William Loraine Squiggy Richard Cletus Billy John ?

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Department_Name ________________ Marketing Customer Support Customer Support Sales ? ? Customer Support Research and Dev Research and Dev Human Resources

The FULL Outer Join Returns all rows from both Tables. NULLs show the flaws!

Page 284

All rows return from both tables on a Full Outer Join

Chapter 8

Join Functions

Which Tables are the Left and which Tables are Right? Fill in the blank. Is the SELECT Cla.Claim_Id, table a Left Table or a Cla.Claim_Date, Right Table? SUB.Last_Name, SUB.First_Name, Claims __________ "ADD".Phone, Providers __________ Services __________ SER.Service_Pay, Subscribers __________ PRO.Provider_Code, Addresses __________ PRO.Provider_Name FROM CLAIMS Cla LEFT OUTER JOIN PROVIDERS PRO ON Cla.Provider_No = PRO.Provider_Code LEFT OUTER JOIN SERVICES SER ON Cla.Claim_Service = SER.Service_Code LEFT OUTER JOIN SUBSCRIBERS SUB ON Cla.Subscriber_No = SUB.Subscriber_No AND Cla.Member_No = SUB.Member_No LEFT OUTER JOIN ADDRESSES "ADD" ON SUB.Subscriber_No = "ADD".Subscriber_No;

Can you list which tables above are left tables and which tables are right tables?

Page 285

Chapter 8

Join Functions

Answer - Which Tables are the Left and which are the Right? Fill in the blank. SELECT Cla.Claim_Id, Is the table a Left Cla.Claim_Date, Table or a Right SUB.Last_Name, Table? SUB.First_Name, Claims Left "ADD".Phone, Providers Right SER.Service_Pay, Services Right PRO.Provider_Code, Subscribers Right PRO.Provider_Name Addresses Right FROM CLAIMS Cla LEFT OUTER JOIN PROVIDERS PRO ON Cla.Provider_No = PRO.Provider_Code LEFT OUTER JOIN SERVICES SER ON Cla.Claim_Service = SER.Service_Code LEFT OUTER JOIN SUBSCRIBERS SUB ON Cla.Subscriber_No = SUB.Subscriber_No AND Cla.Member_No = SUB.Member_No LEFT OUTER JOIN ADDRESSES "ADD" ON SUB.Subscriber_No = "ADD".Subscriber_No;

There is always only one Left table (the first table after the FROM clause) All tables after the first table are each Right Tables.

Tables are joined two at a time. The result from each join remains the Left Table

The first table is always the left table and the rest are right tables. The results from the first two tables being joined becomes the left table.

Page 286

Chapter 8

Join Functions

INNER JOIN with Additional AND Clause Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D WHERE E.Dept_No = D.Dept_No AND Department_Name like 'Marke%' ;

The additional AND is performed first in order to eliminate unwanted data, so the join is less intensive than joining everything first and then eliminating rows that don't qualify.

Page 287

Chapter 8

Join Functions

ANSI INNER JOIN with Additional AND Clause Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND Department_Name like 'Marke%' ;

The additional AND is performed first in order to eliminate unwanted data, so the join is less intensive than joining everything first and then eliminating after.

Page 288

Chapter 8

Join Functions

ANSI INNER JOIN with Additional WHERE Clause Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No WHERE Department_Name like 'Marke%' ;

The additional WHERE is performed first in order to eliminate unwanted data, so the join is less intensive than joining everything first and then eliminating.

Page 289

Chapter 8

Join Functions

OUTER JOIN with Additional WHERE Clause Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT First_Name, Last_Name, Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No WHERE E.Dept_No = 100 ;

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

__________ First_Name Department_Name _______________ Marketing Mandee

Only Mandee Chambers is in Dept_No 100

The additional WHERE clause is performed last on an Outer Join. All rows will be joined first and then the additional WHERE clause filters after the join takes place.

Page 290

Chapter 8

Join Functions

OUTER JOIN with Additional AND Clause Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name ,Department_Name AS Dname FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND E.Dept_No = 100 ;

The additional AND is performed in conjunction with the ON statement on Outer Joins. All rows will be evaluated with the ON clause and the AND combined.

Page 291

Chapter 8

Join Functions

OUTER JOIN with Additional AND Clause Results Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

OUTER Join with additional AND Clause SELECT First_Name ,Department_Name AS Dname FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND E.Dept_No = 100 ;

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

First_Name __________ Mandee Herbert William Loraine Squiggy Richard Cletus Billy John

Dname ________ Marketing ? ? ? ? ? ? ? ?

The additional AND is performed in conjunction with the ON statement on Outer Joins. This can surprise you. Only Mandee is in Dept_No 100, so she showed up like expected, but an outer join returns non-matches also.

Page 292

Chapter 8

Join Functions

Quiz – Why is this considered an INNER JOIN? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name, Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND D.Dept_No = 400 ;

This is considered an INNER JOIN because we are doing a LEFT OUTER JOIN on the Employee_Table and then filtering with the AND for a column in the right table!

Page 293

Chapter 8

Join Functions

Evaluation Order for Outer Queries SELECT Cou.*, STU1.* FROM COURSE_TABLE Cou LEFT OUTER JOIN STUDENT_COURSE_TABLE STU ON Cou.Course_Id = STU.Course_Id LEFT OUTER JOIN STUDENT_TABLE STU1 ON STU.Student_Id = STU1.Student_Id;

The Order in which Vertica evaluates Outer Queries

1

The first ON clause in the query (reading from left to right).

2

Any ON clause applies to its immediately preceding join operation.

3

Parenthesis can be used to override the natural left to right order.

When you perform an inner join, Vertica considers this to be both commutative and associative. That means that two tables being inner joined will easily come up with the intended answer. This allows the optimizer to select the best join order between tables. This is because the end result will be the same. Outer Joins are different. They will follow the above three rules for evaluation order by the Parsing Engine.

Page 294

Chapter 8

Join Functions

The DREADED Product Join Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

No Join Condition Linking the Two Tables!

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D WHERE Department_Name like '%m%' Order by 1, 2, 3;

This query becomes a Product Join because it does not possess any JOIN Conditions (Join Keys). Every row from one table is compared to every row of the other table, and quite often, the data is not what you intended to get back.

Page 295

Chapter 8

Join Functions

The DREADED Product Join Results

No Join Condition Linking the Two Tables!

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D WHERE Department_Name like '%m%' Order by 1, 2, 3;

First_Name _________ Last_Name _________ Department_Name ________________

Not all rows are displayed

Billy Billy Cletus Cletus Herbert Herbert

Coffing Coffing Strickling Strickling Harrison Harrison

Customer Support Human Resources Customer Support Human Resources Customer Support Human Resources

How can Billy Coffing work in 2 different departments?

18 Rows came back. Nine employees with each working in three different departments. This data is WRONG!

A Product Join is often a mistake! Two department rows had an ‘m’ in their name, so these were joined to every employee, and the information is worthless.

Page 296

Chapter 8

Join Functions

The Horrifying Cartesian Product Join Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

No WHERE Clause in the join!

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D

A Cartesian Product Join is usually a big mistake.

Page 297

Department_Table

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

This joins every row from one table to every row of another table. 9 rows multiplied by 5 rows = 45 rows of complete nonsense!

Chapter 8

Join Functions

The ANSI Cartesian Join will ERROR Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

No ON Clause in the join!

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D

Dept_No ________________ Department_Name ________ 100 200 300 400 500

This query Errors because ANSI forbids joins without ON clauses.

Error

This causes an error. ANSI won’t let this run unless a join condition is present.

Page 298

Marketing Research and Dev Sales Customer Support Human Resources

Chapter 8

Join Functions

Quiz – Do these Joins Return the Same Answer Set? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Query 1 SELECT First_Name, Department_Name FROM Employee_Table INNER JOIN Department_Table ;

Do these two queries produce the same result?

Page 299

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Query 2 SELECT First_Name, Department_Name FROM Employee_Table, Department_Table ;

Chapter 8

Join Functions

Answer – Do these Joins Return the Same Answer Set? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Query 1 SELECT First_Name, Department_Name FROM Employee_Table INNER JOIN Department_Table ; This query errors

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Query 2 SELECT First_Name, Department_Name FROM Employee_Table, Department_Table ; Cartesian product join occurs

Do these two queries produce the same result? No, Query 1 Errors due to ANSI syntax and no ON Clause, but Query 2 Product Joins to bring back junk!

Page 300

Chapter 8

Join Functions

The CROSS JOIN Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

A Cross Join is the ANSI equivalent to a Product Join

Only a WHERE will work. ON Will NOT!

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Customer_Name, Order_Number FROM Customer_Table CROSS JOIN Order_Table WHERE Order_Number = 123456 ORDER BY 1 ;

This query becomes a Product Join because a Cross Join is an ANSI Product Join. It will compare every row from the Customer_Table to Order_Number 123456 in the Order_Table. Check out the Answer Set on the next page.

Page 301

Chapter 8

Join Functions

The CROSS JOIN Answer Set Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Answer Set

SELECT Customer_Name, Order_Number FROM Customer_Table CROSS JOIN Order_Table WHERE Order_Number = 123456 ORDER BY 1 ;

Customer_Name ______________ Order_Number _____________ Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

This Cross Join produces information that just isn’t worth anything quite often!

Page 302

123456 123456 123456 123456 123456

Chapter 8

Join Functions

The Self Join Employee_Table2 Employee_No Dept_No Last_Name First_Name Salary ____________ _______ _________ _________ _______ 1232578 100 Chambers Mandee 48850.00 54500.00 1256349 400 Harrison Herbert 2341218 400 Reilly William 36000.00 54500.00 1121334 400 Strickling Cletus 2312225 300 Larkins Loraine 40200.00 2000000 ? Jones Squiggy 32800.50 1000234 10 Smythe Richard 32800.00 41888.88 1324657 200 Coffing Billy 48000.00 1333454 200 Smith John SELECT Mgrs.Dept_No , Mgrs.Last_Name as MgrName , Mgrs.Salary as MgrSal , Emps.Last_Name as EmpName , Emps.Salary as Empsal FROM Employee_Table2 as Emps, Employee_Table2 as Mgrs WHERE Emps.Dept_No = Mgrs.Dept_No AND Mgrs.Mgr = 'Y' AND Emps.Salary > Mgrs.Salary ;

Mgr ____ Y N Y N Y N N N Y

Which Workers make a bigger Salary than their Manager?

A Self Join gives itself 2 different Aliases, which is then seen as two different tables.

Page 303

Chapter 8

Join Functions

The Self Join with ANSI Syntax Employee_Table2 Employee_No Dept_No Last_Name First_Name Salary ____________ _______ _________ _________ _______ 1232578 100 Chambers Mandee 48850.00 54500.00 1256349 400 Harrison Herbert 2341218 400 Reilly William 36000.00 54500.00 1121334 400 Strickling Cletus 2312225 300 Larkins Loraine 40200.00 2000000 ? Jones Squiggy 32800.50 1000234 10 Smythe Richard 32800.00 41888.88 1324657 200 Coffing Billy 48000.00 1333454 200 Smith John SELECT Mgrs.Dept_No , Mgrs.Last_Name as MgrName , Mgrs.Salary as MgrSal , Emps.Last_Name as EmpName , Emps.Salary as Empsal FROM Employee_Table2 as Emps INNER JOIN Employee_Table2 as Mgrs ON Emps.Dept_No = Mgrs.Dept_No WHERE Mgrs.Mgr = 'Y' AND Emps.Salary > Mgrs.Salary ;

Mgr ____ Y N Y N Y N N N Y

Which Workers make a bigger Salary than their Manager?

A Self Join gives itself 2 different Aliases, which is then seen as two different tables.

Page 304

Chapter 8

Join Functions

Quiz – Will both queries bring back the same Answer Set? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1; Will both queries bring back the same result set?

Page 305

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;

Chapter 8

Join Functions

Answer – Will both queries bring back the same Answer Set? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1;

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;

Will both queries bring back the same result set? Yes! Because they’re both inner joins.

Page 306

12347.53 8005.91 5111.47 15231.62 23454.84

Chapter 8

Join Functions

Quiz – Will both queries bring back the same Answer Set? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1; Will both queries bring back the same result set?

Page 307

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;

Chapter 8

Join Functions

Answer – Will both queries bring back the same Answer Set? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1;

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;

Will both queries bring back the same result set? NO! The WHERE clause is performed last.

Page 308

Chapter 8

Join Functions

How would you join these two tables? Course_Table Course_ID Course_Name Credits _________ _________________ ______ Seats ____ 100 Database Concepts 3 50 200 Introduction to SQL 3 20 210 Advanced SQL 3 22 220 V2R3 SQL Features 2 25 300 Physical Database Design 4 20 400 Database Administration 4 16 Student_Table Student_ID __________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name __________ Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

First_Name Class_Code Grade_Pt __________ __________ ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00

How would you join these two tables together? You can't do it. There is no matching column with like data. There is no Primary Key/Foreign Key relationship between these two tables. That is why you are about to be introduced to a bridge table. It is formally called an Associative table or a Lookup table.

Page 309

Chapter 8

Join Functions

An Associative Table is a Bridge that Joins Two Tables Associative

Course_Table

Table

Course_ID Course_Name Credits _________ _________________ ______ Seats ____ 100 Database Concepts 3 50 200 Introduction to SQL 3 20 210 Advanced SQL 3 22 220 V2R3 SQL Features 2 25 300 Physical Database Design 4 20 400 Database Administration 4 16

Student_Course_Table Student_ID Course_ID 280023 231222 125634 231222 125634 322133 125634 322133 324652 333450 260000 333450 234121 123250

210 210 100 220 200 220 220 300 200 500 400 400 100 100

Student_Table Student_ID __________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name __________ Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

First_Name Class_Code Grade_Pt __________ __________ ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00

The Associative Table is a bridge between the Course_Table and Student_Table.

Page 310

Chapter 8

Join Functions

Quiz – Can you write the 3-Table Join? Associative

Course_Table

Table

Course_ID Course_Name Credits _________ _________________ ______ Seats ____ 100 Database Concepts 3 50 200 Introduction to SQL 3 20 210 Advanced SQL 3 22 220 V2R3 SQL Features 2 25 300 Physical Database Design 4 20 400 Database Administration 4 16

Student_Course_Table Student_ID Course_ID 280023 231222 125634 231222 125634 322133 125634 322133 324652 333450 260000 333450 234121 123250

210 210 100 220 200 220 220 300 200 500 400 400 100 100

Student_Table Student_ID __________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name __________ Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

First_Name Class_Code Grade_Pt __________ __________ ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00

SELECT ALL Columns from the Course_Table and Student_Table and Join them.

Page 311

Chapter 8

Join Functions

Answer to quiz – Can you Write the 3-Table Join? Student_Course_Table Student_Table

Student_ID Last_Name First_Name Class_Code Grade_Pt

Course_Table Student_ID Course_ID

SELECT S.*, C.* FROM Student_Table as S, Course_Table as C, Student_Course_Table as SC Where S.Student_ID = SC.Student_ID AND C.Course_ID = SC.Course_ID ;

Course_ID Course_Name Credits Seats

Notice the * technique of getting ALL columns from both tables!

The Associative Table is a bridge between the Course_Table and Student_Table, and its sole purpose is to join these two tables together.

Page 312

Chapter 8

Join Functions

Quiz – Can you write the 3-Table Join to ANSI Syntax? Student_Course_Table Student_Table

Student_ID Last_Name First_Name Class_Code Grade_Pt

Course_Table Student_ID Course_ID

Course_ID Course_Name Credits Seats

SELECT S.*, C.* FROM Student_Table as S, Course_Table as C, Student_Course_Table as SC Where S.Student_ID = SC.Student_ID AND C.Course_ID = SC.Course_ID ; Convert this query to ANSI syntax

Please re-write the above query using ANSI Syntax.

Page 313

Chapter 8

Join Functions

Answer – Can you write the 3-Table Join to ANSI Syntax? Student_Course_Table

Student_Table Student_ID Last_Name First_Name Class_Code Grade_Pt

Course_Table Student_ID Course_ID

Course_ID Course_Name Credits Seats

ANSI Syntax Traditional Syntax SELECT S.*, C.* FROM Student_Table as S, Course_Table as C, Student_Course_Table as SC Where S.Student_ID = SC.Student_ID AND C.Course_ID = SC.Course_ID ;

Select S.*, C.* From Student_Table as S INNER JOIN Student_Course_Table as SC ON S.Student_ID = SC.Student_ID INNER JOIN Course_Table as C ON C.Course_ID = SC.Course_ID;

The above queries show both traditional and ANSI form for this three table join.

Page 314

Chapter 8

Join Functions

Quiz – Can you Place the ON Clauses at the End? Student_Course_Table Student_Table Student_ID Last_Name First_Name Class_Code Grade_Pt

Course_Table Student_ID Course_ID

Course_ID Course_Name Credits Seats

ANSI Syntax Select S.*, C.* From Student_Table as S INNER JOIN Student_Course_Table as SC ON S.Student_ID = SC.Student_ID INNER JOIN Course_Table as C ON C.Course_ID = SC.Course_ID; Please re-write the above query and place both ON Clauses at the end.

Page 315

Can you rewrite this and place all of the ON clauses at the end?

Chapter 8

Join Functions

Answer – Can you Place the ON Clauses at the End? Student_Course_Table Student_Table Student_ID Last_Name First_Name Class_Code Grade_Pt

Course_Table Student_ID Course_ID

Course_ID Course_Name Credits Seats

Select S.*, C.* The trick is to From Student_Table as S put the first ON INNER JOIN clause for the Student_Course_Table as SC last join and go INNER JOIN backwards Course_Table as C ON C.Course_ID = SC.Course_ID ON SC.Student_ID = S.Student_ID;

This is tricky. The only way it works is to place the ON clauses backwards. The first ON Clause represents the last INNER JOIN and then moves backwards.

Page 316

Chapter 8

Join Functions

The 5-Table Join – Logical Insurance Model Addresses

Subscriber_No

Subscribers

Claims

Subscriber_No

Subscriber_No

Member_No

Member_No

Services Service_Code

Claim_Service

Providers Provider_Code

Provider_No

Above is the logical model for the insurance tables showing the Primary Key and Foreign Key relationships (PK/FK).

Page 317

Chapter 8

Join Functions

Quiz - Write a Five Table Join Using ANSI Syntax Addresses

Subscriber_No

Subscribers

Claims

Subscriber_No

Subscriber_No

Member_No

Member_No

Services Service_Code

Claim_Service

Providers Provider_Code

Provider_No

Your mission is to write a five table join selecting all columns using ANSI syntax.

Page 318

Chapter 8

Join Functions

Answer - Write a Five Table Join Using ANSI Syntax SELECT cla1.*, sub1.*, add1.* ,pro1.*, ser1.* FROM CLAIMS AS cla1 INNER JOIN SUBSCRIBERS AS sub1 ON cla1.Subscriber_No = sub1.Subscriber_No AND cla1.Member_No = sub1.Member_No INNER JOIN ADDRESSES AS add1 ON sub1.Subscriber_No = add1.Subscriber_No INNER JOIN PROVIDERS AS pro1 ON cla1.Provider_No = pro1.Provider_Code INNER JOIN SERVICES AS ser1 ON cla1.Claim_Service = ser1.Service_Code ; Above is the example writing this five table join using ANSI syntax.

Page 319

Chapter 8

Join Functions

Quiz - Write a Five Table Join Using Non-ANSI Syntax Addresses

Subscriber_No

Subscribers

Claims

Subscriber_No

Subscriber_No

Member_No

Member_No

Services Service_Code

Claim_Service

Providers Provider_Code

Provider_No

Your mission is to write a five table join selecting all columns using Non-ANSI syntax.

Page 320

Chapter 8

Join Functions

Answer - Write a Five Table Join Using Non-ANSI Syntax SELECT FROM

WHERE AND AND AND AND

cla1.*, sub1.*, add1.* ,pro1.*, ser1.* CLAIMS AS cla1, SUBSCRIBERS AS sub1, ADDRESSES AS add1, PROVIDERS AS pro1, SERVICES AS ser1 cla1.Subscriber_No = sub1.Subscriber_No cla1.Member_No = sub1.Member_No sub1.Subscriber_No = add1.Subscriber_No cla1.Provider_No = pro1.Provider_Code cla1.Claim_Service = ser1.Service_Code ;

Above is the example writing this five table join using Non-ANSI syntax.

Page 321

Chapter 8

Join Functions

Quiz –Re-Write this putting the ON clauses at the END SELECT cla1.*, sub1.*, add1.* ,pro1.*, ser1.* FROM CLAIMS AS cla1 INNER JOIN SUBSCRIBERS AS sub1 ON cla1.Subscriber_No = sub1.Subscriber_No AND cla1.Member_No = sub1.Member_No INNER JOIN ADDRESSES AS add1 ON sub1.Subscriber_No = add1.Subscriber_No INNER JOIN PROVIDERS AS pro1 ON cla1.Provider_No = pro1.Provider_Code INNER JOIN SERVICES AS ser1 ON cla1.Claim_Service = ser1.Service_Code ; Above is the example writing this five table join using Non-ANSI syntax.

Page 322

Chapter 8

Join Functions

Answer –Re-Write this putting the ON clauses at the END SELECT cla1.*, sub1.*, add1.* ,pro1.*, ser1.* FROM PROVIDERS AS pro1 INNER JOIN ADDRESSES AS add1 INNER JOIN SUBSCRIBERS AS sub1 INNER JOIN SERVICES AS ser1 INNER JOIN CLAIMS as cla1 ON cla1.Claim_Service = ser1.Service_Code ON cla1.Subscriber_No = sub1.Subscriber_No AND cla1.Member_No = sub1.Member_No ON sub1.Subscriber_No =add1.Subscriber_No ON cla1.Provider_No = pro1.Provider_Code ; Above is the example writing this five table join using ANSI syntax with the ON clauses at the end. We had to move the tables around also to make this happen. Notice that the first ON clause represents the last two tables being joined, and then it works backwards. .

Page 323

Chapter 9

Page 324

Date Functions

Chapter 9

Date Functions

Chapter 9 – Date Functions

"An inch of time cannot be bought with an inch of gold." - Chinese Proverb

Page 325

Chapter 9

Date Functions

Current_Date Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

EXECUTE

Query 1 Query 2 Query 3 SELECT Current_Date AS ANSI_Date ;

Messages

Garden of Analysis

ANSI_Date 1 10/11/2015

The Current_Date will return today's date.

Page 326

History

Result 1

Sandbox ?

New Query

Chapter 9

Date Functions

Current_Date, Current_Time and Current_Timestamp Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT Current_Date AS ANSI_Date ,Current_Time AS ANSI_Time ,Current_Timestamp(6) AS ANSI_Timestamp

Messages

Garden of Analysis

ANSI_Date

ANSI_TIME

1 10/11/2015

12:53:26

Result 1

6 Microseconds

ANSI_Timestamp

10/11/2015 12:53:26.423126 Date

Space

Time

Above are the keywords you can utilize to get the date, time or timestamp. These are reserved words that the system will deliver to you when requested.

Page 327

Chapter 9

Date Functions

Timestamp Differences SELECT Current_Timestamp(0) AS Col1 ,Current_Timestamp(6) AS Col2

Col1 ________________

Col2 ________________________

2014/03/22 10:34:44

2011/03/22 10:34:44.123456

Date Space Time

Microseconds

A timestamp has the date separated by a space and the time. In our second example we have asked for 6 microseconds.

Page 328

Chapter 9

Date Functions

Getdate This example uses the Getdate() function to return the timestamp.

SELECT Getdate() as "The Date"; The Date -----------03/30/2015 8:46:04.567

“Not all who wander are lost.” – J. R. R. Tolkien The Getdate command will return today's date and time just like the Current_Timestamp command. This is not ANSI.

Page 329

Chapter 9

Date Functions

Date and Time Keywords SELECT GETDATE() AS 'GETDATE' , CURRENT_TIMESTAMP AS 'CURRENT_TIMESTAMP' , GETUTCDATE() AS 'GETUTCDATE'

GETDATE CURRENT_TIMESTAMP 03/30/2015 8:42:04.83352 03/30/2015 8:42:04.83352 Date and Time

Date and Time ANSI

GETUTCDATE 03/30/2015 1:42:04.83352 Date and Time UTC

The above example shows another way to get the date and time. The GETDATE and CURRENT_TIMESTAMP are equivalent, but CURRENT_TIMESTAMP is ANSI compliant.

Page 330

Chapter 9

Date Functions

Using CAST in Literal Values SELECT CAST('20150216' AS DATE) as "Date YMD"; Date YMD _________ 2015-02-16

This is an example of using the CAST function with a date literal.

Page 331

Chapter 9

Date Functions

Add or Subtract Days from a date SELECT Order_Date ,Order_Date + 60 as "Due Date" ,Order_Total ,Order_Date + 50 as Disc_Date ,Order_Total *.98 as Disc_Amt FROM Order_Table ORDER BY 1 ;

Order_Date __________

05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999

Due Date Order_Total _________ Disc_Date _________ __________

07/03/1998 03/02/1999 11/08/1999 11/30/1999 12/09/1999

12347.53 8005.91 23454.84 5111.47 15231.62

06/23/1998 02/20/1999 10/29/1999 11/20/1999 11/29/1999

Disc_Amt __________

12100.58 7845.79 22985.74 5,009.24 14926.99

When you add or subtract from a Date you are adding/subtracting Days

Because Dates are stored internally on disk as integers, it makes it easy to add days to the calendar. In the query above we are adding 60 days to the Order_Date.

Page 332

Chapter 9

Date Functions

Formatting Dates HH - Hour of day (00-23) HH12 - Hour of day (01-12) HH24 - Hour of day (00-23) MI - Minute (00-59) SS - Second (00-59) MS - Millisecond (000-999) US - Microsecond (000000-999999) SSSS - Seconds past midnight (0-86399) AM or A.M. or PM or P.M. - (uppercase) am or a.m. or pm or p.m. - (lowercase) Y,YYY - Year with comma YYYY - Year (4 and more digits) YYY - Last 3 digits of year YY - Last 2 digits of year Y - Last digit of year IYYY - ISO year (4 and more digits) IYY - Last 3 digits of ISO year IY - Last 2 digits of ISO year I - Last digits of ISO year BC or B.C. or AD or A.D. - (uppercase) bc or b.c. or ad or a.d. - (lowercase) MONTH - Full uppercase month name Month - Full mixed-case month name month - Full lowercase month name

MON - Uppercase month (3 chars) Mon - Mixed-case month (3 chars) mon - Lowercase month (3 chars) MM - Month number (01-12) DAY - Full uppercase day Day - Full mixed-case day day - Full lowercase day DY - Uppercase day (3 chars) Dy - Mixed-case day (3 chars) dy - Lowercase day (3 chars) DDD - Day of year (001-366) DD - Day of month (01-31) for TIMESTAMP D - Day of week (1-7) Sunday = 1 W - Week of month (1-5) WW - Week number of year (1-53) IW - ISO week number of year CC - Century (2 digits) J - Julian Day (days since Jan 1, 4712 BC) Q - Quarter RM - Month in Roman numerals rm - Month in Roman numerals (lowercase) TZ - Time-zone name (uppercase) tz - Time-zone name (lowercase)

Vertica gives you many options for formatting dates. The next page will show an example.

Page 333

Chapter 9

Date Functions

Formatting Date Example SELECT Order_Date ,TO_CHAR(Order_Date , 'YY-MM-DD') AS YMD ,TO_CHAR(Order_Date , 'MON, DD, YYYY') AS Month ,TO_CHAR(Order_Date , 'D, Mon DD, YY') AS DayofWeek ,Current_Time as Time ,TO_CHAR(Current_Time , 'HH24:MI:SS:MS') AS Micro FROM Order_Table WHERE EXTRACT(Year from Order_Date) = 1998 ;

Order_Date YMD Month DayofWeek _______ Time Micro __________ ________ ____________ ____________ ___________

05/04/1998 98-05-04 MAY, 04, 1998 2, May 04, 98 10:09:58 10:09:58:109 Above you can see an example of formatted dates using the TO_CHAR command.

Page 334

Chapter 9

Date Functions

A Summary of Math Operations on Dates 1

DATE - DATE = Interval (days between dates)

2

DATE + or - Integer = Date

SELECT Order_Number ,Order_Total ,Order_Date ,Order_Date - 365 as Last_Year , Current_Date - Order_Date as Days_Between FROM Order_Table

Order_Number _____________ Order_Total __________ 123456 12347.53 123512 8005.91 123552 5111.47 123585 15231.62 123777 23454.84

Order_Date Last_Year Days_Between __________ __________ ____________ 05/04/1998 05/04/1997 6221 01/01/1999 01/01/1998 5979 10/01/1999 10/01/1998 5706 10/10/1999 10/10/1998 5697 09/09/1999 09/09/1998 5728

A DATE – DATE is an interval of days between dates. A DATE + or – Integer = Date.

Page 335

Chapter 9

Date Functions

The ADD_MONTHS Command Order_Table Order_Number Customer_Number ___________ Order_Date ____________ ________________ 123456 11111111 05/04/1998 123512 11111111 01/01/1999 123552 31323134 10/01/1999 123585 87323456 10/10/1999 123777 57896883 09/09/1999

Order_Total __________ 12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Order_Date ,Add_Months (Order_Date,2) as "Due Date2" ,Order_Total FROM Order_Table ORDER BY 1 ; Order_Date __________ 05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999

Due Date2 ___________ Order_Total _________ 07/04/1998 12347.53 03/01/1999 8005.91 11/09/1999 23454.84 12/01/1999 5111.47 12/10/1999 15231.62

This is the Add_Months Command. What you can do with it is add a month or many months to your date columns. Can you convert this to one year? There is no ADD_YEAR command!

Page 336

Chapter 9

Date Functions

Using the ADD_MONTHS Command to Add 1 Year Order_Table Order_Number Customer_Number ___________ Order_Date ____________ ________________ 123456 11111111 05/04/1998 123512 11111111 01/01/1999 123552 31323134 10/01/1999 123585 87323456 10/10/1999 123777 57896883 09/09/1999

Order_Total __________ 12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Order_Date ,Add_Months (Order_Date,12) as "Due Date12" ,Order_Total FROM Order_Table ORDER BY 1 ; Order_Date __________ 05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999

Due Date12 ____________ Order_Total _________ 05/04/1999 12347.53 01/01/2000 8005.91 09/09/2000 23454.84 10/01/2000 5111.47 10/10/2000 15231.62

The Add_Months command adds months to any date. Above we used a great technique that would give us 1 year. Can you give me 5 years?

Page 337

Chapter 9

Date Functions

Using the ADD_MONTHS Command to Add 1 Year Order_Table Order_Number Customer_Number ___________ Order_Date ____________ ________________ 123456 11111111 05/04/1998 123512 11111111 01/01/1999 123552 31323134 10/01/1999 123585 87323456 10/10/1999 123777 57896883 09/09/1999

Order_Total __________ 12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Order_Date ,Add_Months (Order_Date,12) as "Due Date12" ,Order_Total FROM Order_Table ORDER BY 1 ; Order_Date __________ 05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999

Due Date12 ____________ Order_Total _________ 05/04/1999 12347.53 01/01/2000 8005.91 09/09/2000 23454.84 10/01/2000 5111.47 10/10/2000 15231.62

The Add_Months command adds months to any date. Above we used a great technique that would give us 1 year. Can you give me 5 years?

Page 338

Chapter 9

Date Functions

Using the ADD_MONTHS Command to Add 5 Years Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT Order_Date ,Add_Months (Order_Date,12 * 5) as "Due Date" ,Order_Total FROM Order_Table WHERE EXTRACT(Month from Order_Date) = 9 ; Messages

Order_Date

Garden of Analysis

Due Date

1 09/09/1999 09/09/2004

Result 1

Order_Total 23454.84

Above you see a great technique for adding multiple years to a date.

Page 339

Sandbox

Chapter 9

Date Functions

The EXTRACT Command Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT Order_Date ,Add_Months (Order_Date,12 * 5) as "Due Date" ,Order_Total FROM Order_Table WHERE EXTRACT(Month from Order_Date) = 9 ; Messages

Order_Date

Garden of Analysis

Due Date

1 09/09/1999 09/09/2004

Result 1

Order_Total 23454.84

This is the Extract command. It returns a date part, such as a day, month or year, from a timestamp value or expression. It can be used in the SELECT list, the WHERE Clause, or the ORDER BY Clause!

Page 340

Chapter 9

Date Functions

YEAR, MONTH, and DAY Functions SELECT Order_Date ,Year(Order_Date) as "Yr" ,Month(Order_Date) as "Mo" ,Day(Order_Date) as "Day" FROM Order_Table ORDER BY 1 ;

Order_Date ____ Yr Mo __________ ___ 1998-05-04 1999-01-01 1999-09-09 1999-10-01 1999-10-10

1998 1999 1999 1999 1999

5 1 9 10 10

Day ____ 4 1 9 1 10

The YEAR, MONTH and DAY functions are abbreviations for the DATEPART function.

Page 341

Chapter 9

Date Functions

A Better Technique for YEAR, MONTH, and DAY Functions SELECT Order_Number, Customer_Number, Order_Date, Order_Total FROM Order_Table WHERE YEAR(order_date) = 1999 AND MONTH(order_date) = 10;

SELECT Order_Number, Customer_Number, Order_Date, Order_Total FROM Order_Table This approach is more efficient for Vertica. WHERE order_date >= '19991001' Indexes can take advantage of this AND order_date < '19991101' technique!

Both queries above do the same thing and deliver the same result set, but the bottom query could be much faster.

Order_Number ________________ Customer_Number Order_Date _____________ __________ Order_Total __________ 123552 123585

31323134 87323456

1999-10-01 1999-10-10

5111.47 15231.62

Above are the tale of two queries. The top query applies manipulation on the filtered column. Yet, in most cases Vertica can’t use an index efficiently when using this technique. The bottom query uses a range filter instead. Brilliant!

Page 342

Chapter 9

Date Functions

Another Version of the EXTRACT Command The EXTRACT command extracts portions of Date, Time, and Timestamp SELECT Order_Date ,Add_Months (Order_Date,12 * 5) as "Due Date" ,Order_Total FROM Order_Table WHERE EXTRACT(Month from Order_Date) = 9 ; Below is another version of the Extract Command SELECT Order_Date ,Add_Months (Order_Date,12 * 5) as "Due Date" ,Order_Total FROM Order_Table WHERE Month (Order_Date) = 9 ; Order_Date __________ 09/09/1999

Due Date Order_Total __________ ____________ 09/09/2004

23454.84

Both examples above are equivalent, but beware! The EXTRACT command is a better form because it also works on Day, Year, Hour, Minute and Second. The example on the bottom won't work with all of them.

Page 343

Chapter 9

Date Functions

EXTRACT from DATES and TIME SELECT Current_Date as Date ,EXTRACT(Year from Current_Date) as Yr ,EXTRACT(Month from Current_Date) as Mo ,EXTRACT(Day from Current_Date) as Da ,Current_Time as Time ,EXTRACT(Hour from Current_Time) as Hr ,EXTRACT(Minute from Current_Time) as Mn ,EXTRACT(Second from Current_Time) as Sc ,EXTRACT(TIMEZONE_HOUR from Current_Time) as Th ,EXTRACT(TimeZONE_MINUTE from Current_Time) as Tm ;

Answer Set Date Yr Mo ___ Da ________ Time __________ ____ ___ 05/16/2015 2015

5 16

07:51:51

Hr Sc Th Tm __ Mn ___ ________ ___ ___ 7 51

51.444636

-4

0

Just like the Add_Months, the EXTRACT Command is a Temporal Function or a Time-Based Function.

Page 344

Chapter 9

Date Functions

Why EXTRACT is a Better Form SELECT Current_Date as Date ,EXTRACT(Year from Current_Date) as Yr ,EXTRACT(Month from Current_Date) as Mo ,EXTRACT(Day from Current_Date) as Da ,Current_Time as Time ,EXTRACT(Hour from Current_Time) as Hr ,EXTRACT(Minute from Current_Time) as Mn ,EXTRACT(Second from Current_Time) as Sc ,EXTRACT(TIMEZONE_HOUR from Current_Time) as Th ,EXTRACT(TimeZONE_MINUTE from Current_Time) as Tm ; SELECT Current_Date as Date ,Year (Current_Date) as Yr , Month (Current_Date) as Mo ,Day(Current_Date) as Da ,Current_Time as Time ,Hour(Current_Time) as Hr ,Minute(Current_Time) as Min ,Second(Current_Time) as Sec

Timezone_Hour (Current_Time) and Timezone_Minute (Current_Time) do not work with this format.

Most extracts are on the month or year. That can be done using either technique above.

Page 345

Chapter 9

Date Functions

EXTRACT with DATE and TIME Literals SELECT EXTRACT(Year FROM Date '2000-10-01') AS "YR" ,EXTRACT(Month FROM Date '2000-10-01') AS "Mth" ,EXTRACT(DAY FROM Date '2000-10-01') AS 'Day' ,EXTRACT(HOUR FROM TIME '10:01:30') AS 'Hr' ,EXTRACT(MINUTE FROM TIME '10:01:30') AS 'Min' ,EXTRACT(SECOND FROM TIME '10:01:30') AS 'Sec' ,EXTRACT(MONTH FROM Current_Timestamp) AS ts_Mth ,EXTRACT(SECOND FROM Current_Timestamp) AS ts_Part

YR Mth Day Hr Min Sec ts_Mth ts_Part ____ ____ ___ ___ ____ _________ ______ _________ 2000 10

1 10

1 30.000000

5

5.266518

Just like the Add_Months, the EXTRACT Command is a Temporal Function or a Time-Based Function. The query above is designed to show how to use it with literal values.

Page 346

Chapter 9

Date Functions

EXTRACT of the Month on Aggregate Queries SELECT EXTRACT(Month FROM Order_Date) ,COUNT(*) as "Rows" ,AVG(Order_Total) as "AVG" FROM Order_Table GROUP BY 1 ORDER BY 1

date_part AVG ________ Rows _____ _________ 1 5 9 10

1 1 1 2

8005.91 12347.53 23454.84 10171.55

The above SELECT uses the EXTRACT to only display the month and also to control the number of aggregates displayed in the GROUP BY. Notice the Answer Set headers.

Page 347

Chapter 9

Date Functions

AGE_IN_MONTHS Order_Table Order_Number Customer_Number ___________ Order_Date ____________ ________________ 123456 11111111 05/04/1998 123512 11111111 01/01/1999 123552 31323134 10/01/1999 123585 87323456 10/10/1999 123777 57896883 09/09/1999

Order_Total __________ 12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Order_Date ,Age_In_Months (Current_Date, Order_Date) as "Age_in_Months" ,Current_Date - Order_Date as Age_in_Days ,Current_Date as Todays_Date FROM Order_Table ORDER BY 1 ; Order_Date Todays_Date __________ Age_in_Months _____________ Age_in_Days ___________ ___________ 05/04/1998 204 6223 05/18/2015 01/01/1999 196 5981 05/18/2015 09/09/1999 188 5730 05/18/2015 10/01/1999 187 5708 05/18/2015 10/10/1999 187 5699 05/18/2015 Above you see a great technique for seeing the age in months between two dates or timestamps. Page 348

Chapter 9

Date Functions

AGE_IN_YEARS Order_Table Order_Number Customer_Number ___________ Order_Date ____________ ________________ 123456 11111111 05/04/1998 123512 11111111 01/01/1999 123552 31323134 10/01/1999 123585 87323456 10/10/1999 123777 57896883 09/09/1999

Order_Total __________ 12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Order_Date ,Age_In_Months (Current_Date, Order_Date) as "Age_in_Months" ,Age_In_Years (Current_Date, Order_Date) as "Age_in_Years" ,Current_Date as Todays_Date FROM Order_Table ORDER BY 1 ; Order_Date Age_in_Months __________ _____________ Age_in_Years ___________ Todays_Date ___________ 05/18/2015 05/04/1998 204 17 05/18/2015 01/01/1999 196 16 05/18/2015 09/09/1999 188 15 05/18/2015 10/01/1999 187 15 05/18/2015 10/10/1999 187 15

Above you see a great technique for seeing the age in years between two dates or timestamps.

Page 349

Chapter 9

Date Functions

DATE_TRUNC SELECT Current_Date as "Today" ,DATE_TRUNC('Century', Current_Date) as "Century" ,DATE_TRUNC('Day', Current_Date) as "Day"

Today Century _________ _______________________ 05/18/2015 01/01/2001 12:00:00.000000

Day _______________________ 05/18/2015 12:00:00.000000

SELECT Current_Time as "Time" ,DATE_TRUNC('Minute', Current_Time) as "Minute" ,DATE_TRUNC('Hour', Current_Time) as "Hour" ,DATE_TRUNC('Microseconds', Current_Time) as "Micro" Time Minute Hour Micro _______ _______ _______ ________ 20:24:25 20:24:00 20:00:00 20:24:25

The Date_Trunc function truncates date and time values as specified.

Page 350

Chapter 9

Date Functions

DATEDIFF SELECT Current_Date as "Date Today" ,Order_Date ,DATEDIFF(Year, Order_Date, Current_Date) as "Years" ,DATEDIFF(Quarter, Order_Date, Current_Date) as "Quarters" ,DATEDIFF(Month, Order_Date, Current_Date) as "Months" ,DATEDIFF(Day, Order_Date, Current_Date) as "Days" ,DATEDIFF(Week, Order_Date, Current_Date) as "Weeks" ,DATEDIFF(Hour, Order_Date, Current_Date) as "Hours" FROM Order_Table WHERE EXTRACT(Year from Order_Date) = 1998 ;

Date Today Order_Date Quarters Months Days ______ Weeks Hours _________ _________ Years ____ _______ ______ ____ _______ 05/18/2015 05/04/1998

17

68

204 6223

889 149352

The DATEDIFF function Returns the difference between two date or time values based on the specified start and ending arguments. The DATEDIFF function includes all of the above plus minute, second, millisecond and microsecond.

Page 351

Chapter 9

Date Functions

DAYOFWEEK SELECT Current_Date as "Date Today" ,DAYOFWEEK(Current_Date) ,CASE DAYOFWEEK(Current_Date) WHEN 1 Then 'Sunday' WHEN 2 Then 'Monday' WHEN 3 Then 'Tuesday' WHEN 4 Then 'Wednesday' WHEN 5 Then 'Thursday' WHEN 6 Then 'Friday' WHEN 7 Then 'Saturday' END as WhatDayIsIt Date Today DAYOFWEEK _________ ____________ WhatDayIsIt ___________

05/18/2015

2

Monday

The DAYOFWEEK function returns a 1 if the day of the week is Sunday, 2 if Monday and so on.

Page 352

Chapter 9

Date Functions

Intervals for Date, Time and Timestamp Interval Chart Simple Intervals

More involved Intervals

YEAR MONTH DAY HOUR MINUTE SECOND

DAY TO HOUR DAY TO MINUTE DAY TO SECOND HOUR TO MINUTE HOUR TO SECOND MINUTE TO SECOND

“It’s not the size of the dog in the fight, but the size of the fight in the dog.”

– Archie Griffin Vertica has added INTERVAL processing, however, it is not ANSI compliant. Intervals are used to perform DATE, TIME and TIMESTAMP arithmetic and conversion.

Page 353

Chapter 9

Date Functions

Interval Data Types and the Bytes to Store Them Interval Chart Bytes 2 4 2 2 2 8 10/12 2 4 8 2 6/8 6/8

Data Type INTERVAL YEAR INTERVAL YEAR TO MONTH INTERVAL MONTH INTERVAL MONTH TO DAY INTERVAL DAY 10 for 32-bit INTERVAL DAY TO MINUTE systems and INTERVAL DAY TO SECOND 12 for 64-bit systems INTERVAL HOUR 2 INTERVAL HOUR TO MINUTE 4 INTERVAL HOUR TO SECOND 8 6 for 32-bit INTERVAL MINUTE 2 systems and INTERVAL MINUTE TO SECOND 8 for 64-bit INTERVAL SECOND systems

Above are the interval data types and the bytes to store them.

Page 354

Chapter 9

Date Functions

Using Intervals SELECT Current_Date as Our_Date ,Current_Date + Interval '1' Day as Plus_1_Day ,Current_Date + Interval '3' Month as Plus_3_Months ,Current_Date + Interval '5' Year as Plus_5_Years

SELECT Current_Date as Our_Date ,CAST(Current_Date + Interval '1' Day as Date) as Plus_1_Day ,CAST(Current_Date + Interval '3' Month as Date) as Plus_3_Months ,CAST(Current_Date + Interval '5' Year as Date) as Plus_5_Years Our_Date ________

Plus_1_Day Plus_3_Months _____________ _______________ Plus_5_Years _____________

05/16/2015 05/17/2015

08/16/2015

05/16/2020

Above we are using simple intervals. Notice in the first example the time added to the interval. Notice the second example has used the CAST (Convert and Store) technique. Either way, the intervals have been added.

Page 355

Chapter 9

Date Functions

How a Simple Interval Handles Leap Year SELECT Date '2012-01-29' as Our_Date ,Date '2012-01-29' + INTERVAL '1' Month as Leap_Year

Our_Date _________ 01/29/2012

Leap_Year _________________________ 02/29/2012 12:00:00.000000

SELECT Date '2011-01-29' as Our_Date ,Date '2011-01-29' + INTERVAL '1' Month as Leap_Year Our_Date _________ 01/29/2011

Leap_Year _________________________ 02/28/2011 12:00:00.000000

The first example works because we added 1 month to the date '2012-01-29' and we got '2012-02-29'. Because this was leap year, there actually is a date of February 29, 2012. The next example is the real point. We have a date of '2011-01-29' and we add 1-month to that, but there is no February 29th in 2011, so the query places the day at 02/28/2011.

Page 356

Chapter 9

Date Functions

Interval Arithmetic Results DATE and TIME arithmetic results using intervals: DATE TIME TIMESTAMP

-

DATE TIME TIMESTAMP

- or + Interval = DATE - or + Interval = TIME - or + Interval = TIMESTAMP

Interval

DATE TIME TIMESTAMP

= Interval = Interval = Interval

- or + Interval = Interval

“Once the game is over, the king and the pawn go back in the same box.” - Italian Proverb To use DATE and TIME arithmetic, it is important to keep in mind the results of various operations. The above chart is your Interval guide.

Page 357

Chapter 9

Date Functions

A Time Interval Example Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT (TIME '12:45:01' - TIME '10:10:01') HOUR AS "Hours" ,(TIME '12:45:01' - TIME '10:10:01') MINUTE AS "Minutes" ,(TIME '12:45:01' - TIME '10:10:01') SECOND AS "Seconds"

Messages

Hours 1 2

Garden of Analysis

Result 1

Minutes

Seconds

155

9300.000000

Time intervals work as you can see from the example above.

Page 358

History

Chapter 9

Date Functions

A DATE Interval Example Going Back in Time Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT Current_Date as Date ,INTERVAL -'2' YEAR + CURRENT_DATE as Two_years_Ago ;

Messages

Garden of Analysis

Date

Result 1

Two_years_Ago

1 10/11/2015 10/11/2013 12:00:00.000000

“I know that you believe that you understand what you think I said, but I am not sure you realize that what you heard is not what I meant.” -Sign on Pentagon office wall

The above Interval example uses a -'2' to go back in time.

Page 359

Chapter 9

Date Functions

A Complex Time Interval Example using CAST Below is the syntax for using the CAST with a date: SELECT CAST ( AS INTERVAL ) FROM ;

The following converts an INTERVAL of 6 years and 2 months to an INTERVAL number of months:

SELECT CAST( (INTERVAL '6-02' YEAR TO MONTH) AS INTERVAL MONTH ) Mths

_____ Mths 74 The CAST function (Convert and Store) is the ANSI method for converting data from one type to another. It can also be used to convert one INTERVAL to another INTERVAL representation. Although the CAST is normally used in the SELECT list, it works in the WHERE clause for comparison reasons.

Page 360

Chapter 9

Date Functions

A Complex Time Interval Example using CAST Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT CAST(INTERVAL '1300' MONTH AS INTERVAL YEAR TO MONTH) AS "Years & Months" ;

The above request attempts to convert 1300 months to show the number of years and months. Messages

Garden of Analysis

Result 1

Years & Months 1

108-04

The biggest advantage in using the INTERVAL processing is that SQL written on another system is now compatible.

Page 361

Chapter 9

Date Functions

The OVERLAPS Command Compatibility: Vertica Extension The syntax of the OVERLAPS is: SELECT WHERE (, ) OVERLAPS (, ) ;

SELECT 'The Dates Overlap' as Dater WHERE (DATE '2001-01-01', DATE '2001-11-30') OVERLAPS (DATE '2001-10-15', DATE '2001-12-31');

Answer

Dater ________________ The Dates Overlap

When working with dates and times, sometimes it is necessary to determine whether two different ranges have common points in time. Vertica provides a Boolean function to make this test for you. It is called OVERLAPS; it evaluates true if multiple points are in common, otherwise it returns a false. The literal is returned because both date ranges have from October 15 through November 30 in common.

Page 362

Chapter 9

Date Functions

An OVERLAPS Example that Returns No Rows Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT 'The dates overlap' AS OverlapAnswer WHERE (DATE '2001-01-01', DATE '2001-11-30') OVERLAPS (DATE '2001-11-30', DATE '2001-12-31') ;

Messages

Garden of Analysis

Result 1

OverlapAnswer No rows returned so we know the dates did not overlap

The above SELECT example tests two literal dates and uses the OVERLAPS to determine whether or not to display the character literal. The literal was not selected because the ranges do not overlap. So, the common single date of November 30 does not constitute an overlap. When dates are used, 2 days must be involved, and when time is used, 2 seconds must be contained in both ranges.

Page 363

Chapter 9

Date Functions

The OVERLAPS Command using TIME Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT 'The Times Overlap' As DoThey WHERE (TIME '08:00:00', TIME '02:00:00') OVERLAPS (TIME '02:01:00', TIME '04:15:00') ;

Messages

Garden of Analysis

Result 1

Do They 1 The Times Overlap

The above SELECT example tests two literal times and uses the OVERLAPS to determine whether or not to display the character literal. At first glance, it appears as if this answer is incorrect because 02:01:00 looks like it starts 1 second after the first range ends. However, the system works on a 24-hour clock when a date and time (timestamp) is not used together. Therefore, the system considers the earlier time of 2AM time as the start and the later time of 8 AM as the end of the range. Therefore, not only do they overlap, the second range is entirely contained in the first range.

Page 364

Chapter 10

Page 365

OLAP Functions

Chapter 10

OLAP Functions

Chapter 10 – OLAP Functions

“Don’t count the days, make the days count.” - Mohammed Ali

Page 366

Chapter 10

OLAP Functions

The Row_Number Command SELECT Product_ID ,Sale_Date , Daily_Sales, ROW_NUMBER() OVER (ORDER BY Product_ID, Sale_Date) AS Seq_Number FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID __________ Sale_Date ________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01

Daily_Sales ___________ Seq_Number _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29

1 2 3 4 5 6 7 8 9 10 11

The ROW_NUMBER Keyword(s) caused Seq_Number to increase sequentially. Notice that this does NOT have a Rows Unbounded Preceding, and it still works!

Page 367

Chapter 10

OLAP Functions

Quiz – How did the Row_Number Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, ROW_NUMBER() OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

What Keyword(s) caused StartOver to reset?

Page 368

Daily_Sales _________

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

StartOver _______

1 2 3 4 5 6 7 1 2 3 4 5 6 7

Chapter 10

OLAP Functions

Quiz – How did the Row_Number Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, ROW_NUMBER() OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________

Sale_Date ________

Daily_Sales _________

StartOver _______

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

1 2 3 4 5 6 7 1 2 3 4 5 6 7

What Keyword(s) caused StartOver to reset? It is the PARTITION BY statement.

Page 369

Chapter 10

OLAP Functions

Using a Derived Table and Row_Number WITH Results AS ( SELECT ROW_NUMBER() OVER(ORDER BY Product_ID, Sale_Date) AS RowNumber, Product_ID, Sale_Date FROM Sales_Table ) SELECT * FROM Results WHERE RowNumber BETWEEN 8 AND 14 RowNumber __________ Product_ID _________ Sale_Date _________ 2000-09-28 8 2000 9 2000 2000-09-29 10 2000 2000-09-30 11 2000 2000-10-01 12 2000 2000-10-02 13 2000 2000-10-03 14 2000 2000-10-04

In the example above we are using a derived table called Results and then using a WHERE clause to only take certain RowNumbers.

Page 370

Chapter 10

OLAP Functions

Finding the First Occurrence using a WITH Derived Table Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 WITH Derived_Tbl AS (select Product_ID as Prod, Sale_Date, Daily_Sales, Row_Number() over (PARTITION BY product_id ORDER BY Sale_Date ASC) AS Row_Num from sales_table) Select * from Derived_Tbl Where Row_Num = 1 ; Messages

1 2 3

Prod 1000 2000 3000

Garden of Analysis

Result 1

Sale_Date Daily_Sales 09/28/2000 48850.40 09/28/2000 41888.88 09/28/2000 61301.77

Row_Num 1 1 1

Using the Row_Number ordered analytic and by partitioning of Product_ID and the sorting by Sale_Date ASC we are bringing back only the first occurrence of a row based on the earliest Sale_Date. This can be done because we are placing our query in a derived table and then selecting from that derived table using a WHERE clause.

Page 371

Chapter 10

OLAP Functions

Finding the Last Occurrence using a WITH Derived Table Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 WITH Derived_Tbl AS (select Product_ID as Prod, Sale_Date, Daily_Sales, Row_Number() over (PARTITION BY product_id ORDER BY Sale_Date Desc) AS Row_Num from sales_table) Select * from Derived_Tbl Where Row_Num = 1 ; Messages

1 2 3

Prod 1000 2000 3000

Garden of Analysis

Result 1

Sale_Date Daily_Sales 10/04/2000 54553.10 10/04/2000 32800.50 10/04/2000 15675.33

Row_Num 1 1 1

Using the Row_Number ordered analytic and by partitioning of Product_ID and the sorting by Sale_Date DESC we are bringing back only the first occurrence of a row based on the latest Sale_Date. This can be done because we are placing our query in a derived table and then selecting from that derived table using a WHERE clause.

Page 372

Chapter 10

OLAP Functions

Ordered Analytics OVER SELECT Product_ID as Prod ,Sale_Date ,Daily_Sales ,SUM(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Total ,AVG(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Avg ,COUNT(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Cnt ,MIN(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Min ,MAX(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Max FROM Sales_Table Prod ____ 1000 2000 3000 3000 2000 1000 1000 2000 3000

Sale_Date __________ Daily_Sales ________ Total _________ 2000-09-28 48850.40 152041.05 2000-09-28 41888.88 152041.05 2000-09-28 61301.77 152041.05 2000-09-29 34509.13 137009.35 2000-09-29 48000.00 137009.35 2000-09-29 54500.22 137009.35 2000-09-30 36000.07 129718.96 2000-09-30 49850.03 129718.96 2000-09-30 43868.86 129718.96

Avg Cnt Min ________ ___ ________ 50680.35 3 41888.88 50680.35 3 41888.88 50680.35 3 41888.88 45669.78 3 34509.13 45669.78 3 34509.13 45669.78 3 34509.13 43239.65 3 36000.07 43239.65 3 36000.07 43239.65 3 36000.07

Not all rows are shown in the answer set

Above is an example of the Ordered Analytics that uses the keyword OVER.

Page 373

Max _______ 61301.77 61301.77 61301.77 54500.22 54500.22 54500.22 49850.03 49850.03 49850.03

Chapter 10

OLAP Functions

RANK and DENSE RANK SELECT Product_ID, Daily_Sales, RANK() OVER (ORDER BY Daily_Sales ASC) as "Rank", DENSE_RANK() OVER(Order By Daily_Sales ASC) as "DenseRank" FROM Sales_Table WHERE Product_ID in(1000, 2000)

Not all rows are displayed

Prod ____ 2000 1000 1000 2000 1000 2000 2000 2000 1000

Daily_Sales _____ Rank __________ DenseRank __________ 32800.50 1 1 32800.50 1 1 36000.07 3 2 36021.93 4 3 40200.43 5 4 41888.88 6 5 43200.18 7 6 48000.00 8 7 48850.40 9 8

Above is an example of the RANK and DENSE_RANK commands. Notice the difference in the ties and the next ranking.

Page 374

Chapter 10

OLAP Functions

RANK Defaults to Ascending Order SELECT Product_ID ,Sale_Date , Daily_Sales, RANK() OVER (ORDER BY Daily_Sales) AS Rank1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID _________

Sale_Date ________

1000 2000 1000 2000 1000 Not all 2000 rows 2000 are displayed 2000 1000 2000 1000 1000 2000

10/02/2000 10/04/2000 09/30/2000 10/02/2000 10/01/2000 09/28/2000 10/03/2000 09/29/2000 09/28/2000 09/30/2000 09/29/2000 10/04/2000 10/01/2000

The RANK OVER command defaults the Sort to ASC.

Page 375

Daily_Sales Rank1 _________ _____ 1 32800.50 1 32800.50 3 36000.07 4 36021.93 5 40200.43 6 41888.88 7 43200.18 8 48000.00 9 48850.40 10 49850.03 11 54500.22 12 54553.10 13 54850.29

Chapter 10

OLAP Functions

Getting RANK to Sort in DESC Order SELECT Product_ID ,Sale_Date , Daily_Sales, RANK() OVER (ORDER BY Daily_Sales DESC) AS Rank1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID _________ 1000 2000 1000 1000 2000 1000 2000 2000 2000 1000 2000 1000 2000 1000

Sale_Date ________

Daily_Sales _________

10/03/2000 10/01/2000 10/04/2000 09/29/2000 09/30/2000 09/28/2000 09/29/2000 10/03/2000 09/28/2000 10/01/2000 10/02/2000 09/30/2000 10/04/2000 10/02/2000

64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50

Rank1 _____ 1 2 3 4 5 6 7 8 9 10 11 12 13 13

Utilize the DESC keyword in the ORDER BY statement to rank in descending order.

Page 376

Chapter 10

OLAP Functions

RANK OVER and PARTITION BY SELECT Product_ID ,Sale_Date , Daily_Sales, RANK() OVER (PARTITION BY Product_ID ORDER BY Daily_Sales DESC) AS Rank1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

10/03/2000 10/04/2000 09/29/2000 09/28/2000 10/01/2000 09/30/2000 10/02/2000 10/01/2000 09/30/2000 09/29/2000 10/03/2000 09/28/2000 10/02/2000 10/04/2000

Daily_Sales Rank1 _________ _____ 64300.00 54553.10 54500.22 48850.40 40200.43 36000.07 32800.50 54850.29 49850.03 48000.00 43200.18 41888.88 36021.93 32800.50

1 2 3 4 5 6 7 1 2 3 4 5 6 7

What does the PARTITION Statement in the RANK OVER do? It resets the rank.

Page 377

Chapter 10

OLAP Functions

PERCENT_RANK OVER SELECT Product_ID ,Sale_Date , Daily_Sales, PERCENT_RANK() OVER (PARTITION BY PRODUCT_ID ORDER BY Daily_Sales DESC) AS PercentRank1 FROM Sales_Table WHERE Product_ID in (1000, 2000) ; Product_ID Sale_Date ________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-10-03 2000-10-04 2000-09-29 2000-09-28 2000-10-01 2000-09-30 2000-10-02 2000-10-01 2000-09-30 2000-09-29 2000-10-03 2000-09-28 2000-10-02 2000-10-04

Daily_Sales _________ PercentRank1 _________ 64300.00 54553.10 54500.22 48850.40 40200.43 36000.07 32800.50 54850.29 49850.03 48000.00 43200.18 41888.88 36021.93 32800.50

0 0.17 0.33 0.5 0.67 0.83 1 0 0.17 0.33 0.5 0.67 0.83 1

7 Rows in Calculation for 1000 Product_ID

7 Rows in Calculation for 2000 Product_ID

We now have added a Partition statement which resets on Product_ID so this produces 7 rows for each of our Product_IDs.

Page 378

Chapter 10

OLAP Functions

PERCENT_RANK OVER with 14 rows in Calculation SELECT Product_ID ,Sale_Date , Daily_Sales, PERCENT_RANK() OVER ( ORDER BY Daily_Sales DESC) AS PercentRank1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 2000 1000 1000 2000 1000 2000 2000 2000 1000 2000 1000 2000 1000

2000-10-03 2000-10-01 2000-10-04 2000-09-29 2000-09-30 2000-09-28 2000-09-29 2000-10-03 2000-09-28 2000-10-01 2000-10-02 2000-09-30 2000-10-04 2000-10-02

Daily_Sales _________ 64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50

PercentRank1 __________ 0 0.08 0.15 0.23 0.31 0.38 0.46 0.54 0.62 0.69 0.77 0.85 0.92 0.92

14 Rows in calculation for both the 1000 and 2000 Product_IDs

Percent_Rank is just like RANK, however, it gives you the Rank as a percent, but only a percent of all the other rows up to 100%.

Page 379

Chapter 10

OLAP Functions

PERCENT_RANK OVER with 21 rows in Calculation SELECT Product_ID ,Sale_Date , Daily_Sales, PERCENT_RANK() OVER ( ORDER BY Daily_Sales DESC) AS PercentRank1 FROM Sales_Table ; Product_ID ________ Sale_Date ________ 1000 3000 2000 1000 1000 Not all 2000 rows 1000 are displayed 2000 3000 2000 2000 1000 2000 1000

2000-10-03 2000-09-28 2000-10-01 2000-10-04 2000-09-29 2000-09-30 2000-09-28 2000-09-29 2000-09-30 2000-10-03 2000-09-28 2000-10-01 2000-10-02 2000-09-30

Daily_Sales _________

PercentRank1 __________

64300.00 61301.77 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43868.86 43200.18 41888.88 40200.43 36021.93 36000.07

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65

21 Rows in Calculation for all of the Product_IDs

Percent_Rank is just like RANK, however, it gives you the Rank as a percent but only a percent of all the other rows up to 100%.

Page 380

Chapter 10

OLAP Functions

Quiz – What Causes the Product_ID to Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, PERCENT_RANK() OVER (PARTITION BY PRODUCT_ID ORDER BY Daily_Sales DESC) AS PercentRank1 FROM Sales_Table WHERE Product_ID in (1000, 2000) ;

Product_ID Sale_Date ________ ________ 1000 2000-10-03 1000 2000-10-04 1000 2000-09-29 1000 2000-09-28 1000 2000-10-01 1000 2000-09-30 1000 2000-10-02 2000 2000-10-01 2000 2000-09-30 2000 2000-09-29 2000 2000-10-03 2000 2000-09-28 2000 2000-10-02 2000 2000-10-04

What caused the Product_IDs to be sorted?

Page 381

Daily_Sales _________ 64300.00 54553.10 54500.22 48850.40 40200.43 36000.07 32800.50 54850.29 49850.03 48000.00 43200.18 41888.88 36021.93 32800.50

PercentRank1 __________ 0 0.17 0.33 0.5 0.67 0.83 1 0 0.17 0.33 0.5 0.67 0.83 1

Chapter 10

OLAP Functions

Answer to Quiz – What Cause the Product_ID to Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, PERCENT_RANK() OVER (PARTITION BY PRODUCT_ID ORDER BY Daily_Sales DESC) AS PercentRank1 FROM Sales_Table WHERE Product_ID in (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-10-03 2000-10-04 2000-09-29 2000-09-28 2000-10-01 2000-09-30 2000-10-02 2000-10-01 2000-09-30 2000-09-29 2000-10-03 2000-09-28 2000-10-02 2000-10-04

Daily_Sales _________

64300.00 54553.10 54500.22 48850.40 40200.43 36000.07 32800.50 54850.29 49850.03 48000.00 43200.18 41888.88 36021.93 32800.50

PercentRank1 __________

0 0.17 0.33 0.5 0.67 0.83 1 0 0.17 0.33 0.5 0.67 0.83 1

What caused the Product_IDs to be sorted? It was the PARTITION BY statement.

Page 382

Chapter 10

OLAP Functions

Finding Gaps between Dates SELECT Product_Id, Sale_Date, MIN(Sale_Date) OVER (PARTITION BY Product_Id ORDER BY Sale_Date ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS Date_Of_Next_Row ,MIN(Sale_Date) OVER (PARTITION BY Product_Id ORDER BY Sale_Date ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) - Sale_Date AS Days_To_Next_Row FROM Sales_Table WHERE Product_ID Between 1000 and 2000 ; Product_ID Sale_Date __________ __________ 1000 10/04/2000 1000 10/03/2000 1000 10/02/2000 1000 10/01/2000 1000 09/30/2000 1000 09/29/2000 1000 09/28/2000 2000 10/04/2000 2000 10/03/2000

The above query finds gaps in dates.

Page 383

Date_Of_Next_Row ________________ ? 10/04/2000 10/03/2000 10/02/2000 10/01/2000 09/30/2000 09/29/2000 ? 10/04/2000

Days_To_Next_Row _________________ ? 1 1 1 Not all 1 rows 1 are displayed 1 ? 1

Chapter 10

OLAP Functions

CSUM – Rows Unbounded Preceding Explained SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Sale_Date ROWS UNBOUNDED PRECEDING) AS CsumAnsi FROM Sales_Table WHERE Product_ID BETWEEN 1000 and 2000 ;

Product_ID Sale_Date ___________ Daily_Sales __________ _________ 2000 2000-09-28 41888.88 1000 2000-09-28 48850.40 2000 2000-09-29 48000.00 Not all rows 1000 2000-09-29 54500.22 are displayed 1000 2000-09-30 36000.07 in this 49850.03 answer set 2000 2000-09-30 1000 2000-10-01 40200.43 2000 2000-10-01 54850.29 1000 2000-10-02 32800.50 2000 2000-10-02 36021.93

CsumAnsi ________ 41888.88 90739.28 138739.28 193239.50 229239.57 279089.60 319290.03 374140.32 406940.82 442962.75

The keywords Rows Unbounded Preceding determines that this is a cumulative sum (CSUM). There are only a few different statements and Rows Unbounded Preceding is the main one. It means start calculating at the beginning row, and continue calculating until the last row. Page 384

Chapter 10

OLAP Functions

CSUM – Making Sense of the Data SELECT

Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Sale_Date ROWS UNBOUNDED PRECEDING) AS SUMOVER FROM Sales_Table WHERE Product_ID BETWEEN 1000 and 2000 ;

Product_ID ________ Sale_Date ________

Not all rows are displayed in this answer set

2000 1000 2000 1000 1000 2000 1000 2000 1000 2000 1000

2000-09-28 2000-09-28 2000-09-29 2000-09-29 2000-09-30 2000-09-30 2000-10-01 2000-10-01 2000-10-02 2000-10-02 2000-10-03

Daily_Sales _________ 41888.88 48850.40 48000.00 54500.22 36000.07 49850.03 40200.43 54850.29 32800.50 36021.93 64300.00

SUMOVER _________ 41888.88 90739.28 138739.28 193239.50 229239.57 279089.60 319290.03 374140.32 406940.82 442962.75 507262.75

The second “SUMOVER” row is 90739.28. That is derived by the first row’s Daily_Sales (41888.88) added to the SECOND row’s Daily_Sales (48850.40).

Page 385

Chapter 10

OLAP Functions

CSUM – Making Even More Sense of the Data SELECT

Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Sale_Date ROWS UNBOUNDED PRECEDING) AS SUMOVER FROM Sales_Table WHERE Product_ID BETWEEN 1000 and 2000 ; Product_ID _________

Not all rows are displayed in this answer set

2000 1000 2000 1000 1000 2000 1000 2000 1000 2000 1000

Sale_Date _________ Daily_Sales _________ SUMOVER _________ 2000-09-28 2000-09-28 2000-09-29 2000-09-29 2000-09-30 2000-09-30 2000-10-01 2000-10-01 2000-10-02 2000-10-02 2000-10-03

41888.88 48850.40 48000.00 54500.22 36000.07 49850.03 40200.43 54850.29 32800.50 36021.93 64300.00

41888.88 90739.28 138739.28 193239.50 229239.57 279089.60 319290.03 374140.32 406940.82 442962.75 507262.75

The third “SUMOVER” row is 138739.28. That is derived by taking the first row’s Daily_Sales (41888.88) and adding it to the SECOND row’s Daily_Sales (48850.40). Then, you add that total to the THIRD row’s Daily_Sales (48000.00).

Page 386

Chapter 10

OLAP Functions

CSUM – The Major and Minor Sort Key(s) SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS SumOVER FROM Sales_Table ;

Product_ID Sale_Date Daily_Sales _________ SumOVER ________ _________ _________ 1000 2000-09-28 48850.40 48850.40 1000 2000-09-29 54500.22 103350.62 1000 2000-09-30 36000.07 139350.69 1000 2000-10-01 40200.43 179551.12 Not all rows are displayed 1000 2000-10-02 32800.50 212351.62 in this 1000 2000-10-03 64300.00 276651.62 answer set 1000 2000-10-04 54553.10 331204.72 2000 2000-09-28 41888.88 373093.60 2000 2000-09-29 48000.00 421093.60 2000 2000-09-30 49850.03 470943.63 2000 2000-10-01 54850.29 525793.92

You can have more than one SORT KEY. In the top query, Product_ID is the MAJOR Sort and Sale_Date is the MINOR Sort. Remember, the data is sorted first and then the cumulative sum is calculated. That is why they are called Ordered Analytics.

Page 387

Chapter 10

OLAP Functions

The ANSI CSUM – Getting a Sequential Number SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) as SUMOVER, SUM(1) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS Seq_Number FROM Sales_Table ; Product_ID Daily_Sales ___________ SUM OVER ___________ Seq_Number __________ Sale_Date _________ __________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03

48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 373093.60 421093.60 470943.63

1 2 3 4 5 6 7 8 9 10

With “Seq_Number”, it will continuously add 1 to the answer for each row. Because you placed the number 1 in the area which calculates the cumulative sum, it will continuously add 1 to the answer for each row.

Page 388

Chapter 10

OLAP Functions

Troubleshooting the ANSI OLAP on a GROUP BY SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Sale_Date ROWS UNBOUNDED PRECEDING) AS AnsiCsum FROM Sales_Table GROUP BY Product_ID ; Error! Why?

Never GROUP BY in a SUM Over or with any ANSI Syntax OLAP command. If you want to reset, use a PARTITION BY Statement, but never a GROUP BY.

Page 389

Chapter 10

OLAP Functions

Reset with a PARTITION BY Statement SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS SumANSI FROM Sales_Table ;

Product_ID Sale_Date ________ ________

Not all rows are displayed in this answer set

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30

Daily_Sales SumANSI _________ ________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03

48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 41888.88 89888.88 139738.91

CSUM Resets on Product_ID break

The PARTITION Statement is how you reset in ANSI. This will cause the SUMANSI to start over (reset) on its calculating for each NEW Product_ID.

Page 390

Chapter 10

OLAP Functions

PARTITION BY only Resets a Single OLAP not ALL of them SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS Subtotal, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS GrandTotal FROM Sales_Table ; Product_ID ________ Sale_Date Daily_Sales Subtotal GrandTotal _________ _________ ________ ________

Not all rows are displayed in this answer set

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03

48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 41888.88 89888.88 139738.91

48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 373093.60 421093.60 470943.63

Above are two OLAP statements. Only one has PARTITION BY, so only it resets. The other continuously does a CSUM.

Page 391

Chapter 10

OLAP Functions

CURRENT ROW AND UNBOUNDED FOLLOWING SELECT Product_ID, Sale_Date ,Daily_Sales ,SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS CumulativeTotal FROM Sales_Table ORDER BY CumulativeTotal Product_ID ________ Sale_Date Daily_Sales CumulativeTotal _________ _________ ____________

Not all rows are displayed in this answer set

3000 3000 3000 3000 3000 3000 3000 2000 2000 2000 2000

10/04/2000 10/03/2000 10/02/2000 10/01/2000 09/30/2000 09/29/2000 09/28/2000 10/04/2000 10/03/2000 10/02/2000 10/01/2000

15675.33 21553.79 19678.94 28000.00 43868.86 34509.13 61301.77 32800.50 43200.18 36021.93 54850.29

15675.33 37229.12 56908.06 84908.06 128776.92 163286.05 224587.82 257388.32 300588.50 336610.43 391460.72

Above we used the ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING to produce a CSUM, but notice that the Product_ID and the Sale_Date are reversed. We see the Product_ID of 3000 first and the latest date first.

Page 392

Chapter 10

OLAP Functions

Different Windowing Options SELECT Product_ID, Sale_Date, Daily_Sales ,SUM(Daily_Sales) OVER( PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS BETWEEN 1 PRECEDING AND CURRENT ROW ) as Row_Preceding ,SUM(Daily_Sales) OVER( PARTITION BY Product_ID ORDER BY Product_Id, Sale_Date ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) as Row_Following FROM Sales_Table Product_ID ________ Sale_Date Daily_Sales Row_Preceding Row_Following _________ _________ ____________ ___________

Not all rows are displayed in this answer set

1000 1000 1000 1000 1000 1000 1000 2000 2000

09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00

48850.40 103350.62 90500.29 76200.50 73000.93 97100.50 118853.10 41888.88 89888.88

103350.62 90500.29 76200.50 73000.93 97100.50 118853.10 54553.10 89888.88 97850.03

The example above uses ROWS BETWEEN 1 PRECEDING AND CURRENT ROW and then uses a different example with ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING. Notice how the report came out?

Page 393

Chapter 10

OLAP Functions

Moving Sum has a Moving Window SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding)AS Sum3_ANSI FROM Sales_Table ; Calculate the Current Row and 2 rows preceding

Moving Window of 3 rows

Product_ID ________ Sale_Date Daily_Sales _________ _________ Sum3_ANSI _________

Not all rows are displayed in this answer set

1000 1000 1000 1000 1000 1000 1000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29

48850.40 103350.62 139350.69 130700.72 109001.00 137300.93 151653.60 160741.98 144441.98

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00

The SUM () Over allows you to get the moving SUM of a certain column. The moving window in ANSI form always includes the current row. A Rows 2 Preceding statement means the current row and two preceding, which is a moving window of 3. .

Page 394

Chapter 10

OLAP Functions

How ANSI Moving SUM Handles the Sort SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS Sum3_ANSI FROM Sales_Table ; Major and Minor Sort keys Product_ID _________ Sale_Date _________ Daily_Sales _________

Not all rows are displayed in this answer set

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02

The SUM OVER places the sort after the ORDER BY.

Page 395

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93

Sum3_AN SI __________

48850.40 103350.62 139350.69 130700.72 109001.00 137300.93 151653.60 160741.98 144441.98 139738.91 152700.32 140722.25

Chapter 10

OLAP Functions

Quiz – How is that Total Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS Sum3_ANSI FROM Sales_Table ; Product_ID ________

Sale_Date _________

Daily_Sales _________

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93

Not all rows are displayed in this answer set

Sum3_ANSI __________ 48850.40 103350.62 139350.69 130700.72 109001.00 137300.93 151653.60 160741.98 144441.98 139738.91 152700.32 140722.25

With a Moving Window of 3, how is the 139350.69 amount derived in the Sum3_ANSI column in the third row?

Page 396

Chapter 10

OLAP Functions

Answer to Quiz – How is that Total Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS Sum3_ANSI FROM Sales_Table ; Product_ID ________

Sale_Date _________

Daily_Sales _________

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93

Not all rows are displayed in this answer set

Sum3_ANSI __________ 48850.40 103350.62 139350.69 130700.72 109001.00 137300.93 151653.60 160741.98 144441.98 139738.91 152700.32 140722.25

With a Moving Window of 3, how is the 139350.69 amount derived in the Sum3_ANSI column in the third row? It is the sum of 48850.40, 54500.22 and 36000.07. The current row of Daily_Sales plus the previous two rows of Daily_Sales.

Page 397

Chapter 10

OLAP Functions

Moving SUM every 3-rows Vs a Continuous Average SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS SUM3, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED Preceding) AS Continuous FROM Sales_Table; Product_ID ________ Sale_Date _________ Daily_Sales ________ SUM3 Continuous _________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00

48850.40 103350.62 139350.69 130700.72 109001.00 137300.93 151653.60 160741.98 144441.98

48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 373093.60 421093.60

Not all rows are displayed in this answer set

The ROWS 2 Preceding gives the MSUM for every 3 rows. The ROWS UNBOUNDED Preceding gives the continuous MSUM.

Page 398

Chapter 10

OLAP Functions

PARTITION BY Resets an ANSI OLAP SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS SUM3, SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ANSI reset ROWS UNBOUNDED Preceding) AS Continuous much Like a FROM Sales_Table; GROUP BY Product_ID __________ 1000 1000 1000 Not all 1000 rows are 1000 displayed 1000 1000 2000 2000

Sale_Date Daily_Sales _________ SUM3 __________ __________ 2000-09-28 48850.40 48850.40 2000-09-29 54500.22 103350.62 2000-09-30 36000.07 139350.69 2000-10-01 40200.43 130700.72 2000-10-02 32800.50 109001.00 2000-10-03 64300.00 137300.93 2000-10-04 54553.10 151653.60 2000-09-28 41888.88 160741.98 2000-09-29 48000.00 144441.98

Continuous __________ 48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 41888.88 89888.88

Use a PARTITION BY Statement to Reset the ANSI OLAP. Notice it only resets the OLAP command containing the Partition By statement, but not the other OLAPs.

Page 399

Chapter 10

OLAP Functions

The Moving Window is Current Row and Preceding SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding)AS AVG_3_ANSI FROM Sales_Table ; Moving Window of 3 rows

Calculate the Current Row and 2 rows preceding

Product_ID Sale_Date _________ Daily_Sales AVG_3_ANSI _________ _________ ___________ 48850.40 1000 2000-09-28 48850.40 54500.22 1000 2000-09-29 51675.31 36000.07 1000 2000-09-30 46450.23 40200.43 1000 2000-10-01 43566.91 Not all rows 32800.50 1000 2000-10-02 36333.67 are 64300.00 1000 2000-10-03 45788.98 displayed 2000-10-04 54553.10 1000 50551.20 41888.88 2000 2000-09-28 53580.66 48000.00 2000 2000-09-29 48147.33 49850.03 2000 2000-09-30 46579.11

The AVG () Over allows you to get the moving average of a certain column. The Rows 2 Preceding is a moving window of 3 in ANSI.

Page 400

Chapter 10

OLAP Functions

How Moving Average Handles the Sort SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3_ANSI FROM Sales_Table ; Major and Minor Sort keys

Product_ID ________ Sale_Date _________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02

Daily_Sales ___________ AVG_3_ANSI ________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93

48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33 46579.11 50900.11 46907.42

Much like the SUM OVER Command, the Average OVER places the sort keys via the ORDER BY keywords.

Page 401

Chapter 10

OLAP Functions

Moving Average SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3 FROM Sales_Table ; Product_ID ________ Sale_Date _________ Daily_Sales _________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03

AVG_3 __________

48850.4 51675.31 46450.23 43566.91 36333.67 45766.98 50551.2 53580.66 48147.33 46579.64

Understand that in ANSI a ROWS 2 PRECEDING is considered a Moving Window of 3. That is because in ANSI it is considered the Current Row and 2 preceding. The next page will use the CAST command to provide a precision of 0 decimal places.

Page 402

Chapter 10

OLAP Functions

Moving Average SELECT Product_ID , Sale_Date, Daily_Sales, CAST(AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) as Decimal (8,0)) AS AVG_3 FROM Sales_Table ;

Product_ID ________ Sale_Date _________ Daily_Sales _______ AVG_3 _________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03

48850 51675 46450 43567 36334 45767 50551 53581 48147 46580

Understand that in ANSI a ROWS 2 PRECEDING is considered a Moving Window of 3. That is because in ANSI it is considered the Current Row and 2 preceding.

Page 403

Chapter 10

OLAP Functions

Quiz – How is that Total Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3_ANSI FROM Sales_Table ; Product_ID _________

Sale_Date _________

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02

Not all rows are displayed

Daily_Sales ___________ AVG_3_ANSI _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93

48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33 46579.11 50900.11 46907.42

With a Moving Window of 3, how is the 46450.23 amount derived in the AVG_3_ANSI column in the third row?

Page 404

Chapter 10

OLAP Functions

Answer to Quiz – How is that Total Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3_ANSI FROM Sales_Table ; Product_ID Sale_Date _________ Daily_Sales ___________ AVG_3_ANSI ________ ________ 1000 2000-09-28 48850.40 48850.40 1000 2000-09-29 51675.31 54500.22 1000 2000-09-30 46450.23 36000.07 1000 2000-10-01 43566.91 40200.43 Not all 1000 2000-10-02 36333.67 32800.50 rows 1000 2000-10-03 45788.98 64300.00 are 1000 2000-10-04 50551.20 54553.10 displayed 2000 2000-09-28 53580.66 41888.88 2000 2000-09-29 48147.33 48000.00 2000 2000-09-30 46579.11 49850.03 2000 2000-10-01 50900.11 54850.29 2000 2000-10-02 46907.42 36021.93 AVG of 48850.40, 54500.22, and 36000.07

With a Moving Window of 3, the 46450.23 amount derived in the third row is the average of 48850.40, 54500.22 and 36000.07.

Page 405

Chapter 10

OLAP Functions

Quiz – How is that 4th Row Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3_ANSI FROM Sales_Table ;

Product_ID _________ Sale_Date Daily_Sales ___________ AVG_3_ANSI ________ _________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93

48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33 46579.11 50900.11 46907.42

With a Moving Window of 3, how is the 43566.91 amount derived in the AVG_3_ANSI column in the fourth row?

Page 406

Chapter 10

OLAP Functions

Answer to Quiz – How is that 4th Row Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3_ANSI FROM Sales_Table ; Product_ID _________

Sale_Date _________

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02

Not all rows are displayed

Daily_Sales AVG_3_ANSI _________ __________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93

48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33 46579.11 50900.11 46907.42

AVG of 54500.22, 36000.07 and 40200.43

With a Moving Window of 3, how is the 43566.91 amount derived in the AVG_3_ANSI column in the fourth row? The current row plus Rows 2 Preceding.

Page 407

Chapter 10

OLAP Functions

Moving Average every 3-rows vs a Continuous Average SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG3, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED Preceding) AS Continuous FROM Sales_Table;

Product_ID Sale_Date Daily_Sales _______ AVG3 Continuous _________ _________ _________ _________ 1000 1000 1000 Not all rows 1000 are 1000 displayed 1000 1000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00

48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33

48850.40 51675.31 46450.23 44887.78 42470.32 46108.60 47314.96 46636.70 46788.18

The ROWS 2 Preceding gives the MAVG for every 3 rows. The ROWS UNBOUNDED Preceding gives the continuous MAVG.

Page 408

Chapter 10

OLAP Functions

PARTITION BY Resets an ANSI OLAP SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG3, AVG(Daily_Sales) OVER (PARTITION BY Product_ID ANSI reset ORDER BY Product_ID, Sale_Date much Like a ROWS UNBOUNDED Preceding) AS Continuous GROUP BY FROM Sales_Table; Product_ID Sale_Date Daily_Sales _______ AVG3 Continuous _________ _________ _________ _________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00

48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33

48850.40 51675.31 46450.23 44887.78 42470.32 46108.60 47314.96 41888.88 44944.44

Use a PARTITION BY Statement to Reset the ANSI OLAP. The Partition By statement only resets the column using the statement. Notice that only Continuous resets.

Page 409

Chapter 10

OLAP Functions

Moving Difference using ANSI Syntax SELECT Product_ID, Sale_Date, Daily_Sales, Daily_Sales - SUM(Daily_Sales) OVER ( ORDER BY Product_ID ASC, Sale_Date ASC ROWS BETWEEN 4 PRECEDING AND 4 PRECEDING) AS "MDiff_ANSI" FROM Sales_Table ; Product_ID _________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000

Sale_Date __________ Daily_Sales MDiff_ANSI _________ __________ 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29

? ? ? ? -16049.90 9799.78 18553.03 1688.45 15199.50 -14449.97 297.19

This is how you do a MDiff using the ANSI Syntax with a moving window of 4.

Page 410

Chapter 10

OLAP Functions

Moving Difference using ANSI Syntax with Partition By SELECT Product_ID, Sale_Date, Daily_Sales, Daily_Sales - SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID ASC, Sale_Date ASC ROWS BETWEEN 4 PRECEDING AND 4 PRECEDING) AS "MDiff_ANSI" FROM Sales_Table; Product_ID _________ Sale_Date __________ Daily_Sales ___________ MDiff_ANSI __________ Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93

? ? ? ? -16049.90 9799.78 18553.03 ? ? ? ? -5866.95

Wow! This is how you do a MDiff using the ANSI Syntax with a moving window of 4 and with a PARTITION BY statement.

Page 411

Chapter 10

OLAP Functions

COUNT OVER for a Sequential Number SELECT Product_ID ,Sale_Date , Daily_Sales, COUNT(*) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS Seq_Number FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID Sale_Date _________ Daily_Sales Seq_Number ________ _________ __________ 48850.40 1 1000 2000-09-28 54500.22 2 1000 2000-09-29 36000.07 3 1000 2000-09-30 Not all 40200.43 4 1000 2000-10-01 rows 32800.50 5 1000 2000-10-02 are 64300.00 6 displayed 1000 2000-10-03 54553.10 7 1000 2000-10-04 41888.88 8 2000 2000-09-28 48000.00 9 2000 2000-09-29 49850.03 10 2000 2000-09-30 54850.29 11 2000 2000-10-01 This is the COUNT OVER. It will provide a sequential number starting at 1. The Keyword(s) ROWS UNBOUNDED PRECEDING causes Seq_Number to start at the beginning and increase sequentially to the end.

Page 412

Chapter 10

OLAP Functions

COUNT OVER without Rows Unbounded Preceding SELECT Product_ID ,Sale_Date , Daily_Sales, COUNT(*) OVER (ORDER BY Product_ID, Sale_Date) AS No_Seq FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date __________ Daily_Sales _______ No_Seq __________ _________ 48850.40 1 1000 2000-09-28 54500.22 2 1000 2000-09-29 36000.07 3 1000 2000-09-30 40200.43 4 1000 2000-10-01 32800.50 5 1000 2000-10-02 64300.00 6 1000 2000-10-03 54553.10 7 14 rows 1000 2000-10-04 came 41888.88 8 2000 2000-09-28 back 48000.00 9 2000 2000-09-29 49850.03 10 2000 2000-09-30 54850.29 11 2000 2000-10-01 36021.93 12 2000 2000-10-02 43200.18 13 2000 2000-10-03 32800.50 14 2000 2000-10-04 When you don’t have a ROWS UNBOUNDED PRECEDING this still works just fine.

Page 413

Chapter 10

OLAP Functions

Quiz – What caused the COUNT OVER to Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, COUNT(*) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date _________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

What Keyword(s) caused StartOver to reset?

Page 414

Daily_Sales _________

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

StartOver _______

1 2 3 4 5 6 7 1 2 3 4 5 6 7

Chapter 10

OLAP Functions

Answer to Quiz – What caused the COUNT OVER to Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, COUNT(*) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

Daily_Sales _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

StartOver _______ 1 2 3 4 5 6 7 1 2 3 4 5 6 7

What Keyword(s) caused StartOver to reset? It is the PARTITION BY statement. Page 415

Chapter 10

OLAP Functions

The MAX OVER Command SELECT Product_ID ,Sale_Date , Daily_Sales, MAX(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS MaxOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

Sale_Date ________

Daily_Sales _________

MaxOver _______

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

48850.40 54500.22 54500.22 54500.22 54500.22 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00

After the sort, the Max Over shows the Max Value up to that point.

Page 416

Chapter 10

OLAP Functions

MAX OVER with PARTITION BY Reset SELECT Product_ID ,Sale_Date , Daily_Sales, MAX(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS MaxOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID _________ 1000 1000 1000 1000 Not all 1000 rows 1000 are displayed 1000 2000 2000 2000 2000

Sale_Date ________ 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01

Daily_Sales _________

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29

MaxOver ________

48850.40 54500.22 54500.22 54500.22 54500.22 64300.00 64300.00 41888.88 48000.00 49850.03 54850.29

The largest value is 64300.00 in the column MaxOver. Once it was evaluated, it did not continue until the end because of the PARTITION BY reset.

Page 417

Chapter 10

OLAP Functions

MAX OVER without Rows Unbounded Preceding SELECT Product_ID ,Sale_Date , Daily_Sales, MAX(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ) AS MaxOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Daily_Sales ________ MaxOver __________ Sale_Date ________ __________

Not all rows are displayed

1000 1000 1000 1000 1000 1000 1000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03

You don't need the Rows Unbounded Preceding with the MAX OVER.

Page 418

48850.40 54500.22 54500.22 54500.22 54500.22 64300.00 64300.00 64300.00 64300.00 64300.00

Chapter 10

OLAP Functions

The MIN OVER Command SELECT Product_ID, Sale_Date ,Daily_Sales ,MIN(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS MinOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date _________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

Daily_Sales _________

MinOver _______

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

48850.40 48850.40 36000.07 36000.07 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50

After the sort, the MIN () Over shows the Max Value up to that point.

Page 419

Chapter 10

OLAP Functions

MIN OVER without Rows Unbounded Preceding SELECT Product_ID ,Sale_Date , Daily_Sales, MIN(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ) AS MinOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID __________ Sale_Date _________ Daily_Sales __________ 48850.40 1000 2000-09-28 54500.22 1000 2000-09-29 36000.07 1000 2000-09-30 40200.43 1000 2000-10-01 Not all rows 32800.50 1000 2000-10-02 are 64300.00 1000 2000-10-03 displayed 1000 2000-10-04 54553.10 41888.88 2000 2000-09-28 48000.00 2000 2000-09-29 49850.03 2000 2000-09-30 54850.29 2000 2000-10-01

You don't need the Rows Unbounded Preceding with the MIN OVER.

Page 420

MinOver ________ 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50

Chapter 10

OLAP Functions

Finding a Value of a Column in the Next Row with MIN SELECT Product_ID, Sale_Date, Daily_Sales, MIN(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS BETWEEN 1 Following and 1 Following) AS NextSale FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID __________ 1000 1000 1000 1000 Not all rows 1000 are 1000 displayed 1000 2000 2000 2000 2000

Sale_Date Daily_Sales _________ __________ 48850.40 09/28/2000 54500.22 09/29/2000 36000.07 09/30/2000 40200.43 10/01/2000 32800.50 10/02/2000 64300.00 10/03/2000 54553.10 10/04/2000 41888.88 09/28/2000 48000.00 09/29/2000 49850.03 09/30/2000 54850.29 10/01/2000

NextSale ________ 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 ? 48000.00 49850.03 54850.29 36021.93

The above example finds the value of a column in the next row for Daily_Sales. Notice it is partitioned so there is a Null value at the end of each Product_ID.

Page 421

Chapter 10

OLAP Functions

The CSUM for Each Product_Id and the Next Start Date SELECT ROW_NUMBER() OVER (PARTITION BY Product_ID ORDER BY Sale_Date) As Rnbr ,Product_Id as PROD ,Sale_Date ,MIN(Sale_Date) OVER (PARTITION BY Product_ID ORDER BY Sale_Date ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) As Next_Start_Dt ,Daily_Sales ,SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Sale_Date ROWS UNBOUNDED PRECEDING) As To_Date_Revenue FROM Sales_Table Rnbr Prod _________ Sale_Date ____________ Next_Start_Dt __________ Daily_Sales To_Date_Revenue ____ ____ ____ ___________ 1 2 3 4 5 6 7 1

1000 1000 1000 1000 1000 1000 1000 2000

09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000

09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 ? 09/29/2000

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88

Not all rows are displayed

48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 41888.88

The above example shows the cumulative SUM for the Daily_Sales and the next date on the same line.

Page 422

Chapter 10

OLAP Functions

Quiz – Fill in the Blank SELECT Product_ID ,Sale_Date , Daily_Sales, MIN(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS MinOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID _________ Sale_Date Daily_Sales MinOver ________ _________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

The last two answers (MinOver) are blank, so you can fill in the blank.

Page 423

48850.40 48850.40 36000.07 36000.07 32800.50 32800.50 32800.50 41888.88 41888.88 41888.88 41888.88 36021.93

Chapter 10

OLAP Functions

Answer – Fill in the Blank SELECT Product_ID ,Sale_Date , Daily_Sales, MIN(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS MinOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID _________ Sale_Date Daily_Sales ________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04

The last two answers (MinOver) are filled in.

Page 424

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

MinOver ________ 48850.40 48850.40 36000.07 36000.07 32800.50 32800.50 32800.50 41888.88 41888.88 41888.88 41888.88 36021.93 36021.93 32800.50

Chapter 10

OLAP Functions

How Ntile Works SELECT Product_ID, Sale_Date, Daily_Sales ,NTILE (4) OVER (ORDER BY Daily_Sales , Sale_Date ) AS "Quartiles" FROM Sales_Table WHERE Product_ID = 1000;

Product_ID Sale_Date __________ Daily_Sales ________ Quartiles __________ _________ 1000 1000 1000 1000 1000 1000 1000

10/02/2000 09/30/2000 10/01/2000 09/28/2000 09/29/2000 10/04/2000 10/03/2000

32800.50 36000.07 40200.43 48850.40 54500.22 54553.10 64300.00

1 1 2 2 3 3 4

Assigning a different value to the indicator of the Ntile function changes the number of partitions established. Each Ntile partition is assigned a number starting at 1 increasing to a value that is one less than the partition number specified. So, with an Ntile of 4 the partitions are 1 through 4. Then, all the rows are distributed as evenly as possible into each partition from highest to lowest values. Normally, extra rows with the lowest value begin back in the lowest numbered partitions.

Page 425

Chapter 10

OLAP Functions

Ntile SELECT Last_Name, Grade_Pt, NTILE(5) OVER (ORDER BY Grade_Pt) as "Tile" FROM Student_Table ORDER BY "Tile" DESC;

Last_Name Grade_Pt ____ Tile ________ _________ 3.95 5 Bond 4.00 5 Thomas 3.35 4 Delaney 3.80 4 Wilson 2.88 3 Hanson 3.00 3 Phillips 1.90 2 McRoberts 2.00 2 Smith ? 1 Johnson 0.00 1 Larkins

The Ntile function organizes rows into n number of groups. These groups are referred to as tiles. The tile number is returned. For example, the example above has 10 rows, so NTILE (5) splits the 10 rows into five equally sized tiles. There are 2 rows in each tile in the order of the OVER clause's ORDER BY.

Page 426

Chapter 10

OLAP Functions

Ntile Continued SELECT Dept_No, EmployeeCount, NTILE(2) OVER (ORDER BY EmployeeCount) as "Tile" FROM (SELECT Dept_No, COUNT(*) as EmployeeCount FROM Employee_Table GROUP BY Dept_No ) AS Q ORDER BY "Tile" DESC; Dept_No ________ EmployeeCount _____________ Tile ____ 1 2 300 2 2 200 3 2 400 1 1 ? 1 1 10 1 1 100

The Ntile function organizes rows into n number of groups. These groups are referred to as tiles. The tile number is returned. For example, the example above has 6 rows, so NTILE (2) splits the 10 rows into 2 equally sized tiles. There are 3 rows in each tile in the order of the OVER clause's ORDER BY.

Page 427

Chapter 10

OLAP Functions

Ntile Percentile SELECT Claim_ID, Claim_Date, ClaimCount, NTILE(100) OVER (ORDER BY ClaimCount) as Percentile FROM (SELECT Claim_ID, Claim_Date, COUNT(*) as ClaimCount FROM Claims GROUP BY Claim_ID, Claim_Date ) AS Q ORDER BY Percentile DESC Claim_ID _________ 1302111 4307444 3306333 1304111 2303222 4305444 4303555 3402222 3308333

Claim_Date ClaimCount ___________ __________ 2003-03-01 4 2003-07-05 3 2003-06-28 3 2003-04-28 2 2003-03-12 2 2003-05-12 2 2004-03-01 2 2004-02-28 2 2003-08-01 2

Percentile _________ 26 25 24 23 22 21 20 19 18

Not all rows are displayed

The Ntile function organizes rows into n number of groups. These groups are referred to as tiles. The tile number is returned. Above is a way to get the percentile.

Page 428

Chapter 10

OLAP Functions

Another Ntile Example This example determines the percentile for every row in the Sales table based on the daily sales amount and sorts it into sequence by the value being categorized, which here is daily sales. SELECT Product_ID, Sale_Date, Daily_Sales ,NTILE(100) OVER (ORDER BY Daily_Sales) AS "Quantile" FROM Sales_Table WHERE Product_ID < 2000 ;

Product_ID _________ 1000 1000 1000 1000 1000 1000 1000 Above is another Ntile example.

Page 429

Sale_Date _________

Daily_Sales ________ Quantile __________

10/02/2000 09/30/2000 10/01/2000 09/28/2000 09/29/2000 10/04/2000 10/03/2000

32800.50 36000.07 40200.43 48850.40 54500.22 54553.10 64300.00

1 2 3 4 5 6 7

Chapter 10

OLAP Functions

Using Tertiles (Partitions of Four) SELECT Product_ID, Sale_Date, Daily_Sales ,NTILE (4) OVER (Order by Daily_Sales , Sale_Date ) AS "Quartiles" FROM Sales_Table WHERE Product_ID in (1000, 2000) ;

Product_ID __________ 1000 2000 1000 2000 1000 2000 2000 2000 1000 2000 1000 1000 2000 1000

Sale_Date __________ Daily_Sales ________ Quartiles _________ 10/02/2000 32800.50 1 10/04/2000 32800.50 1 09/30/2000 36000.07 1 10/02/2000 36021.93 1 10/01/2000 40200.43 2 09/28/2000 41888.88 2 10/03/2000 43200.18 2 09/29/2000 48000.00 2 09/28/2000 48850.40 3 09/30/2000 49850.03 3 09/29/2000 54500.22 3 10/04/2000 54553.10 4 10/01/2000 54850.29 4 10/03/2000 64300.00 4

Instead of 100, the example above uses a quartile (QUANTILE based on 4 partitions).

Page 430

Chapter 10

OLAP Functions

NTILE SELECT Product_ID ,Sale_Date , Daily_Sales, NTILE(4) OVER (ORDER BY Daily_Sales) AS Bucket FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID ________ Sale_Date ________ 1000 2000 1000 1000 2000 1000 2000 2000 2000 1000 2000 1000 1000 2000

10/03/2000 10/01/2000 10/04/2000 09/29/2000 09/30/2000 09/28/2000 09/29/2000 10/03/2000 09/28/2000 10/01/2000 10/02/2000 09/30/2000 10/02/2000 10/04/2000

Daily_Sales _________ 64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50

Bucket ________ 1 1 1 1 2 2 2 2 3 3 3 4 4 4

The NTILE function divides the rows into buckets as evenly as possible. In this example, because PARTITION BY is omitted, the entire input will be sorted using the ORDER BY clause, and then divided into the number of buckets specified.

Page 431

Chapter 10

OLAP Functions

NTILE Using a Value of 10 SELECT Product_ID ,Sale_Date , Daily_Sales, NTILE(10) OVER (ORDER BY Daily_Sales) AS Bucket FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date __________ _________ 1000 10/03/2000 2000 10/01/2000 1000 10/04/2000 1000 09/29/2000 2000 09/30/2000 1000 09/28/2000 2000 09/29/2000 2000 10/03/2000 2000 09/28/2000 1000 10/01/2000 2000 10/02/2000 1000 09/30/2000 1000 10/02/2000 2000 10/04/2000

Daily_Sales Bucket __________ _____ 64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50

1 1 2 2 3 3 4 4 5 6 7 8 9 10

The NTILE function divides the rows into buckets as evenly as possible. In this example, because PARTITION BY is omitted, the entire input will be sorted using the ORDER BY clause, and then divided into the number of buckets specified. This example uses a value of 10 in the NTILE.

Page 432

Chapter 10

OLAP Functions

NTILE with a Partition SELECT Product_ID ,Sale_Date , Daily_Sales, NTILE(3) OVER (PARTITION BY Product_ID ORDER BY Daily_Sales) AS Bucket FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID Sale_Date Daily_Sales __________ _________ __________ 32800.50 1000 10/02/2000 36000.07 1000 09/30/2000 40200.43 1000 10/01/2000 48850.40 1000 09/28/2000 54500.22 1000 09/29/2000 54553.10 1000 10/04/2000 64300.00 1000 10/03/2000 32800.50 2000 10/04/2000 36021.93 2000 10/02/2000 41888.88 2000 09/28/2000 43200.18 2000 10/03/2000 48000.00 2000 09/29/2000 49850.03 2000 09/30/2000 54850.29 2000 10/01/2000

Bucket ______ 1 1 1 2 2 3 3 1 1 1 2 2 3 3

The NTILE function divides the rows into buckets as evenly as possible. In this example, because PARTITION BY is listed, the data will first be sorted by Product_ID and then sorted using the ORDER BY clause (within Product_ID), and then divided into the number of buckets specified. This example uses a value of 3 in the NTILE. Notice that the PARTITION BY statement causes the answer set to reset on Product_ID breaks.

Page 433

Chapter 10

OLAP Functions

Using FIRST_VALUE SELECT Last_name, first_name, dept_no ,FIRST_VALUE(first_name) OVER (ORDER BY dept_no, last_name desc rows unbounded preceding) AS "First All" ,FIRST_VALUE(first_name) OVER (PARTITION BY dept_no ORDER BY dept_no, last_name desc rows unbounded preceding) AS "First Partition" FROM Employee_Table; LAST_NAME Jones Smythe Chambers Smith Coffing Larkins Strickling Reilly Harrison

FIRST_NAME DEPT_NO Squiggy ? Richard 10 Mandee 100 John 200 Billy 200 Loraine 300 Cletus 400 William 400 Herbert 400

First All Squiggy Squiggy Squiggy Squiggy Squiggy Squiggy Squiggy Squiggy Squiggy

First Partition Squiggy Richard Mandee John John Loraine Cletus Cletus Cletus

The above example uses FIRST_VALUE to show you the very first first_name returned. It also uses the keyword Partition to show you the very first first_name returned in each department.

Page 434

Chapter 10

OLAP Functions

FIRST_VALUE SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - First_Value (Daily_Sales) OVER (ORDER BY Sale_Date) AS Delta_First FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ ________ 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000

09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000 10/01/2000 10/01/2000 10/02/2000 10/02/2000 10/03/2000 10/03/2000 10/04/2000 10/04/2000

Daily_Sales _________ Delta_First __________ 48850.40 41888.88 54500.22 48000.00 36000.07 49850.03 40200.43 54850.29 32800.50 36021.93 64300.00 43200.18 54553.10 32800.50

0.00 -6961.52 5649.82 -850.40 -12850.33 999.63 -8649.97 5999.89 -16049.90 -12828.47 15449.60 -5650.22 5702.70 -16049.90

Above, after sorting the data by Sale_Date, we compute the difference between the first row's Daily_Sales and the Daily_Sales of each following row. All rows Daily_Sales are compared with the first row's Daily_Sales, thus the name First_Value.

Page 435

Chapter 10

OLAP Functions

FIRST_VALUE after Sorting by the Highest Value SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - First_Value (Daily_Sales) OVER (ORDER BY Daily_Sales DESC) AS Delta_First FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ _________ 1000 2000 1000 1000 2000 1000 2000 2000 2000 1000 2000 1000 1000 2000

10/03/2000 10/01/2000 10/04/2000 09/29/2000 09/30/2000 09/28/2000 09/29/2000 10/03/2000 09/28/2000 10/01/2000 10/02/2000 09/30/2000 10/02/2000 10/04/2000

Daily_Sales _________

Delta_First _________

64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50

0.00 -9449.71 -9746.90 -9799.78 -14449.97 -15449.60 -16300.00 -21099.82 -22411.12 -24099.57 -28278.07 -28299.93 -31499.50 -31499.50

Above, after sorting the data by Daily_Sales DESC, we compute the difference between the first row's Daily_Sales and the Daily_Sales of each following row. All rows Daily_Sales are compared with the first row's Daily_Sales, thus the name First_Value. This example shows that how much less each Daily_Sales is compared to 64,300.00 (our highest sale).

Page 436

Chapter 10

OLAP Functions

FIRST_VALUE with Partitioning SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - First_Value (Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Sale_Date) AS Delta_First FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;

Product_ID Sale_Date _________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000

Daily_Sales _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

Delta_First _________ 0.00 5649.82 -12850.33 -8649.97 -16049.90 15449.60 5702.70 0.00 6111.12 7961.15 12961.41 -5866.95 1311.30 -9088.38

We are now comparing the Daily_Sales of the first Sale_Date for each Product_ID with the Daily_Sales of all other rows within the Product_ID partition. Each row is only compared with the first row (First_Value) in it's partition. Page 437

Chapter 10

OLAP Functions

Using LAST_VALUE SELECT Last_name, first_name, dept_no ,LAST_VALUE(first_name) OVER (ORDER BY dept_no, last_name desc rows unbounded preceding) AS "Last All" ,LAST_VALUE(first_name) OVER (PARTITION BY dept_no ORDER BY dept_no, last_name desc rows unbounded preceding) AS "Last Partition" FROM sql_class.Employee_Table; LAST_NAME Jones Smythe Chambers Smith Coffing Larkins Strickling Reilly Harrison

FIRST_NAME DEPT_NO Squiggy ? Richard 10 Mandee 100 John 200 Billy 200 Loraine 300 Cletus 400 William 400 Herbert 400

Last All Squiggy Richard Mandee John Billy Loraine Cletus William Herbert

Last Partition Squiggy Richard Mandee John Billy Loraine Cletus William Herbert

The FIRST_VALUE and LAST_VALUE are good to use anytime you need to propagate a value from one row to all or multiple rows based on a sorted sequence. However, the output from the LAST_VALUE function appears to be incorrect and is a little misleading until you understand a few concepts. The SQL request specifies "rows unbounded preceding“, and LAST_VALUE looks at the last row. The current row is always the last row, and therefore, it appears in the output.

Page 438

Chapter 10

OLAP Functions

LAST_VALUE SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LAST_Value (Daily_Sales) OVER (ORDER BY Sale_Date) AS Delta_Last FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ ________ 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000

09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000 10/01/2000 10/01/2000 10/02/2000 10/02/2000 10/03/2000 10/03/2000 10/04/2000 10/04/2000

Daily_Sales _________

Delta_Last ________

48850.40 41888.88 54500.22 48000.00 36000.07 49850.03 40200.43 54850.29 32800.50 36021.93 64300.00 43200.18 54553.10 32800.50

0.00 -6961.52 0.00 -6500.22 0.00 13849.96 0.00 14649.86 0.00 3221.43 0.00 -21099.82 0.00 -21752.60

Above, after sorting the data by Sale_Date, we compute the difference between the last row's Daily_Sales and the Daily_Sales of each following row (from the same Sale_Date). Since there is only two product totals for each day, there is always a 0.00 for one of the rows.

Page 439

Chapter 10

OLAP Functions

Using LAG and LEAD Compatibility: Vertica Extension The LAG and LEAD functions allow you to compare different rows of a table by specifying an offset from the current row. You can use these functions to analyze change and variation. Syntax for LAG and LEAD: {LAG | LEAD} (, [ [, ]]) OVER ([PARTITION BY [,...]] ORDER BY [ASC | DESC] [,...] ) ;

The above provides information and the syntax for LAG and LEAD.

Page 440

Chapter 10

OLAP Functions

Using LEAD SELECT Last_Name, Dept_No ,LEAD(Dept_No) OVER (ORDER BY Dept_No, Last_Name) as "Lead All" ,LEAD(Dept_No) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lead Partition" FROM Employee_Table; LAST_NAME Jones Smythe Chambers Coffing Smith Larkins Harrison Reilly Strickling

DEPT_NO ? 10 100 200 200 300 400 400 400

Lead All 10 100 200 200 300 400 400 400 ?

Lead Partition ? ? ? 200 ? ? 400 400 ?

As you can see, the first LEAD brings back the value from the next row except for the last which has no row following it. The offset value was not specified in this example, so it defaulted to a value of 1 row.

Page 441

Chapter 10

OLAP Functions

Using LEAD With and Offset of 2 SELECT Last_Name, Dept_No ,LEAD(Dept_No,2) OVER (ORDER BY Dept_No, Last_Name) as "Lead All" ,LEAD(Dept_No,2) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lead Partition" FROM Employee_Table;

LAST_NAME Jones Smythe Chambers Coffing Smith Larkins Harrison Reilly Strickling

DEPT_NO ? 10 100 200 200 300 400 400 400

Lead All 100 200 200 300 400 400 400 ? ?

Lead Partition ? ? ? ? ? ? 400 ? ?

Above, each value in the first LEAD is 2 rows away, and the partitioning only shows when values are contained in each value group with 1 more than offset value.

Page 442

Chapter 10

OLAP Functions

LEAD SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LEAD(Daily_Sales, 1, 0) OVER (ORDER BY Product_ID, Sale_Date) AS Lead1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000

Daily_Sales _________

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

Lead1 ________

-5649.82 18500.15 -4200.36 7399.93 -31499.50 9746.90 12664.22 -6111.12 -1850.03 -5000.26 18828.36 -7178.25 10399.68 32800.50

Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the next row's Daily_Sales, or one whose Daily_Sales is the same). The expression LEAD (Daily_Sales, 1, 0) tells LEAD () to evaluate the expression Daily_Sales on the row that is positioned one row following the current row. If there is no such row (as is the case on the last row of the partition or relation), then the default value of 0 is used.

Page 443

Chapter 10

OLAP Functions

LEAD With Partitioning SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LEAD(Daily_Sales, 1, 0) OVER (PARTITION BY Product_ID ORDER BY Sale_Date) AS Lead1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000

Daily_Sales ________ Lead1 _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

-5649.82 18500.15 -4200.36 7399.93 -31499.50 9746.90 54553.10 -6111.12 -1850.03 -5000.26 18828.36 -7178.25 10399.68 32800.50

Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the next row's Daily_Sales, or one whose Daily_Sales is the same). We also partitioned the data by Product_ID.

Page 444

Chapter 10

OLAP Functions

Using LAG SELECT Last_Name, Dept_No ,LAG(Dept_No) OVER (ORDER BY Dept_No, Last_Name) as "Lag All" ,LAG(Dept_No) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lag Partition" FROM Employee_Table;

LAST_NAME DEPT_NO Jones ? Smythe 10 Chambers 100 Coffing 200 Smith 200 Larkins 300 Harrison 400 Reilly 400 Strickling 400

Lag All ? ? 10 100 200 200 300 400 400

Lag Partition ? ? ? ? 200 ? ? 400 400

From the example above, you see that LAG uses the value from a previous row and makes it available in the next row. For LAG, the first row(s) will contain a null based on the value in the offset. Here it defaulted to 1. The first null comes from the function whereas the second row gets the null from the first row. Page 445

Chapter 10

OLAP Functions

Using LAG with an Offset of 2 SELECT Last_Name, Dept_No ,LAG(Dept_No,2) OVER (ORDER BY Dept_No, Last_Name) as "Lag All" ,LAG(Dept_No,2) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lag Partition" FROM Employee_Table; LAST_NAME Jones Smythe Chambers Coffing Smith Larkins Harrison Reilly Strickling

DEPT_NO ? 10 100 200 200 300 400 400 400

Lag All ? ? ? 10 100 200 200 300 400

Lag Partition ? ? ? ? ? ? ? ? 400

For this example, the first two rows have a null because there is not a row two rows before these. The number of nulls will always be the same as the offset value. There is a third null because Jones Dept_No is null.

Page 446

Chapter 10

OLAP Functions

LAG SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LAG(Daily_Sales, 1, 0) OVER (ORDER BY Product_ID, Sale_Date) AS Lag1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000

Daily_Sales _________

Lag1 _______

48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50

48850.40 5649.82 -18500.15 4200.36 -7399.93 31499.50 -9746.90 -12664.22 6111.12 1850.03 5000.26 -18828.36 7178.25 -10399.68

Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the previous row's Daily_Sales, or one whose Daily_Sales is the same). The expression LAG (Daily_Sales, 1, 0) tells LAG to evaluate the expression Daily_Sales on the row that is positioned one row before the current row. If there is no such row (as is the case on the first row of the partition or relation), then the default value of 0 is used. Page 447

Chapter 10

OLAP Functions

LAG with Partitioning SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LAG(Daily_Sales, 1, 0) OVER (PARTITION BY Product_ID ORDER BY Sale_Date) AS Lag1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date _________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000

Daily_Sales Lag1 _________ _______ 48850.40 48850.40 54500.22 5649.82 36000.07 -18500.15 40200.43 4200.36 32800.50 -7399.93 64300.00 31499.50 54553.10 -9746.90 41888.88 41888.88 48000.00 6111.12 49850.03 1850.03 54850.29 5000.26 36021.93 -18828.36 43200.18 7178.25 32800.50 -10399.68

Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the previous row's Daily_Sales, or one whose Daily_Sales is the same). The expression LAG (Daily_Sales, 1, 0) tells LAG to evaluate the expression Daily_Sales on the row that is positioned one row before the current row. If there is no such row (as is the case on the first row of the partition or relation), then the default value of 0 is used. Page 448

Chapter 10

OLAP Functions

MEDIAN with Partitioning SELECT Last_Name, Dept_No, Salary, MEDIAN(Salary) OVER (PARTITION BY Dept_No) AS MEDIAN FROM Employee_Table as e WHERE Dept_No in (200, 400)

Last_Name Dept_No ________ _______

Salary _______

MEDIAN _______

Coffing Smith Reilly Harrison Strickling

41888.88 48000.00 36000.00 54500.00 54500.00

44944.44 44944.44 54500 54500 54500

200 200 400 400 400

The Median is a numerical value of an expression in an answer set within a window that separates the higher half of a sample from the lower half. After sorting all values from lowest value to highest, it then picks the middle one. If there is an even number of values, then there is no single middle value, so the median is considered to be the mean (average) of the two middle values.

Page 449

Chapter 10

OLAP Functions

CUME_DIST SELECT Product_ID ,Sale_Date , Daily_Sales, CUME_DIST() OVER (ORDER BY Daily_Sales DESC) AS CDist FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date _________ _________ 1000 10/03/2000 2000 10/01/2000 1000 10/04/2000 1000 09/29/2000 2000 09/30/2000 1000 09/28/2000 2000 09/29/2000 2000 10/03/2000 2000 09/28/2000 1000 10/01/2000 2000 10/02/2000 1000 09/30/2000 1000 10/02/2000 2000 10/04/2000

Daily_Sales __________

64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50

CDist _____

0.07 0.14 0.21 0.29 0.36 0.43 0.50 0.57 0.64 0.71 0.79 0.86 1.00 1.00

The CUME_DIST is a cumulative distribution function that assigns a relative rank to each row, based on a formula. That formula is (number of rows preceding or peer with current row) / (total rows). We order by Daily_Sales DESC, so that each row is ranked by cumulative distribution. The distribution is represented relatively, by floating point numbers from 0 to 1. When there is only one row in a partition, it is assigned 1. When there is more than one row, each is assigned a cumulative distribution ranking, ranging from 0 to 1.

Page 450

Chapter 10

OLAP Functions

CUME_DIST with a Partition SELECT Product_ID ,Sale_Date , Daily_Sales, CUME_DIST() OVER (PARTITION by Product_ID ORDER BY Daily_Sales DESC) AS CDist FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date Daily_Sales ________ _________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000

10/03/2000 10/04/2000 09/29/2000 09/28/2000 10/01/2000 09/30/2000 10/02/2000 10/01/2000 09/30/2000 09/29/2000 10/03/2000 09/28/2000 10/02/2000 10/04/2000

64300.00 54553.10 54500.22 48850.40 40200.43 36000.07 32800.50 54850.29 49850.03 48000.00 43200.18 41888.88 36021.93 32800.50

CDist _____ 0.14 0.29 0.43 0.57 0.71 0.86 1.00 0.14 0.29 0.43 0.57 0.71 0.86 1.00

The CUME_DIST is a cumulative distribution function that assigns a relative rank to each row, based on a formula. That formula is (number of rows preceding or peer with current row) / (total rows). We Partition by Product_ID and then ORDER BY Daily_Sales DESC, so that each row is ranked by cumulative distribution within its partition.

Page 451

Chapter 10

OLAP Functions

SUM (SUM (n)) SELECT Product_ID , SUM(Daily_Sales) as Summy, SUM(SUM(Daily_Sales)) OVER (ORDER BY Sum(Daily_Sales) ) AS Prod_Sales_Running_Sum FROM Sales_Table GROUP BY Product_ID ;

Product_ID __________ Summy _______ Prod_Sales_Running_Sum ___________________ 3000 2000 1000

224587.82 306611.81 331204.72

224587.82 531199.63 862404.35

Window functions can compute aggregates of aggregates, as in the example above.

Page 452

Chapter 11

Page 453

Temporary Tables

Chapter 11

Temporary Tables

Chapter 11 – Temporary Tables

“I cannot imagine any condition which would cause this ship to founder. Modern shipbuilding has gone beyond that.” - E. I. Smith, Captain of the Titanic

Page 454

Chapter 11

Temporary Tables

There are three types of Temporary Tables Derived Table • • • •

Exists only within a query Materialized by a SELECT Statement inside a query Space comes from the User’s Spool space Deleted when the query ends

Local Temporary Table • • •

Created by the User and materialized with an INSERT/SELECT Table and Data are deleted only after a User Logs off the session Can be session specific or seen across different sessions

Global Temporary Table • • • • •

Table definition is created by a User and the table definition is permanent Materialized with an INSERT/SELECT When User logs off the session the data is deleted, but the table definition stays Many Users can populate the same Global table, but each has their own copy Global temporary tables are created in the public schema, with the data contents private to the transaction or session through which data is inserted.

The three types of Temporary tables are Derived, Local Temporary and Global Temporary Tables.

Page 455

Chapter 11

Temporary Tables

CREATING A Derived Table • • • •

Exists only within a query Materialized by a SELECT Statement inside a query Space comes from the User’s Spool space Deleted when the query ends

SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) AS TeraTom(AVGSAL) ; A query within a query.

AVGSAL ________ 46782.15

Answer Set

The SELECT Statement that creates and populates the Derived table is always inside Parentheses.

Page 456

Chapter 11

Temporary Tables

Naming the Derived Table SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) AS TeraTom(AVGSAL) ;

The name of the Derived Table is TeraTom

AVGSAL ________ 46782.15

Answer Set

In the example above, TeraTom is the name we gave the Derived Table. It is mandatory that you always name the table or it errors.

Page 457

Chapter 11

Temporary Tables

Aliasing the Column Names in The Derived Table SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) AS TeraTom(AVGSAL) ; AVGSALis the Column Name in the derived table named TeraTom

AVGSAL ________

46782.15

Answer Set

AVGSAL is the name we gave to the column in our Derived Table that we call TeraTom. Our SELECT (which builds the columns) shows we are only going to have one column in our derived table, and we have named that column AVGSAL.

Page 458

Chapter 11

Temporary Tables

Multiple Ways to Alias the Columns in a Derived Table Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

Query 1 Query 2 Query 3

SELECT * FROM (SELECT AVG(salary) as avgsal FROM Employee_Table) AS TeraTom

Messages

Garden of Analysis

?

New Query

The derived table's name is TeraTom

Result 1

AVGSAL 1 46782.15

A derived table only lasts for the lifetime of the query and then it is deleted

You can alias the column name within the SQL query that materializes the derived table.

Page 459

Chapter 11

Temporary Tables

CREATING a Derived Table using the WITH Command Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 You create the derived table WITH TeraTom(AVGSAL) AS first using a WITH (SELECT AVG(salary) statement FROM Employee_Table) SELECT * You must then include the derived FROM TeraTom ; table in a final SELECT query Messages

Garden of Analysis

Result 1

AVGSAL 1 46782.15

When using the WITH Command, we can CREATE our Derived table before running the main query. The only issue here is that you can only have 1 WITH.

Page 460

Chapter 11

Temporary Tables

The Same Derived Query shown Three Different Ways

1

SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) TeraTom (AVGSAL) ; Alias CAN be done here or here

2

3

Page 461

SELECT * FROM (SELECT AVG(salary) as AVGSAL FROM Employee_Table) TeraTom ;

WITH TeraTom(AVGSAL) AS (SELECT AVG(salary)FROM Employee_Table) SELECT * FROM TeraTom ;

Chapter 11

Temporary Tables

Most Derived Tables Are Used To Join To Other Tables SELECT E.*, AVGSAL The SELECT materializes FROM Employee_Table as E the Derived Table INNER JOIN (SELECT Dept_No, AVG(salary) FROM Employee_Table GROUP BY Dept_No) AS TeraTom (Dept_No, AVGSAL) ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;

The derived table name is TeraTom

The columns are aliased

Employee_No _______ Dept_No Last_Name First_Name ______ Salary ___________ ________ ________ 1000234 1232578 1324657 1333454 2312225 1121334 1256349 2341218

10 100 200 200 300 400 400 400

Smythe Chambers Coffing Smith Larkins Strickling Harrison Reilly

Richard Mandee Billy John Loraine Cletus Herbert William

64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 54500.00 36000.00

AVGSAL _______ 64300.00 48850.00 44944.44 44944.44 40200.00 48333.33 48333.33 48333.33

The first five columns in the Answer Set came from the Employee_Table. AVGSAL came from the derived table named TeraTom.

Page 462

Chapter 11

Temporary Tables

The Three Components of a Derived Table SELECT E.*, Salary - AVGSAL as PlusMinAvg FROM Employee_Table as E INNER JOIN (SELECT Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;

Dept_No AVGSAL ________ ________ ? 32800.50 10 64300.00 100 48850.00 200 44944.44 300 40200.00 400 48333.33 The derived table lives in memory

1

A derived table will always have a SELECT query to materialize the derived table with data. The SELECT query always starts with an open parenthesis and ends with a close parenthesis.

2

The derived table must be given a name. Above we called our derived table TeraTom.

3

You will need to define (alias) the columns in the derived table. Above we allowed Dept_No to default to Dept_No, but we had to specifically alias AVG(Salary) as AVGSAL.

Every derived table must have the three components listed above.

Page 463

TeraTom

Chapter 11

Temporary Tables

Visualize This Derived Table SELECT E.*, Salary - AVGSAL as PlusMinAvg FROM Employee_Table as E INNER JOIN (SELECT Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;

Employee_No ____________ Dept_No ________ 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 1256349 400 2341218 400

TeraTom Dept_No AVGSAL ________ ________ ? 32800.50 10 64300.00 100 48850.00 200 44944.44 300 40200.00 400 48333.33

The derived table is built first

Last_Name Salary PlusMinAvg ___________ First_Name ___________ ________ ___________ Smythe Richard 64300.00 0.00 Chambers Mandee 48850.00 0.00 Coffing Billy 41888.88 -3055.56 Smith John 48000.00 3055.56 Larkins Loraine 40200.00 0.00 Strickling Cletus 54500.00 6166.67 Harrison Herbert 54500.00 6166.67 Reilly William 36000.00 -12333.33

Our example above shows the data in the derived table named TeraTom. This query allows us to see each employee and the plus or minus avg of their salary compared to the other workers in their department.

Page 464

Chapter 11

Temporary Tables

Our Join Example with a Different Column Aliasing Style I don't need to alias this

SELECT E.*, AVGSAL because it can default to its FROM Employee_Table as E current name INNER JOIN (SELECT Dept_No as Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom I must alias this ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;

column because it is an aggregate

Employee_No ________ Dept_No _________ Last_Name _________ First_Name _______ Salary AVGSAL __________ _______ 1000234 1232578 1324657 1333454 2312225 1121334 1256349 2341218

10 100 200 200 300 400 400 400

Smythe Chambers Coffing Smith Larkins Strickling Harrison Reilly

Richard Mandee Billy John Loraine Cletus Herbert William

64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 54500.00 36000.00

64300.00 48850.00 44944.44 44944.44 40200.00 48333.33 48333.33 48333.33

Our example above aliases the column Dept_No, but it doesn’t need an alias. It will default to Dept_No, but the aggregate must be aliased..

Page 465

Chapter 11

Temporary Tables

Column Aliasing Can Default for Normal Columns I don't need to alias this SELECT E.*, AVGSAL because it can default to its FROM Employee_Table as E current name INNER JOIN (SELECT Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;

TeraTom Dept_No AVGSAL ________ ________ ? 32800.50 10 64300.00 100 48850.00 200 44944.44 300 40200.00 400 48333.33 The derived table is built first

In a derived table, you will always have a SELECT query in parenthesis, and you will always name the table. You have options when aliasing the columns. As in the example above, you can let normal columns default to their current name.

Page 466

Chapter 11

Temporary Tables

A Derived example Using the WITH Syntax Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History We have toSandbox alias the aggregate, but dept_no EXECUTE ? New Query defaults to its own name

Query 1 Query 2 Query 3 WITH TeraTom AS (SELECT Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) SELECT e.dept_no, e.last_name, e.first_name, e.salary, AVGSAL FROM TeraTom JOIN Employee_Table as E ON E.Dept_No = TeraTom.Dept_No WHERE Salary > AVGSAL Messages Garden of Analysis Result 1

dept_no last_name 1 200 Smith 2 400 Strickling 3 400 Harrison

first_name John Cletus Herbert

salary 48000.00 54500.00 54500.00

avgsal 44944.44 48333.33 48333.33

Most derived tables involve calculations, aggregations or ordered analytics. This allows tables and derived columns to mix well on the final report. Above, we are finding all employees who make a salary that is greater than the average salary within their own department. We created a derived table that holds all departments and the average salary within the department. We then join the derived table (named TeraTom) to the employee_table where we can check the salary vs. the avg (salary).

Page 467

Chapter 11

Temporary Tables

Quiz - Answer the Questions SELECT Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table INNER JOIN (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) as TeraTom (Depty, AVGSAL) ON Dept_No = Depty ;

1) What is the name of the derived table? __________ 2) How many columns are in the derived table? _______ 3) What is the name of the derived table columns? ______

4) Is there more than one row in the derived table? _______ 5) What common keys join the Employee and Derived? _______ 6) Why were the join keys named differently? ______________

Answer the questions above an you will fully understand the components of a derived table.

Page 468

Chapter 11

Temporary Tables

Answer to Quiz - Answer the Questions SELECT Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table INNER JOIN (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) as TeraTom (Depty, AVGSAL) ON Dept_No = Depty ;

1) What is the name of the derived table? TeraTom 2) How many columns are in the derived table? 2

3) What’s the name of the derived columns? Depty and AVGSAL 4) Is their more than one row in the derived table? Yes 5) What keys join the tables? Dept_No and Depty 6) Why were the join keys named differently? If both were named Dept_No, we would error unless we full qualified.

Great job!

Page 469

Chapter 11

Temporary Tables

Clever Tricks on Aliasing Columns in a Derived Table SELECT Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table Alias Here INNER JOIN

1

(SELECT Dept_No as Depty, AVG(Salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) as TeraTom ON Dept_No = Depty ;

SELECT E.Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table as E INNER JOIN Alias Here

2

(SELECT Dept_No, AVG(Salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) as TeraTom ON E.Dept_No = TeraTom.Dept_No ;

Check out a few clever tricks to help you with derived tables.

Page 470

Chapter 11

Temporary Tables

A Derived Table lives only for the lifetime of a single query Begin Transaction ; First query

1

Begin Transaction

WITH T (Dept_No, AVGSAL) AS (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) SELECT T.Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table as E INNER JOIN T ON E.Dept_No = T.Dept_No ;

Second query

2

SELECT * FROM T ;

END Transaction;

Error – Query Fails…. T does Not exist. End Transaction

We tried everything to see if the derived table would live past the current query. Notice above, we started with a BEGIN TRANSACTION statement. Then we ran our query that materialized our derived table name T. Then, we attempted to run another query (within the same transaction) that did a SELECT * FROM T and the query failed.

Page 471

Chapter 11

Temporary Tables

An Example of Two Derived Tables in a Single Query WITH T (Dept_No, AVGSAL) AS (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) SELECT T.Dept_No, First_Name, Last_Name, AVGSAL, Counter FROM Employee_Table as E INNER JOIN T ON E.Dept_No = T.Dept_No INNER JOIN (SELECT Employee_No, SUM(1) OVER(PARTITION BY Dept_No ORDER BY Dept_No, Last_Name Rows Unbounded Preceding) FROM Employee_Table) as S (Employee_No, Counter) ON E.Employee_No = S.Employee_No ORDER BY T.Dept_No;

Above we have built two different derived tables. The first is named T and the second is named S. Notice that we materialized T using a WITH statement and we build S right after the INNER JOIN keywords.

Page 472

Chapter 11

Temporary Tables

Example of Two Derived Tables in a Single WITH Statement Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Systems

Query 1 Query 2 Query 3 + Aster Data WITH E AS (SELECT Dept_No, Last_Name, Salary + Azure Cloud Separate FROM Employee_Table) multiple + DB2 Derived ,D AS (SELECT Dept_No, Department_Name + Excel tables in a FROM Department_Table) + Greenplum WITH + Hadoop SELECT E.*, department_name by using a + Kognitio FROM E INNER JOIN D comma + Netezza ON E.Dept_No = D.Dept_No + Oracle WHERE E.Dept_No = 100 + Matrix + + + + +

Redshift SQL Server Sybase Teradata Vertica

Messages

Garden of Analysis

e.dept_no e.last_name

1 100

Chambers

Result 1 e.salary 48850.00

department_name Marketing

Above we have built two different derived tables within a single WITH statement. The first is named E and the second is named D. There is only one WITH statement, but the tables and definitions are separated with a comma.

Page 473

Chapter 11

Temporary Tables

Finding the First Occurrence of a Row using WITH Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 WITH Derived_Tbl AS (select Product_ID as Prod, Sale_Date, Daily_Sales, Row_Number() over (PARTITION BY product_id ORDER BY Sale_Date ASC) AS Row_Num from sales_table) Select * from Derived_Tbl Where Row_Num = 1 ; Messages

1 2 3

Prod 1000 2000 3000

Garden of Analysis

Result 1

Sale_Date Daily_Sales 09/28/2000 48850.40 09/28/2000 41888.88 09/28/2000 61301.77

Row_Num 1 1 1

Using the Row_Number ordered analytic and by partitioning of Product_ID and the sorting by Sale_Date ASC we are bringing back only the first occurrence of a row based on the earliest Sale_Date. This can be done because we are placing our query in a derived table and then selecting from that derived table using a WHERE clause.

Page 474

Chapter 11

Temporary Tables

Finding the Last Occurrence of a Row using WITH Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 WITH Derived_Tbl AS (select Product_ID as Prod, Sale_Date, Daily_Sales, Row_Number() over (PARTITION BY product_id ORDER BY Sale_Date Desc) AS Row_Num from sales_table) Select * from Derived_Tbl Where Row_Num = 1 ; Messages

1 2 3

Prod 1000 2000 3000

Garden of Analysis

Result 1

Sale_Date Daily_Sales 10/04/2000 54553.10 10/04/2000 32800.50 10/04/2000 15675.33

Row_Num 1 1 1

Using the Row_Number ordered analytic and by partitioning of Product_ID and the sorting by Sale_Date DESC we are bringing back only the last occurrence of a row based on the latest Sale_Date. This can be done because we are placing our query in a derived table and then selecting from that derived table using a WHERE clause.

Page 475

Chapter 11

Temporary Tables

Syntax for Temporary Tables CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } ] TABLE [schema-name].table-name { ( column-definition [ , ... ] ) | [ column-name-list ] }

[ ON COMMIT { DELETE | PRESERVE } ROWS ] [ AS [ AT EPOCH LATEST ] | [ AT TIME 'timestamp' ] query ] [ [ ORDER BY table-column [ , ... ] ] [ ENCODED BY column-definition [ , ... ] [ hash-segmentation-clause | range-segmentation-clause

| UNSEGMENTED { NODE node | ALL NODES } ] [ KSAFE [ k-num ] ] | [ NO PROJECTION ] ]

The syntax above is for creating temporary tables. Global tables can be seen outside the session and persist until the end of the session. Global is the default. Local tables can only be seen inside the session and persist until the end of session.

Page 476

Chapter 11

Temporary Tables

Temporary Tables Explained Global Temporary Tables - The definition of a global temporary table is permanent in the database catalogs until explicitly removed by using the DROP TABLE command.

Global temporary tables are created in the public schema, and they are visible to all users and sessions. But, the contents (data) of a global table are private to the transaction or session in which the data was inserted. Data is automatically removed when the transaction commits, rolls back, or the session ends. This allows two users to use the same temporary table, but each only sees the data specific to his or her own transactions for the duration of those transactions or sessions. Local Temporary Tables - A local temporary table is created in the V_TEMP_SCHEMA namespace and is inserted into the user's search path automatically. It can only be seen by the user who created the table, and it lasts for only the duration of the session in which it is created. When the session ends, the table definition is automatically dropped from the database catalogs. Local Temporary Tables can be dropped explicitly. Above are the major differences between Global and Local Temporary tables.

Page 477

Chapter 11

Temporary Tables

Key Temporary Table Terms Global - [Optional] means that the table definition is visible to all sessions. Temporary table data is visible only to the session that materializes (inserts) the data into the table. Temporary tables in default to global. Local - [Optional] Means that the table definition is visible only to the session in which it is created. Temporary tables always default to global. On Commit Preserve|Delete rows – Preserve will preserve the rows until session end and then Truncate the table and Delete will Truncate the rows after each COMMIT. AT EPOCH LATEST | AT TIME - Used with AS query to query historical data. You can specify AT EPOCH LATEST to include data from the latest committed transaction or specify a specific epoch based on a specific time stamp.

Above are the key terms you will want to know when creating a temporary table.

Page 478

Chapter 11

Temporary Tables

Creating and Populating a Local Temporary Table CREATE LOCAL Temporary TABLE Dept_Agg_Local ( Dept_no Integer 1 ,AVG_Salary Decimal(10,2) ) ON COMMIT PRESERVE ROWS ;

2

3

INSERT INTO Dept_Agg_Local SELECT Dept_no ,AVG(Salary) FROM Employee_Table GROUP BY Dept_no ;

SELECT * FROM Dept_Agg_Local ORDER BY 1;

Local tables are Materialized with an Insert/Select statement

Dept_No AVG_Salary _______ __________ ? 32800.50 10 64300.00 100 48850.00 200 89888.88 300 40200.00 400 145000.00

1) A USER Creates a Local Temporary Table and then 2) populates the Temporary Table with an INSERT/SELECT Statement. Now, the user can query this table all session long. When the session is logged off, the table and the data are automatically deleted (Truncated).

Page 479

Chapter 11

Temporary Tables

Using a Local Temporary Table CREATE LOCAL Temporary TABLE Dept_Agg_Local2 ( Dept_no Integer ,AVG_Salary Decimal(10,2) ) ON COMMIT PRESERVE ROWS ; INSERT INTO Dept_Agg_Local2 SELECT Dept_no ,AVG(Salary) FROM Employee_Table GROUP BY Dept_no ; SELECT E.*, AVG_Salary FROM Employee_Table as E INNER JOIN Dept_Agg_Local2 ON E.Dept_No = Dept_Agg_Local2.Dept_No AND Salary > AVG_Salary Employee_No ____________ Dept_No ________ Last_Name __________ First_Name __________ Salary ______ AVG_Salary __________ 1333454 1256349 1121334

200 Smith 400 Harrison 400 Strickling

John Herbert Cletus

48000.00 54500.00 54500.00

44944.44 48333.33 48333.33

We created the Local Temporary Table, materialized it and then used it in a join. The above query finds all employees making a greater salary then the AVG (Salary) within their own dept_no.

Page 480

Chapter 11

Temporary Tables

Creating and Populating a Global Temporary Table CREATE Global Temporary TABLE Dept_Agg_Global ( Dept_no Integer 1 ,AVG_Salary Decimal(10,2) ) ON COMMIT PRESERVE ROWS ;

2

3

INSERT INTO Dept_Agg_Global SELECT Dept_no ,AVG(Salary) FROM Employee_Table GROUP BY Dept_no ;

SELECT * FROM Dept_Agg_Global ORDER BY 1;

Global tables are Materialized with an Insert/Select statement

Dept_No AVG_Salary _______ __________ ? 32800.50 10 64300.00 100 48850.00 200 89888.88 300 40200.00 400 145000.00

1) A USER Creates a Global Temporary Table once and the table definition will persist permanently, until it is dropped. Users can then 2) populates the Global Temporary Table with an INSERT/SELECT Statement. Now, the user can query this table all session long. When the session is logged off the table definition stays, but the data is automatically deleted (Truncated). Many different users can populate the table, but each only sees the table they materialized.

Page 481

Chapter 11

Temporary Tables

Creating and Populating a Global Temporary Table CREATE Global Temporary TABLE Dept_Agg_Global ( Dept_no Integer ,AVG_Salary Decimal(10,2) ) ON COMMIT PRESERVE ROWS ;

User 1

User n

INSERT INTO Dept_Agg_Global SELECT Dept_no ,AVG(Salary) FROM Employee_Table WHERE Dept_No in (100, 200) GROUP BY Dept_no ;

INSERT INTO Dept_Agg_Global SELECT Dept_no ,AVG(Salary) FROM Employee_Table GROUP BY Dept_no HAVING AVG(Salary) > 46000;

Both users above can only see the data they populated

Two users above have materialized the same Global Temporary table, but each only sees their table. Users can not share a Global Temporary table, but only the definition.

Page 482

Chapter 11

Temporary Tables

Some Great Examples of Creating a Temporary Table Quickly This table is created from the Sales_Table CREATE TEMP TABLE Sales_Agg ON COMMIT PRESERVE ROWS AS SELECT Product_ID ,SUM(Daily_Sales) FROM Sales_Table Group by Product_ID;

This table is materialized from a join CREATE TEMP TABLE Emp_Dept ON COMMIT PRESERVE ROWS AS SELECT E.*, Department_Name, Budget FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No;

Above are two great examples to quickly CREATE a temporary Table from another table.

Page 483

Chapter 11

Temporary Tables

Creating a Temporary Table That is sorted This table is sorted by Sale_Date CREATE GLOBAL TEMP TABLE Temp_Orders ( Order_Number INTEGER ,Customer_Number INTEGER ,Order_Date Date ,Order_Total Decimal(8,2)) ON COMMIT PRESERVE ROWS ORDER BY Order_Date, Customer_Number;

INSERT INTO Temp_Orders SELECT * FROM Order_Table; SELECT * FROM Temp_Orders; A great reason to create a temporary table is to have it sorted.

Page 484

Chapter 11

Temporary Tables

A Temp Table That Populates some of the Rows Create a Temporary Table with orders from September

CREATE Temp TABLE Order_Vol ON COMMIT PRESERVE ROWS AS (SELECT * FROM Order_Table WHERE Extract(Month from Order_Date) = 9);

Above is an example of creating a temporary table that is not an exact copy. It is only populating the table with orders from the month of September.

Page 485

Chapter 11

Temporary Tables

A Temporary Table with Some of the Columns This creates a table with only three columns

CREATE Temporary TABLE Order_Vol5 ON COMMIT PRESERVE ROWS AS (SELECT Customer_Number ,Order_Date, Order_Total FROM Order_Table) ;

Above is an example of creating a Temporary table with three columns. The original table had four columns.

Page 486

Chapter 12

Page 487

Sub-query Functions

Chapter 12

Sub-query Functions

Chapter 12 – Sub-query Functions

“An invasion of Armies can be resisted, but not an idea whose time has come.” - Victor Hugo

Page 488

Chapter 12

Sub-query Functions

An IN List is much like a Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Employee_Table WHERE Dept_No IN (100, 200) ;

Messages

Garden of Analysis

Result 1

Employee_No Dept_No Last_Name 1 1232578 100 Chambers 2 1324657 200 Coffing 3 1333454 200 Smith

First_Name Salary Mandee 48850.00 41888.88 Billy 48000.00 John

This query is easy to understand. It uses an IN List to find all Employees who are in Dept_No 100 or Dept_No 200.

Page 489

Chapter 12

Sub-query Functions

An IN List Never has Duplicates – Just like a Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3

SELECT * FROM Employee_Table WHERE Dept_No IN (100, 100, 200, 200) ;

Messages

Garden of Analysis

Duplicates in an IN-List are silly

Result 1

Employee_No Dept_No Last_Name 1 1232578 100 Chambers 2 1324657 200 Coffing 3 1333454 200 Smith

First_Name Salary Mandee 48850.00 41888.88 Billy 48000.00 John

The answer still only produced three rows

What is going on with this IN List? Why in the world are their duplicates in there? Will this query even work? What will the result set look like? Duplicate values are ignored here. We got the same rows back as before, and it is as if the system ignored the duplicate values in the IN List. That is exactly what happened.

Page 490

Chapter 12

Sub-query Functions

The Subquery Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

There is a Top Query and a Bottom Query!

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table

Dept_No ________________ Department_Name ________

SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ;

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Which Query Runs First?

The query above is a Subquery which means there are multiple queries in the same SQL. The bottom query runs first, and its purpose in life is to build a distinct list of values that it passes to the top query. The top query then returns the result set. This query solves the problem: Show all Employees in Valid Departments!

Page 491

Chapter 12

Sub-query Functions

The Three Steps of How a Basic Subquery Works Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

SELECT * FROM Employee_Table 1 WHERE Dept_No IN ( SELECT Dept_No The Bottom Query runs first! FROM Department_Table) ;

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

2 The result is passed to the top query!

3 SELECT * FROM Employee_Table WHERE Dept_No IN (100, 200, 300, 400, 500) ;

The top query runs using the bottom query answer set

The bottom query runs first and builds a distinct IN list. Then the top query runs using the list.

Page 492

Chapter 12

Sub-query Functions

These are Equivalent Queries Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

1

2

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ;

SELECT * FROM Employee_Table WHERE Dept_No IN (100, 200, 300, 400, 500) ;

Both queries above are the same. Query 2 has values in an IN list. Query 1 runs a subquery to build the values in the IN list.

Page 493

Chapter 12

Sub-query Functions

The Final Answer Set from the Subquery Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400 Remember that a subquery never has columns return in the final answer set

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Page 494

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources Notice that No employees are in dept 500

SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ; Employee_No Dept_No ____________ ________ 1232578 100 1324657 200 1333454 200 2312225 300 1256349 400 2341218 400 1121334 400

.

Department_Table

Last_Name __________ Chambers Coffing Smith Larkins Harrison Reilly Strickling

First_Name __________ Mandee Billy John Loraine Herbert William Cletus

Salary ________ 48850.00 41888.88 48000.00 40200.00 54500.00 36000.00 54500.00

Chapter 12

Sub-query Functions

Quiz- Answer the Difficult Question Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

How are Subqueries similar to Joins between two tables?

A great question was asked above. Do you know the key to answering? Turn the page!

Page 495

Chapter 12

Sub-query Functions

Answer to Quiz- Answer the Difficult Question Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Department_Table

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

Primary Key

Foreign Key

How are Subqueries similar to Joins between two tables?

A Subquery between two tables or a Join between two tables will each need a common key that represents the relationship. This is called a Primary Key/Foreign Key relationship.

A Subquery will use a common key linking the two tables together very similar to a join! When subquerying between two tables, look for the common link between the two tables. Most of the time they both have a column with the same name but not always.

Page 496

Chapter 12

Sub-query Functions

Should you use a Subquery or a Join? Employee_Table

Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Department_Table

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

When do I Subquery? SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ;

Dept_No ________________ Department_Name ________ 100 200 300 400 500

Marketing Research and Dev Sales Customer Support Human Resources

When do I perform a Join?

SELECT E.*, Department_Name FROM Employee_Table as E Inner Join Department_Table as D ON E.Dept_No = D.Dept_No;

If you only want to see a report where the final result set has only columns from one table, use a Subquery. Obviously, if you need columns on the report where the final result set has columns from both tables, you have to do a Join.

Page 497

Chapter 12

Sub-query Functions

Quiz- Write the Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________

11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Write the Subquery

Select all columns in the Customer_Table if the customer has placed an order!

Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table. Good luck! Advice: Look for the common key among both tables!

Page 498

Chapter 12

Sub-query Functions

Answer to Quiz- Write the Subquery Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica

Database: SQL Class

EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Customer_Table WHERE Customer_Number IN (SELECT Customer_Number FROM Order_Table) ; Messages

1 2 3 4

Garden of Analysis

Customer_Number 11111111 31323134 57896883 87323456

Result 1

Customer_Name Phone_Number Billy's Best Choice 555-1234 555-1212 ACE Consulting 347-8954 XYZ Plumbing 322-1012 Databases N-U

The common key among both tables is Customer_Number. The bottom query runs first and delivers a distinct list of Customer_Number values which the top query uses in the IN List!

Page 499

Chapter 12

Sub-query Functions

Quiz- Write the More Difficult Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Write the Subquery Select all columns in the Customer_Table if the customer has placed an order over $10,000.00 Dollars!

Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table that is greater than $10,000.00.

Page 500

Chapter 12

Sub-query Functions

Answer to Quiz- Write the More Difficult Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Here is your answer!

Page 501

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Customer_Table WHERE Customer_Number IN ( SELECT Customer_Number FROM Order_Table WHERE Order_Total > 10000.00) ; Messages

Garden of Analysis

Customer_Number 1 11111111 2 57896883 3 87323456

Result 1

Customer_Name Phone_Number Billy's Best Choice 555-1234 347-8954 XYZ Plumbing 322-1012 Databases N-U

Chapter 12

Sub-query Functions

Quiz – Write the Extreme Subquery Course_Table Course_ID Course_Name _________ _________________ Student_Course_Table Student_ID Course_ID 280023 210 231222 210 125634 100 231222 220 125634 200 322133 220 125634 220 322133 300 324652 200 333450 500 260000 400 333450 400 234121 100 123250 100

100 200 210 220 300 400

Credits ______ Seats ____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 V2R3 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16 Student_Table

__________ Student_ID 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

__________ Last_Name Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

__________ First_Name __________ Class_Code Grade_Pt ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00

Write SQL that will bring back an answer set that selects all columns from the Student_Table if that student is taking a course that has four (4) credits.

Use a subquery to get the answer set requested above. The answer is on the next page.

Page 502

Chapter 12

Sub-query Functions

Answer to Quiz- Write the Extreme Subquery Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

?

New Query

Query 1 Query 2 Query 3 SELECT S.* FROM Student_Table as S WHERE Student_ID IN (SELECT Student_ID FROM Student_Course_Table WHERE Course_ID IN (SELECT Course_ID FROM Course_Table WHERE Credits=4)) Messages

Student_ID 1 260000 2 322133 3 333450

Above is something to enjoy and learn from.

Page 503

EXECUTE

Sandbox

Garden of Analysis

Last_Name Johnson Bond Smith

Result 1

First_Name Class_Code Grade_Pt ? Stanley ? 3.95 Jimmy JR 2.00 Andy SO

Chapter 12

Sub-query Functions

Quiz- Write the Subquery with an Aggregate Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name __________ Jones Squiggy Smythe Richard Chambers Mandee Coffing Billy Smith John Larkins Loraine Strickling Cletus Reilly William Harrison Herbert

Salary _______ 32800.50 64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 36000.00 54500.00

Write the Subquery Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary. Another opportunity knocking! Would someone please answer the query door?

Page 504

Chapter 12

Sub-query Functions

Answer to Quiz- Write the Subquery with an Aggregate Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

Query 1 Query 2 Query 3 SELECT * FROM Employee_Table WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table) ; Messages

Garden of Analysis

Result 1

Employee_No Dept_No Last_Name 10 Smythe 1 1000234 400 Strickling 2 1121334 100 Chambers 3 1232578 200 Smith 4 1333454 400 Harrison 5 1256349

First_Name Richard Cletus Mandee John Herbert

Notice that we are no longer using an IN clause, but instead a greater than sign.

Page 505

New Query

Salary 64300.00 54500.00 48850.00 48000.00 54500.00

Chapter 12

Sub-query Functions

Quiz- Write the Correlated Subquery Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400

Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert

Write the Correlated Subquery

Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary (within their own Department). Another opportunity knocking! This is a tough one, and only the best get this written correctly.

Page 506

Chapter 12

Sub-query Functions

Answer to Quiz- Write the Correlated Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3

SELECT * This co-relates or FROM Employee_Table as EE correlates the top WHERE Salary > ( query to the bottom SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EEEE.Dept_No = EE.Dept_No) ; Messages

1 2 3

Garden of Analysis

Result 1

Employee_No Dept_No Last_Name First_Name Salary 400 Strickling Cletus 54500.00 1121334 200 Smith 48000.00 1333454 John 400 Harrison Herbert 54500.00 1256349

A Correlated subquery uses a column from the top query in the WHERE clause on the bottom query. This corelates the top and bottom queries, thus the name correlated subquery. Since we wanted to see all salaries greater than the average salary within their own Dept_No the correlating column is Dept_No. Both tables are aliased so the WHERE clause is correlated. Page 507

Chapter 12

Sub-query Functions

The Basics of a Correlated Subquery The Top Query is Co-Related (Correlated) with the Bottom Query. The table name from the top query and the table name from the bottom query are given a different alias.

The bottom query WHERE clause co-relates Dept_No from Top and Bottom. The top query is run first. The bottom query is run one time for each distinct value delivered from the top query. SELECT * FROM Employee_Table as EE WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EE.Dept_No = EEEE.Dept_No) ;

A correlated subquery breaks all the rules. It is the top query that runs first. Then, the bottom query is run one time for each distinct column in the bottom WHERE clause. In our example, this is the column Dept_No. This is because in our example, the WHERE clause is comparing the column Dept_No. After the top query runs and brings back its rows, the bottom query will run one time for each distinct Dept_No. If this is confusing, it is not you. These take a little time to understand, but I have a plan to make you an expert. Keep reading!

Page 508

Chapter 12

Sub-query Functions

The Top Query always runs first in a Correlated Subquery The Top Query runs first (colored in blue)

SELECT * FROM Employee_Table as EE WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EE.Dept_No = EEEE.Dept_No)

EE.Dept_No = EEEE.Dept_No

SELECT * FROM Employee_Table as EE Employee_No Dept_No ____________ ________ Last_Name _________ Null is 2000000 skipped ? Jones 1000234 10 Smythe 1232578 100 Chambers 1324657 200 Coffing 1333454 200 Smith 2312225 300 Larkins 1121334 400 Strickling 2341218 400 Reilly 1256349 400 Harrison

First_Name _______ Salary _________ Squiggy 32800.50 Richard 64300.00 Mandee 48850.00 Billy 41888.88 John 48000.00 Loraine 40200.00 Cletus 54500.00 William 36000.00 Herbert 54500.00

Dept_No ________ 10 100 200 300 400

Employee_No ________ Dept_No __________ Last_Name __________ First_Name _______ Salary ____________ 1333454 1256349 1121334

200 400 400

Smith Harrison Strickling

John Herbert Cletus

The bottom Query (in red) runs 1 time for each distinct Dept_No

48000.00 54500.00 54500.00

AVGSAL ________ 64300.00 48850.00 44944.44 40200.00 48333.33

Only these three employees make more than the AVG salary within their own department

The top query runs first and then the bottom query is only run once per distinct Dept_No. Page 509

Chapter 12

Sub-query Functions

Correlated Subquery Example vs. a Join with a Derived Table SELECT Last_Name, Dept_No, Salary FROM Employee_Table as EE WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EE.Dept_No = EEEE.Dept_No) ;

SELECT Last_Name, Dept_No, Salary, AVGSAL FROM Employee_Table as E INNER JOIN (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) as TeraTom (Depty, AVGSAL) ON Dept_No = Depty AND Salary > AVGSAL ;

Correlated Subquery Last_Name Dept_No __________ ________ Smith 200 Harrison 400 Strickling 400

Salary _______ 48000.00 54500.00 54500.00

Join with a Derived Table Last_Name Dept_No _________ ________ Smith 200 Harrison 400 Strickling 400

Salary AVGSAL _______ ________ 48000.00 44944.44 54500.00 48333.33 54500.00 48333.33

Both queries above will bring back all employees making a salary that is greater than the average salary in their department. The biggest difference is that the Join with the Derived Table also shows the Average Salary in the result set.

Page 510

Chapter 12

Sub-query Functions

Quiz- A Second Chance to Write a Correlated Subquery Sales_Table

Product_ID _________ Sale_Date __________ 1000 10/02/2000 1000 09/30/2000 1000 10/01/2000 All Rows are 2000 10/04/2000 NOT 2000 10/02/2000 Displayed 2000 09/28/2000 3000 10/04/2000 3000 10/02/2000 3000 10/03/2000

Daily_Sales __________ 32800.50 36000.07 40200.43 32800.50 36021.93 41888.88 15675.33 19678.94 21553.79

Write the Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Product_ID. Another opportunity knocking! This is your second chance. I will even give you a third chance.

Page 511

Chapter 12

Sub-query Functions

Answer - A Second Chance to Write a Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Product_ID. SELECT * FROM Sales_Table as TopS WHERE Daily_Sales > ( SELECT AVG(Daily_Sales) FROM Sales_Table as BotS WHERE TopS.Product_ID = BotS.Product_ID) ORDER BY Product_ID, Sale_Date ; Product_ID _________ Sale_Date __________ Daily_Sales __________

Answer Set

1000 1000 1000 1000 2000 2000 2000 3000 3000 3000

09/28/2000 09/29/2000 10/03/2000 10/04/2000 09/29/2000 09/30/2000 10/01/2000 09/28/2000 09/29/2000 09/30/2000

Notice that it is the Product_Id in the bottom WHERE clause.

Page 512

48850.40 54500.22 64300.00 54553.10 48000.00 49850.03 54850.29 61301.77 34509.13 43868.86

Chapter 12

Sub-query Functions

Quiz- A Third Chance to Write a Correlated Subquery Sales_Table

Product_ID _________ Sale_Date __________ 1000 10/02/2000 1000 09/30/2000 1000 10/01/2000 All Rows are 2000 10/04/2000 NOT 2000 10/02/2000 Displayed 2000 09/28/2000 3000 10/04/2000 3000 10/02/2000 3000 10/03/2000

Daily_Sales __________ 32800.50 36000.07 40200.43 32800.50 36021.93 41888.88 15675.33 19678.94 21553.79

Write the Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Sale_Date. Another opportunity knocking! There is just one minor adjustment and you are home free.

Page 513

Chapter 12

Sub-query Functions

Answer - A Third Chance to Write a Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Sale_Date. SELECT * FROM Sales_Table as TopS WHERE Daily_Sales > ( SELECT AVG(Daily_Sales) FROM Sales_Table as BotS WHERE TopS.Sale_Date = BotS.Sale_Date) ORDER BY Sale_Date ; Product_ID _________ Sale_Date __________ Daily_Sales __________

Answer Set

3000 2000 1000 3000 2000 2000 2000 1000 2000 1000 1000

09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000 10/01/2000 10/02/2000 10/02/2000 10/03/2000 10/03/2000 10/04/2000

61301.77 48000.00 54500.22 43868.86 49850.03 54850.29 36021.93 32800.50 43200.18 64300.00 54553.10

Notice that it is the Sale_Date in the bottom WHERE clause. Plus, we threw in an ORDER BY that is outside of the subquery.

Page 514

Chapter 12

Sub-query Functions

Quiz- Last Chance to Write a Correlated Subquery Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

Write the Correlated Subquery Select all columns in the Student_Table if the Grade_Pt column is greater than the Average Grade_Pt within its own Class_Code. Another opportunity knocking! There is just one minor adjustment and you are home free.

Page 515

Chapter 12

Sub-query Functions

Answer – Last Chance to Write a Correlated Subquery Select all columns in the Student_Table if the Grade_Pt column is greater than the Average Grade_Pt within its own Class_Code.

SELECT * FROM Student_Table as TopS WHERE Grade_Pt > ( SELECT AVG(Grade_Pt) FROM Student_Table as BotS WHERE TopS. Class_Code = BotS.Class_Code ) ORDER BY Class_Code ;

Answer Set Student_ID Last_Name First_Name __________ __________ __________ Class_Code __________ Grade_Pt ________ 234121 125634 322133 231222 324652

Page 516

Thomas Hanson Bond Wilson Delaney

Wendy Henry Jimmy Susie Danny

FR FR JR SO SR

4.00 2.88 3.95 3.80 3.35

Chapter 12

Sub-query Functions

Quiz – Write the Extreme Correlated Subquery Course_Table Course_ID Course_Name _________ _________________ Student_Course_Table Student_ID Course_ID 280023 210 231222 210 125634 100 231222 220 125634 200 322133 220 125634 220 322133 300 324652 200 333450 500 260000 400 333450 400 234121 100 123250 100

100 200 210 220 300 400

Credits ______ Seats ____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 V2R3 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16 Student_Table

__________ Student_ID 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

__________ Last_Name Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips

First_Name __________ __________ Class_Code Grade_Pt ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00

Write a correlated subquery that will bring back an answer set that returns all columns from the Course_Table if that course is being taken by a student who has a greater than average grade point within their own class code.

Use a subquery to get the answer set requested above. The answer is on the next page.

Page 517

Chapter 12

Sub-query Functions

Answer To Quiz – Write the Extreme Correlated Subquery SELECT * FROM Course_Table WHERE Course_ID IN (SELECT Course_ID FROM Student_Course_Table WHERE Student_ID IN (SELECT Student_ID FROM Student_Table AS s1 WHERE Grade_Pt > (SELECT AVG(Grade_Pt) FROM Student_Table AS s2 WHERE s1.Class_Code=s2.Class_Code) ) ); Course_ID _________ 200 100 220 300 210

Above is something to enjoy and learn from.

Page 518

Course_Name _____________________ Credits ______ Seats _____ Introduction to SQL 3 20 Vertica Concepts 3 50 V2R3 SQL Features 2 25 Physical Database Design 4 20 Advanced SQL 3 22

Chapter 12

Sub-query Functions

Quiz- Write the NOT Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

Write the Subquery Select all columns in the Customer_Table if the Customer has NOT placed an order.

Another opportunity knocking! Write the above query!

Page 519

12347.53 8005.91 5111.47 15231.62 23454.84

Chapter 12

Sub-query Functions

Answer to Quiz- Write the NOT Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table WHERE Customer_Number IS NOT NULL) ; Messages

Garden of Analysis

Customer_Number 1 31313131

Use this technique to get rid of Nulls

Result 1

Customer_Name Acme Products

Phone_Number 555-1111

When a NOT IN subquery encounters a NULL value it returns nothing. Since the bottom query is passing up the Customer_Number to the top query, if there are NULL values in any Customer_Number, the top query returns nothing. That is why we used the IS NOT NULL statement in the bottom WHERE clause.

Page 520

Chapter 12

Sub-query Functions

Quiz- Write the Subquery using a WHERE Clause Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

Write the Subquery Select all columns in the Order_Table that were placed by a customer with ‘Bill’ anywhere in their name.

Write the above query and then check out the results on the next page.

Page 521

12347.53 8005.91 5111.47 15231.62 23454.84

Chapter 12

Sub-query Functions

Answer - Write the Subquery using a WHERE Clause Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Order_Table WHERE Customer_Number IN (SELECT Customer_Number FROM Customer_Table WHERE Customer_Name ilike '%Bill%') ; Messages

Garden of Analysis

Result 1

Order_Number Customer_Number Order_Date Order_Total 1 123456 11111111 05/04/1998 12347.53 2 123512 11111111 01/01/1999 8005.91

Great job on writing your query just like the above.

Page 522

History

Chapter 12

Sub-query Functions

Quiz- Write the Subquery with Two Parameters Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

Write the Subquery What is the highest dollar order for each Customer? This Subquery will involve two parameters!

Get ready to be amazed at either yourself or the Answer on the next page!

Page 523

12347.53 8005.91 5111.47 15231.62 23454.84

Chapter 12

Sub-query Functions

Answer to Quiz- Write the Subquery with Two Parameters Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

Sandbox

EXECUTE

?

Query 1 Query 2 Query 3 SELECT Customer_Number, Order_Number, Order_Total FROM Order_Table WHERE (Customer_Number, Order_Total) IN (SELECT Customer_Number, MAX(Order_Total) FROM Order_Table GROUP BY Customer_Number) ; Messages

Garden of Analysis

Result 1

Customer_Number Order_Number Order_Total 1 2 3 4

57896883 11111111 31323134 87323456

123777 123456 123552 123585

23454.84 12347.53 5111.47 15231.62

This is how you utilize multiple parameters in a Subquery! Turn the page for more.

Page 524

New Query

Chapter 12

Sub-query Functions

How the Double Parameter Subquery Works Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Customer_Number, Order_Number, Order_Total FROM Order_Table WHERE (Customer_Number, Order_Total) IN (SELECT Customer_Number, MAX(Order_Total) FROM Order_Table GROUP BY 1) ; Customer_Number Max(Order_Total) ________________ _______________ 11111111 31323134 87323456 57896883

12347.53 5111.47 15231.62 23454.84

The bottom query runs first returning two columns. Next page for more info!

Page 525

These 4 rows are sent to the top query

Chapter 12

Sub-query Functions

More on how the Double Parameter Subquery Works Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Customer_Number, Order_Number, Order_Total FROM Order_Table WHERE (Customer_Number, Order_Total ) IN ( 11111111 ,12347.53 The top query now uses the 31323134 , 5111.47 In-list 87323456 ,15231.62 57896883 ,23454.84 ); The IN list is built and the top query can now process for the final Answer Set.

Page 526

Chapter 12

Sub-query Functions

Quiz – Write the Triple Subquery Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Write the Subquery

What is the Customer_Name who has the highest dollar order among all customers? This query will have multiple Subqueries! Good luck in writing this. Remember that this will involve multiple Subqueries.

Page 527

Chapter 12

Sub-query Functions

Answer to Quiz – Write the Triple Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

The answer is XYZ Plumbing.

Page 528

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 This SELECT Customer_Name runs FROM Customer_Table third WHERE Customer_Number IN Runs (SELECT Customer_Number FROM Order_Table second WHERE Order_Total IN (SELECT Max(Order_Total) FROM Order_Table)) Runs first Messages

Garden of Analysis

Customer_Name

1

XYZ Plumbing

Result 1

Chapter 12

Sub-query Functions

Quiz – How many rows return on a NOT IN with a NULL? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777 000099

11111111 11111111 31323134 87323456 57896883 NULL

We added a Null Value to the Order_Table

12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL

SELECT Customer_Name FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table ) ;

How many rows return from the query now that a NULL value is in a Customer_Number?

We really didn’t place a new row inside the Order_Table with a NULL value for the Customer_Number column, but in theory, if we had, how many rows would return?

Page 529

Chapter 12

Sub-query Functions

Answer – How many rows return on a NOT IN with a NULL? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777 000099

11111111 11111111 31323134 87323456 57896883 NULL

We added a Null Value to the Order_Table

12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL

SELECT Customer_Name FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table ) ;

How many rows return from the query now that a NULL value is in a Customer_Number? ZERO rows will return

The answer is no rows come back. This is because when you have a NULL value in a NOT IN list, the system doesn’t know the value of NULL, so it returns nothing.

Page 530

Chapter 12

Sub-query Functions

How to handle a NOT IN with Potential NULL Values Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777 000099

11111111 11111111 31323134 87323456 57896883 NULL

We added a Null Value to the Order_Table

12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL

SELECT Customer_Name FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table WHERE Customer_Number IS NOT NULL) ;

How many rows return NOW from the query? 1 Acme Products

You can utilize a WHERE clause that tests to make sure Customer_Number IS NOT NULL. This should be used when a NOT IN could encounter a NULL.

Page 531

Chapter 12

Sub-query Functions

IN is equivalent to =ANY Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Query 1 Query 2 Query 3 SELECT * FROM Customer_Table WHERE Customer_Number = ANY (SELECT Customer_Number FROM Order_Table ) ; Messages

Garden of Analysis

11111111 31323134 57896883 87323456

?

New Query

= ANY Is the same As IN

Result 1

Customer_Number Customer_Name 1 2 3 4

Sandbox

Billy's Best Choice ACE Consulting XYZ Plumbing Databases N-U

Phone_Number 555-1234 555-1212 347-8954 322-1012

Instead of using the IN, you can use the = ANY command. These queries work the SAME. The above queries will produce the same result set.

Page 532

Chapter 12

Sub-query Functions

Using a Correlated Exists Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT * FROM Customer_Table Top1 EXISTS is a Boolean that is either true or false WHERE EXISTS (SELECT * FROM Order_Table Bot1 WHERE Top1.Customer_Number = Bot1.Customer_Number ) ; Messages

Garden of Analysis

Result 1

Customer_Number Customer_Name 1 2 3 4

11111111 31323134 57896883 87323456

Billy's Best Choice ACE Consulting XYZ Plumbing Databases N-U

Phone_Number 555-1234 555-1212 347-8954 322-1012

The EXISTS command will determine via a Boolean if something is True or False. If a customer placed an order, it EXISTS, and using the Correlated Exists statement, only customers who have placed an order will return in the answer set.

Page 533

Chapter 12

Sub-query Functions

How a Correlated Exists matches up Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Does not Acme Products Exist in ACE Consulting Order_Table XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ; Customer_Number ________________

________________ Customer_Name

11111111 31323134 57896883 87323456

Billy’s Best Choice ACE Consulting XYZ Plumbing Databases N-U

Only customers who placed an order return with the above Correlated EXISTS.

Page 534

Chapter 12

Sub-query Functions

The Correlated NOT Exists Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT * NOT EXISTS is a Boolean FROM Customer_Table Top1 that is either true or false WHERE NOT EXISTS (SELECT * FROM Order_Table Bot1 WHERE Top1.Customer_Number = Bot1.Customer_Number )

Messages

Garden of Analysis

Result 1

Customer_Number Customer_Name 1

31313131

Acme Products

Phone_Number 555-1111

The EXISTS command will determine via a Boolean if something is True or False. If a customer has not placed an order, it does not EXIST, and using the Correlated Exists statement, only customers who have not placed an order will return in the answer set. Null values do not affect a NOT EXIST statement like they do a NOT IN statement.

Page 535

Chapter 12

Sub-query Functions

The Correlated NOT Exists Answer Set Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777

11111111 11111111 31323134 87323456 57896883

12347.53 8005.91 5111.47 15231.62 23454.84

Use NOT EXISTS to find which Customers have NOT placed an Order? SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE NOT EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ; Customer_Number ________________ Customer_Name ______________ 31313131

Acme Products

The only customer who did NOT place an order was Acme Products.

Page 536

Chapter 12

Sub-query Functions

Quiz – How many rows come back from this NOT Exists? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777 000099

11111111 11111111 31323134 87323456 57896883 NULL

We added a Null Value to the Order_Table

12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL

SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE NOT EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ;

How many rows return from the query?

A NULL value in a list for queries with NOT IN returned nothing, but you must now decide if that is also true for the NOT EXISTS. How many rows will return?

Page 537

Chapter 12

Sub-query Functions

Answer – How many rows come back from this NOT Exists? Customer_Table

Order_Table

Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456

Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U

123456 123512 123552 123585 123777 000099

11111111 11111111 31323134 87323456 57896883 NULL

We added a Null Value to the Order_Table

12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL

SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE NOT EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ; How many rows return from the query? One row Acme Products

NOT EXISTS is unaffected by a NULL in the list. That’s why it is more flexible!

Page 538

Chapter 13

Page 539

Strings

Chapter 13

Strings

Chapter 13 – Strings

“It’s always been and always will be the same in the world: the horse does the work and the coachman is tipped.” - Anonymous

Page 540

Chapter 13

Strings

The LENGTH Command Counts Characters Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT First_Name ,LENGTH (First_Name) AS Lnth FROM Employee_Table WHERE LENGTH (First_Name) < 7 ORDER BY 1; Messages

first_name

1 2 3 4

Billy Cletus John Mandee

Garden of Analysis

Result 1

Lnth 5 6 4 6

The LENGTH command counts the number of characters. If ‘Tom’ was in the Employee_Table, his length would be 3.

Page 541

Chapter 13

Strings

The LENGTH Command – Spaces can Count too Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

EXECUTE

Sandbox ?

New Query

Query 1 Query 2 Query 3 SELECT 'T o m' AS First_Name ,LENGTH('T o m') AS Lnth There are spaces in between each letter Messages

Garden of Analysis

Result 1

First_Name Length 1

T o m

5

Spaces in between count

If ‘T o m’ was in the Employee_Table, his length would be 5. Yes, spaces do count as characters.

Page 542

Chapter 13

Strings

The LENGTH Command and Character Data CHAR (20) SELECT Last_Name ,LENGTH(Last_Name) AS Lnth FROM Employee_Table ORDER BY 1;

Last_Name Lnth __________ _____ Chambers 8 Coffing 7 Harrison 8 Jones 5 Larkins 7 Reilly 6 Smith 5 Smythe 6 Strickling 10

Even though Last_Name is a CHAR (20), the LENGTH command in Vertica will automatically trim the spaces for the LENGTH command.

Page 543

Chapter 13

Strings

LENGTH and CHARACTER_LENGTH Are Equivalent Query 1 SELECT First_Name ,LENGTH(First_Name) AS C_Length FROM Employee_Table ;

Query 2 SELECT First_Name ,CHARACTER_Length(First_Name) AS C_Length FROM Employee_Table ;

These two queries will get you the SAME EXACT answer set in your report.

Page 544

Chapter 13

Strings

OCTET_LENGTH Query 1 SELECT First_Name ,LENGTH(First_Name) AS C_Length FROM Employee_Table ;

Query 2 SELECT First_Name ,CHARACTER_Length(First_Name) AS C_Length FROM Employee_Table ;

Query 3 SELECT First_Name ,Octet_Length (First_Name) AS C_Length FROM Employee_Table ; You can also use the OCTET LENGTH command. These three queries get the same exact answer sets! Query 2 and 3 are ANSI Standard.

Page 545

Chapter 13

Strings

UPPER and LOWER Commands Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

?

Query 1 Query 2 Query 3 SELECT Last_Name AS "Name_Normal" ,UPPER (Last_Name) AS "Name_Upper" ,LOWER (Last_name) AS "Name_Lower" FROM Employee_Table WHERE Last_Name LIKE 'S%' ; Messages

Garden of Analysis

Result 1

Name_Normal Name_Upper Name_Lower smythe SMYTHE Smythe 1 STRICKLING strickling Strickling 2 smith SMITH Smith 3

Upper convert’s text to uppercase and Lower converts text to lowercase.

Page 546

Sandbox New Query

Chapter 13

Strings

Using the LOWER Command Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1

SELECT LOWER('AbCdE') as "Go Low" FROM Order_Table Limit 1 ; Messages

Garden of Analysis

Result 1

Go Low 1

abcde

The LOWER function converts all letters in a specified string to lowercase letters. If there are characters in the string that are not letters, they are not affected by the LOWER command.

Page 547

Chapter 13

Strings

A LOWER Command Example Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 SELECT 'They match' as "Do They Match?" FROM Order_Table WHERE LOWER('ABCDE') = 'abcde' Limit 1 ; Messages

Garden of Analysis

Result 1

Do They Match? 1

They match

The LOWER function converts all letters in a specified string to lowercase letters. If there are characters in the string that are not letters, they are not affected by the LOWER command. Above, we compare a LOWER 'ABCDE' = 'abcde' and they are now equivalent because we have lowercased the 'ABCDE'.

Page 548

Chapter 13

Strings

Using the UPPER Command Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1

SELECT UPPER('AbCdE') as "Go upper" FROM Order_Table Limit 1 ; Messages

Garden of Analysis

Result 1

Go upper 1

ABCDE

The UPPER function converts all letters in a specified string to uppercase letters. If there are characters in the string that are not letters, they are not affected by the UPPER command.

Page 549

Chapter 13

Strings

An UPPER Command Example Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Sandbox ?

New Query

Query 1 SELECT 'They match' as "Do They Match?" FROM Order_Table WHERE 'ABCDE' = UPPER('abcde') LIMIT 1 ; Messages

Garden of Analysis

Result 1

Do They Match?

1

They match

The UPPER function converts all letters in a specified string to uppercase letters. If there are characters in the string that are not letters, they are not affected by the UPPER command. Above, we compare a string of 'ABCDE' = UPPER 'abcde' and they are now equivalent because we have uppercased the 'abcde'.

Page 550

Chapter 13

Strings

Non-Letters are Unaffected by UPPER and LOWER Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

Sandbox

EXECUTE

?

New Query

Query 1 SELECT LOWER('ABCDE1') as "Number Stays" ,UPPER('abCdE2') as "Numbers Hold" FROM Order_Table LIMIT 1 ; Messages

Garden of Analysis

Result 1

Number Stays Numbers Hold 1

abcde1

ABCDE2

The UPPER and LOWER functions convert all letters in a specified string to either upper or lower case letters. If there are characters in the string that are not letters, they are not affected by the UPPER or LOWER commands. Notice in our example that the numbers 1 and 2 were unaffected by the LOWER and UPPER commands.

Page 551

Chapter 13

Strings

The TRIM Command trims both Leading and Trailing Spaces Query 1

SELECT Last_Name ,Trim(Last_Name) AS No_Spaces FROM Employee_Table ;

Query 2 SELECT Last_Name ,Trim(Both from Last_Name) AS No_Spaces FROM Employee_Table ;

Both queries above do the exact same thing. They remove spaces from the beginning and the end of the column Last_Name.

Both queries trim both the leading and trailing spaces from Last_Name.

Page 552

Chapter 13

Strings

Trim Combined with the CHARACTERS Command SELECT ' Rodriquez ' as "Name" ,LENGTH (Trim (' Rodriquez ')) AS No_Spaces ;

2 front spaces

2 back spaces

' Rodriquez '

Name _________ Rodriquez

No_Spaces _________ 9

This will allow for the character count to only be 9 because both the leading and trailing spaces have been cut.

Page 553

Chapter 13

Strings

How to TRIM only the Trailing Spaces SELECT ' Rodriquez ' ,LENGTH (Trim (Trailing FROM ' Rodriquez ')) AS Front_Spaces ;

2 front spaces

2 back spaces

' Rodriquez '

' Rodriquez ' ___________ Rodriquez

Front_Spaces ___________ 11

The TRAILING FROM Command allows you to only TRIM the spaces behind the Last_Name. Now, we will still get a character count of 11 because we are only cutting off the trailing spaces and not the beginning spaces.

Page 554

Chapter 13

Strings

A Visual of the TRIM Command Using Concatenation Concatenation without Trim and with Trim SELECT Last_Name concatenate ,First_Name ,Last_Name || First_Name as NameBackwards ,TRIM(Last_Name) || First_Name as TrimNameBackwards FROM Employee_Table

Last_Name First_Name __________ __________ Jones Squiggy Smith John Smythe Richard Harrison Herbert Chambers Mandee Strickling Cletus Reilly William Coffing Billy Larkins Loraine

NameBackwards TrimNameBackwards ______________________ __________________ Jones Squiggy JonesSquiggy Smith John SmithJohn Smythe Richard SmytheRichard Harrison Herbert HarrisonHerbert Chambers Mandee ChambersMandee Strickling Cletus StricklingCletus Reilly William ReillyWilliam Coffing Billy CoffingBilly Larkins Loraine LarkinsLoraine

When you use the TRIM command on a column, that column will have all beginning and ending spaces removed.

Page 555

Chapter 13

Strings

Trim and Trailing is Case Sensitive VARCHAR Capitol 'Y'

SELECT First_Name, Trim(trailing 'Y' from First_Name) AS No_Y, Trim(trailing 'y' from First_Name) AS Success FROM Employee_Table Lower Case 'y' ORDER BY 1; For leading and trailing TRIM commands, case sensitivity is important. First_Name No_Y Success __________ ________ __________ Billy Billy Bill Cletus Cletus Cletus Herbert Herbert Herbert John John John Loraine Loraine Loraine Mandee Mandee Mandee Richard Richard Richard Squiggy Squiggy Squigg William William William

For LEADING and TRAILNG TRIM commands, case sensitivity is required.

Page 556

Chapter 13

Strings

How to TRIM Trailing Letters VARCHAR

SELECT First_Name ,Trim(trailing 'y' from First_Name) AS No_Y ,Last_Name ,Trim(trailing 'g' from (TRIM (Last_Name))) AS No_G FROM Employee_Table ; CHAR(20)

First_Name No_Y __________ ________

Last_Name _________ No_G __________

Squiggy John Richard Herbert Mandee Cletus William Billy Loraine

Jones Smith Smythe Harrison Chambers Strickling Reilly Coffing Larkins

Squigg John Richard Herbert Mandee Cletus William Bill Loraine

Jones Smith Smythe Harrison Chambers Stricklin Reilly Coffin Larkins

The above example removed the trailing ‘y’ from the First_Name and the trailing ‘g’ from the Last_Name. Remember that this is case sensitive.

Page 557

Chapter 13

Strings

The SUBSTRING Command SELECT First_Name, SUBSTRING (First_Name FROM 2 for 3) AS Quiz FROM Employee_Table ; Start in position 2

First_Name __________ Squiggy John Richard Herbert Mandee Cletus William Billy Loraine

Go for 3 positions

Quiz ______ qui ohn ich erb and let ill ill ora

This is a SUBSTRING. The substring is passed two parameters, and they are the starting position of the string and the number of positions to return (from the starting position). The above example will start in position 2 and go for 3 positions!

Page 558

Chapter 13

Strings

SUBSTRING and SUBSTR are equal, but use different syntax Query 1 with Substring

SELECT First_Name, SUBSTRING(First_Name FROM 2 for 3) AS Quiz FROM Employee_Table ;

Query 2 with Substr

SELECT First_Name, SUBSTR (First_Name , 2 ,3) AS Quiz2 FROM Employee_Table ;

Both queries above are going to yield the same results! SUBSTR is just a different way of doing a substring. Both have two parameters, which are starting position and number of characters to return.

Page 559

Chapter 13

Strings

How SUBSTRING Works with NO ENDING POSITION SELECT First_Name, SUBSTRING (First_Name FROM 2) AS GoToEnd FROM Employee_Table ; Start in Position 2

First_Name GoToEnd __________ _________ Squiggy quiggy John ohn Richard ichard Herbert erbert Mandee andee Cletus letus William illiam Billy illy Loraine oraine

If you don’t tell the Substring the end position, it will go all the way to the end.

Page 560

Chapter 13

Strings

Using SUBSTRING to move backwards SELECT First_Name, SUBSTRING (First_Name FROM 0 For 6) AS Before1 FROM Employee_Table ; Start in Position 0 (one space before)

First_Name Before1 __________ ________ Squiggy Squig John John Richard Richa Herbert Herbe Mandee Mande Cletus Cletu William Willi Billy Billy Loraine Lorai

A starting position of zero moves one space in front of the beginning. Notice that our FOR Length is 6 so ‘Squiggy’ turns into ‘ Squig’. The point being made here is that both the starting position and ending positions can move backwards which will come in handy as you see other examples.

Page 561

Chapter 13

Strings

How SUBSTRING Works with a Starting Position of -1 SELECT First_Name, SUBSTRING (First_Name FROM -1 For 3) AS Before2 FROM Employee_Table ; Start in Position -1. This is two spaces before.

First_Name Before2 __________ ________ Squiggy S John J Richard R Herbert H Mandee M Cletus C William W Billy B Loraine L

A starting position of -1 moves two spaces in front of the beginning. Notice that our FOR Length is 3, so each name delivers only the first initial. The point being made here is that both the starting position and ending positions can move backwards which will come in handy as you see other examples.

Page 562

Chapter 13

Strings

How SUBSTRING Works with an Ending Position of 0 SELECT First_Name, SUBSTRING (First_Name FROM 3 For 0) AS WhatsUp FROM Employee_Table ; Go for 0 positions

First_Name WhatsUp __________ ________ Squiggy John Richard Herbert Mandee Cletus William Billy Loraine In our example above, we start in position 3, but we go for zero positions, so nothing is delivered in the column. That is what’s up!

Page 563

Chapter 13

Strings

An Example using SUBSTRING, TRIM and CHAR Together SELECT Last_Name CHAR(20) ,SUBSTRING(Last_Name FROM LENGTH( TRIM (TRAILING FROM Last_Name)) -1 FOR 2) AS Letters FROM Employee_Table; Last_Name __________ Jones Smith Smythe Harrison Chambers Strickling Reilly Coffing Larkins

Letters ______ es th he on rs ng ly ng ns

The SQL above brings back the last two letters of each Last_Name. The tricky part is that the last names are different lengths. We first trimmed the spaces off of the Last_Name. Then, we counted the characters in the Last_Name. Then, we subtracted two from the Last_Name character length and then passed it to our substring as the starting position.

Page 564

Chapter 13

Strings

The POSITION Command finds a Letters Position SELECT Last_Name ,Position ('e' in Last_Name) AS Find_The_E ,Position ('f' in Last_Name) AS Find_The_F FROM Employee_Table ;

e is in 4th position

e is 2nd position in name

Last_Name Find_The_E Find_The_F __________ __________ __________ Jones 4 0 Smith 0 0 Smythe 6 0 No f is in Harrison 0 0 the name Chambers 6 0 Strickling 0 0 Reilly 2 0 1st f is in Coffing 0 3 3rd position Larkins 0 0

This is the position counter. What it will do is tell you what position a letter is on. Why did Jones have a 4 in the result set? The ‘e’ was in the 4th position. Why did Smith get a zero for both columns? There is no ‘e’ in Smith and no ‘f’ in Smith. If there are two ‘f’s, only the first occurrence is reported.

Page 565

Chapter 13

Strings

Quiz – Find that SUBSTRING Starting Position SELECT DISTINCT Department_Name as Dept_Name ,SUBSTRING(Department_Name FROM POSITION(' ' IN Department_Name) +1) as Word2 FROM Department_Table WHERE POSITION(' ' IN trim(Department_Name)) >0;

Dept_Name __________________ Customer Support Human Resources Research and Develop

Word2 ___________ Support Resources and Develop

What is the Starting Position here? What is the Starting position of the Substring in the above query? Hint: This only looks for a Dept_Name that has two words or more.

Page 566

Chapter 13

Strings

Answer to Quiz – Find that SUBSTRING Starting Position SELECT DISTINCT Department_Name as Dept_Name ,SUBSTRING(Department_Name FROM POSITION(' ' IN Department_Name) +1) as Word2 FROM Department_Table WHERE POSITION(' ' IN trim(Department_Name)) >0; Dept_Name __________________

Customer Support Human Resources Research and Develop

Word2 ___________

Support Resources and Develop

What is the Starting Position here? The Starting Position is calculated by finding the length up to the first SPACE and then adding 1.

Customer Support (FROM 10) Human Resources (FROM 7) Research and Develop FROM 10)

What is the Starting position of the Substring in the above query? See above!

Page 567

Chapter 13

Strings

Using the SUBSTRING to Find the Second Word On SELECT DISTINCT Department_Name as Dept_Name ,SUBSTRING(Department_Name FROM POSITION(' ' IN Department_Name) +1) as Word2 FROM Department_Table WHERE POSITION(' ' IN trim(Department_Name)) >0;

Dept_Name __________________ Customer Support Human Resources Research and Develop

Word2 ____________ Support Resources and Develop

Notice we only had three rows come back. That is because our WHERE looks for only Department_Name that has multiple words. Then, notice that our starting position of the Substring is a subquery that looks for the first space. Then, it adds 1 to the starting position, and we have a starting position for the 2nd word. We don’t give a FOR length parameter, so it goes to the end.

Page 568

Chapter 13

Strings

Quiz – Why did only one Row Return SELECT Department_Name ,SUBSTRING(Department_Name from POSITION(' ' IN Department_Name) + 1 + POSITION(' ' IN SUBSTRING(Department_Name FROM POSITION(' ' IN Department_Name) + 1))) as Third_Word FROM Department_Table WHERE POSITION(' ' IN TRIM(Substring(Department_Name from POSITION(' ' in Department_Name) + 1)))> 0

Dept_Name _________ Research and Develop Why did only one row come back?

Page 569

Third_Word __________ Develop

Chapter 13

Strings

Answer to Quiz – Why Did only one Row Return SELECT Department_Name ,SUBSTRING(Department_Name from POSITION(' ' IN Department_Name) + 1 + POSITION(' ' IN SUBSTRING(Department_Name FROM POSITION(' ' IN Department_Name) + 1))) as Third_Word FROM Department_Table WHERE POSITION(' ' IN TRIM(Substring(Department_Name from POSITION(' ' in Department_Name) + 1)))> 0

Dept_Name __________________ Research and Develop

Third_Word __________ Develop

It has 3 words

Why did only one row come back? It’s the Only Department Name with three words. The SUBSTRING and the WHERE clause both look for the first space, and if they find it, they look for the second space. If they find that, add 1 to it, and their Starting Position is the third word. There is no FOR position, so it defaults to “go to the end”.

Page 570

Chapter 13

Strings

Concatenation

Two Pipe Symbols together (no space) mean concatenate

SELECT First_Name ,Last_Name ,First_Name A space || ' ' || Last_Name as Full_Name FROM Employee_Table WHERE First_Name = 'Squiggy'

First_Name _________

Last_Name Full_Name _________ ___________

Squiggy

Jones

Squiggy Jones

Two pipe symbols represent concatenation. That allows you to combine multiple columns into one column. The || (Pipe Symbol) on your keyboard is just above the ENTER key. Don’t put a space in between, just put two Pipe Symbols together. In this example, we have combined the first name, then a single space and then the last name to get a new column called Full_Name.

Page 571

Chapter 13

Strings

Concatenation and SUBSTRING A Period (.) and a space

SELECT First_Name ,Last_Name ,Substring(First_Name, 1, 1) || '. ' || Last_Name as Full_Name FROM Employee_Table WHERE First_Name = 'Squiggy' ;

_________ First_Name _________ Last_Name _________ Full_Name Squiggy Jones S. Jones

Of the three items being concatenated together, what is the first item of concatenation in the example above? The first initial of the First_Name. Then, we concatenated a literal space and a period. Then, we concatenated the Last_Name.

Page 572

Chapter 13

Strings

Four Concatenations Together CHAR(20)

VARCHAR(12)

SELECT First_Name ,Last_Name ,TRIM(Last_Name) ||' ' || Substring(First_Name, 1, 1) || '.' AS Last_Name_1st FROM Employee_Table WHERE First_Name = 'Squiggy' ;

First_Name Last_Name_1st __________ Last_Name _________ _____________

Squiggy

Jones

Jones S.

Why did we TRIM the Last_Name? To get rid of the spaces, otherwise the output would have looked odd. How many items are being concatenated in the example above? There are 4 items concatenated. We start with the Last_Name (after we trim it), then we have a single space, then we have the First Initial of the First Name, and then we have a Period.

Page 573

Chapter 13

Strings

Troubleshooting Concatenation ERROR: There should never be spaces between the pipe symbols

SELECT First_Name ,Last_Name ,TRIM (Last_Name) | | First_Name AS LastFirst FROM Employee_Table WHERE First_Name = 'Squiggy' ; This is now perfect

SELECT First_Name ,Last_Name ,TRIM (Last_Name) || First_Name AS LastFirst FROM Employee_Table WHERE First_Name = 'Squiggy' ; First_Name Last_Name ___________ LastFirst __________ __________ Squiggy

Jones

JonesSquiggy

What happened above to cause the error? Can you see it? The Pipe Symbols || have a space between them like | |, when it should be ||. It is a tough one to spot, so be careful.

Page 574

Chapter 14

Page 575

Interrogating the Data

Chapter 14

Interrogating the Data

Chapter 14 – Interrogating the Data

"The difference between genius and stupidity is that genius has its limits" - Albert Einstein

Page 576

Chapter 14

Interrogating the Data

Numeric Manipulation Functions Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT -10 as Neg10 ,Cos(90) as Cos -- Trigonometric cosine of an angle ,Sin(90) as Sin -- Trigonometric sine of an angle ,Tan(90) as Tan -- Trigonometric tangent of an angle ,Exp(6) as Exp -- Exponential value of a number ,Sqrt(16) as Sqrt -- Square root of a number FROM Order_Table limit 1 ;

Messages

Neg 10 1 -10

Garden of Analysis

Result 1

Cos Sin Tan Exp Sqrt -0.45 0.89 -2 403.43 4

The functions above are often used for algebraic, trigonometric, or geometric calculations.

Page 577

Chapter 14

Interrogating the Data

Finding the Cube Root Nexus Chameleon History

File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

?

New Query

Query 1 SELECT cbrt(27.0) cube_root_27 cbrt finds the ,cbrt(216) cube_root_216 Cube Root FROM Order_Table All queries need a FROM statement LIMIT 1 ; so pick any table and LIMIT 1

Messages

Garden of Analysis

Result 1

cube_root_27 cube_root_216

1

Find the cube root with the cbrt function.

Page 578

EXECUTE

Sandbox

3

6

Chapter 14

Interrogating the Data

Ceiling Gets the Smallest Integer Not Smaller Than X Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Query 1 SELECT ceil(-0.1) as Ceil_1 ,ceil(3.333) as Ceil_2 ,order_total ,ceil(order_total) as Ceiling_Total FROM order_table LIMIT 1; Messages

Garden of Analysis

Sandbox ?

New Query

ceil finds the smallest integer NOT smaller than X

Result 1

ceil_1 ceil_2 order_total ceiling_total 1

0

4

15231.62

15232

Find the smallest integer not smaller than x by using the ceil command. This stands for a numbers integer ceiling.

Page 579

Chapter 14

Interrogating the Data

Floor Finds the Largest Integer Not Greater Than X Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History EXECUTE

Query 1 SELECT floor(-0.1) as Floor_1 ,floor(3.333) as Floorl_2 ,order_total ,floor(order_total) as Floor_Total FROM order_table LIMIT 1; Messages

Garden of Analysis

Sandbox ?

New Query

Floor finds the largest integer NOT greater than X

Result 1

Floor_1 Floor_2 order_total Floor_Total 1

-1

3

15231.62

15231

Find the largest integer not greater than x by using the floor command. This stands for a numbers integer floor.

Page 580

Chapter 14

Interrogating the Data

The Round Function and Precision Nexus Chameleon System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

EXECUTE

?

New Query

Query 1 Query 2 Query 3 SELECT Customer_Number, Order_Total ,round(Order_Total, 0) as no_decimals ,round(Order_Total, 1) as one_decimal FROM Order_Table Round 1 decimal place

Messages

Garden of Analysis

Result 1

customer_number order_total no_decimals one_decimal 1 2 3 4 5

87323456 57896883 31323134 11111111 11111111

15231.62 23454.84 5111.47 12347.53 8005.91

Use the round function to round to the precision you need.

Page 581

Sandbox

History

File Edit View Query Tools Help Web Windows

15232 23455 5111 12348 8006

15231.6 23454.8 5111.5 12347.5 8005.9

Chapter 14

Interrogating the Data

Quiz – What would the Answer be? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT Class_Code ,Grade_Pt / (Grade_Pt * 2 ) as Math1 FROM Student_Table ORDER BY 1,2 ;

Can you guess what would return in the Answer Set? Using the Student_Table above, and try and predict what the answer will be if this query was running on the system.

Page 582

Chapter 14

Interrogating the Data

Answer to Quiz – What would the Answer be? SELECT Class_Code ,Grade_Pt / (Grade_Pt * 2 ) as Math1 FROM Student_Table ORDER BY 1, 2 ;

Class_Code __________ Math1 ___________________________

FR FR FR JR JR SO SO SR SR ? Above are your answers.

Page 583

0 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 ?

Chapter 14

Interrogating the Data

The NULLIFZERO Command Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00

SELECT Class_Code ,Grade_Pt / ( NULLIFZERO (Grade_pt) * 2 ) AS Math1 FROM Student_Table ORDER BY 1, 2 ; If you have a calculation where a ZERO is not desired, you can use the NULLIFZERO command to convert any zero value to a null value. Turn the page and see the results.

Page 584

Chapter 14

Interrogating the Data

The NULLIFZERO vs. Zeroes SELECT Class_Code as class ,Grade_Pt / (Grade_Pt * 2 ) as Math1 FROM Student_Table ORDER BY 1, 2 ;

SELECT Class_Code as Class ,Grade_Pt / (NULLIFZERO (Grade_pt) * 2 ) AS Math1 FROM Student_Table ORDER BY 1, 2 ;

Class Math1 _____ ___________________________

Class Math1 _____ ___________________________

FR FR FR JR JR SO SO SR SR ?

FR FR FR JR JR SO SO SR SR ?

0 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 ?

? 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 ?

If you have a calculation where a ZERO is not desired, you can use the NULLIFZERO command to convert any zero value to a null value.

Page 585

Chapter 14

Interrogating the Data

Quiz – Fill in the Blank Values in the Answer Set Sample_Table

Cust_No ________ 0

Acc_Balance Location ___________ _______ ? 3

SELECT NULLIFZERO (Cust_No) AS Cust_No ,NULLIFZERO (Acc_Balance) AS Acc_Balance ,NULLIFZERO (Location) AS Location FROM Sample_Table ;

Cust_No Acc_Balance

________ ____________

Location

_________

Fill in the Answer Set above after looking at the table and the query.

Okay! Time to show me your brilliance! What would the Answer Set produce?

Page 586

Chapter 14

Interrogating the Data

Answer to Quiz – Fill in the Blank Values in the Answer Set Sample_Table Cust_No ________ 0

Acc_Balance Location ___________ _______ ? 3

SELECT NULLIFZERO (Cust_No) AS Cust_No ,NULLIFZERO (Acc_Balance) AS Acc_Balance ,NULLIFZERO (Location) AS Location FROM Sample_Table ;

Cust_No Acc_Balance Location ________ _____________ _________ ?

?

3

Here is the answer set! How did you do? The NULLIFZERO command found a zero in Cust_No, so it made it Null. The others were not zero, so they retained their value. The only time NULLIFZERO changes data is if it finds a zero, and then it changes it to null.

Page 587

Chapter 14

Interrogating the Data

Quiz – Fill in the Answers for the NULLIF Command Student_Table Student_ID _________ 423400 123250 234121

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Phillips Martin SR 3.00 Thomas Wendy FR 4.00

SELECT Fill in the Answer Last_Name Set below after ,NULLIF(Grade_Pt, 0) AS GP1 looking at the table ,NULLIF(Grade_Pt, 3.0) AS GP2 and the query. ,NULLIF(Grade_Pt, 4.0) AS GP3 FROM Student_Table WHERE Student_ID IN (423400, 123250, 234121) ORDER BY Last_Name ; Last_Name GP1 __________ ____ Larkins Phillips Thomas

GP2 ____

What would the above Answer Set produce from your analysis?

Page 588

GP3 ____

Chapter 14

Interrogating the Data

Answer – Fill in the Answers for the NULLIF Command Student_Table Student_ID _________ 423400 123250 234121

Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Phillips Martin SR 3.00 Thomas Wendy FR 4.00

SELECT Fill in the Answer Last_Name Set below after ,NULLIF(Grade_Pt, 0) AS GP1 looking at the table ,NULLIF(Grade_Pt, 3.0) AS GP2 and the query. ,NULLIF(Grade_Pt, 4.0) AS GP3 FROM Student_Table WHERE Student_ID IN (423400, 123250, 234121) ORDER BY Last_Name ; Last_Name GP1 GP2 __________ ____ ____ ? 0.00 Larkins 3.00 ? Phillips 4.00 4.00 Thomas

GP3 ____ 0.00 3.00 ?

Look at the answers above, and if it doesn’t make sense, go over it again until it does.

Page 589

Chapter 14

Interrogating the Data

The ZEROIFNULL Command Sample_Table Cust_No ________ 0

Acc_Balance Location ___________ _______ ? 3 Notice the Null! We’re turning it into a 0 shortly!

SELECT ZEROIFNULL (Cust_No) as Cust ,ZEROIFNULL (Acc_Balance) as Balance ,ZEROIFNULL (Location) as Location FROM Sample_Table ;

Cust Balance Location _____ _________ _________

Fill in the Answer Set above after looking at the table and the query.

This is the ZEROIFNULL. What it will do is put a zero into a place where a NULL shows up. Fill in what you think the answer set will be.

Page 590

Chapter 14

Interrogating the Data

Answer to the ZEROIFNULL Question Sample_Table Cust_No ________ 0

Acc_Balance Location ___________ _______ ? 3 Notice the Null! We’re turning it into a 0 shortly!

SELECT ZEROIFNULL (Cust_No) as Cust ,ZEROIFNULL (Acc_Balance) as Balance ,ZEROIFNULL (Location) as Location FROM Sample_Table ;

Cust Balance Location _____ _________ _________ 0 0 3 The answer set placed a zero in the place of the NULL Acc_Balance, but the other values didn’t change because they were NOT Null.

Page 591

Chapter 14

Interrogating the Data

The COALESCE Command Sample_Table Last_Name Home_Phone ___________ Work_Phone __________ Cell_Phone __________ ___________ Jones Patel Gonzales Nguyen

555-1234 ? ? ?

444-1234 456-7890 ? ?

? 454-6789 354-0987 ?

SELECT Last_Name ,COALESCE (Home_Phone, Work_Phone, Cell_Phone) as Phone FROM Sample_Table ; Last_Name __________

Phone ______

Fill in the Answer Set above after looking at the table and the query

Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null.

Page 592

Chapter 14

Interrogating the Data

The COALESCE Answer Set Sample_Table Last_Name Home_Phone ___________ Work_Phone Cell_Phone __________ ___________ __________ Jones Patel Gonzales Nguyen

555-1234 ? ? ?

444-1234 456-7890 ? ?

? 454-6789 354-0987 ?

SELECT Last_Name ,COALESCE (Home_Phone, Work_Phone, Cell_Phone) as Phone FROM Sample_Table ;

Last_Name __________ Jones Patel Gonzales Nguyen

Phone ________ 555-1234 456-7890 354-0987 ?

Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null.

Page 593

Chapter 14

Interrogating the Data

The Coalesce Quiz Sample_Table

Last_Name Home_Phone ___________ Work_Phone Cell_Phone __________ ___________ __________ Jones Patel Gonzales Nguyen

555-1234 ? ? ?

444-1234 456-7890 ? ?

? 454-6789 354-0987 ?

SELECT Last_Name ,COALESCE (Home_Phone, Work_Phone, Cell_Phone, 'No Phone') as Phone FROM Sample_Table ; Last_Name __________

Phone ________

Fill in the answer set above after looking at the table and the query

Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null. Since we decided in the above query we don’t want NULLs, notice we have placed a literal ‘No Phone’ in the list. How will this affect the Answer Set?

Page 594

Chapter 14

Interrogating the Data

Answer – The Coalesce Quiz Sample_Table

Last_Name ___________ Home_Phone ___________ Work_Phone Cell_Phone __________ __________ Jones Patel Gonzales Nguyen

555-1234 ? ? ?

444-1234 456-7890 ? ?

? 454-6789 354-0987 ?

SELECT Last_Name ,COALESCE (Home_Phone, Work_Phone, Cell_Phone, 'No Phone') as Phone FROM Sample_Table ; Last_Name __________ Jones Patel Gonzales Nguyen

Phone ________ 555-1234 456-7890 354-0987 No Phone

Answers are above! We put a literal in the list so there’s no chance of NULL returning.

Page 595

Chapter 14

Interrogating the Data

The COALESCE Command – Fill In the Answers Student_Table Student_ID _________ 423400 260000 234121

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Johnson Stanley ? ? Thomas Wendy FR 4.00

SELECT Fill in the Answer Last_Name Set below after looking at the table ,Grade_Pt and the query. ,Student_ID ,COALESCE (Grade_Pt, Student_ID) as ValidStudents FROM Student_Table WHERE Last_Name IN ('Johnson', 'Larkins', 'Thomas') ORDER BY 1 ; Last_Name Grade_Pt __________ ________ Johnson Larkins Thomas

? 0.00 4.00

Student_ID __________ ValidStudents ___________ 260000 423400 234121

Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null.

Page 596

Chapter 14

Interrogating the Data

The COALESCE Answer Set Student_Table Student_ID _________ 423400 260000 234121

Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Johnson Stanley ? ? Thomas Wendy FR 4.00

SELECT Last_Name ,Grade_Pt ,Student_ID ,COALESCE (Grade_Pt, Student_ID) as ValidStudents FROM Student_Table WHERE Last_Name IN ('Johnson', 'Larkins', 'Thomas') ORDER BY 1 ;

Last_Name Grade_Pt __________ ________ Johnson Larkins Thomas

? 0.00 4.00

Student_ID __________ ValidStudents ___________ 260000 423400 234121

260000.00 0.00 4.00

Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null.

Page 597

Chapter 14

Interrogating the Data

COALESCE is Equivalent to This CASE Statement SELECT Last_Name ,Grade_Pt ,Class_Code ,COALESCE (Grade_Pt, Student_ID) as ValidStudents FROM Student_Table ; SELECT Last_Name ,Grade_Pt ,Class_Code , CASE WHEN Grade_Pt IS NOT NULL THEN Grade_Pt WHEN Student_ID IS NOT NULL THEN Student_ID ELSE NULL END as ValidStudents FROM Student_Table ;

Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null. Above are two queries that return the exact same answer set. These examples are designed to give you a better idea of how Coalesce works.

Page 598

Chapter 14

Interrogating the Data

Some Great CAST (Convert and Store) Examples Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica

Systems + + + + + + + + + + + + + + +

Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica

Database: SQL Class

History

Sandbox

EXECUTE

?

New Query

Query 1

SELECT CAST('ABCDE' AS CHAR(1) ) AS Trunc ,CAST(128 AS CHAR(3) ) AS OK ,CAST(127 AS INTEGER ) AS Bigger

Messages

1

Garden of Analysis

TRUNC

OK

BIGGER

A

128

127

Result 1

The first CAST truncates the five characters (left to right) to form the single character ‘A’. In the second CAST, the integer 128 is converted to three characters and left justified in the output. The 127 was initially stored in a SMALLINT (5 digits - up to 32767) and then converted to an INTEGER. Hence, it uses 11 character positions for its display, ten numeric digits and a sign (positive assumed) and right justified as numeric.

Page 599

Chapter 14

Interrogating the Data

Some Great CAST (Convert and Store) Examples SELECT CAST(121.53 AS SMALLINT) AS Whole ,CAST(121.53 AS DECIMAL(3,0)) AS Rounder ; ______ _______ Whole Rounder 122 122 SELECT CAST(121.49 AS SMALLINT) AS Whole ,CAST(121.53 AS DECIMAL(3,0)) AS Rounder ; ______ _______ Whole Rounder 121 122 SELECT CAST(121.50 AS SMALLINT) AS Whole ,CAST(121.53 AS DECIMAL(3,0)) AS Rounder ; ______ _______ Whole Rounder 122 122

The value of 121.53 was initially stored as a DECIMAL as 5 total digits with 2 of them to the right of the decimal point. Then, it is converted to a SMALLINT using CAST to remove the decimal positions. Therefore, it truncates data by stripping off the decimal portion, but also rounds up because 53 is > 50. The CAST in the next column called Rounder is converted to a DECIMAL as 3 digits with no decimals, so it will also round data values. Since .53 is greater than .5, it is rounded up to 122.

Page 600

Chapter 14

Interrogating the Data

A Rounding Example SELECT CAST(.014 ,CAST(.016 ,CAST(.015 ,CAST(.0150 ,CAST(.0250 ,CAST(.0159

.014 ____ 0.01

.016 ____ 0.02

AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2))

.015 ____ 0.02

AS ".014" AS ".016" AS ".015" AS ".0150" AS ".0250" AS ".0159"

.0150 _____ 0.02

Rounding isn't always intuitive as you can see from the examples above.

Page 601

.0250 _____ 0.03

.0159 _____ 0.02

Chapter 14

Interrogating the Data

Some Great CAST (Convert and Store) Examples SELECT Order_Number as OrdNo ,Customer_Number as CustNo ,Order_Date ,Order_Total ,CAST(Order_Total as integer) as Chopped ,CAST(Order_Total as Decimal(5,0)) as Rounded FROM Order_Table ORDER BY Order_Date ;

OrdNo _________ CustNo Order_Date Order_Total _______ __________ __________ Chopped _______

123456 123512 123777 123552 123585

11111111 11111111 57896883 31323134 87323456

05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999

12347.53 8005.91 23454.84 5111.47 15231.62

Notice how the rounding did not take place as you might have expected.

Page 602

12348 8006 23455 5111 15232

Rounded _______

12348 8006 23455 5111 15232

Chapter 14

Interrogating the Data

Quiz - The Basics of the CASE Statements Course_Table Course_ID _________ 100 200 210 220 300 400

Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16

SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' END AS CreditAlias FROM Course_Table WHERE Course_ID IN (220, 300) ; Course_Name ______________________ CreditAlias ____________ Physical Database Design SQL Features

This is a CASE STATEMENT which allows you to evaluate a column in your table, and from that, come up with a new answer for your report. Every CASE begins with a CASE, and they all must end with a corresponding END. What would the answer be?

Page 603

Chapter 14

Interrogating the Data

Answer to Quiz - The Basics of the CASE Statements Course_Table Course_ID _________ 100 200 210 220 300 400

Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16

SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' END AS CreditAlias FROM Course_Table WHERE Course_ID IN (220, 300) ; Course_Name ______________________ CreditAlias ____________ ? Physical Database Design Two Credits SQL Features

The answer for the Physical Database Design class is null. This is because it fell through the case statement. The answer for the SQL Features course is Two Credits. Once a case statement gets a match, it leaves the statement and gets the next row.

Page 604

Chapter 14

Interrogating the Data

Using an ELSE in the Case Statement Course_Table Course_ID _________ 100 200 210 220 300 400

Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16

SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' ELSE 'Four Credits' END AS CreditAlias FROM Course_Table WHERE Course_ID IN (220, 300) ; Course_Name ______________________ CreditAlias ____________ Four Credits Physical Database Design Two Credits SQL Features

Now that we have an ELSE in our case statement we are guaranteed that nothing will fall through.

Page 605

Chapter 14

Interrogating the Data

Using an ELSE as a Safety Net Course_Table Course_ID _________ 100 200 210 220 300 400

Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16

SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' WHEN 4 THEN 'Four Credits' ELSE 'Do not know' END AS CreditAlias FROM Course_Table ;

Now that we have an ELSE in our case statement we are guaranteed that nothing will fall through. An ELSE should be used in case you forgot a possibility and there was no match.

Page 606

Chapter 14

Interrogating the Data

Rules for a Valued Case Statement SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' Else 'Credits not found' END AS CreditAlias FROM Course_Table ;

The column Credits (in blue) follows the word CASE. This is a valued case statement. The value is the column Credits.

Rules for a Valued CASE: 1. You can only check for equality 2. You can only check the value of the column Credits There are two types of CASE statements. There is the Valued CASE and the Searched CASE. Above are the rules for the Valued CASE statement.

Page 607

Chapter 14

Interrogating the Data

Rules for a Searched Case Statement SELECT Course_Name No Value follows the ,CASE word CASE. This is WHEN Credits