One of the most exciting new database inventions is Columnar technology. HP has built one of the best columnar databases
153 71 5MB
English Pages 766 Year 2015
The Tera-Tom Video Series
Lessons with Tera-Tom Teradata Architecture and SQL Video Series These exciting videos make learning and certification much easier
Three ways to view them: 1. Safari (look up Coffing Studios) 2. CoffingDW.com (sign-up on our website) 3. Your company can buy them all for everyone to see (contact [email protected])
The Tera-Tom Genius Series
The Tera-Tom Genius Series consists of ten books. Each book is designed for a specific audience, and Teradata is explained to the level best suited for that audience. The books take a building block approach; always starting out simple, then each page builds upon the previous point. Order them all at www.CoffingDW.com.
Tera-Tom- Author of over 50 Books
Tera-Tom books have been the primary source of Teradata learning for over 20 years. They have helped to teach millions of people all aspects of Teradata. What people love the most about the Tera-Tom books is how easy they are to understand. They are so easy that a seven year old boy (raised by wolves) can understand them!
The Best Query Tool Works on all Systems
When you possess a tool like Nexus, you have access to every system in your enterprise! The Nexus Query Chameleon is the only tool that works on all systems. Its Super Join Builder allows for the ERwin Logical Model to be loaded, and then Nexus shows tables and views visually. It then guides users to show what joins to what. As users choose the tables and columns they want in their report, Nexus builds the SQL for them with each click of the mouse. Nexus was designed for Teradata and Hadoop, but works on all platforms. Nexus even converts table structures between vendors, so querying and managing multi-vendor platforms is transparent. Even if you only work with one system, you will find that the Nexus is the best query tool you have ever used. If you work with multiple systems, you will be even more amazed. Download a free trial at www.CoffingDW.com.
Trademarks and Copyrights Microsoft Windows, Windows 2003 Server, SQL Server 2012, SQL Server Compact Edition, .NET, PDW, SQL Server, T-SQL, Azure SQL Data Warehouse and Azure Cloud are trademarks of Microsoft. Teradata, NCR, BYNET and SQL Assistant are registered trademarks of Teradata Corporation, Dayton, Ohio, U.S.A., IBM, DB2 and Netezza are registered trademarks of IBM Corporation, ANSI is a registered trademark of the American National Standards Institute. Ethernet is a trademark of Xerox. UNIX is a trademark of The Open Group. Linux is a trademark of Linus Torvalds. Java and Oracle is a trademark of Oracle. ParAccel is a trademark of ParAccel. Kognitio is a trademark of Kognitio. Greenplum is a trademark of EMC and Dell Corporation. Vertica is a trademark of HP Corporation. Nexus Query Chameleon is a trademark of Coffing Data Warehousing. Coffing Data Warehousing shall have neither liability nor responsibility to any person or entity with respect to any loss or damages arising from the information contained in this book or from the use of programs or program segments that are included. The manual is not a publication of HP Corporation, nor was it produced in conjunction with HP Corporation. Copyright © December 2015 by Coffing Publishing ISBN 978-1-940540-34-4 All rights reserved. No part of this book shall be reproduced, stored in a retrieval system, or transmitted by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission from the publisher. No patent liability is assumed with respect to the use of information contained herein. Although every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions, neither is any liability assumed for damages resulting from the use of information contained herein.
About Tom Coffing
Tom Coffing, better known as Tera-Tom, is the founder of Coffing Data Warehousing where he has been CEO for the past 20 years. Tom has written over 50 books on all aspects of Teradata, Netezza, Kognitio, Redshift, ParAccel, Vertica, SQL Server, and Greenplum. Tom has taught over 1,000 Teradata classes in places such as India, Africa, Europe, China, Malaysia, and throughout North America. Tom is also the owner and designer of the Nexus Query Chameleon, the most sophisticated enterprise query tool in the industry. The Nexus works on all platforms, including Hadoop, converts table structures between all systems, and allows companies to load their ERwin logical model inside Nexus. The Nexus guides users like a GPS system. Users point and click on any table or view from any system, and they are guided to what joins to what. As users choose the columns they want on their report, the SQL is built automatically. In High School, Tom was the first athlete from his school to ever place at state. He was selected by his school to represent them at Buckeye Boys State, and Tom was inducted into the first class of the Lakota High School Hall of Fame. At the University of Arizona and University of Nevada Las Vegas, Tom was a two-time All-American wrestler, Sophomore Athlete of the year, and a two-time winner of the 1980 Olympic wrestling trials. Tom graduated with a Bachelor’s degree in Speech Communications. After college, Tom became a state and national champion speech winner for Toastmasters and won two orchid awards as an actor. Tom is the proud father of three wonderful children and has been married for the past 32 years. You can contact Tom at 513 300-0341 or at [email protected].
About Leslie Nolander
Leslie Nolander has been the Chief Operating Officer (COO) at CoffingDW for the past ten years. She is responsible for running the business and has done a brilliant job of orchestrating the procedures and standards at CoffingDW. Leslie has personally negotiated hundreds of contracts both internationally and domestically and has overseen every financial transaction. Leslie is the author of multiple books and a key asset in the design and implementation of the Nexus Query Chameleon. Leslie has a Bachelor's Degree from the University of Arizona in Child Development and Family Relations and completed a fifth year Teacher Certification Program. Leslie was a teacher the first ten years of her career before joining the team at CoffingDW. Leslie resides in Phoenix, Arizona with her husband Jason and they have two daughters.
Table of Contents
Contents Chapter 1 – What is Columnar? .................................................................................................................................. 28 What is Parallel Processing? .................................................................................................................................... 29 Nothing Happens on Disk ........................................................................................................................................ 30 Data in Memory is fast as Lightning ........................................................................................................................ 31 Parallel Processing Of Data ..................................................................................................................................... 32 The Problem with Row-Based Data......................................................................................................................... 33 Columnar Data Can Store Each Column in Their Own Block ................................................................................ 34 Why Columnar? ....................................................................................................................................................... 35 Row Based Blocks vs. Columnar Based Blocks ...................................................................................................... 36 Visualize the Data – Rows vs. Columns .................................................................................................................. 37 The Architecture of Vertica ..................................................................................................................................... 38 Vertica Architecture Terms ...................................................................................................................................... 39 Vertica has Linear Scalability .................................................................................................................................. 40 Chapter 2 – Vertica Data Distribution ........................................................................................................................ 42 Distribution Strategy 1 - Segmented By Hash ......................................................................................................... 43 Distribution Strategy 2 - Unsegmented.................................................................................................................... 44 Sorting the Data in a Table CREATE Statement ..................................................................................................... 45 Even Distribution ..................................................................................................................................................... 46 Uneven Distribution Where the Data is Non-Unique .............................................................................................. 47 Matching Distribution Keys for Co-Location of Joins ............................................................................................ 48 Big Table / Small Table Joins .................................................................................................................................. 49 Fact and Dimension Table Distribution Key Designs ............................................................................................. 50 Why a Sort Key Improves Performance .................................................................................................................. 51 Sort Keys Help Group By, Order By and Window Functions................................................................................. 52
Table of Contents Chapter 3 – Clever Features of Vertica ...................................................................................................................... 54 Super Projections ..................................................................................................................................................... 55 Vertica Projections ................................................................................................................................................... 56 The Five Advantages of Projections ........................................................................................................................ 57 Creating a Projection ................................................................................................................................................ 58 Read-Optimized Store (ROS)/Write-Optimized Store (WOS) ................................................................................ 59 Write-Optimized Store (WOS) is Memory Resident ............................................................................................... 60 Updates are collected in Time-Based Buckets called Epochs ................................................................................. 61 Vertica Does Not Support In-Place Updates ........................................................................................................... 62 K-Safety ................................................................................................................................................................... 63 K-Safety of 2 ............................................................................................................................................................ 64 The Five Data Isolation Modes ................................................................................................................................ 65 Import/Export between Multiple Vertica Systems .................................................................................................. 66 Roles ......................................................................................................................................................................... 67 Compression ............................................................................................................................................................. 68 Runlength encoding ................................................................................................................................................. 69 LZO Encoding .......................................................................................................................................................... 70 Delta Encoding ......................................................................................................................................................... 71 Block Based Dictionary Encoding for Character Data ............................................................................................ 72 Chapter 4 - Nexus ....................................................................................................................................................... 74 Nexus is Available on the Cloud.............................................................................................................................. 75 Nexus Queries Every Major System ........................................................................................................................ 76 How to Use Nexus ................................................................................................................................................... 77 Why is Nexus Special? Visualization and Automatic SQL ..................................................................................... 78 Why is Nexus Special? Cross-System Joins ............................................................................................................ 79 Why is Nexus Special? The Amazing Hub System ................................................................................................. 80 Why is Nexus Special? Save Answer Sets as Tables .............................................................................................. 81 Why is Nexus Special? Automated Data Movement ............................................................................................... 82
Table of Contents Why is Nexus Special? Nexus makes the Servers Talk Directly ............................................................................ 83 What Makes Nexus Special? The Garden of Analysis ............................................................................................ 84 The Garden of Analysis Grouping Sets Tab ............................................................................................................ 85 The Garden of Analysis - Grouping Sets Answer Sets ............................................................................................ 86 The Garden of Analysis – Join Tab (1 of 4) ............................................................................................................ 87 The Garden of Analysis – Join Tab (2 of 4) ............................................................................................................ 88 The Garden of Analysis – Join Tab (3 of 4) ............................................................................................................ 89 The Garden of Analysis – Join Tab (4 of 4) ............................................................................................................ 90 The Garden of Analysis – Charts/Graphs Tab (1 of 4) ............................................................................................ 91 The Garden of Analysis – Charts/Graphs Tab (2 of 4) ............................................................................................ 92 The Garden of Analysis – Charts/Graphs Tab (3 of 4) ............................................................................................ 93 The Garden of Analysis – Charts/Graphs Tab (4 of 4) ............................................................................................ 94 The Garden of Analysis – Dynamic Charts Tab (1 of 4) ......................................................................................... 95 The Garden of Analysis – Dynamic Charts Tab (2 of 4) ......................................................................................... 96 The Garden of Analysis – Dynamic Charts Tab (3 of 4) ......................................................................................... 97 The Garden of Analysis – Dynamic Charts Tab (4 of 4) ......................................................................................... 98 The Garden of Analysis – Dashboard Tab (1 of 5).................................................................................................. 99 The Garden of Analysis – Dynamic Charts Tab (2 of 5) ....................................................................................... 100 The Garden of Analysis – Dynamic Charts Tab (3 of 5) ....................................................................................... 101 The Garden of Analysis – Dynamic Charts Tab (4 of 5) ....................................................................................... 102 The Garden of Analysis – Dynamic Charts Tab (5 of 5) ....................................................................................... 103 Getting to the Super Join Builder ........................................................................................................................... 104 The Super Join Builder is the First Entry in the Menu .......................................................................................... 105 The Super Join Builder Shows Tables Visually .................................................................................................... 106 Using the Add Join Button ..................................................................................................................................... 107 What to Do When No Tables are Joinable? ........................................................................................................... 108 Drag a Joinable Object into the Super Join Builder ............................................................................................... 109 You will see the Add Custom Join Window .......................................................................................................... 110 Defining the Join Columns .................................................................................................................................... 111
Table of Contents Your Tables Will Appear Together ....................................................................................................................... 112 Select the Columns You Want on the Report ........................................................................................................ 113 Check out the SQL Tab to See the SQL that has been built .................................................................................. 114 SQL Tab ................................................................................................................................................................. 115 Hit Execute to get the Report inside the Super Join Builder ................................................................................. 116 The Report is delivered inside the Super Join Builder .......................................................................................... 117 Let's Join Two Tables Again (1 of 6)..................................................................................................................... 118 Let's Join Two Tables Again (2 of 6)..................................................................................................................... 119 Let's Join Two Tables Again (3 of 6)..................................................................................................................... 120 Let's Join Two Tables Again (4 of 6)..................................................................................................................... 121 Let's Join Two Tables Again (5 of 6)..................................................................................................................... 122 Let's Join Two Tables Again (6 of 6)..................................................................................................................... 123 The Tabs of the Super Join Builder Philosophy – One Query............................................................................... 124 The Tabs of the Super Join Builder – Objects Tab ................................................................................................ 125 The Tabs of the Super Join Builder – Columns Tab) ............................................................................................ 126 The Tabs of the Super Join Builder – Sorting Tab ................................................................................................ 127 The Tabs of the Super Join Builder – Joins Tab .................................................................................................... 128 The Tabs of the Super Join Builder – SQL Tab..................................................................................................... 129 The Tabs of the Super Join Builder – Metadata Tab ............................................................................................. 130 The Tabs of the Super Join Builder – Analytics Tab ............................................................................................. 131 The Tabs of the SJB – Analytics Tab – OLAP Screen .......................................................................................... 132 Getting a Simple CSUM in the Analytics Tab – OLAP ........................................................................................ 133 Getting a Simple CSUM – The SQL Automatically Generated ............................................................................ 134 The Answer Set of the CSUM ............................................................................................................................... 135 Getting all of the OLAP functions in the Analytics Tab ....................................................................................... 136 A Five Table Join Using the Menu ........................................................................................................................ 137 The First Table is placed in the Super Join Builder ............................................................................................... 138 Using the Add Join Cascading Menu ..................................................................................................................... 139 All Five Tables Are In the Super Join Builder ...................................................................................................... 140
Table of Contents A Five Table Join Two Steps (Cube) ..................................................................................................................... 141 Choose Cube with Columns from the Left Top of the Table ................................................................................ 142 All Tables are Cubed (Joined Together Instantly) ................................................................................................. 143 Choose Cube and then Choose Your Columns ...................................................................................................... 144 Create Cube - Tables Are Joined Without Columns Selected ............................................................................... 145 Create Cube – Select the Columns You Want on the Report ................................................................................ 146 How to join Vertica, Oracle and SQL Server Tables............................................................................................. 147 The Vertica Table is now in the Super Join Builder .............................................................................................. 148 Drag the Joining Oracle Table to the Super Join Builder ...................................................................................... 149 Defining the Join Columns .................................................................................................................................... 150 Choose the Columns You Want on Your Report................................................................................................... 151 Let's Add a SQL Server Table to our Vertica and Oracle Join .............................................................................. 152 Defining the Join Columns .................................................................................................................................... 153 All Three Tables are now in the Super Join Builder .............................................................................................. 154 Change the Hub and Run the Join on Oracle ......................................................................................................... 155 Change the Hub and Run the Join on SQL Server................................................................................................. 156 Simply Amazing - Change the Hub to the Garden of Analysis ............................................................................. 157 Have the Answer Set Saved Automatically to Any System .................................................................................. 158 Saving the Answer Set to an Oracle or SQL Server System ................................................................................. 159 Saving the Answer Set to a Vertica System........................................................................................................... 160 Saving the Answer Set to a Teradata System ........................................................................................................ 161 Chapter 5 – The Basics of SQL ................................................................................................................................ 163 Introduction ............................................................................................................................................................ 164 Setting your Path .................................................................................................................................................... 165 Setting Your Default Database .............................................................................................................................. 166 SELECT * (All Columns) in a Table ..................................................................................................................... 167 Fully Qualifying a Database, Schema and Table ................................................................................................... 168 SELECT Specific Columns in a Table .................................................................................................................. 169
Table of Contents Commas in the Front or Back? .............................................................................................................................. 170 Place your Commas in front for better Debugging Capabilities ............................................................................ 171 Sort the Data with the ORDER BY Keyword ....................................................................................................... 172 ORDER BY Defaults to Ascending ....................................................................................................................... 173 Use the Name or the Number in your ORDER BY Statement .............................................................................. 174 Two Examples of ORDER BY using Different Techniques ................................................................................. 175 Changing the ORDER BY to Descending Order ................................................................................................... 176 NULL Values sort First in Ascending Mode (Default) ......................................................................................... 177 NULL Values sort Last in Descending Mode (DESC).......................................................................................... 178 Major Sort vs. Minor Sorts .................................................................................................................................... 179 Multiple Sort Keys using Names vs. Numbers ...................................................................................................... 180 Sorts are Alphabetical, NOT Logical ..................................................................................................................... 181 Using A CASE Statement to Sort Logically .......................................................................................................... 182 How to ALIAS a Column Name ............................................................................................................................ 183 A Missing Comma can by Mistake become an Alias ............................................................................................ 184 Aliasing a Column Name with Spaces or Reserved Words................................................................................... 185 Comments using Double Dashes are Single Line Comments ............................................................................... 186 Comments for Multi-Lines..................................................................................................................................... 187 Comments for Multi-Lines as Double Dashes per Line ........................................................................................ 188 Formatting Number ................................................................................................................................................ 189 Formatting Number Examples ............................................................................................................................... 190 Formatting Dates .................................................................................................................................................... 191 Formatting Date Example ...................................................................................................................................... 192 Chapter 6 – The WHERE Clause............................................................................................................................. 194 The WHERE Clause limits Returning Rows ......................................................................................................... 195 Double Quoted Aliases are for Reserved Words and Spaces ................................................................................ 196 Character Data needs Single Quotes in the WHERE Clause................................................................................. 197 Character Data needs Single Quotes, but Numbers Don’t..................................................................................... 198
Table of Contents Comparisons against a Null Value ......................................................................................................................... 199 NULL means UNKNOWN DATA so Equal (=) won’t Work .............................................................................. 200 Use IS NULL or IS NOT NULL when dealing with NULLs ............................................................................... 201 NULL is UNKNOWN DATA so NOT Equal won’t Work .................................................................................. 202 Use IS NULL or IS NOT NULL when dealing with NULLs ............................................................................... 203 Using Greater Than or Equal To (>=).................................................................................................................... 204 AND in the WHERE Clause .................................................................................................................................. 205 Troubleshooting AND ............................................................................................................................................ 206 OR in the WHERE Clause ..................................................................................................................................... 207 Troubleshooting Or ................................................................................................................................................ 208 Troubleshooting Character Data ............................................................................................................................ 209 Using Different Columns in an AND Statement ................................................................................................... 210 Quiz – How many rows will return? ...................................................................................................................... 211 Answer to Quiz – How many rows will return? .................................................................................................... 212 What is the Order of Precedence? .......................................................................................................................... 213 Using Parentheses to change the Order of Precedence .......................................................................................... 214 Using an IN List in place of OR ............................................................................................................................ 215 The IN List is an Excellent Technique................................................................................................................... 216 IN List vs. OR brings the same Results ................................................................................................................. 217 The IN List Can Use Character Data ..................................................................................................................... 218 Using a NOT IN List .............................................................................................................................................. 219 Null Values in a NOT IN List Bring Back No Rows ............................................................................................ 220 A Technique for Handling Nulls with a NOT IN List ........................................................................................... 221 BETWEEN is Inclusive ......................................................................................................................................... 222 NOT BETWEEN is Also Inclusive ....................................................................................................................... 223 LIKE uses Wildcards Percent ‘%’ and Underscore ‘_’ ......................................................................................... 224 LIKE command Underscore is Wildcard for one Character.................................................................................. 225 LIKE Command Works Differently on Char Vs Varchar ..................................................................................... 226 LIKE Command on Character Data Auto Trims ................................................................................................... 227
Table of Contents Quiz – What Data is Left Justified and what is Right? .......................................................................................... 228 Numbers are Right Justified and Character Data is Left ....................................................................................... 229 Answer – What Data is Left Justified and what is Right? ..................................................................................... 230 An Example of Data with Left and Right Justification ......................................................................................... 231 A Visual of CHARACTER Data vs. VARCHAR Data ........................................................................................ 232 Use the TRIM command to remove spaces on CHAR Data ................................................................................. 233 Escape Character in the LIKE Command changes Wildcards .............................................................................. 234 Escape Characters Turn off Wildcards in the LIKE Command ............................................................................ 235 Quiz – Turn off that Wildcard................................................................................................................................ 236 ANSWER – To Find that Wildcard ....................................................................................................................... 237 The Distinct Command .......................................................................................................................................... 238 Distinct vs. GROUP BY ........................................................................................................................................ 239 Quiz – How many rows come back from the Distinct? ......................................................................................... 240 Answer – How many rows come back from the Distinct? .................................................................................... 241 Chapter 7 – Aggregation ........................................................................................................................................... 243 Quiz – You calculate the Answer Set in your own Mind ...................................................................................... 244 Answer – You calculate the Answer Set in your own Mind ................................................................................. 245 Quiz – You calculate the Answer Set in your own Mind ...................................................................................... 246 Answer – You calculate the Answer Set in your own Mind ................................................................................. 247 The 3 Rules of Aggregation ................................................................................................................................... 248 There are Five Aggregates ..................................................................................................................................... 249 Quiz – How many rows come back? ..................................................................................................................... 250 Answer – How many rows come back? ................................................................................................................. 251 Troubleshooting Aggregates .................................................................................................................................. 252 GROUP BY when Aggregates and Normal Columns Mix ................................................................................... 253 GROUP BY delivers one row per Group .............................................................................................................. 254 GROUP BY Dept_No or GROUP BY 1 the same thing ....................................................................................... 255 Limiting Rows and Improving Performance with WHERE .................................................................................. 256
Table of Contents WHERE Clause in Aggregation limits unneeded Calculations ............................................................................. 257 Keyword HAVING tests Aggregates after they are totaled .................................................................................. 258 Keyword HAVING is like an Extra WHERE Clause for totals ............................................................................ 259 Keyword HAVING tests Aggregates after they are totaled .................................................................................. 260 Getting the Average Values per Column ............................................................................................................... 261 GROUP BY Rollup ................................................................................................................................................ 262 GROUP BY Rollup Result Set .............................................................................................................................. 263 Chapter 8 – Join Functions ....................................................................................................................................... 265 A Two-Table Join Using Traditional Syntax ......................................................................................................... 266 A two-table join using Non-ANSI Syntax with Table Alias ................................................................................. 267 You Can Fully Qualify All Columns ..................................................................................................................... 268 A two-table join using ANSI Syntax ..................................................................................................................... 269 Both Queries have the same Results and Performance.......................................................................................... 270 Quiz – Can You Finish the Join Syntax? ............................................................................................................... 271 Answer to Quiz – Can You Finish the Join Syntax? ............................................................................................. 272 Quiz – Can You Find the Error? ............................................................................................................................ 273 Answer to Quiz – Can You Find the Error? .......................................................................................................... 274 Super Quiz – Can You Find the Difficult Error? ................................................................................................... 275 Answer to Super Quiz – Can You Find the Difficult Error? ................................................................................. 276 Quiz – Which rows from both tables won’t return? .............................................................................................. 277 Answer to Quiz – Which rows from both tables Won’t Return?........................................................................... 278 LEFT OUTER JOIN .............................................................................................................................................. 279 LEFT OUTER JOIN Results ................................................................................................................................. 280 RIGHT OUTER JOIN............................................................................................................................................ 281 RIGHT OUTER JOIN Example and Results......................................................................................................... 282 FULL OUTER JOIN .............................................................................................................................................. 283 FULL OUTER JOIN Results ................................................................................................................................. 284 Which Tables are the Left and which Tables are Right? ....................................................................................... 285
Table of Contents Answer - Which Tables are the Left and which are the Right? ............................................................................. 286 INNER JOIN with Additional AND Clause .......................................................................................................... 287 ANSI INNER JOIN with Additional AND Clause ............................................................................................... 288 ANSI INNER JOIN with Additional WHERE Clause .......................................................................................... 289 OUTER JOIN with Additional WHERE Clause ................................................................................................... 290 OUTER JOIN with Additional AND Clause ......................................................................................................... 291 OUTER JOIN with Additional AND Clause Results ............................................................................................ 292 Quiz – Why is this considered an INNER JOIN? .................................................................................................. 293 Evaluation Order for Outer Queries ....................................................................................................................... 294 The DREADED Product Join ................................................................................................................................ 295 The DREADED Product Join Results ................................................................................................................... 296 The Horrifying Cartesian Product Join .................................................................................................................. 297 The ANSI Cartesian Join will ERROR .................................................................................................................. 298 Quiz – Do these Joins Return the Same Answer Set? ........................................................................................... 299 Answer – Do these Joins Return the Same Answer Set? ....................................................................................... 300 The CROSS JOIN .................................................................................................................................................. 301 The CROSS JOIN Answer Set............................................................................................................................... 302 The Self Join........................................................................................................................................................... 303 The Self Join with ANSI Syntax ............................................................................................................................ 304 Quiz – Will both queries bring back the same Answer Set? ................................................................................. 305 Answer – Will both queries bring back the same Answer Set? ............................................................................. 306 Quiz – Will both queries bring back the same Answer Set? ................................................................................. 307 Answer – Will both queries bring back the same Answer Set? ............................................................................. 308 How would you join these two tables? .................................................................................................................. 309 An Associative Table is a Bridge that Joins Two Tables ...................................................................................... 310 Quiz – Can you write the 3-Table Join? ................................................................................................................ 311 Answer to quiz – Can you Write the 3-Table Join? ............................................................................................... 312 Quiz – Can you write the 3-Table Join to ANSI Syntax? ...................................................................................... 313 Answer – Can you write the 3-Table Join to ANSI Syntax? ................................................................................. 314
Table of Contents Quiz – Can you Place the ON Clauses at the End?................................................................................................ 315 Answer – Can you Place the ON Clauses at the End? ........................................................................................... 316 The 5-Table Join – Logical Insurance Model ........................................................................................................ 317 Quiz - Write a Five Table Join Using ANSI Syntax .............................................................................................. 318 Answer - Write a Five Table Join Using ANSI Syntax ......................................................................................... 319 Quiz - Write a Five Table Join Using Non-ANSI Syntax ..................................................................................... 320 Answer - Write a Five Table Join Using Non-ANSI Syntax ................................................................................. 321 Quiz –Re-Write this putting the ON clauses at the END ...................................................................................... 322 Answer –Re-Write this putting the ON clauses at the END .................................................................................. 323 Chapter 9 – Date Functions....................................................................................................................................... 325 Current_Date .......................................................................................................................................................... 326 Current_Date, Current_Time and Current_Timestamp ......................................................................................... 327 Timestamp Differences .......................................................................................................................................... 328 Getdate ................................................................................................................................................................... 329 Date and Time Keywords....................................................................................................................................... 330 Using CAST in Literal Values ................................................................................................................................ 331 Add or Subtract Days from a date .......................................................................................................................... 332 Formatting Dates .................................................................................................................................................... 333 Formatting Date Example ...................................................................................................................................... 334 A Summary of Math Operations on Dates ............................................................................................................. 335 The ADD_MONTHS Command ........................................................................................................................... 336 Using the ADD_MONTHS Command to Add 1 Year .......................................................................................... 337 Using the ADD_MONTHS Command to Add 1 Year .......................................................................................... 338 Using the ADD_MONTHS Command to Add 5 Years ........................................................................................ 339 The EXTRACT Command .................................................................................................................................... 340 YEAR, MONTH, and DAY Functions .................................................................................................................. 341 A Better Technique for YEAR, MONTH, and DAY Functions ........................................................................... 342 Another Version of the EXTRACT Command ..................................................................................................... 343
Table of Contents EXTRACT from DATES and TIME ..................................................................................................................... 344 Why EXTRACT is a Better Form.......................................................................................................................... 345 EXTRACT with DATE and TIME Literals........................................................................................................... 346 EXTRACT of the Month on Aggregate Queries ................................................................................................... 347 AGE_IN_MONTHS .............................................................................................................................................. 348 AGE_IN_YEARS .................................................................................................................................................. 349 DATE_TRUNC...................................................................................................................................................... 350 DATEDIFF............................................................................................................................................................. 351 DAYOFWEEK ...................................................................................................................................................... 352 Intervals for Date, Time and Timestamp ............................................................................................................... 353 Interval Data Types and the Bytes to Store Them ................................................................................................. 354 Using Intervals ....................................................................................................................................................... 355 How a Simple Interval Handles Leap Year ........................................................................................................... 356 Interval Arithmetic Results .................................................................................................................................... 357 A Time Interval Example ....................................................................................................................................... 358 A DATE Interval Example Going Back in Time ................................................................................................... 359 A Complex Time Interval Example using CAST .................................................................................................. 360 A Complex Time Interval Example using CAST .................................................................................................. 361 The OVERLAPS Command .................................................................................................................................. 362 An OVERLAPS Example that Returns No Rows ................................................................................................. 363 The OVERLAPS Command using TIME.............................................................................................................. 364 Chapter 10 – OLAP Functions ................................................................................................................................. 366 The Row_Number Command ................................................................................................................................ 367 Quiz – How did the Row_Number Reset? ............................................................................................................. 368 Quiz – How did the Row_Number Reset? ............................................................................................................. 369 Using a Derived Table and Row_Number ............................................................................................................. 370 Finding the First Occurrence using a WITH Derived Table ................................................................................. 371 Finding the Last Occurrence using a WITH Derived Table .................................................................................. 372
Table of Contents Ordered Analytics OVER ...................................................................................................................................... 373 RANK and DENSE RANK ................................................................................................................................... 374 RANK Defaults to Ascending Order ..................................................................................................................... 375 Getting RANK to Sort in DESC Order .................................................................................................................. 376 RANK OVER and PARTITION BY ..................................................................................................................... 377 PERCENT_RANK OVER ..................................................................................................................................... 378 PERCENT_RANK OVER with 14 rows in Calculation ....................................................................................... 379 PERCENT_RANK OVER with 21 rows in Calculation ....................................................................................... 380 Quiz – What Causes the Product_ID to Reset? ..................................................................................................... 381 Answer to Quiz – What Cause the Product_ID to Reset? ..................................................................................... 382 Finding Gaps between Dates.................................................................................................................................. 383 CSUM – Rows Unbounded Preceding Explained ................................................................................................. 384 CSUM – Making Sense of the Data ....................................................................................................................... 385 CSUM – Making Even More Sense of the Data .................................................................................................... 386 CSUM – The Major and Minor Sort Key(s) .......................................................................................................... 387 The ANSI CSUM – Getting a Sequential Number ................................................................................................ 388 Troubleshooting the ANSI OLAP on a GROUP BY............................................................................................. 389 Reset with a PARTITION BY Statement .............................................................................................................. 390 PARTITION BY only Resets a Single OLAP not ALL of them........................................................................... 391 CURRENT ROW AND UNBOUNDED FOLLOWING ...................................................................................... 392 Different Windowing Options ............................................................................................................................... 393 Moving Sum has a Moving Window ..................................................................................................................... 394 How ANSI Moving SUM Handles the Sort .......................................................................................................... 395 Quiz – How is that Total Calculated? .................................................................................................................... 396 Answer to Quiz – How is that Total Calculated? .................................................................................................. 397 Moving SUM every 3-rows Vs a Continuous Average ......................................................................................... 398 PARTITION BY Resets an ANSI OLAP .............................................................................................................. 399 The Moving Window is Current Row and Preceding ............................................................................................ 400 How Moving Average Handles the Sort ................................................................................................................ 401
Table of Contents Moving Average..................................................................................................................................................... 402 Moving Average..................................................................................................................................................... 403 Quiz – How is that Total Calculated? .................................................................................................................... 404 Answer to Quiz – How is that Total Calculated? .................................................................................................. 405 Quiz – How is that 4th Row Calculated? ................................................................................................................ 406 Answer to Quiz – How is that 4th Row Calculated? .............................................................................................. 407 Moving Average every 3-rows vs a Continuous Average ..................................................................................... 408 PARTITION BY Resets an ANSI OLAP .............................................................................................................. 409 Moving Difference using ANSI Syntax ................................................................................................................. 410 Moving Difference using ANSI Syntax with Partition By .................................................................................... 411 COUNT OVER for a Sequential Number ............................................................................................................. 412 COUNT OVER without Rows Unbounded Preceding .......................................................................................... 413 Quiz – What caused the COUNT OVER to Reset? ............................................................................................... 414 Answer to Quiz – What caused the COUNT OVER to Reset? ............................................................................. 415 The MAX OVER Command.................................................................................................................................. 416 MAX OVER with PARTITION BY Reset ............................................................................................................ 417 MAX OVER without Rows Unbounded Preceding .............................................................................................. 418 The MIN OVER Command ................................................................................................................................... 419 MIN OVER without Rows Unbounded Preceding ................................................................................................ 420 Finding a Value of a Column in the Next Row with MIN .................................................................................... 421 The CSUM for Each Product_Id and the Next Start Date ..................................................................................... 422 Quiz – Fill in the Blank .......................................................................................................................................... 423 Answer – Fill in the Blank ..................................................................................................................................... 424 How Ntile Works ................................................................................................................................................... 425 Ntile ........................................................................................................................................................................ 426 Ntile Continued ...................................................................................................................................................... 427 Ntile Percentile ....................................................................................................................................................... 428 Another Ntile Example .......................................................................................................................................... 429 Using Tertiles (Partitions of Four) ......................................................................................................................... 430
Table of Contents NTILE .................................................................................................................................................................... 431 NTILE Using a Value of 10 ................................................................................................................................... 432 NTILE with a Partition........................................................................................................................................... 433 Using FIRST_VALUE ........................................................................................................................................... 434 FIRST_VALUE ..................................................................................................................................................... 435 FIRST_VALUE after Sorting by the Highest Value ............................................................................................. 436 FIRST_VALUE with Partitioning ......................................................................................................................... 437 Using LAST_VALUE ............................................................................................................................................ 438 LAST_VALUE ...................................................................................................................................................... 439 Using LAG and LEAD........................................................................................................................................... 440 Using LEAD........................................................................................................................................................... 441 Using LEAD With and Offset of 2 ........................................................................................................................ 442 LEAD ..................................................................................................................................................................... 443 LEAD With Partitioning ........................................................................................................................................ 444 Using LAG ............................................................................................................................................................. 445 Using LAG with an Offset of 2 .............................................................................................................................. 446 LAG ........................................................................................................................................................................ 447 LAG with Partitioning............................................................................................................................................ 448 MEDIAN with Partitioning .................................................................................................................................... 449 CUME_DIST ......................................................................................................................................................... 450 CUME_DIST with a Partition................................................................................................................................ 451 SUM (SUM (n)) ..................................................................................................................................................... 452 Chapter 11 – Temporary Tables ............................................................................................................................... 454 There are three types of Temporary Tables ........................................................................................................... 455 CREATING A Derived Table................................................................................................................................ 456 Naming the Derived Table ..................................................................................................................................... 457 Aliasing the Column Names in The Derived Table ............................................................................................... 458 Multiple Ways to Alias the Columns in a Derived Table ...................................................................................... 459
Table of Contents CREATING a Derived Table using the WITH Command .................................................................................... 460 The Same Derived Query shown Three Different Ways ....................................................................................... 461 Most Derived Tables Are Used To Join To Other Tables ..................................................................................... 462 The Three Components of a Derived Table ........................................................................................................... 463 Visualize This Derived Table ................................................................................................................................ 464 Our Join Example with a Different Column Aliasing Style .................................................................................. 465 Column Aliasing Can Default for Normal Columns ............................................................................................. 466 A Derived example Using the WITH Syntax ........................................................................................................ 467 Quiz - Answer the Questions ................................................................................................................................. 468 Answer to Quiz - Answer the Questions................................................................................................................ 469 Clever Tricks on Aliasing Columns in a Derived Table ........................................................................................ 470 A Derived Table lives only for the lifetime of a single query ............................................................................... 471 An Example of Two Derived Tables in a Single Query ........................................................................................ 472 Example of Two Derived Tables in a Single WITH Statement ............................................................................ 473 Finding the First Occurrence of a Row using WITH ............................................................................................. 474 Finding the Last Occurrence of a Row using WITH ............................................................................................. 475 Syntax for Temporary Tables ................................................................................................................................ 476 Temporary Tables Explained ................................................................................................................................. 477 Key Temporary Table Terms ................................................................................................................................. 478 Creating and Populating a Local Temporary Table ............................................................................................... 479 Using a Local Temporary Table ............................................................................................................................ 480 Creating and Populating a Global Temporary Table ............................................................................................. 481 Creating and Populating a Global Temporary Table ............................................................................................. 482 Some Great Examples of Creating a Temporary Table Quickly ........................................................................... 483 Creating a Temporary Table That is sorted ........................................................................................................... 484 A Temp Table That Populates some of the Rows.................................................................................................. 485 A Temporary Table with Some of the Columns .................................................................................................... 486
Table of Contents Chapter 12 – Sub-query Functions ........................................................................................................................... 488 An IN List is much like a Subquery ....................................................................................................................... 489 An IN List Never has Duplicates – Just like a Subquery....................................................................................... 490 The Subquery ......................................................................................................................................................... 491 The Three Steps of How a Basic Subquery Works................................................................................................ 492 These are Equivalent Queries ................................................................................................................................ 493 The Final Answer Set from the Subquery.............................................................................................................. 494 Quiz- Answer the Difficult Question ..................................................................................................................... 495 Answer to Quiz- Answer the Difficult Question ................................................................................................... 496 Should you use a Subquery or a Join? ................................................................................................................... 497 Quiz- Write the Subquery ...................................................................................................................................... 498 Answer to Quiz- Write the Subquery..................................................................................................................... 499 Quiz- Write the More Difficult Subquery .............................................................................................................. 500 Answer to Quiz- Write the More Difficult Subquery ............................................................................................ 501 Quiz – Write the Extreme Subquery ...................................................................................................................... 502 Answer to Quiz- Write the Extreme Subquery ...................................................................................................... 503 Quiz- Write the Subquery with an Aggregate........................................................................................................ 504 Answer to Quiz- Write the Subquery with an Aggregate ...................................................................................... 505 Quiz- Write the Correlated Subquery .................................................................................................................... 506 Answer to Quiz- Write the Correlated Subquery ................................................................................................... 507 The Basics of a Correlated Subquery ..................................................................................................................... 508 The Top Query always runs first in a Correlated Subquery .................................................................................. 509 Correlated Subquery Example vs. a Join with a Derived Table ............................................................................ 510 Quiz- A Second Chance to Write a Correlated Subquery ..................................................................................... 511 Answer - A Second Chance to Write a Correlated Subquery ................................................................................ 512 Quiz- A Third Chance to Write a Correlated Subquery ........................................................................................ 513 Answer - A Third Chance to Write a Correlated Subquery ................................................................................... 514 Quiz- Last Chance to Write a Correlated Subquery .............................................................................................. 515 Answer – Last Chance to Write a Correlated Subquery ........................................................................................ 516
Table of Contents Quiz – Write the Extreme Correlated Subquery .................................................................................................... 517 Answer To Quiz – Write the Extreme Correlated Subquery ................................................................................. 518 Quiz- Write the NOT Subquery ............................................................................................................................. 519 Answer to Quiz- Write the NOT Subquery ........................................................................................................... 520 Quiz- Write the Subquery using a WHERE Clause............................................................................................... 521 Answer - Write the Subquery using a WHERE Clause ......................................................................................... 522 Quiz- Write the Subquery with Two Parameters ................................................................................................... 523 Answer to Quiz- Write the Subquery with Two Parameters ................................................................................. 524 How the Double Parameter Subquery Works ........................................................................................................ 525 More on how the Double Parameter Subquery Works .......................................................................................... 526 Quiz – Write the Triple Subquery .......................................................................................................................... 527 Answer to Quiz – Write the Triple Subquery ........................................................................................................ 528 Quiz – How many rows return on a NOT IN with a NULL? ................................................................................ 529 Answer – How many rows return on a NOT IN with a NULL? ........................................................................... 530 How to handle a NOT IN with Potential NULL Values........................................................................................ 531 IN is equivalent to =ANY ...................................................................................................................................... 532 Using a Correlated Exists ....................................................................................................................................... 533 How a Correlated Exists matches up ..................................................................................................................... 534 The Correlated NOT Exists.................................................................................................................................... 535 The Correlated NOT Exists Answer Set ................................................................................................................ 536 Quiz – How many rows come back from this NOT Exists? .................................................................................. 537 Answer – How many rows come back from this NOT Exists? ............................................................................. 538 Chapter 13 – Strings.................................................................................................................................................. 540 The LENGTH Command Counts Characters ........................................................................................................ 541 The LENGTH Command – Spaces can Count too ................................................................................................ 542 The LENGTH Command and Character Data ....................................................................................................... 543 LENGTH and CHARACTER_LENGTH Are Equivalent .................................................................................... 544 OCTET_LENGTH ................................................................................................................................................. 545
Table of Contents UPPER and LOWER Commands .......................................................................................................................... 546 Using the LOWER Command ............................................................................................................................... 547 A LOWER Command Example ............................................................................................................................. 548 Using the UPPER Command ................................................................................................................................. 549 An UPPER Command Example ............................................................................................................................ 550 Non-Letters are Unaffected by UPPER and LOWER ........................................................................................... 551 The TRIM Command trims both Leading and Trailing Spaces ............................................................................ 552 Trim Combined with the CHARACTERS Command ........................................................................................... 553 How to TRIM only the Trailing Spaces ................................................................................................................. 554 A Visual of the TRIM Command Using Concatenation ........................................................................................ 555 Trim and Trailing is Case Sensitive ....................................................................................................................... 556 How to TRIM Trailing Letters ............................................................................................................................... 557 The SUBSTRING Command................................................................................................................................. 558 SUBSTRING and SUBSTR are equal, but use different syntax ........................................................................... 559 How SUBSTRING Works with NO ENDING POSITION .................................................................................. 560 Using SUBSTRING to move backwards ............................................................................................................... 561 How SUBSTRING Works with a Starting Position of -1 ..................................................................................... 562 How SUBSTRING Works with an Ending Position of 0 ...................................................................................... 563 An Example using SUBSTRING, TRIM and CHAR Together ............................................................................ 564 The POSITION Command finds a Letters Position .............................................................................................. 565 Quiz – Find that SUBSTRING Starting Position .................................................................................................. 566 Answer to Quiz – Find that SUBSTRING Starting Position ................................................................................. 567 Using the SUBSTRING to Find the Second Word On .......................................................................................... 568 Quiz – Why did only one Row Return ................................................................................................................... 569 Answer to Quiz – Why Did only one Row Return ................................................................................................ 570 Concatenation ......................................................................................................................................................... 571 Concatenation and SUBSTRING........................................................................................................................... 572 Four Concatenations Together ............................................................................................................................... 573 Troubleshooting Concatenation ............................................................................................................................. 574
Table of Contents Chapter 14 – Interrogating the Data.......................................................................................................................... 576 Numeric Manipulation Functions .......................................................................................................................... 577 Finding the Cube Root ........................................................................................................................................... 578 Ceiling Gets the Smallest Integer Not Smaller Than X ......................................................................................... 579 Floor Finds the Largest Integer Not Greater Than X ............................................................................................. 580 The Round Function and Precision ........................................................................................................................ 581 Quiz – What would the Answer be? ...................................................................................................................... 582 Answer to Quiz – What would the Answer be? ..................................................................................................... 583 The NULLIFZERO Command .............................................................................................................................. 584 The NULLIFZERO vs. Zeroes .............................................................................................................................. 585 Quiz – Fill in the Blank Values in the Answer Set ................................................................................................ 586 Answer to Quiz – Fill in the Blank Values in the Answer Set .............................................................................. 587 Quiz – Fill in the Answers for the NULLIF Command ......................................................................................... 588 Answer – Fill in the Answers for the NULLIF Command .................................................................................... 589 The ZEROIFNULL Command .............................................................................................................................. 590 Answer to the ZEROIFNULL Question ................................................................................................................ 591 The COALESCE Command .................................................................................................................................. 592 The COALESCE Answer Set ................................................................................................................................ 593 The Coalesce Quiz ................................................................................................................................................. 594 Answer – The Coalesce Quiz ................................................................................................................................. 595 The COALESCE Command – Fill In the Answers ............................................................................................... 596 The COALESCE Answer Set ................................................................................................................................ 597 COALESCE is Equivalent to This CASE Statement ............................................................................................ 598 Some Great CAST (Convert and Store) Examples ................................................................................................ 599 Some Great CAST (Convert and Store) Examples ................................................................................................ 600 A Rounding Example ............................................................................................................................................. 601 Some Great CAST (Convert and Store) Examples ................................................................................................ 602 Quiz - The Basics of the CASE Statements ........................................................................................................... 603 Answer to Quiz - The Basics of the CASE Statements ......................................................................................... 604
Table of Contents Using an ELSE in the Case Statement ................................................................................................................... 605 Using an ELSE as a Safety Net .............................................................................................................................. 606 Rules for a Valued Case Statement ........................................................................................................................ 607 Rules for a Searched Case Statement ..................................................................................................................... 608 The Basics of the CASE Statements ...................................................................................................................... 609 The Basics of the CASE Statement........................................................................................................................ 610 Valued Case vs. a Searched Case........................................................................................................................... 611 Quiz - Valued Case Statement ............................................................................................................................... 612 Answer - Valued Case Statement........................................................................................................................... 613 Quiz - Searched Case Statement ............................................................................................................................ 614 Answer - Searched Case Statement ....................................................................................................................... 615 Quiz - When NO ELSE is present in CASE Statement ......................................................................................... 616 Answer - When NO ELSE is present in CASE Statement .................................................................................... 617 When an ELSE is present in CASE Statement ...................................................................................................... 618 Answer - When an ELSE is present in CASE Statement ...................................................................................... 619 The CASE Challenge ............................................................................................................................................. 620 The CASE Challenge Answer................................................................................................................................ 621 Combining Searched Case and Valued Case ......................................................................................................... 622 A Trick for getting a Horizontal Case.................................................................................................................... 623 Nested Case ............................................................................................................................................................ 624 Put a CASE in the ORDER BY ............................................................................................................................. 625 Chapter 15 – View Functions ................................................................................................................................... 627 The Fundamentals of Views .................................................................................................................................. 628 Creating a Simple View to Restrict Sensitive Columns ........................................................................................ 629 You SELECT From a View ................................................................................................................................... 630 Creating a Simple View to Restrict Rows ............................................................................................................. 631 A View Provides Security for Columns and Rows ................................................................................................ 632 Basic Rules for Views ............................................................................................................................................ 633
Table of Contents How to Modify a View .......................................................................................................................................... 634 An Exception to the ORDER BY Rule inside a View ........................................................................................... 635 Views Are Sometimes CREATED for Formatting ................................................................................................ 636 Creating a View to Join Tables Together............................................................................................................... 637 How to Alias Columns in a View CREATE .......................................................................................................... 638 The Standard Way Most Aliasing is done ............................................................................................................. 639 What Happens When Both Aliasing Options Are Present .................................................................................... 640 Resolving Aliasing Problems in a View CREATE ............................................................................................... 641 Answer to Resolving Aliasing Problems in a View CREATE .............................................................................. 642 Aggregates on View Aggregates............................................................................................................................ 643 Altering A Table After a View Has Been Created ................................................................................................ 644 A View that Errors after An ALTER ..................................................................................................................... 645 Chapter 16 – Set Operators Functions ...................................................................................................................... 647 Rules of Set Operators ........................................................................................................................................... 648 INTERSECT Explained Logically......................................................................................................................... 649 INTERSECT Explained Logically......................................................................................................................... 650 UNION Explained Logically ................................................................................................................................. 651 UNION Explained Logically ................................................................................................................................. 652 UNION ALL Explained Logically ........................................................................................................................ 653 UNION ALL Explained Logically ........................................................................................................................ 654 EXCEPT Explained Logically ............................................................................................................................... 655 EXCEPT Explained Logically ............................................................................................................................... 656 Minus Explained Logically .................................................................................................................................... 657 Minus Explained Logically .................................................................................................................................... 658 Testing Your Knowledge ....................................................................................................................................... 659 Answer - Testing Your Knowledge ....................................................................................................................... 660 Testing Your Knowledge ....................................................................................................................................... 661 Answer - Testing Your Knowledge ....................................................................................................................... 662
Table of Contents An Equal Amount of Columns in both SELECT List ........................................................................................... 663 Columns in the SELECT list should be from the same Domain ........................................................................... 664 The Top Query handles all Aliases ........................................................................................................................ 665 The Bottom Query does the ORDER BY (a Number) .......................................................................................... 666 Great Trick: Place your Set Operator in a Derived Table..................................................................................... 667 UNION Vs UNION ALL ....................................................................................................................................... 668 Using UNION ALL and Literals ........................................................................................................................... 669 A Great Example of how EXCEPT works ............................................................................................................ 670 USING Multiple SET Operators in a Single Request............................................................................................ 671 Changing the Order of Precedence with Parentheses ............................................................................................ 672 Using UNION ALL for speed in Merging Data Sets ............................................................................................ 673 Chapter 17 – Table Create and Data Types .............................................................................................................. 675 Distribution Strategy 1 - Segmented By Hash ....................................................................................................... 676 Distribution Strategy 2 - Unsegmented.................................................................................................................. 677 Sorting the Data in a Table CREATE Statement ................................................................................................... 678 Even Distribution ................................................................................................................................................... 679 Uneven Distribution Where the Data is Non-Unique ............................................................................................ 680 Matching Distribution Keys for Co-Location of Joins .......................................................................................... 681 Big Table / Small Table Joins ................................................................................................................................ 682 Fact and Dimension Table Distribution Key Designs ........................................................................................... 683 Why a Sort Key Improves Performance ................................................................................................................ 684 Sort Keys Help GROUP BY, ORDER BY and Window Functions ..................................................................... 685 Syntax for Temporary Tables ................................................................................................................................ 686 Temporary Tables Explained ................................................................................................................................. 687 Key Temporary Table Terms ................................................................................................................................. 688 Creating and Populating a Local Temporary Table ............................................................................................... 689 Using a Local Temporary Table ............................................................................................................................ 690 Creating and Populating a Global Temporary Table ............................................................................................. 691
Table of Contents Creating and Populating a Global Temporary Table ............................................................................................. 692 Some Great Examples of Creating a Temporary Table Quickly ........................................................................... 693 Creating a Temporary Table That is sorted ........................................................................................................... 694 A Temp Table That Populates Some of the Rows ................................................................................................. 695 A Temporary Table with Some of the Columns .................................................................................................... 696 Chapter 18 – Data Manipulation Language (DML) ................................................................................................. 698 INSERT Syntax # 1 ................................................................................................................................................ 699 INSERT example with Syntax 1 ............................................................................................................................ 700 INSERT Syntax # 2 ................................................................................................................................................ 701 INSERT example with Syntax 2 ............................................................................................................................ 702 INSERT/SELECT Command ................................................................................................................................ 703 INSERT/SELECT example using All Columns (*) .............................................................................................. 704 INSERT/SELECT example with Less Columns ................................................................................................... 705 Two UPDATE Examples ....................................................................................................................................... 706 Subquery UPDATE Command Syntax .................................................................................................................. 707 Example of Subquery UPDATE Command .......................................................................................................... 708 Join UPDATE Command Syntax .......................................................................................................................... 709 Example of an UPDATE Join Command .............................................................................................................. 710 Fast UPDATE ........................................................................................................................................................ 711 Example of Subquery DELETE Command ........................................................................................................... 712 Chapter 19 – Statistical Aggregate Functions........................................................................................................... 714 The Stats Table ....................................................................................................................................................... 715 The STDDEV_POP Function ................................................................................................................................ 716 A STDDEV_POP Example ................................................................................................................................... 717 The STDDEV_SAMP Function............................................................................................................................. 718 A STDDEV_SAMP Example ................................................................................................................................ 719 The VAR_POP Function ....................................................................................................................................... 720
Table of Contents A VAR_POP Example ........................................................................................................................................... 721 The VAR_SAMP Function .................................................................................................................................... 722 A VAR_SAMP Example ....................................................................................................................................... 723 The VARIANCE Function..................................................................................................................................... 724 A VARIANCE Example ........................................................................................................................................ 725 The CORR Function .............................................................................................................................................. 726 A CORR Example .................................................................................................................................................. 727 Another CORR Example so you can compare ...................................................................................................... 728 The COVAR_POP Function .................................................................................................................................. 729 A COVAR_POP Example ..................................................................................................................................... 730 Another COVAR_POP Example so you can compare .......................................................................................... 731 The COVAR_SAMP Function .............................................................................................................................. 732 A COVAR_SAMP Example .................................................................................................................................. 733 Another COVAR_SAMP Example so you can compare ...................................................................................... 734 The REGR_INTERCEPT Function ....................................................................................................................... 735 A REGR_INTERCEPT Example .......................................................................................................................... 736 Another REGR_INTERCEPT Example so you can compare ............................................................................... 737 The REGR_SLOPE Function ................................................................................................................................ 738 REGR_SLOPE Example ........................................................................................................................................ 739 Another REGR_SLOPE Example so you can compare ........................................................................................ 740 The REGR_AVGX Function ............................................................................................................................... 741 A REGR_AVGX Example .................................................................................................................................. 742 Another REGR_AVGX Example so you can compare ......................................................................................... 743 The REGR_AVGY Function ............................................................................................................................... 744 A REGR_AVGY Example .................................................................................................................................... 745 Another REGR_AVGY Example so you can compare ......................................................................................... 746 The REGR_COUNT Function ............................................................................................................................. 747 A REGR_COUNT Example .................................................................................................................................. 748 The REGR_R2 Function ........................................................................................................................................ 749
Table of Contents A REGR_R2 Example ........................................................................................................................................... 750 The REGR_SXX Function..................................................................................................................................... 751 A REGR_SXX Example ........................................................................................................................................ 752 The REGR_SXY Function..................................................................................................................................... 753 A REGR_SXY Example ........................................................................................................................................ 754 The REGR_SYY Function..................................................................................................................................... 755 A REGR_SYY Example ........................................................................................................................................ 756 Using GROUP BY ................................................................................................................................................. 757
Chapter 1
Page 27
What is Columnar?
Chapter 1
What is Columnar?
Chapter 1 – What is Columnar?
“When you go into court you, are putting your fate into the hands of twelve people who weren’t smart enough to get out of jury duty.” – Norm Crosby
Page 28
Chapter 1
What is Columnar?
What is Parallel Processing? "After enlightenment, the laundry" - Zen Proverb
Tera-Tom's Parallel Processing Wash and Dry
"After parallel processing the laundry, enlightenment!" -Matrix Zen Proverb
Two guys were having fun on a Saturday night when one said, “I’ve got to go and do my laundry.” The other said, "What!?" The first man explained that if he went to the laundry mat the next morning, he would be lucky to get one machine and be there all day. But if he went on Saturday night, he could get all the machines. Then, he could do all his wash and dry in two hours. Now that's parallel processing mixed in with a little dry humor!
Page 29
Chapter 1
What is Columnar?
Nothing Happens on Disk CPU
Memory How are we doing on orders today?
Orders Order_No 100 200 300 400
Customer_No
Order_Date
21345679 32456733 31323134 87323456
01/01/2013 01/01/2013 01/01/2013 01/01/2013
Order_Total
12347.53 8005.91 5111.47 15231.62
How would I know? I'm just a disk. I need to transfer the block of data to the memory, and that is a slow process.
“When you are courting a nice girl, an hour seems like a second. When you sit on a red-hot cinder, a second seems like an hour. That’s relativity.” – Albert Einstein
Data on disk does absolutely nothing. When data is requested, the computer moves the data one block at a time from disk into memory. Once the data is in memory, it is processed by the CPU at lightning speed. All computers work this way. The "Achilles Heel" of every computer is the slow process of moving data from disk to memory. The real theory of relativity is to find out how to get blocks of data from the disk into memory faster!
Page 30
Chapter 1
What is Columnar?
Data in Memory is fast as Lightning CPU
Memory Order_No 100 200 300 400
Customer_No
Order_Date
21345679 32456733 31323134 87323456
01/01/2013 01/01/2013 01/01/2013 01/01/2013
Order_Total 12347.53 8005.91 5111.47 15231.62
Orders Order_No 100 200 300 400
Customer_No
Order_Date
21345679 32456733 31323134 87323456
01/01/2013 01/01/2013 01/01/2013 01/01/2013
Order_Total 12347.53 8005.91 5111.47 15231.62
“You can observe a lot by watching.” – Yogi Berra
Once the data block is moved off of the disk and into memory, the processing of that block happens as fast as lightning. It is the movement of the block from disk into memory that slows down every computer. Data being processed in memory is so fast that even Yogi Berra couldn't catch it!
Page 31
Chapter 1
What is Columnar?
Parallel Processing Of Data Parallel Process
Parallel Process
Memory
Memory
Cust_No
Order_Date
Order_Total
Cust_No
21345679 32456733 31323134 87323456
01/01/2013 01/01/2013 01/01/2013 01/01/2013
12347.53 8005.91 5111.47 15231.62
34345699 41456543 51323154 67823486
Order_Date
Orders Cust_No 21345679 32456733 31323134 87323456
Parallel Process Memory
Order_Total
01/01/2013 01/01/2013 01/01/2013 01/01/2013
13347.51 13005.91 7611.57 11671.92
Cust_No
Order_Date
87945679 98756733 35623134 97873456
Orders
Order_Date
Order_Total
Cust_No
01/01/2013 01/01/2013 01/01/2013 01/01/2013
12347.53 8005.91 5111.47 15231.62
34345699 41456543 51323154 67823486
Order_Date 01/01/2013 01/01/2013 01/01/2013 01/01/2013
Parallel Process Memory
Order_Total
Cust_No
Order_Date
Order_Total
8347.53 17005.91 3451.47 19871.62
44445679 32547733 57497134 87768956
01/01/2013 01/01/2013 01/01/2013 01/01/2013
12447.53 8055.66 5651.47 231.62
Order_Total
Cust_No
01/01/2013 01/01/2013 01/01/2013 01/01/2013
Orders
Order_Total 13347.51 13005.91 7611.57 11671.92
Cust_No
Order_Date
87945679 98756733 35623134 97873456
01/01/2013 01/01/2013 01/01/2013 01/01/2013
Orders 8347.53 17005.91 3451.47 19871.62
44445679 32547733 57497134 87768956
Order_Date 01/01/2013 01/01/2013 01/01/2013 01/01/2013
Order_Total 12447.53 8055.66 5651.47 231.62
"If the facts don't fit the theory, change the facts."
-Albert Einstein
Big Data is all about parallel processing. Parallel processing is all about taking the rows of a table and spreading them among many parallel processing units. Above, we can see a table called Orders. There are 16 rows in the table. Each parallel processor holds four rows. Now they can process the data in parallel and be four times as fast. What Albert Einstein meant to say was, “If the theory doesn't fit the dimension table, change it to a fact."
Page 32
Chapter 1
What is Columnar?
The Problem with Row-Based Data Parallel Process
M e m o r y
The entire block must be placed into memory just to calculate one column Cust_No ________
Customer_Name ______________
31323134 57896883 11111111 11111111 87323456
ACE Consulting XYZ Plumbing Billy's Best Choice Billy's Best Choice Databases N-U
Cust_No ________ 31323134 57896883 11111111 11111111 87323456
Customer_Name ______________ ACE Consulting XYZ Plumbing Billy's Best Choice Billy's Best Choice Databases N-U
Phone Order_No Order_Date Order_Total ________ ________ __________ __________ 555-1212 347-8954 555-1234 555-1234 322-1012
123552 123777 123456 123512 123585
10/01/1999 5111.47 09/09/1999 23454.84 05/04/1998 12347.53 01/01/1999 8005.91 10/10/1999 15231.62
Phone Order_No ________ ________ Order_Date __________Order_Total __________ 555-1212 123552 10/01/1999 5111.47 347-8954 123777 09/09/1999 23454.84 555-1234 123456 05/04/1998 12347.53 555-1234 123512 01/01/1999 8005.91 322-1012 123585 10/10/1999 15231.62
SELECT AVG(Order_Total) FROM Row_Based_Table;
Nothing happens on disk. For data to be processed, the block of disk data must be copied and moved into memory. The problem with row-based data is that the entire block must be moved into memory even when the query only needs to analyze a single column. When queries only need a few columns, moving the entire block is a lot of wasted energy.
Page 33
Chapter 1
What is Columnar?
Columnar Data Can Store Each Column in Their Own Block Parallel Process
M e m o r y
Columnar data is designed to only move the columns needed to satisfy the query
Cust_No ________ 31323134 57896883 11111111 11111111 87323456
5111.47 23454.84 12347.53 8005.91 15231.62
Customer_Name ______________ ACE Consulting XYZ Plumbing Billy's Best Choice Billy's Best Choice Databases N-U
Phone Order_No ________ ________ Order_Date __________ Order_Total __________ 555-1212 123552 10/01/1999 5111.47 347-8954 123777 09/09/1999 23454.84 555-1234 123456 05/04/1998 12347.53 555-1234 123512 01/01/1999 8005.91 322-1012 123585 10/10/1999 15231.62
SELECT AVG(Order_Total) FROM Row_Based_Table;
Columnar systems can store each column of a table in their own individual block. This is extremely efficient when a query only needs a relatively few columns from the table to satisfy the query. Our query above only needs the Order_Total column to get the average Order_Total. Only one small block on each parallel process is moved into memory. Wow, that was fast!
Page 34
Chapter 1
What is Columnar?
Why Columnar?
“Everyone is kneaded out of the same dough but not baked in the same oven.” – Yiddish Proverb
Emp_No
Dept_No
1001 1002 1003 1004 1005 1006 1007 1008 1009
100 200 300 400 400 300 200 100 300
First_Name
Rafael Maria Charl Kyle Rob Inna Sushma Mo Mo
Last_Name
Minal Gomez Kertzel Stover Rivers Kinski Davis Khan Swartz
Salary
90000 80000 70000 60000 50000 50000 50000 60000 70000
Each data block holds a single column. The row can be rebuilt because everything is aligned perfectly. If someone runs a query that would return the average salary, then only one small data block is moved into memory. The salary block moves into memory where it is processed as fast as lightning. We just cut down on moving large blocks by 80%! Why columnar? Because like our Yiddish Proverb says, "All data is not kneaded on every query, so that is why it costs so much dough."
Page 35
Chapter 1
What is Columnar?
Row Based Blocks vs. Columnar Based Blocks “Two roads diverged in a wood and I took the one less traveled by, and that has made all the difference.” – Robert Frost
Row based
Columnar Design
Both designs have the same amount of data. Both take up just as much space. In this example, both have 9 rows and five columns. If a query needs to analyze all of the rows or return most of the columns, then the row based design is faster and more efficient. However, if the query only needs to analyze a few rows or merely a few columns, then the columnar design is much lighter because not all of the data is moved into memory. Just one or two columns move. Take the road less traveled.
Page 36
Chapter 1
What is Columnar?
Visualize the Data – Rows vs. Columns 24 rows (five columns) stored in 6 blocks in this row-based system
24 rows (five columns) stored in 15 blocks (each column is its own block)
Both examples above have the same data and the same amount of data. If your applications tend to need to analyze the majority of columns or read the entire table, then a row-based system (top example) can move more data into memory. Columnar tables are advantageous when only a few columns need to be read. This is just one of the reasons why analytics goes with columnar like bread goes with butter. A row-based system must move the entire block into memory even if it only needs to read one row or even a single column. If a user above needed to analyze the Salary, the columnar system would move 80% less block mass.
Page 37
Chapter 1
What is Columnar?
The Architecture of Vertica Compute Node 1 S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
Compute Node n S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
“Be the change that you want to see in the world.”
- Mahatma Gandhi
Vertica is a shared nothing architecture, designed as a collection of Linux cluster nodes connected by a TCP/IP network. Storage can be directly attached to each node, or SAN-based. This technology is relatively inexpensive. It might not "be the change you want to see in the world", but it will help your company "keep the change" because costs are low.
Page 38
Chapter 1
What is Columnar?
Vertica Architecture Terms Host - A server with a 64-bit processor, memory, hard disk and TCP/IP network interface. Hosts share neither disk space nor main memory with each other. Vertica is a shared nothing MPP architecture. Instance - An instance is a node running the Vertica process and disk storage (catalog and data) on a host. Only one instance of HP Vertica can be running on each host. Multiple instances make up a cluster.
Node - A host that is configured to run an instance of Vertica. It is a member of a database cluster, which consists of one or more nodes working together in parallel. HP has recoverability for multiple nodes (minimum 3 – recommended 4) so a database can recover from a node failure. Cluster - A collection of nodes bound to a database. Database - A cluster of nodes that perform distributed data storage and SQL statement execution as a single unit. Above are the key terms for the Vertica architecture.
Page 39
Chapter 1
What is Columnar?
Vertica has Linear Scalability Ethernet Network
S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
Vertica
S e g m e n t
S e g m e n t
S e g m e n t
Ethernet Network
S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
S e g m e n t
"A Journey of a thousand miles begins with a single step." - Lao Tzu
Vertica was born to be parallel. With each query, a single step is performed in parallel by each Segment. A Vertica system consists of a series of nodes that will work in parallel to store and process your data. This design allows you to start small and grow infinitely. If your Vertica system provides you with an excellent Return on Investment (ROI), then continue to invest by purchasing more nodes (adds additional Segments). Most companies start small, but after seeing what Vertica can do, they continue to grow their ROI from the single step of implementing a Vertica system to millions of dollars in profits. Double the Segments and double the speeds….Forever. Vertica actually provides a journey of a thousand smiles! Page 40
Chapter 2
Page 41
Vertica Data Distribution
Chapter 2
Vertica Data Distribution
Chapter 2 – Vertica Data Distribution
“Fall seven times, stand up eight.” – Japanese Proverb
Page 42
Chapter 2
Vertica Data Distribution
Distribution Strategy 1 - Segmented By Hash CREATE TABLE Employee_Table ( Employee_No integer NOT NULL, Dept_No integer, Last_Name char(20), First_Name varchar(12) ) SEGMENTED BY HASH(Employee_No) ALL NODES
Emp_No Dept_No First_Name _______ ________ __________ Last_Name _________ 1001 100 Rafael Minal 1002 200 Maria Gomez 1003 300 Charl Kertzel 1004 400 Kyle Stover 1005 400 Rob Rivers 1006 300 Inna Kinski 1007 200 Sushma Davis 1008 100 Mo Khan 1009 300 Mo Swartz
Segment 2
Segment 1 Hash Key
Segment 3
Hash Key
Hash Key
1001
100
Rafael
Minal
1002
200
Maria
Gomez
1003
300
Charl
Kertzel
1008
100
Mo
Khan
1007
200
Sushma
Davis
1006
300
Inna
Kinski
1009
300
Mo
Swartz
1005
400
Rob
Rivers
1004
400
Kyle
Stover
The entire row of a table is on a segment, but each column in the row is in a separate block. Vertica spreads the rows of a table evenly across the nodes. A good Distribution Key is the key to good distribution!
Page 43
Chapter 2
Vertica Data Distribution
Distribution Strategy 2 - Unsegmented Segment 1
CREATE TABLE Emp_Table (Emp_No INTEGER, Dept_No INTEGER, First_name VARCHAR(12), Last_name CHAR(20)) Unsegmented all nodes ;
Unsegmented (Replicated) 1001
100
Rafael
1002 1007
200 200
Maria Sushma
1004 1005
400 400
Kyle Rob
Stover Rivers
1008
100
Mo
Khan
1003 1006
300 300
Charl Inna
Kertzel Kinski
1009
300
Mo
Swartz
Segment n
Unsegmented (Replicated)
Minal Gomez Davis
Unsegmented means the table is copied in its entirety to all segments.
1001
100
Rafael
Minal Gomez
1002 1007
200 200
Maria Sushma
1004 1005
400 400
Kyle Rob
Stover Rivers
1008
100
Mo
Khan
1003 1006
300 300
Charl Inna
Kertzel Kinski
1009
300
Mo
Swartz
Davis
When Unsegmented is chosen for distribution, the entire table is copied to each segment. This is often termed replicated. The general idea is to Segment by Hash all large tables and to use Unsegmented on smaller tables.
Page 44
Chapter 2
Vertica Data Distribution
Sorting the Data in a Table CREATE Statement Order_No Cust_No Order_Date _________ Order_Total _______ ______ ________
1 2 3 4 5 6 7 8 9
CREATE TABLE Order_Table (Order_No INTEGER Not NULL, Cust_No INTEGER, Order_Date Date, Order_Total Decimal(8,2)) ORDER BY Order_Date SEGMENTED BY HASH(Order_No) ALL NODES
100 200 300 400 400 300 200 100 300
1-1-2015 1-1-2015 1-2-2015 1-3-2015 1-3-2015 1-4-2015 1-5-2015 1-6-2015 1-6-2015
Segment
Segment Hash
1000 2000 3000 1000 3000 4000 1000 2000 1000
Segment
Hash
Hash
1
100
1-1-2015
1000
2
200
1-1-2015
2000
3
300
1-2-2015
3000
8
100
1-6-2015
2000
5
400
1-3-2015
3000
4
400
1-3-2015
1000
9
300
1-6-2015
1000
7
200
1-5-2015
1000
6
300
1-4-2015
4000
sortkey
sortkey
sortkey
We have chosen the Order_Date column as the sort key and the Order_No column as the Hash Key.
Page 45
Chapter 2
Vertica Data Distribution
Even Distribution Order_No Cust_No Order_Date _________ Order_Total _______ ______ ________ CREATE TABLE Order_Table3 (Order_No INTEGER Not NULL, Cust_No INTEGER, Order_Date Date, Order_Total Decimal(8,2)) SEGMENTED BY HASH(Order_No) ALL NODES
1 2 3 4 5 6 7 8 9
100 200 300 400 400 300 200 100 300
1-1-2015 1-1-2015 1-2-2015 1-3-2015 1-3-2015 1-4-2015 1-5-2015 1-6-2015 1-6-2015
Segment
Segment Hash
1000 2000 3000 1000 3000 4000 1000 2000 1000
Segment
Hash
Hash
1
100
1-1-2015
1000
2
200
1-1-2015
2000
3
300
1-2-2015
3000
8
100
1-6-2015
2000
5
400
1-3-2015
3000
4
400
1-3-2015
1000
1000
7
200
1-5-2015
1000
6
300
1-4-2015
4000
9
300
1-6-2015
The data has spread evenly among the segments for this table. Do you know why? The Hash Key is Order_No and it is a unique value. Hashing unique values results in near perfect distribution every single time.
Page 46
Chapter 2
Vertica Data Distribution
Uneven Distribution Where the Data is Non-Unique Order_No Cust_No Order_Date _________ Order_Total _______ ______ ________ CREATE TABLE Order_Table4 (Order_No INTEGER Not NULL, Cust_No INTEGER, Order_Date Date, Order_Total Decimal(8,2)) SEGMENTED BY HASH(Cust_No) ALL NODES
1 2 3 4 5 6 7 8 9
Hash
8
100 100
1-6-2015
1000 2000 3000 1000 3000 4000 1000 2000 1000
Segment
Hash 1-1-2015
1-1-2015 1-1-2015 1-2-2015 1-3-2015 1-3-2015 1-4-2015 1-5-2015 1-6-2015 1-6-2015
Segment
Segment
1
100 200 300 400 400 300 200 100 300
Hash
1000
2
200
1-1-2015
2000
3
300
1-2-2015
3000
2000
7
200
1-5-2015
1000
6
300
1-4-2015
4000
5 4
400 400
1-3-2015 1-3-2015
3000 1000
9
300
1-6-2015
1000
The data did not spread evenly among the segments for this table. Do you know why? The Hash Key is Cust_No. All like values went to the same Node. This distribution isn't perfect, but it is reasonable, so it is an acceptable practice.
Page 47
Chapter 2
Vertica Data Distribution
Matching Distribution Keys for Co-Location of Joins CREATE TABLE Employee_table (Emp_No INTEGER NULL, Dept_No INTEGER NULL, Last_name CHAR(20) NULL, First_name VARCHAR(12) NULL) SEGMENTED BY HASH(Dept_No) ALL NODES;
1001 1008
Segment
Segment
Employee_Table
Employee_Table
100 100
Rafael Mo
Minal Khan
Fin
1008
Segment Employee_Table
1002 1007
200 200
Maria Sushma
Gomez Davis
1004 1005
400 400
Kyle Rob
Stover Rivers
Department_Table 100
CREATE TABLE Department_table (Dept_No INTEGER NULL, Dept_Name CHAR(20) NULL, Mgr_No INTEGER Budget Decimal (10,2)) SEGMENTATED BY HASH(Dept_No) ALL NODES;
1003
300
Charl
Kertzel
1006 1009
300 300
Inna Mo
Kinski Swartz
Department_Table 90000
200
HR
1002
500000
400
IT
1005
600000
Department_Table 300
Mrkt
1006
500000
Notice that both tables are distributed by Hash on the column Dept_No. When these two tables are joined WHERE Dept_No = Dept_No, the rows with matching department numbers are on the same segment. This is called CoLocation. This makes joins efficient and fast.
Page 48
Chapter 2
Vertica Data Distribution
Big Table / Small Table Joins Segment
Segment
Employee_Table
Employee_Table
Segmented by Hash
Segment
Employee_Table
Segmented by Hash
Segmented by Hash
1001 1008
100 100
Rafael Mo
Minal Khan
1002 1007
200 200
Maria Sushma
Gomez Davis
1003 1006
300 300
Charl Inna
Kertzel Kinski
1009
300
Mo
Swartz
1005
400
Rob
Rivers
1004
400
Kyle
Stover
Department_Table
Department_Table
Department_Table
100
Fin
1008
90000
100
Fin
1008
90000
100
Fin
1008
90000
200
HR
1002
500000
200
HR
1002
500000
200
HR
1002
500000
300 400
Mrkt IT
1006 1005
500000 600000
300 400
Mrkt IT
1006 1005
500000 600000
300 400
Mrkt IT
1006 1005
500000 600000
Replicated
Replicated
Replicated
Notice that the Department_Table has only four rows. Those four rows are copied to every segment. This is distributed by UNSEGMENTED. Now, the Department_Table can be joined to the Employee_Table with a guarantee that matching rows are co-located. They are co-located because the smaller table has copied ALL of its rows to each Node. When two joining tables have one large table (fact table) and one small table (dimension table), then use the UNSEGMENTED keyword to distribute the smaller table. This theory is also called a "Big Table/ Small Table Join".
Page 49
Chapter 2
Vertica Data Distribution
Fact and Dimension Table Distribution Key Designs Line_Order_Fact_Table
Part_Table P_Part_Key
Make the Part_Key the Distribution Key for the two largest tables
LO_Order_Key LO_Line_Number LO_Cust_Key LO_Part_Key LO_Ship_Priority LO_Quantity LO_Extended_Price LO_Supp_Key LO_Order_Total_Price LO_Discount LO_Tax LO_Order_Date LO_Supply_Cost LO_Revenue LO_Ship_Mode
REPLICATED Customer_Table
C_Cust_Key REPLICATED Supplier_Table S_Supp_Key REPLICATED Date_Table D_Date_Key
The fact table (Line_Order_Fact_Table) is the largest table, but the Part_Table is the largest dimension table. That is why you make Part_Key the distribution key for both tables. Now, when these two tables are joined together, the matching Part_Key rows are on the same Node. You can then distribute by UNSEGMENTED, which replicates the other dimension tables to each node. Each table will have all their rows on each Node. Now, everything that joins to the fact table is co-located!
Page 50
Chapter 2
Vertica Data Distribution
Why a Sort Key Improves Performance CREATE TABLE Order_Table (Order_No INTEGER Not NULL, Cust_No INTEGER, Order_Date Date, Order_Total Decimal(8,2)) ORDER BY Order_Date SEGMENTED BY HASH(Order_No) ALL NODES
Segment 1
Segment 2
Segment 3
Order_Table
Order_Table
Order_Table
JAN FEB
JAN FEB
JAN FEB
MAR APR
MAR APR
MAR APR
MAY JUN
MAY JUN
MAY JUN
There are three basic reasons to use the sortkey keyword when creating a table. 1) If recent data is queried most frequently, specify the timestamp or date column as the leading column for the sort key. 2) If you do frequent range filtering or equality filtering on one column, specify that column as the sort key. 3) If you frequently join a (dimension) table, specify the join column as the sort key. Above, you can see we have made our sortkey the Order_Date column. Look how the data is sorted!
Page 51
Chapter 2
Vertica Data Distribution
Sort Keys Help Group By, Order By and Window Functions CREATE TABLE Order_Table (Order_Number INTEGER Not NULL, Customer_Number INTEGER, Order_Date Date, Order_Total Decimal(8,2)) ORDER BY Customer_Number, Order_Date SEGMENTED BY HASH(Order_Number) ALL NODES
SELECT Customer_Number ,SUM(Order_Total) as "Order Sum" ,AVG(Order_Total) as "Avg Order" FROM Order_Table GROUP BY Customer_Number ORDER BY Customer_Number ;
SELECT Customer_Number ,Order_Date ,Order_Total ,SUM(Order_Total) OVER (Partition By Customer_Number Order By Customer_Number ,Order_Date Rows Unbounded Preceding) as "Cumulative Sum" FROM Order_Table
When data is sorted on a strategic column, it will improve (GROUP BY and ORDER BY operations), window functions (PARTITION BY and ORDER BY operations), and even as a means of optimizing compression. But, as new rows are incrementally loaded, these new rows are sorted but they reside temporarily in a separate region on disk. In order to maintain a fully sorted table, you need to run the VACUUM command at regular intervals. You will also need to run ANALYZE.
Page 52
Chapter 3
Page 53
Clever Features of Vertica
Chapter 3
Clever Features of Vertica
Chapter 3 – Clever Features of Vertica
“Always remember that you are unique just like everyone else.” - Anonymous
Page 54
Chapter 3
Clever Features of Vertica
Super Projections Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 Replicated on all nodes
Billy’s Best Choice Acme Products ACE Consulting
123456 123512 123552 123585
11111111 11111111 31323134 31323134
Node 1
Node n
Customer_Table Unsegmented 11111111 31313131 31323134
12347.53 8005.91 5111.47 15231.62
Billy’s Best Choice Acme Products ACE Consulting
Customer_Table Unsegmented 11111111 31313131 31323134
Replicated on all nodes
Billy’s Best Choice Acme Products ACE Consulting
Hashed
Customer_Table Segmented 123456 123512
11111111 11111111
12347.53 8005.91
Customer_Table Segmented 123552 123585
31323134 31323134
Hashed
5111.47 15231.62
A superprojection contains all columns of a single table by default
Vertica creates a default superprojection for each table in the database so that all SQL queries can be answered. A superprojection consists of all columns in the table and this is done by default when the data is first loaded or inserted. Notice that both superprojections above are either replicated across all nodes or the rows are hashed to different nodes.
Page 55
Chapter 3
Clever Features of Vertica
Vertica Projections Stu_ID First_Name __________ Last_Name Class_Code Grade_Pt ______ __________ __________ ________ 1 Thomas Wendy FR 4.00 2 Smith Andy SO 2.00 Original Data 3 McRoberts Richard JR 1.90 4 Phillips Martin SR 3.00 Physically Stored as Columns Stu_ID 1 2 3 4
First_Name Last_Name Class_Code Thomas Wendy FR Smith Andy SO McRoberts Richard JR Phillips Martin SR
Grade_Pt 4.00 2.00 1.90 3.00
Student_Names Stu_ID
First_Name Last_Name
Student_Grades Stu_ID
Class_Code
Grade_Pt
Split into Multiple Projections
Vertica stores data physically in views called projections. Each projection contains a subset of the columns, but each subset can be sorted differently. Projections can even contain columns from multiple tables, like a materialized view. Every data element in a table will appear in at least one projection. Tables occupy no physical storage! It is only the projections that are stored. This allows Vertica to group columns used most often together right next to each other in the physical storage.
Page 56
Chapter 3
Clever Features of Vertica
The Five Advantages of Projections 1.
The Vertica query optimizer automatically picks the best projections to use for any query, so no user interaction is required.
2.
Projections compress and encode data to greatly reduce the space required for storing data.
3.
Vertica operates on the encoded data when it can in order to avoid the cost of decoding.
4.
Because Vertica uses a combination of both compression and encoding, this ensures the smallest disk space possible and yet it still maximizes query performance.
5.
Projections also provide high availability and recovery by duplicating table columns on at least K+1 nodes within its cluster. If a node fails, the database continues to operate by using duplicate data on a buddy node(s).
Vertica projections store data in encoded format designed for automatic performance tuning. Think of projections similar to Join Indexes in Teradata or a materialized views in Oracle. Projections are really result sets that are stored on disk. Instead of computing these results each time they can be used in each query. Projection results are automatically refreshed whenever data values are inserted, deleted, updated or copied.
Page 57
Chapter 3
Clever Features of Vertica
Creating a Projection Database
Projection Columns
Schema
Projection Name
CREATE Projection Coffing.SQL_Class.Order_Projection ( Order_Number_P ENCODING RLE ,Customer_Number_P ENCODING RLE ,Order_Date_P ,Order_Total_P ) AS SELECT Order_Number ,Customer_Number ,Order_Total ,Order_Date FROM Coffing.SQL_Class.Order_Table How the data ORDER BY Order_Date is sorted on SEGMENTED BY Hash(Customer_Number) each node ALL NODES How the data will be OFFSET 1; distributed to the nodes
Vertica projections store data in encoded format designed for automatic performance tuning. Think of projections similar to Join Indexes in Teradata or a materialized views in Oracle. Projections are really result sets that are stored on disk. Above, we did create a projection using all columns in the table, but we determined how we wanted the data sorted.
Page 58
Chapter 3
Clever Features of Vertica
Read-Optimized Store (ROS)/Write-Optimized Store (WOS) Node Memory Cache Ready for Transfer
Ready for Transfer
Current Epoch
Both can Be queried Write-Optimized Storage (WOS) Disks for Permanent Storage
Read-Optimized Storage (ROS)
Periodically, the Tuple Mover migrates the recent updates that are "Ready for Transfer" to the permanent storage in the Read-Only Storage (ROS)
Vertica caches all updates to a main memory called the Write-Optimized Store (WOS), which by the way is queryable. The WOS puts the data into projections in collection buckets that are uncompressed and unsorted, but are in update order. The Tuple Mover then migrates the recent updates during certain periods to the permanent disk storage in the Read-Optimized Store (ROS). The data in the ROS is sorted, compressed and packed into variable length disk blocks.
Page 59
Chapter 3
Clever Features of Vertica
Write-Optimized Store (WOS) is Memory Resident Write-Optimized Storage (WOS) Ready for Transfer
Ready for Transfer
Current Epoch
Read-Optimized Storage (ROS)
Vertica's Write Optimized Store (WOS) is always memory-resident and it is buffer for INSERT, UPDATE, DELETE, and COPY operations. To support very fast data load speeds, the WOS stores records without data compression or indexing. A projection in the WOS is sorted only when it is queried. It remains sorted as long as no further data is inserted into it. The WOS organizes data by epoch and holds both committed and uncommitted transaction data. Both the Read Optimized Store (ROS) and the Write Optimized Store (WOS) are arranged by projections. This technique allows for continuous loading throughout the day without having a major impact on read queries.
Page 60
Chapter 3
Clever Features of Vertica
Updates are collected in Time-Based Buckets called Epochs Write-Optimized Storage (WOS) Ready for Transfer
Ready for Transfer
Current Epoch
Read-Optimized Storage (ROS) Vertica caches all updates to a main memory called the Write-Optimized Store (WOS), which by the way is queryable. The WOS is designed so updated can be collected in time-based buckets. At fixed intervals, Vertica closes the current epoch and begins a new Epoch. The non-current Epochs are queryable and deemed for migration by the Tuple Mover to update the permanent disks called the Read-Optimized Storage (ROS). This design allows for the majority of users who only need to read data to have an open gateway, however it also allows for near real-time data warehouses with high append data volumes.
Page 61
Chapter 3
Clever Features of Vertica
Vertica Does Not Support In-Place Updates Node Write-Optimized Storage (WOS)
Ready for Transfer
Ready for Transfer
Current Epoch
Tuple Mover updates by deleting and re-inserting rows Read-Optimized Storage (ROS)
Appended data is added to the end of a column-store block and updated data (in the middle) of a block is deleted and re-inserted.
Vertica's Tuple Mover updates by deleting and re-inserting rows. Appended data is added to the end of a columnstore block and updated data (in the middle) of a block is deleted and re-inserted. Vertica does not support in-place updates.
Page 62
Chapter 3
Clever Features of Vertica
K-Safety Node 1
Node 2
Node 3
Node 4
Node 5
The K in K-Safety means how many duplicate copies are stored. In this example K = 1. This example is not designed to represent mirroring, but in effect each node has a buddy node that it keeps a backup copy of its data in case of a failure. Node 1 holds the backup for node 2 and node 2 holds the backup for node 3, etc.
You can view a list of critical nodes in your database by running the query below from your Nexus Chameleon.
SELECT * FROM v_monitor.critical_nodes; Any of the nodes in the cluster example above could fail, and the database would still be able to continue perform. The performance would be lower because one node would have to handle its own workload and the workload of the failed node.
Page 63
Chapter 3
Clever Features of Vertica
K-Safety of 2 Node 1
Node 2
Node 4
Node 3
Node 5
To see the K-Safety numbers just run the query below from your Nexus Chameleon. SELECT current_fault_tolerance FROM system ;
Any two nodes in the cluster example above could fail, and the database would still be able to continue perform. Each node is a buddy the nodes before and after it.
Page 64
Chapter 3
Clever Features of Vertica
The Five Data Isolation Modes 1.
Snapshot – Queries and updates do not interfere with each other, so read only queries do not require locking.
2.
Serializable - Transactions run in serial order. Locks are acquired for both read and write operations, which ensures that any successive SELECT commands within a single transaction always produce the same results.
3.
Repeatable read – Auto-converts to SERIALIZABLE.
4.
Read Committed - SELECT queries sees a snapshot of the committed data at the start of the transaction and any results of updates run within its transaction, even if they have not been committed.
5.
Read Uncommitted (Read Without Integrity) By default, Vertica uses the READ COMMITTED isolation level.
Vertica supports all types of database isolation. Database isolation refers to how the concurrent users of data affect each other as they read and change data in the database. The key question comes down to integrity of data vs. concurrency. Although the optimizer understands all five standard SQL isolation levels, internally Vertica uses only two isolation levels. They are "Read Committed" and "Serializable". So, you may not get the other isolations you request. Vertica automatically translates "Read Uncommitted" to "Read Committed" and "Repeatable Read" to "Serializable".
Page 65
Chapter 3
Clever Features of Vertica
Import/Export between Multiple Vertica Systems Vertica System 1 Node 1
Node 2
Node 3
Node 4
Vertica System 2 Node 1
Node 2
Node 3
SAN Storage
Entire databases, or certain portions of databases, can be moved from one Vertica system to another by using a simple SQL statement. Notice that the instances do not need to be the same size or have the same storage requirements. The data is automatically re-segmented to match the new configuration, and projections are resorted based on queries being run.
Page 66
Chapter 3
Clever Features of Vertica
Roles 1,000 Users
You’ve been given the Mrkt_User_Role
Database Mrkt Mrkt_User_Role I Grant thee SELECT (read) privilege to all tables in the database Mrkt
Tables Customers Products
Orders Sales
Roles simplify database administration by assigning access rights to tables and other objects, and then groups of people with similar job functions (or roles) can access these objects. It is as simple as creating different roles for different job functions and responsibilities, and then granting specific privileges (access rights) on database objects to these roles, and then granting a role or roles to users who share the same privileges. Vertica database security supports roles conforming to SQL 2008 specifications. This type of security is essential for management of data access across large organizations.
Page 67
Chapter 3
Clever Features of Vertica
Compression Vertica compresses data in order to save space. Here are the facts: •
Vertica can utilize over twelve different compression options.
•
The compression depends on the data.
•
Vertica will choose which compression option to apply.
•
NULLs take up no space on Vertica because they are compressed.
•
Vertically will compress data on average 70%.
•
HP Vertica queries data in encoded form.
•
When similar data is grouped, you have even more options.
One of the key advantages to columnar storage is the ability to compress column data. When column stores are compressed they can stores more data, provides more projections and use less hardware. This can provide up to 50% more historical data being stored and queried. The following pages show just some of the compression techniques utilized. Encoding is the process of converting data into a standard format. Encoded data can be directly processed, however compressed data cannot. Vertica operates on encoded data when it can to avoid the heavy costs of decoding.
Page 68
Chapter 3
Clever Features of Vertica
Runlength encoding Encoding Type
Run-length
Original Data Ohio Ohio Ohio Ohio California California California Michigan Michigan
Encoding Keyword
RUNLENGTH
Data Types
All
Original size (bytes) Compressed Value Compressed Size 4 4 4 4 10 10 10 8 8
4, Ohio
3, California
2 Michigan
5 0 0 0 11 0 0 9 0
Runlength encoding replaces a value that is repeated consecutively with a token that consists of the value and a count of the number of consecutive occurrences (the length of the run). This is where the name Runlength comes into play. A separate dictionary of unique values is created for each block of column values on disk. This encoding is best suited to a table in which data values are often repeated consecutively, for example, when the table is sorted by those values.
Page 69
Chapter 3
Clever Features of Vertica
LZO Encoding Encoding Type LZO
Data Types
Encoding Keyword LZO
All except BOOLEAN, REAL, and DOUBLE PRECISION
•
Designed to work best with Char and Varchar data that store long character strings
•
Is a portable lossless data compression library written in ANSI C
•
Offers fast compression but extremely fast decompression
•
Includes slower compression levels achieving a quite competitive compression ratio while still decompressing at this very high speed
•
Often implemented with a tool called LZOP
Lempel–Ziv–Oberhumer (LZO) is a lossless data compression algorithm that is focused on decompression speed. LZO encoding provides a high compression ratio with good performance. LZO encoding is designed to work well with character data. It is especially good for CHAR and VARCHAR columns that store very long character strings, especially free form text such as product descriptions, user comments or JSON strings.
Page 70
Chapter 3
Clever Features of Vertica
Delta Encoding Encoding Type
Encoding Keyword
Delta
DELTA
Delta
DELTA32K
Uncompressed Data
4-byte integers
1 2 3 4 5 6 7 8
Data Types SMALLINT, INT, BIGINT, DATE, TIMESTAMP, DECIMAL INT, BIGINT, DATE, TIMESTAMP, DECIMAL
Delta Encoding Compression 0001 1 1 1 1 1 1
The first row is a 4-byte integer (plus one flag byte).
One byte with the number 1. Each is 1 greater than the previous value.
The Delta encodings are very useful for date and time columns. Delta encoding compresses data by recording the difference between values that follow each other in the column. These differences are recorded in a separate dictionary for each block of column values on disk. If the column contains 10 integers in sequence from 1 to 10, the first will be stored as a 4-byte integer (plus a 1-byte flag), and the next 9 will each be stored as a byte with the value 1, indicating that it is one greater than the previous value. Delta encoding comes in two variations. DELTA records the differences as 1-byte values (8-bit integers), and DELTA32K records differences as 2-byte values (16bit integers).
Page 71
Chapter 3
Clever Features of Vertica
Block Based Dictionary Encoding for Character Data Encoding Type
Data Types
Encoding Keyword
Block Based dictionary Block Based
Uncompressed Data
Compressed Data
Ohio California Minnesota Alaska Oregon Ohio California Minnesota Alaska
1 2 3 4 5 1 2 3 4
Character
Dictionary
1 - Ohio 2 - California 3 - Minnesota 4 - Alaska 5 - Oregon
Block Based Dictionary Encoding utilizes a separate dictionary of unique values for each block of column values on disk. Remember, each Vertica disk block occupies 1 MB. The dictionary contains up to 256 one-byte values that are stored as indexes to the original data values. If more than 256 values are stored in a single block, the extra values are written into the block in raw, uncompressed form. The process repeats for each disk block. This encoding is very effective when a column contains a limited number of unique values, and it is especially optimal when there is less than 256 unique values.
Page 72
Chapter 4
Page 73
Nexus
Chapter 4
Nexus
Chapter 4 - Nexus
“We envisioned the cloud a decade before its arrival and that is why the Nexus is the most dominating tool in the industry today” - Tera-Tom Coffing
Page 74
Chapter 4
Nexus
Nexus is Available on the Cloud
Why the Nexus Chameleon should be your query tool of choice: 1) Queries every major system 2) Provides visualization and automatically writes the SQL 3) Can perform cross-system joins with a few clicks of the mouse 4) Converts table structures and moves the table and data between systems 5) Compares and synchronizes databases 6) Can move an entire database of tables or views between systems 7) Has the "Garden of Analysis" to re-query answer sets inside your PC 8) Provides a dashboard of graphs and charts for answer sets
Download the Nexus for a free trial at www.CoffingDW.com and use Nexus in-house or on the cloud. Nexus is on the Amazon (AWS) cloud, the Microsoft Azure cloud and the Century Link cloud.
Page 75
Chapter 4
Nexus
Nexus Queries Every Major System Nexus Chameleon File Edit View Query Tools Help Web Windows System:
Systems + + + + + + + + + + + + + + +
Database: SQL Class
Vertica
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3
SELECT * FROM Employee_table Where Dept_No = 400 ; Messages
Garden of Analysis
Employee_No Dept_No 1 2 3
1256349 1121334 2341218
400 400 400
Nexus is designed to work with every system in your enterprise, whether on-premises or in the cloud
Result 1 Last_Name First_Name Salary Herbert 54500.00 Harrison Cletus 54500.00 Strickling William 36000.00 Reilly
And you can query them all simultaneously
Nexus is designed to work with every system in your enterprise, on-premises systems and cloud systems. Nexus works with traditional systems, such as DB2, Oracle, SQL Server and Teradata. Nexus also works with newer systems, such as Netezza, Greenplum, Kognitio, Hana, Matrix, Aster Data and Vertica. Nexus also works with your top cloud systems, such as Amazon Redshift, Microsoft Azure SQL Data Warehouse and Hadoop.
Page 76
Chapter 4
Nexus
How to Use Nexus Nexus Chameleon
You query history
File Edit View Query Tools Help Web Windows System:
Vertica
Systems
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1
+ Vertica System your connected to Click on the plus sign and see all the databases and objects for this particular system
Your current database
Runs a Query, or press F5
Clears The SQL from your screen
Provides the EXPLAIN Plan, or hit F6
Hit F2 to see SQL Syntax
The important buttons and function keys you need to know about are listed above.
Page 77
Opens Up another Query Window for any system you have in your systems tree
Chapter 4
Nexus
Why is Nexus Special? Visualization and Automatic SQL Nexus Chameleon History
File Edit View Query Tools Help Web Windows System:
Vertica
Execute Objects
Database: SQL Class
EXECUTE
Create Table Preview SQL in Nexus Columns
Sorting
Joins
V
Customer_Table
Sandbox
WHERE
?
New Query
Join Hub System Teradata SQL
Metadata
Analytics
V
Order_Table
Add Join
Add Join
Select *
Select *
Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)
Order_Number Integer Customer_Number Integer Order_Date Date Order_Total Decimal (10,2)
Right click on any table or view in your Nexus system tree and choose "Super Join Builder". Your table, or view, will be shown visually, along with its columns and their data types. Press the "Add Join" drop down menu and see what other tables or views can be joined. Click on the columns you want on your report and a checkmark appears. Nexus automatically writes the SQL for you. You now have the ability to develop at record speeds.
Page 78
Chapter 4
Nexus
Why is Nexus Special? Cross-System Joins Nexus Chameleon System: Vertica Execute Objects
Sandbox
History
File Edit View Query Tools Help Web Windows Database: SQL Class
EXECUTE
?
New Query
Create Table Preview SQL in Nexus Columns
Sorting
Joins
T
ADDRESSES Add Join
Select * Street Varchar(30) City Varchar(20) State Char(2) Zip Integer AreaCode Smallint Phone Char(15) Subscriber_No Integer
Join Hub System Teradata
WHERE
SQL
Metadata
Analytics
O
SUBSCRIBERS Add Join
Select * Last_Name Varchar(20) First_Name Varchar(20) Gender Char(1) SSN Integer Member_No Smallint Subscriber_No Integer
V
CLAIMS
Add Join
Select * Claim_Id Integer Claim_Date Date Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Provider_No Integer Claim_Service Integer
Did you ever even imagine that you would be able to join tables from different systems? Above, we are joining a Teradata table to an Oracle table to a Vertica table. Just checkmark the columns you want on your report and Nexus handles everything behind the scenes. Nexus builds the SQL, converts the table structures and moves the tables to the Hub system. The report comes flying back with no intervention from the user.
Page 79
Chapter 4
Nexus
Why is Nexus Special? The Amazing Hub System Nexus Chameleon System: Vertica Execute Objects
Sandbox
History
File Edit View Query Tools Help Web Windows
Database: SQL Class
EXECUTE
?
New Query
Create Table Preview SQL in Nexus Columns
Sorting
Joins
T
ADDRESSES Add Join
WHERE
Join Hub System Vertica SQL
Metadata
Analytics
O
SUBSCRIBERS
V
CLAIMS
Add Join
Add Join
Select * Street Varchar(30) City Varchar(20)
Select * Last_Name Varchar(20) First_Name Varchar(20)
Select * Claim_Id Integer Claim_Date Date
State Char(2) Zip Integer AreaCode Smallint Phone Char(15) Subscriber_No Integer
Gender Char(1) SSN Integer Member_No Smallint Subscriber_No Integer
Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Provider_No Integer Claim_Service Integer
Nexus allows the user to select which system they want to process the data. We just changed the Hub to Vertica (above). Now, the tables from the three systems will be converted, moved and joined on the Vertica system. Nexus allows you to join tables from any system in your enterprise (both on-premises and cloud), but then tops it off by allowing you to process the joins on any system in your enterprise. Need extra processing power? Spin up a server on the cloud and make it the hub. Simply amazing!
Page 80
Chapter 4
Nexus
Why is Nexus Special? Save Answer Sets as Tables Nexus Chameleon System: Vertica Execute Objects
Sandbox
History
File Edit View Query Tools Help Web Windows Database: SQL Class
EXECUTE
?
New Query
Create Table Preview SQL in Nexus Columns
Sorting
Joins
V
ADDRESSES Add Join
Join Hub System SQL Server
WHERE
SQL
Metadata
Analytics
O
SUBSCRIBERS Add Join
V
CLAIMS
Add Join
Select * Street Varchar(30) City Varchar(20) State Char(2) Zip Integer
Select * Last_Name Varchar(20) First_Name Varchar(20) Gender Char(1) SSN Integer
Select * Claim_Id Integer Claim_Date Date Subscriber_No Integer Member_No Smallint
AreaCode Smallint Phone Char(15) Subscriber_No Integer
Member_No Smallint Subscriber_No Integer
Claim_Amt Decimal(12,2) Provider_No Integer Claim_Service Integer
Nexus allows you to run a join or any query in the Super Join Builder and save the answer set as a table on any system in your enterprise (on-premises or cloud). This is a fantastic way to create a data mart or to save data to your sandbox. It is also great for building a test system. And Nexus makes it so easy. Just click on the "Create Table" button and tell Nexus the system and the table name you desire, and that new table and its data will be there when the query is finished.
Page 81
Chapter 4
Nexus
Why is Nexus Special? Automated Data Movement Database Movement Execute Table Movement System Teradata V15
Options
Log
Database SQL_Class
System Vertica
Source Tables
Target Tables
SQL_Class Tables Addresses Claims Course_Table Customer_Table Employee_Table Order_Table Providers Sales_Table
Database SQL_Sandbox
SQL_Sandbox
X
Tables Addresses OK Claims OK Course_Table OK Customer_Table Employee_Table Order_Table Providers Sales_Table
Do you know how long it normally takes to convert the table structures and move data between systems? Sometimes this is a month long project, but not with Nexus. Just right click on any database in your systems tree and choose "Move Data". Pick the target system in which you want the tables and data to move and then checkmark the tables you want to move. Press the blue arrow and the tables with a checkmark move to the Target system. Press Execute and watch the tables light up in green with every successful move. If there is a problem, the table(s) will light up in red.
Page 82
Chapter 4
Nexus
Why is Nexus Special? Nexus makes the Servers Talk Directly Vertica Teradata
Hadoop
Oracle
Redshift
Whenever Nexus converts and moves data from one system to another the data never lands in a landing zone. When Nexus performs cross-system joins between systems the data moves directly from server to server. Whether you install Nexus on your desktop, laptop, Citrix server or access Nexus via a remote desktop on the cloud, there is no data passing through the Nexus. All data movement, conversions and cross-system joins are done directly between the big systems. Nexus uses Progress Data Direct drivers to establish a connection between systems and this allows the big systems to talk directly. This is the secret sauce that allows Nexus to move billions of rows from one system to another!
Page 83
Chapter 4
Nexus
What Makes Nexus Special? The Garden of Analysis Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Result 1
Result 2
Product_ID 3000 1000 3000 1000 3000 1000
Result 3
Sale_Date 09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000
Result 4
Daily_Sales 61301.77 48850.40 34509.13 54500.22 43868.86 36000.07
Nexus can query all of systems in your enterprise and each answer set is automatically present in the Garden of Analysis. This is a unique concept because any and all answer sets can be re-queried by Nexus inside your desktop. Above, we have four answer sets. We right clicked on answer set 4 (Result 4 – in red). Now, we can click on any tab above and get Aggregates, OLAP, Rank, Grouping Sets, Quantiles, Top, Sort, Join, Charts/Graphs, Dynamic Charts or a Dashboard. All you have to do is press on a table and drag columns in the answer set to the templates and you immediately see the new data.
Page 84
Chapter 4
Nexus
The Garden of Analysis Grouping Sets Tab Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Products
Date Column (Optional)
> Product_Id … Sale_Date Daily_Sales
2
Sum
> Sale_Date …
Choose the Product_ID from the drop down
> Product_Id … Sale_Date Daily_Sales
Choose the date column from the drop down
3 3
Options Grouping Sets Rollup Cube Date Extraction
4
Choose the column you want to sum from the drop down
Year Month Create
Result 1
Result 2
Product_ID 3000 1000 3000 1000 3000 1000
Result 3
Sale_Date 09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000
Result 4
Daily_Sales 61301.77 48850.40 34509.13 54500.22 43868.86 36000.07
5
Right click on Result 4 tab and choose Set as active Result Set
1
In five easy steps we got three new reports. 1) We made Result 4 the active result set by right clicking on the Result 4 tab and we chose "Set as active Result Set". We then chose the Grouping Sets tab at the top (red circle). 2) We then clicked on the Products drop down menu and chose the column Product_Id. 3) We then clicked on the Date Column drop down and chose Sale_Date. 4) We clicked on the Sum drop down menu and chose Daily_Sales. Our options already had the Grouping Sets, Rollup and Cube pre-selected. 5) We hit the CREATE button. Watch what happens on the next page! Page 85
Chapter 4
Nexus
The Garden of Analysis - Grouping Sets Answer Sets Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Result 1
Result 2
Result 3
Result 4
Grouping sets
Cube
Product_ID MTH This report depicts the actual results of the Grouping Sets
3000 2000 1000 ? ? ?
Rollup
YR
Sum_Daily_Sales
? ? ? ? ? ? 10 ? 9 ? ? 2000
224587.82 306611.81 331204.72 443634.99 418769.36 862404.35
3 new reports were created with the PC doing all the work
We instantly received three more answer sets (in yellow and pink) for Grouping Sets, Group by Cube and Group by Rollup. What is truly intelligent is that we re-queried Result 4 to get the three new reports and did so inside Nexus. The data warehouse was not re-queried, but instead the Nexus used the processor and memory inside the PC to calculate the analytics. All answer sets are saved to the Garden of Analysis so users can get additional reports by merely clicking on the appropriate tab and selecting the columns they want in the varying templates. Why not have your own data warehouse inside Nexus?
Page 86
Chapter 4
Nexus
The Garden of Analysis – Join Tab (1 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard Left click on the Join Tab
Result 1
Result 1
Order_Number Customer_Number Order_Date 123512 123456 123552 123777 123585
11111111 11111111 31323134 57896883 87323456
If you hit the blue button you will leave the Garden and be back to your main Nexus screen
01/01/1999 05/04/1998 10/01/1999 09/09/1999 10/10/1999
Order_Total 8005.91 12347.53 5111.47 23454.84 15231.62
Customer_Number Customer_Name 11111111 31313131 31323134 57896883 87323456
Billy's Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
Phone_Number 555-1234 555-1111 555-1212 347-8954 322-1012
To join answer sets together you need a single step. 1) Click on the join Tab (at the top). Notice that we have two answer sets that are named Result 1 and Result 2. Turn the page and see what happens next.
Page 87
Chapter 4
Nexus
The Garden of Analysis – Join Tab (2 of 4) Toggle Garden Docking
1
Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Left Table
2
Right Table
Options
3
Join Type: Left Table Join Column(s) > …
4
Right Table Join Column(s) > …
5
6
Inner Join
Left (Outer) Join Right (Outer) Join Full Join
Clear
Create
7
In seven easy steps you can sort an answer set. 1) Choose the Join tab at the top (red circle). 2) Choose the Left Table from the Left Table Drop Down (one of the answer sets). 3) Choose the Right Table from the Right Table Drop Down (one of the answer sets) 4) Choose the Left Table Join Column from the menu. 5) Choose the Right Table Join Column from the menu. 6) Pick the join type you want from the Options menu. 7) Hit the Create button. Turn the page to see the actual choices.
Page 88
Chapter 4
Nexus
The Garden of Analysis – Join Tab (3 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Left Table
Result 1
Right Table Result 2
Options Join Type:
Left Table Join Column(s) > …
Customer_Number
Right Table Join Column(s) > …
Customer_Number
Inner Join Left (Outer) Join
Right (Outer) Join Full Join
Clear
Create
We Chose the Result 1 answer set for the Left Table from the Left Table drop down menu. We Chose the Result 2 answer set for the Right Table from the Right Table drop down menu. We chose the Customer_Number column from the Left Table Join Column drop down menu. We Chose the Customer_Number column for the Right Table Join Column from the Right Table Join Columns drop down menu. We kept the Join Type of Inner Join. We hit the Create button. Turn the page to see the results
Page 89
Chapter 4
Nexus
The Garden of Analysis – Join Tab (4 of 4) Result 1
Result 2
Order_Number Customer_Number Order_Date 123552 123777 123512 123456 123585
31323134 57896883 11111111 11111111 87323456
10/01/1999 09/09/1999 01/01/1999 05/04/1998 10/10/1999
Result 3
Result 4
Order_Total 5111.47 23454.84 8005.91 12347.53 15231.62
Join
Customer_Number2 Customer_Name 31323134 57896883 11111111 11111111 87323456
ACE Consulting XYZ Plumbing Billy's Best Choice Billy's Best Choice Databases N-U
Phone_Number 555-1212 347-8954 555-1234 555-1234 322-1012
The two answer sets are now joined together. You can join answer sets from different systems just as easily.
Page 90
Chapter 4
Nexus
The Garden of Analysis – Charts/Graphs Tab (1 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
2 Left click on the Charts/Graphs Tab
Result 1
1 Right click on any Result tab you want to work with and choose "Set as Active Result Set"
Result 2
Product_ID 3000 1000 3000 1000 3000 1000
Result 3
Sale_Date 09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000
Result 4
Daily_Sales 61301.77 48850.40 34509.13 54500.22 43868.86 36000.07
>
If you hit the blue button you will leave the Garden and be back to your main Nexus screen
Set as Active Result Set Rename Tab Print Result Set Save Result Set to Garden Remove Result Set Export Result Set
To get charts and graphs for an answer set do the following two steps. 1) Right click on the answer set you want to work with and choose "Set as Active Result Set". That answer set is now ready to be placed into graphs and charts. 2) Click on the Charts/Graphs Tab (at the top). Turn the page and see what happens next.
Page 91
Chapter 4
Nexus
The Garden of Analysis – Charts/Graphs Tab (2 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Values (Y Axis)
Labels (X axis)
Sort By (Optional)
> …
> …
> …
2
3
4
1
Options Basic Chart Types: Advanced Pie Chart
6
Line Chart Column Chart All Types
Partition By (Optional) > …
Visual Type:
7
Flat (2D) 3D
5
Basic Chart Values: Sum
8
Clear
Create
In eight easy steps you can graph and chart an answer set with 35 different graphs and charts. 1) Choose the Graphs/Charts tab at the top (red circle). 2) Choose the column you want as the Y Axis from the Values (Y Axis) Drop Down menu. 3) Choose the column you want to be the X axis from the Labels (X axis) Drop Down menu. 4) Choose the Sort By (Optional) column in which you want to sort from the drop down menu. 5) Choose the optional Partition By column from the drop down menu. 6) Pick the type of chart from the Basic Chart Types options. 7) Pick the Visual Type of chart you want. 8) Hit the Create button. Turn the page to see the actual choices.
Page 92
Chapter 4
Nexus
The Garden of Analysis – Charts/Graphs Tab (3 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Values (Y Axis) > …
Labels (X axis) > …
Sort By (Optional) > … Product_ID (ASC)
Daily_Sales
Sale_Date
Sale_Date (ASC)
Options Basic Chart Types: Advanced Pie Chart
Line Chart Column Chart All Types
Partition By (Optional) > …
All Charts
Visual Type:
Flat (2D) 3D Product_ID (ASC)
Basic Chart Values: Sum
Clear
Create
You can see the columns above that we chose. Because we chose All Types and All Charts in the options tab we got 35 charts and each was placed in the Dashboard. Turn the page to see one of the many charts.
Page 93
Chapter 4
Nexus
The Garden of Analysis – Charts/Graphs Tab (4 of 4) 64300.0 59437.5 54575.1 49712.6 44850.1
39987.7 35125.2 30262.7 25400.3
20537.8 15675.3
9/28/2000 9/29/2000 9/30/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000
1000
9/28/2000 9/28/2000 9/28/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000
2000
The chart above represents the answer set. This chart has been placed in the dashboard along with 35 other charts depicting the same answer set and parameters.
Page 94
Chapter 4
Nexus
The Garden of Analysis – Dynamic Charts Tab (1 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
2
Result 1
1 Right click on any Result tab you want to work with and choose "Set as Active Result Set"
Result 2
Product_ID 3000 1000 3000 1000 3000 1000
Result 3
Sale_Date 09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000
Result 4
Daily_Sales 61301.77 48850.40 34509.13 54500.22 43868.86 36000.07
>
Left click on the Dynamic Charts Tab
Set as Active Result Set Rename Tab Print Result Set Save Result Set to Garden Remove Result Set Export Result Set
To get Dynamic Charts for an answer set do the following two steps. 1) Right click on the answer set you want to work with and choose "Set as Active Result Set". That answer set is now ready to be placed in a Dynamic Chart. 2) Click on the Dynamic Charts Tab (at the top). Turn the page and see what happens next.
Page 95
Chapter 4
Nexus
The Garden of Analysis – Dynamic Charts Tab (2 of 4) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Load Visualization
You need to take two steps. 1) Make sure you have selected the Dynamic Charts tab at the top (red circle). 2) Hit the Load Visualization button. Turn the page to see what happens next.
Page 96
Chapter 4
Nexus
The Garden of Analysis – Dynamic Charts Tab (3 of 4) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
File Edit View Data Layout Help Execute
Close Session Find:
Pages
Rows
Attributes Sale_Date
Columns
Data
Measures Filters
Measures
Level of Detail
Product_ID Daily_Sales Values
Encodings Markin Auto-Text Label Color
Size
Dynamic Charts allow you to drop and drag columns from the Attributes and Measures area. As you drop and drag, the charts dynamically change. Above, you can see the Sale_Date (in red) and the Product_ID (in blue) and the Daily_Sales column (in pink). The next slide will show how we drop and drag these Attributes and Measures to get our chart.
Page 97
Chapter 4
Nexus
The Garden of Analysis – Dynamic Charts Tab (4 of 4) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
File Edit View Data Layout Help Execute
Close Session Find:
Pages
Attributes Sale_Date
Data
Daily_Sales Values
Rows
Daily_Sales
140K 120K 100K 90K
Filters
Product_ID
Sale_Date
160K
Measures
Measures
Columns
Level of Detail Daily_Sales
80K 70K 60K
Encodings
50K
Markin Auto-Text
40K
Label
30K
Color
Size
Sale_Date
20K 10K
9/28/2000 9/29/2000 9/30/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000
Notice how we dragged the attributes and measures to the varying parts of the application. Because we put the Daily_Sales column in the level of detail, the actual Daily_Sales will appear if you hover your mouse over any of the bars.
Page 98
Chapter 4
Nexus
The Garden of Analysis – Dashboard Tab (1 of 5) Toggle Garden Docking Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Left click on the Dashboard Tab
Once you create and save graphs and charts to the dashboard you can view your graphs and charts in varying ways. Just left click on the Dashboard Tab at the top. Turn the page and see what happens next.
Page 99
Chapter 4
Nexus
The Garden of Analysis – Dynamic Charts Tab (2 of 5) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Slideshow
Thumbnails
Scroll
Compare
64300.0 59437.5 54575.1
Seconds to Display
1
Pause
Every 1 second another slide displays, until you hit Pause
49712.6 44850.1 39987.7 35125.2 30262.7 25400.3
20537.8 15675.3
9/28/2000 9/29/2000 9/30/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000
1000
9/28/2000 9/28/2000 9/28/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000
2000
The slideshow will display the many graphs one at a time in intervals of seconds. Hit the pause button to stop and examine any particular slide.
Page 100
Chapter 4
Nexus
The Garden of Analysis – Dynamic Charts Tab (3 of 5) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Slideshow
Thumbnails
Scroll
Compare
Pause
64300.0 59437.5 54575.1 49712.6
The graphs and charts will scroll across the screen
1000 2000 3000
44850.1 39987.7 35125.2 30262.7 25400.3 20537.8 15675.3
9/28/2000 9/29/2000 9/30/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000
The scroll will scroll the graphs and charts across the screen from right to left. Hit the pause button to stop and examine any particular graph or chart. Hit the speed bar to speed up or slow down the scrolling.
Page 101
Chapter 4
Nexus
The Garden of Analysis – Dynamic Charts Tab (4 of 5) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Slideshow
Thumbnails
Scroll
Compare
Result 6 Column Chart
With
Result 6 Pie Chart (3D)
64300.0 59437.5 54575.1 49712.6 44850.1
1000 2000 3000
Sales
39987.7 35125.2 30262.7 25400.3 20537.8 15675.3
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
9/28/2000 9/29/2000 9/30/2000 10/1/2000 10/2/1000 10/3/2000 10/4/2000
The Compare allows you to compare two different charts. The drop down menus are there so you can pick chart 1 vs chart 2.
Page 102
Chapter 4
Nexus
The Garden of Analysis – Dynamic Charts Tab (5 of 5) Welcome Aggregate OLAP Rank Grouping Sets Quantiles Top Sort Join Charts/Graphs Dynamic Charts Dashboard
Slideshow
Thumbnails
Send Selected Graphs to Garden Tabs
Scroll
Compare
Compare Selected Graphs
Delete Selected Graphs
Sales
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
Chart Title 5 0 Cat1 Column1
Cat2 Column2
cat3 Column3
The thumbnails show all of the graphs and charts in your dashboard. This gives you a broad view that allows you to double click on any thumbnail and see it in actual size. You can also use the menu (at the top) to send selected graphs to Garden Tabs, Compare Selected Graphs, or Delete Selected Graphs.
Page 103
Chapter 4
Nexus
Getting to the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows System:
-
Vertica
Database: SQL Class
Systems
History EXECUTE
Sandbox ?
New Query
Query 1
Vertica SQL_Class
-
Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table
Messages
Garden of Analysis
Right Click on any table and choose Super Join Builder
+ Order_Table + Providers
+ Sales_Table + Services
Right click on any table in your systems tree and choose Super Join Builder. You will be placed inside the Super Join Builder.
Page 104
Chapter 4
Nexus
The Super Join Builder is the First Entry in the Menu Nexus Chameleon File Edit View Query Tools Help Web Windows System:
Vertica
Database: SQL Class
Systems
- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers
+ Sales_Table + Services
History
Sandbox
EXECUTE
?
New Query
Query 1 Choose Super Join Builder from the menu Super Join Builder Right Click
Quick Select View DDL Move data to Oracle Move data to SQL Server Move data to Teradata Move data to Azure SQL Data Warehouse SmartScript Hound Dog Compression Compare/Sync Data
Right click on any table in your systems tree and a menu will appear. Choose Super Join Builder (top menu item) and the table you selected will be placed inside the Super Join Builder.
Page 105
Chapter 4
Nexus
The Super Join Builder Shows Tables Visually Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table
+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table
Query 1 Objects
Columns
Sorting
Joins
WHERE
SQL
Join Hub System Teradata Metadata
Analytics
V
Customer_Table
+
Preview SQL in Nexus
Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)
+ Services
This is exactly what you will see when you first enter the Super Join Builder. You will see your table, its columns and the data types of each column. Notice the table name (Customer_Table) and the Vertica icon of V. Turn the page for more.
Page 106
Chapter 4
Nexus
Using the Add Join Button Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table
+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table
Query 1
Super Join Builder
Execute
Create Table
Query 1 Objects
Columns
Sorting
Joins
WHERE
SQL
Join Hub System Teradata Metadata
Analytics
V
Customer_Table
+
Preview SQL in Nexus
Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)
Press the Add Join button to see what this table can Join to
+ Services
One of the greatest features of the Super Join Builder is the Add Join drop down menu. Press the drop down and the menu will show you what other tables or views can be joined to this table.
Page 107
Chapter 4
Nexus
What to Do When No Tables are Joinable? Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table
+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table + Services
Query 1 Objects
Columns
Sorting
Joins
WHERE
SQL
Join Hub System Teradata Metadata
Analytics
V
Customer_Table
+
Preview SQL in Nexus
Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)
Could not identify any joins to this object.
When no joins have been defined yet, you will get the above message in the menu
You might find that when you click on the Add Join drop down menu that you receive a message that says, "Could not identify any joins to this object". This means that you haven't actually told Nexus what does join to this table. We are about to fix that. The next page shows you how to define joins so that the menu will recognize them.
Page 108
Chapter 4
Nexus
Drag a Joinable Object into the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table
+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table
Query 1 Objects
Columns
Left click on the table you want to join and drag it into the Super Join Builder
Sorting
Preview SQL in Nexus Joins
WHERE
Join Hub System Teradata
SQL
Metadata
V
Customer_Table
+
Analytics
Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)
+ Services
If you want to define how a table joins to another table in the Super Join Builder, you merely left click on the table in the tree and drag it into the Super Join Builder area.
Page 109
Chapter 4
Nexus
You will see the Add Custom Join Window Add Custom Join Nexus Chameleon File Edit ViewJoin Query Tools Help Web Windows Type
-
SystemsInner
Query 1
Super Join Builder
Vertica Execute SQL_Class Existing Tables
Query 1
Tables
Objects
Create Table Columns
SQL_Class.Customer_Table Addresses
+ + Claims Customer_Table + Customer_Table
V
+ Employee_Table Left click on the table Customer_Number Integer and drag it into the + Department_Table Customer_Name Varchar(20) Super Join Builder + Order_Table
Phone_Number Char(8)
+ Providers
+ Sales_Table
Preview SQL in Nexus
Sorting
Join Hub System Teradata
Joins WHERE SQL Metadata SQL_Class.Order_Table
Order_Table
T
Customer_Table
+
Analytics
V
Order_Number Integer Add Join
Customer_Number Integer
Select * Order_Date date Customer_Number Integer Customer_Name Varchar(20) Order_Total Decimal(10,2) Phone_Number Char(8)
+ Services
SQL_Class.Customer_Table cus INNER JOIN SQL_Class.Order_Table ord
Reset
+
Add Join
When you drag a second table into the Super Join Builder the Add Custom Join window appears. This is the window where you will define the join conditions. Turn the page and watch how we define the columns that join the tables together.
Page 110
Chapter 4
Nexus
Defining the Join Columns Add Custom Join Join Type Inner Existing Tables
1) 2) 3) 4)
Left click the join column on the first table Left click the join column on the second table Left click the blue arrow to establish the join condition Hit the Add Join Button
SQL_Class.Order_Table
SQL_Class.Customer_Table Customer_Table Customer_Number Integer
Order_Table
V
Order_Number Integer
1
Customer_Name Varchar(20) Phone_Number Char(8)
V
Customer_Number Integer 3
2
Order_Date date Order_Total Decimal(10,2)
SQL_Class.Customer_Table cus INNER JOIN SQL_Class.Order_Table ord ON cus.Customer_Number = Ord.Customer_Number
Reset
4
+
Add Join
In four easy steps you can define the join conditions. Nexus will remember this table next time.
Page 111
Chapter 4
Nexus
Your Tables Will Appear Together Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table
+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table + Services
Query 1 Objects
Columns
Sorting
Joins
V
Customer_Table
+
Preview SQL in Nexus
Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)
WHERE
SQL
Join Hub System Teradata Metadata
Analytics
V
Order_Table Add Join Select * Order_Number Integer Customer_Number Integer Order_Date Date Order_Total Decimal (10,2)
Now that you have defined your tables and the join conditions the tables will appear together in the Super Join Builder. Notice the line connecting the tables points to the columns from both tables that are their respective join conditions.
Page 112
Chapter 4
Nexus
Select the Columns You Want on the Report Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Sales_Table + Services
Query 1 Objects
Columns
Sorting
Joins
V
Customer_Table
+
Preview SQL in Nexus WHERE
SQL
Metadata
Analytics
V
Order_Table
Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)
Join Hub System Teradata
Add Join Select * Order_Number Integer Customer_Number Integer Order_Date Date Order_Total Decimal (10,2)
Click on the columns you want on the report and Nexus will build the SQL automatically
Put a checkmark in the column boxes for all columns you want on the report and Nexus will build the SQL automatically. We have checked the Customer_Number, Customer_Name, Order_Date and Order_Total columns.
Page 113
Chapter 4
Nexus
Check out the SQL Tab to See the SQL that has been built Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table + Employee_Table
+ Department_Table + Order_Table + Providers
+ Sales_Table + Services
Query 1 Objects
Columns
Preview SQL in Nexus
Sorting
Joins
You are currently in the Objects Tab
SQL
Metadata
Analytics
Click on the SQL Tab to see the SQL
V
Customer_Table
+
WHERE
Join Hub System Vertica
Add Join
V
Order_Table Add Join
Select * Customer_Number Integer Customer_Name Varchar(20)
Select * Order_Number Integer Customer_Number Integer
Phone_Number Char(8)
Order_Date Date Order_Total Decimal (10,2)
When you see your tables in the Super Join Builder you are in the Objects Tab. Now that you have put a checkmark on the columns you want on your report, click on the SQL tab to see the SQL that Nexus has automatically built for you.
Page 114
Chapter 4
Nexus
SQL Tab Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table + Employee_Table
+ Department_Table + Order_Table + Providers
+ Sales_Table + Services
Query 1 Objects
Columns
Sorting
Preview SQL in Nexus Joins
WHERE
SQL
Join Hub System Teradata Metadata
Analytics
SQL
SELECT cus.Customer_Number, cus.Customer_Name, ord.Order_Date, ord.Order_Total FROM SQL_Class.customer_table cus INNER JOIN SQL_Class.order_table ord ON cus.Customer_Number = Ord.Customer_Number ;
The SQL has automatically been built for you because you put a check on the column you wanted to see on the report. You can hit the Execute button (above the Objects tab) or you can Preview SQL in Nexus. We will show you both options next.
Page 115
Chapter 4
Nexus
Hit Execute to get the Report inside the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table + Employee_Table
+ Department_Table + Order_Table + Providers
+ Sales_Table + Services
Query 1
Super Join Builder
Execute
Create Table
Query 1 Objects
Columns
Sorting
Preview SQL in Nexus Joins
WHERE
SQL
Join Hub System Teradata Metadata
Analytics
SQL
SELECT cus.Customer_Number, cus.Customer_Name, ord.Order_Date, ord.Order_Total FROM SQL_Class.customer_table cus INNER JOIN SQL_Class.order_table ord ON cus.Customer_Number = Ord.Customer_Number ;
If you click the Execute button (above the Objects tab) the report will come back inside the Super Join Builder. Turn to the next page and see the report.
Page 116
Chapter 4
Nexus
The Report is delivered inside the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows
-
Systems
Query 1
Vertica SQL_Class
Super Join Builder
SELECT cus.Customer_Number, cus.Customer_Name, ord.Order_Date, ord.Order_Total FROM SQL_Class.customer_table cus INNER JOIN SQL_Class.order_table ord ON cus.Customer_Number = Ord.Customer_Number ;
Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers
Messages
+ Sales_Table
1 2 3 4 5
Page 117
Garden of Analysis
Result 1
Customer_Number Customer_Name
+ Services
The report is delivered.
Join Builder x
31323134 57896883 11111111 11111111 87323456
ACE Consulting XYZ Plumbing Billy's Best Choice Billy's Best Choice Databases N-U
Order_Date 10/01/1999 09/09/1999 05/04/1998 01/01/1999 10/10/1999
Order_Total 5111.47 23454.84 12347.53 8005.91 15231.62
Chapter 4
Nexus
Let's Join Two Tables Again (1 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows System:
Vertica
Database: SQL Class
Systems
- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table
History EXECUTE
Sandbox ?
New Query
Query 1
Messages
Garden of Analysis
Right Click on any table and choose Super Join Builder
+ Order_Table + Providers
+ Sales_Table + Services
Right click on the table you previously defined in your systems tree and choose Super Join Builder. You will be placed inside the Super Join Builder. The next page will show the Right Click menu. Do you remember where the Super Join Builder is?
Page 118
Chapter 4
Nexus
Let's Join Two Tables Again (2 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows System:
-
Vertica
Database: SQL Class
Systems
+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table
EXECUTE
?
New Query
Choose Super Join Builder from the menu
Tables + Addresses + Claims + Customer_Table
Sandbox
Query 1
Vertica SQL_Class
-
History
Super Join Builder Right Click
Quick Select View DDL Move data to Oracle Move data to SQL Server Move data to Teradata Move data to Azure SQL Data Warehouse SmartScript Hound Dog Compression Compare/Sync Data
+ Services
The Super Join Builder is always the top menu item. Select that and you will be placed inside the Super Join Builder.
Page 119
Chapter 4
Nexus
Let's Join Two Tables Again (3 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table
+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table
Query 1 Objects
Columns
Sorting
Joins
WHERE
SQL
Join Hub System Vertica Metadata
Analytics
V
Customer_Table
+
Preview SQL in Nexus
Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)
+ Services
You once again see your table, its columns and the data types of each column. Watch what happens when we hit the Add Join drop down menu now.
Page 120
Chapter 4
Nexus
Let's Join Two Tables Again (4 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table
+ Employee_Table + Department_Table
Query 1
Super Join Builder
Execute
Create Table
Query 1 Objects
Columns
Sorting
Joins
Add Join
+ Providers
Select * Customer_Number Integer Customer_Name Varchar(20)
+ Sales_Table
Phone_Number Char(8)
+ Order_Table
WHERE
SQL
Join Hub System Vertica Metadata
Analytics
V
Customer_Table
+
Preview SQL in Nexus
Press the Add Join button to see what this table can Join to
+ Services
What do you think will be in the Add Join drop down menu this time? I bet you already guessed it. Turn the page.
Page 121
Chapter 4
Nexus
Let's Join Two Tables Again (5 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table
+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table + Services
Query 1 Objects
Columns
Sorting
Joins
WHERE
Join Hub System Vertica
SQL
Metadata
Analytics
V
Customer_Table
+
Preview SQL in Nexus
Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)
V
SQL_Class.Order_Table
You now have the Order_Table in the menu. Choose it.
Since you previously defined the relationship between the Customer_Table and the Order_Table the Add Join drop down menu will list these tables as joinable. Just use the menu to select the Order_Table and both tables will appear side by side together in the Super Join Builder. Turn the page and see for yourself. You can then look at the dashboard tab (top right) and see the compression savings for every table.
Page 122
Chapter 4
Nexus
Let's Join Two Tables Again (6 of 6) Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table
+ Employee_Table + Department_Table + Order_Table + Providers + Sales_Table + Services + Subscribers
Query 1 Objects
Columns
Sorting
Joins
V
Customer_Table
+
Preview SQL in Nexus
Add Join Select * Customer_Number Integer Customer_Name Varchar(20) Phone_Number Char(8)
WHERE
SQL
Join Hub System Vertica Metadata
Analytics
V
Order_Table Add Join Select * Order_Number Integer Customer_Number Integer Order_Date Date Order_Total Decimal (10,2)
Now that you have defined your tables and the join conditions, the tables will appear together in the Super Join Builder. Notice the line connecting the tables’ points to the columns respective join conditions.
Page 123
Chapter 4
Nexus
The Tabs of the Super Join Builder Philosophy – One Query Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics
Vertica
The tabs above work as a team on a single query. Each tab is designed for a different purpose. The next series of slides will explain how to use each tab effectively, in order to build a single query quickly and efficiently. Each time you change something in a tab, the SQL being built is changed to build the query as you desire.
The Super Join Builder is one of the most intricate pieces of commercial software ever built. Each tab above performs a different function so that you can quickly build the most efficient query possible.
Page 124
Chapter 4
Nexus
The Tabs of the Super Join Builder – Objects Tab Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics
Vertica
The Objects tab shows your objects (tables and views) and provides a menu of what other objects are joinable.
V
Customer_Table
+ Create cube Create cube w/ columns
V
Order_Table
Add Join
Add Join
Select * Order_Table Customer_Number Integer Customer_Name Varchar(20)
Select * Order_Number Integer Customer_Number Integer
Phone_Number Char(8)
Order_Date Date Order_Total Decimal (10,2)
The Objects tab is the first screen you will always see when you right click on a table and choose "Super Join Builder". It shows the table you right clicked on in your systems tree (visually). You can then select from the Add Join drop down to see a menu of what other objects can be joined. If you click on an object in the Add Join drop down it will be joined. You can also click the Cube drop down menu to automatically select all objects that are joinable (instantly). Each time you checkmark a column in an object the SQL is automatically built and that column will be on your report. Of course you can always check all columns in an object by putting a checkmark in the Select * check box of an object. Above, we are joining the Customer_Table to the Order_Table and have put a checkmark on the Customer_Name, Order_Date and Order_Total.
Page 125
Chapter 4
Nexus
The Tabs of the Super Join Builder – Columns Tab) Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics
Vertica
Trashcan – Drag and drop object here to remove them from the report. Report Columns Customer_Name
Order_Date
Order_Total
Additional Columns Customer_Number
Phone_Number
The Columns tab allows you to see the columns on your report, and to change their order. It shows the columns you have selected on the report (at the top), because you placed a checkmark in them from the objects tab. It also shows you the columns you did not checkmark (at the bottom). You can move the columns around, throw them in the trash, bring up columns that were not selected, and the SQL will change to reflect exactly what you want your report columns to look like.
Order_Number
Customer_Number
The columns tab allows you to rearrange columns on your report. You can drag and drop columns to change their order. Notice that the columns are color coded to match the color of the table (in the objects tab) that they came from. Columns at the top are the ones you selected. The columns at the bottom are the ones you did not select. You can throw columns at the top in the trash can and they are no longer on your report, as they will reappear at the bottom. You can even move columns from the bottom up to the top and they will then be on your report.
Page 126
Chapter 4
Nexus
The Tabs of the Super Join Builder – Sorting Tab Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics
Vertica
Trashcan – Drag and drop object here to remove them from the report. Column Name Order_Date
Sort Type ASC
The Sorting tab is for the ORDER BY clause. Double click on any column below and it will be used to sort the report.
Report Columns Customer_Name
Order_Date
Order_Total
Additional Columns Customer_Number
Phone_Number
Order_Number
Customer_Number
The Sorting Tab allows you to double-click on any column that you want to use in the ORDER BY statement of the SQL. This essentially sorts the report. The Report Columns shows the columns you selected to be on your report. The Additional Columns shows the columns you did not select to be on your report. You can however choose from any of these columns to be the sort key. Double click on a column, or click-and-drag it up, and it will be a sort key. You can have multiple sort keys and you can always choose either ASC or DESC mode.
Page 127
Chapter 4
Nexus
The Tabs of the Super Join Builder – Joins Tab Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics Join Type
Join Objects
SQL_Class.Customer_Table Cus Joins to SQL_Class.Order_Table ORD ON Cus.Customer_Number = ORD.Customer_Number
INNER
INNER LEFT RIGHT FULL
The Joins tab allows you to change from Inner joins to outer joins.
Vertica
Customer_Table
INNER JOIN Order_Table
ON Cus.Customer_Number = ORD.Customer_Number The Joins tab also gives you a visual of the joining tables and the columns in the ON Clause
Use the Joins tab if you want to change your joins from Inner to Outer joins. The drop down menu (red arrow) allows you to easily adjust your SQL to utilize the outer join of your choice. It is also designed to show you the tables being joined and the join column conditions in the ON CLAUSE. Above, we have decided to keep the default INNER JOIN.
Page 128
Chapter 4
Nexus
The Tabs of the Super Join Builder – SQL Tab Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics
Vertica
SQL
SELECT Cus.Customer_Name, ORD.Order_Date, ORD.Order_Total FROM SQL_Class.Customer_Table Cus INNER JOIN SQL_Class.Order_Table ORD ON Cus.Customer_Number = ORD.Customer_Number ORDER BY Ord.Order_Date ASC ;
The SQL tab shows you the SQL that Nexus has automatically generated. The SQL begins being built the first time you checkmark a column in the objects tab, and changes with each change you request from any of the other tabs. We originally requested Customer_Name, Order_Date and Order_Total in the Objects tab. We sorted by Order_Date ASC in the Sorting Tab. Nexus always generates the SQL perfectly for all systems.
Page 129
Chapter 4
Nexus
The Tabs of the Super Join Builder – Metadata Tab Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Create Table Preview SQL in Nexus Join Hub System Vertica Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics Table Metadata
Customer_Table
Explain
V
Order_Table
V
Table Size: 5 KB
Table Size: 5 KB
Row Count: 6
Row Count: 6
The Metadata tab shows you the size of each table in your join. This becomes a strategic asset for Cross-System joins.
This request is eligible for incremental planning and execution (IPE) but does not meet cost thresholds. The following is the static plan for the request. 1) First, we lock a distinct SQL_CLASS."pseudo table" for read on a RowHash to prevent global deadlock for SQL_CLASS.ORD. 2) Next, we lock SQL_CLASS.ORD for read. 3) We do an all-AMPs RETRIEVE step from SQL_CLASS.ORD by way of an all-rows scan with a condition of ("SQL_CLASS.ORD.Customer_Number = 11111111") into Spool 2 (one-amp), which is redistributed by the hash code of (11111111) to all AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is estimated with low confidence to be 2 rows (60 bytes). The estimated time for this step is 0.01 seconds.
The Metadata tab will show you the size of your table, including row counts. This will be extremely important when you begin performing cross-system joins. Nexus always thinks about performance tuning first because Nexus has been used (in production) by many of the largest companies in the world. The Metadata tab will also show you the optimizer's plan, which is often called the Explain plan, if you request it by clicking on the magnifying glass (red circle above).
Page 130
Chapter 4
Nexus
The Tabs of the Super Join Builder – Analytics Tab Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Create Table Preview SQL in Nexus Join Hub System Vertica Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics V
Sales_Table Start the Analytics by bringing in a table to the Super Join Builder, but make sure you checkmark the columns you will be using in the Analytics tab.
+
Add Join Select * Product_ID Integer Sale_Date Date Daily_Sales Char(8)
The Analytics tab will allow a user to quickly build the SQL needed for Ordered Analytics (OLAP), Rank, and Grouping Sets. These analytics can also be done in the Garden of Analysis after an answer set returns, but on extremely large data sets it can be advantageous to have analytics performed by the data warehouse. The Analytics tab will build the SQL for the user so the user can submit that SQL to the data warehouse to receive an answer set. Start by right clicking on a table in the system tree and choosing Super Join Builder from the right click menu. Then, checkmark the columns you will want on your report, and then go to the Analytics tab.
Page 131
Chapter 4
Nexus
The Tabs of the SJB – Analytics Tab – OLAP Screen Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Objects
OLAP
Create Table
Columns Sorting
Rank
Preview SQL in Nexus Joins
WHERE
SQL
Sorting
Place a checkmark on the OLAPs below
Partitioning
Column Name
Moving Window 6
Report Columns Product_ID
Sale_Date
Vertica
Metadata Analytics
Grouping Sets
OLAP
Join Hub System
Daily_Sales
Drag and drop these columns
OLAP Function Select * CSUM MSUM MAVG MDIFF COUNT MAX MIN
With Partitioning Select * (P) CSUM (P) MSUM (P) MAVG (P) MAVG (C) MDIFF (P) COUNT (P) MAX (P) MIN (P)
This is usually the exact screen you see by default when you enter the Analytics tab. We are in the OLAP subtab. You will drop and drag the columns below to the appropriate window, change the moving window parameter and then check the OLAP functions you desire. You can select all of the OLAP functions by checking the Select * checkbox, and you can select the With Partitioning OLAP functions as well. The With Partitioning will reset the calculations on the Partitioning column.
Page 132
Chapter 4
Nexus
Getting a Simple CSUM in the Analytics Tab – OLAP Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Objects
OLAP
Create Table
Columns Sorting
Rank
Preview SQL in Nexus Joins
SQL
Sorting
Daily_Sales
Column Name
Place a checkmark on the OLAPs below
Partitioning
Moving Window 6
Product_ID Sale_Date
Report Columns Sale_Date
Greenplum
Metadata Analytics
Grouping Sets
OLAP
Product_ID
WHERE
Join Hub System
Daily_Sales
OLAP Function Select * CSUM MSUM MAVG MDIFF COUNT MAX MIN
With Partitioning Select * (P) CSUM (P) MSUM (P) MAVG (P) MAVG (C) MDIFF (P) COUNT (P) MAX (P) MIN (P)
In the example above, we dragged the Daily_Sales column to the OLAP box because that is the column we want to perform the calculations on. We dragged the Product_ID column first to the sorting tab and then we dragged the Sale_Date column there also. This means we will first sort the data by Product_ID, Sale_Date. We also checked the CSUM OLAP function. We didn't touch the Partitioning or Moving Window information. The next page shows the SQL tab and the SQL generated.
Page 133
Chapter 4
Nexus
Getting a Simple CSUM – The SQL Automatically Generated Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Create Table Preview SQL in Nexus Join Hub System Query 1 Objects Columns Sorting Joins WHERE SQL Metadata Analytics
Vertica
SQL
SELECT Sal.Product_ID, Sal.Sale_Date, Sal.Daily_Sales, SUM(Sal.Daily_Sales) OVER ( ORDER BY Sal.Product_ID ASC, Sal.Sale_Date ASC ROWS UNBOUNDED PRECEDING) AS CSUM_Sale_Date_Sal FROM SQL_CLASS.Sales_table Sal ;
The SQL above was automatically generated by the previous OLAP screen and was done so the second we checked the CSUM checkbox. The above query will OLAP the column Daily_Sales, but only after first sorting the data by Product_ID, Sale_Date. The Rows Unbounded Preceding will generate a Cumulative Sum. Let's check out the report on the next slide.
Page 134
Chapter 4
Nexus
The Answer Set of the CSUM SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID ASC, Sale_Date ASC ROWS UNBOUNDED PRECEDING) AS CSUM_Sale_Date_Sal FROM Sales_Table ; Product_ID Sale_Date _________ Daily_Sales _________________ CSUM_Sale_Date_Sal ________ _________ 1000 2000-09-28 48850.40 48850.40 1000 2000-09-29 54500.22 103350.62 1000 2000-09-30 36000.07 139350.69 1000 2000-10-01 Not all rows 40200.43 179551.12 are displayed 1000 2000-10-02 32800.50 212351.62 in this 1000 2000-10-03 64300.00 276651.62 answer set 1000 2000-10-04 54553.10 331204.72 2000 2000-09-28 41888.88 373093.60 2000 2000-09-29 48000.00 421093.60 2000 2000-09-30 49850.03 470943.63 2000 2000-10-01 54850.29 525793.92 The Sales_Table was first sorted by Product_ID, Sale_Date. Then, the Cumulative Sum (CSUM) began. We made 48850 for the first row so 48850.50 is our first CSUM value. Then, we made 54500.22 so 48850 + 54500.22 equals 103350.62. The values from each Daily_Sales entry was continually added.
Page 135
Chapter 4
Nexus
Getting all of the OLAP functions in the Analytics Tab Nexus Chameleon File Edit View Query Tools Help Web Windows
Execute Objects
OLAP
Create Table
Columns Sorting
Rank
Preview SQL in Nexus Joins
WHERE
Place a checkmark on the OLAPs below
Sorting
Partitioning
Daily_Sales
Column Name
Product_ID
Product_ID Sale_Date
Report Columns Sale_Date
Vertica
Metadata Analytics
Grouping Sets
OLAP
Product_ID
SQL
Join Hub System
Daily_Sales
Moving Window 3
OLAP Function Select * CSUM MSUM MAVG MDIFF COUNT MAX MIN
With Partitioning Select * (P) CSUM (P) MSUM (P) MAVG (P) MAVG (C) MDIFF (P) COUNT (P) MAX (P) MIN (P)
In the example above, we will OLAP the Daily_Sales column after first sorting by Product_ID, Sale_Date. We dragged the Product_ID column to the partitioning window. We changed the moving window to 3 and checked all of the OLAP and OLAP with partitioning boxes by merely clicking on SELECT *. The next page shows the SQL tab and the SQL generated.
Page 136
Chapter 4
Nexus
A Five Table Join Using the Menu Nexus Chameleon File Edit View Query Tools Help Web Windows System:
Vertica
Database: SQL Class
Systems
- Vertica - SQL_Class - Tables
?
New Query
Super Join Builder Right Click
+ Department_Table
+ Services
EXECUTE
Right click and choose Super Join Builder from the menu
+ Employee_Table
+ Providers
Sandbox
Query 1
+ Addresses + Claims + Customer_Table
+ Order_Table
History
Quick Select View DDL Move data to Oracle Move data to SQL Server Move data to Teradata Move data to Azure SQL Data Warehouse SmartScript Hound Dog Compression Compare/Sync Data
+ Subscribers
You are about to see how the menu system of the Super Join Builder is designed to work. We will first use the menu of the Super Join Builder and then show you an even quicker way (using the Cube method).
Page 137
Chapter 4
Nexus
The First Table is placed in the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers
Query 1 Objects
Columns
Sorting
V
Addresses
+
Add Join
Select * Street Varchar(30) City Varchar(20) State Char(2) Zip Integer AreaCode Smallint Phone Integer Subscriber_No Integer
Preview SQL in Nexus Joins
WHERE
SQL
Join Hub System Vertica Metadata
Analytics
Get ready to select from the Add Join drop down menu
This is exactly what you will see when you first enter the Super Join Builder. You will see your table, its columns and the data types of each column. Notice the table name (Addresses) and the V for the Vertica icon. Turn the page for more.
Page 138
Chapter 4
Nexus
Using the Add Join Cascading Menu Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica Execute Objects
Database: SQL Class
EXECUTE
?
Create Table Preview SQL in Nexus Columns
Sorting
Joins
New Query Join Hub System Teradata
WHERE
SQL
Metadata
Analytics
V
Addresses
+
Sandbox
History
Add Join Select * Street Varchar(30) City Varchar(20) State Char(2)
Zip Integer AreaCode Smallint Phone Integer Subscriber_No Integer
V
Subscribers
V
Claims
V
Providers
V
Services
Left Click on the final table (Services)
The Add Join menu drop down shows that the Addresses table joins to the Subscribers table. Keep cascading down the menu and you see that the Subscribers table joins to the Claims table. Keep cascading all the way until you get to the final table, which is the Services table. Left click on the Services table and all five tables will be in the Super Join Builder.
Page 139
Chapter 4
Nexus
All Five Tables Are In the Super Join Builder Nexus Chameleon Execute
Objects
Create Table Preview SQL in Nexus
Columns
V
Addresses
+
Add Join
Sorting
Join Hub System Vertica
Joins
WHERE
Subscribers
V
+
Add Join
SQL
Metadata
Analytics
V
Claims
+
Add Join
Select * Street Varchar(30) City Varchar(20)
Select *
Select *
Last_Name Varchar(20) First_Name Varchar(20)
State Char(2)
Vender Char(1)
Zip Integer
SSN Integer
AreaCode Smallint
Member_No Smallint
Phone Integer
Subscriber_No Integer
Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(9,2) Phone Integer Subscriber_No Integer
Subscriber_No Integer
V
Services
+
Add Join
Select * Service_Code Integer Service_Desc Varchar(20) Service_Pay Decimal(7,2)
V
Providers
+
Add Join
Select * Provider_Code Integer Prov_Name Varchar(20) Error_Rate Decimal(4,2)
Now that all five tables are present in the Super Join Builder, all you have to do is checkmark the columns you want on the report. The SQL is built automatically with each mouse click. When you are done selecting the columns, just hit Execute.
Page 140
Chapter 4
Nexus
A Five Table Join Two Steps (Cube) Nexus Chameleon File Edit View Query Tools Help Web Windows System:
-
Vertica
Database: SQL Class
Systems
Right Click
+ Addresses + Claims + Customer_Table
+ Employee_Table + Department_Table
+ Providers
+ Services
EXECUTE
?
New Query
Right click and choose Super Join Builder from the menu
Tables
+ Order_Table
Sandbox
Query 1
Vertica SQL_Class
-
History
Super Join Builder Quick Select View DDL Move data to Oracle Move data to SQL Server Move data to Teradata Move data to Azure SQL Data Warehouse SmartScript Hound Dog Compression Compare/Sync Data
+ Subscribers
Be prepared to be amazed. We are about to do the two-step! These two steps will allow a user to join many tables in an instant. Watch the two steps that it takes to join a table to everything possible. Pick any table in the join of many tables.
Page 141
Chapter 4
Nexus
Choose Cube with Columns from the Left Top of the Table Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers
Query 1 Objects
Columns
Sorting
Preview SQL in Nexus Joins
WHERE
Join Hub System Vertica
SQL
+
Analytics
V
Addresses Choose Create Cube with Columns From the Cube drop down menu
Metadata
Add Join
Select * Create Cube Street Varchar(30) Create Cube with Columns City Varchar(20) State Char(2) Zip Integer AreaCode Smallint Phone Integer Subscriber_No Integer
On the left side of the table is the Cube drop down menu. Choose the Create Cube with Columns option (highlighted above). The Nexus will join every table possible in the entire lineage instantly, and choose all of the columns. Turn the page!
Page 142
Chapter 4
Nexus
All Tables are Cubed (Joined Together Instantly) Nexus Chameleon Execute
Objects
Create Table Preview SQL in Nexus
Columns
V
Addresses
+
Add Join
Sorting
Join Hub System Vertica
Joins
WHERE
Subscribers
V
+
Add Join
SQL
Metadata
Analytics
V
Claims
+
Add Join
Select * Street Varchar(30) City Varchar(20)
Select *
Select *
Last_Name Varchar(20) First_Name Varchar(20)
State Char(2)
Gender Char(1)
Zip Integer
SSN Integer
AreaCode Smallint
Member_No Smallint
Phone Integer
Subscriber_No Integer
Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(9,2) Phone Integer Subscriber_No Integer
Subscriber_No Integer
V
Services
+
Add Join
Select * Service_Code Integer Service_Desc Varchar(20) Service_Pay Decimal(7,2)
V
Providers
+
Add Join
Select * Provider_Code Integer Prov_Name Varchar(20) Error_Rate Decimal(4,2)
There were five total tables that were joinable and the Create Cube with Columns choice instantly joined them together, including all of the columns. The SQL has been built automatically (in 2 seconds) and you can hit Execute to get your report.
Page 143
Chapter 4
Nexus
Choose Cube and then Choose Your Columns Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
Vertica SQL_Class
Execute
Create Table
-
Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers
Query 1 Objects
Columns
Sorting
Preview SQL in Nexus Joins
WHERE
Join Hub System Vertica
SQL
+
Analytics
V
Addresses Choose Create Cube From the Cube drop down menu
Metadata
Add Join
Select * Create Cube Street Varchar(30) Create Cube with Columns City Varchar(20) State Char(2) Zip Integer AreaCode Smallint Phone Integer Subscriber_No Integer
On the left side of the table is the Cube drop down menu. Choose the Create Cube option (highlighted above). The Nexus will join every table possible in the entire lineage instantly, but you can decide what columns you want on the report.
Page 144
Chapter 4
Nexus
Create Cube - Tables Are Joined Without Columns Selected Nexus Chameleon Execute
Objects
Create Table Preview SQL in Nexus
Columns
V
Addresses
+
Add Join
Sorting
Join Hub System Vertica
Joins
WHERE
Subscribers
V
+
Add Join
SQL
Metadata
Analytics
V
Claims
+
Add Join
Select * Street Varchar(30) City Varchar(20)
Select *
Select *
Last_Name Varchar(20) First_Name Varchar(20)
State Char(2)
Gender Char(1)
Zip Integer
SSN Integer
AreaCode Smallint
Member_No Smallint
Phone Integer
Subscriber_No Integer
Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(9,2) Phone Integer Subscriber_No Integer
Subscriber_No Integer
V
Services
+
Add Join
Select * Service_Code Integer Service_Desc Varchar(20) Service_Pay Decimal(7,2)
V
Providers
+
Add Join
Select * Provider_Code Integer Prov_Name Varchar(20) Error_Rate Decimal(4,2)
All of the tables joinable are present in the Super Join Builder, but none of the columns are selected. You can now select the columns you want on the report. There will also be an X on the top of the tables, so you can delete any table you don't need.
Page 145
Chapter 4
Nexus
Create Cube – Select the Columns You Want on the Report Nexus Chameleon Execute
Objects
Create Table Preview SQL in Nexus
Columns
V
Addresses
+
Add Join
Sorting
Join Hub System Vertica
Joins
WHERE
Subscribers
V
+
Add Join
SQL
Metadata
Analytics
V
Claims
+
Add Join
Select * Street Varchar(30) City Varchar(20)
Select *
Select *
Last_Name Varchar(20) First_Name Varchar(20)
State Char(2)
Gender Char(1)
Zip Integer
SSN Integer
AreaCode Smallint
Member_No Smallint
Phone Integer
Subscriber_No Integer
Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(9,2) Phone Integer Subscriber_No Integer
Subscriber_No Integer
V
Services
+
Add Join
Select * Service_Code Integer Service_Desc Varchar(20) Service_Pay Decimal(7,2)
V
Providers
+
Add Join
Select * Provider_Code Integer Prov_Name Varchar(20) Error_Rate Decimal(4,2)
Notice that we have checked the columns we want on the report, but that not all columns were selected. The SQL is built automatically with each check or uncheck of a column box. When you are finished choosing the columns you want on the report, please hit Execute (above) or Preview SQL in Nexus, where you can hit Execute in the main Nexus screen.
Page 146
Chapter 4
Nexus
How to join Vertica, Oracle and SQL Server Tables Nexus Chameleon File Edit View Query Tools Help Web Windows System:
Vertica
Database: SQL Class
Systems
History
Sandbox
EXECUTE
?
New Query
Query 1
+ Oracle + SQL Server
- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers
Choose Super Join Builder from the menu
Right Click
Super Join Builder Quick Select View DDL
Move data to Oracle Move data to SQL Server Move data to Teradata Move data to Azure SQL Data Warehouse SmartScript Hound Dog Compression Compare/Sync Data
We are about to do a three-table join, but the incredible part is that one table is from Vertica, another from Oracle and the third table is from SQL Server. We will start with the Vertica table. We will right click on the Addresses table from the Vertica systems tree and a menu will appear. We choose the Super Join Builder (top menu item) and begin the process.
Page 147
Chapter 4
Nexus
The Vertica Table is now in the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
Query 1
Super Join Builder
+ Oracle + SQL Server
Execute
Create Table
- Vertica - SQL_Class - Tables + Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers
Query 1 Objects
Columns
V
Addresses
+
Sorting
Preview SQL in Nexus Joins
V
WHERE
SQL
Join Hub System Vertica Metadata
Analytics
Stands for Vertica
Add Join
Select * Subscriber_No Integer Street Varchar(30) City Varchar(20) State Char(2) Zip Integer AreaCode Smallint
Phone Integer
The Addresses Table (from the Vertica system) is now in the Super Join Builder. You will see the table, its columns and the data types of each column. Notice that the table name (Addresses) has an icon of V for Vertica in the upper right corner. Now is the time to open up our Oracle system tree. We will do that on the next slide.
Page 148
Chapter 4
Nexus
Drag the Joining Oracle Table to the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows
-
Systems
Oracle SQL_Class
-
Tables
Query 1
Super Join Builder
Execute
Create Table
Query 1 Objects
Columns
+ Addresses + Claims + Customer_Table
+ Providers + Services + Subscribers
+ SQL Server + Vertica
Joins
WHERE
+
SQL
Join Hub System Vertica Metadata
Analytics
V
Addresses
+ Employee_Table + Department_Table + Order_Table
Sorting
Preview SQL in Nexus
Add Join
Select * Subscriber_No Integer
Left click on the Oracle table you want to join and drag it into the Super Join Builder
Street Varchar(30) City Varchar(20) State Char(2) Zip Integer AreaCode Smallint
Phone Integer
Open up your Oracle Systems Tree and left click on the Oracle table you want to join and drag it into the Super Join Builder.
Page 149
Chapter 4
Nexus
Defining the Join Columns Add Custom Join Join Type Inner Existing Tables
1) 2) 3) 4)
SQL_Sandbox.Subscribers931827O
SQL_Class.Addresses
Addresses Subscriber_No Street City State Zip AreaCode Phone
Settings icon – If you want to Change the Data movement options
Left click the join column on the first table (highlights in blue) Left click the join column on the second table Left click the blue arrow to establish the join condition Hit the Add Join Button
Subscribers931827O
V Integer Varchar(30) Varchar(20) Char(2) Integer Smallint Integer
1
3
SSN Gender First_Name Last_Name Member_No Subscriber_No
Number(38,0) Char(1) Varchar(20) Char(20) Number (38,0) Number (38,0)
O
2
SQL_Class.Addresses Add INNER JOIN SQL_Sandbox.Subscribers931827O SUB ON Add.Subscriber_No = SUB.Subscriber_No
Reset
4
+ Add Join
In four easy steps you can define the join conditions. Notice two things about the Subscribers Table from Oracle. First, notice the Oracle icon in the table's right hand corner. Second, notice the number behind the Subscribers name (pink). In the above example, the name is Subscribers931827O. The table will be moved to Vertica temporarily for the life of the join.
Page 150
Chapter 4
Nexus
Choose the Columns You Want on Your Report Nexus Chameleon File Edit View Query Tools Help Web Windows
-
Systems
Oracle SQL_Class
-
Tables
+ Addresses + Claims + Customer_Table + Employee_Table + Department_Table + Order_Table + Providers + Services + Subscribers
+ SQL Server + Vertica
Query 1
Super Join Builder
Execute
Create Table
Query 1 Objects
Columns
V
Addresses
+
Add Join
Select * Subscriber_No Integer Street Varchar(30) City Varchar(20)
Preview SQL in Nexus
Sorting
Joins
Vertica Table
WHERE
Join Hub System Vertica
SQL
Metadata
Subscribers
+
O
Analytics Oracle Table
Add Join
Select * Last_Name Varchar(20) First_Name Varchar(20) Gender Char(1)
State Char(2)
SSN Integer
Zip Integer
Member_No Smallint
AreaCode Smallint
Subscriber_No Integer
Phone Integer
The Vertica and Oracle tables are in the Super Join Builder, the relationships have been defined and so have the data movement strategies. All you need to do now is to checkmark the columns you want from both tables on the report. We have placed a checkmark on the Subscriber_No, State, Last_Name and First_Name columns. The SQL has already been built (automatically), and if you hit the Execute button, the report will return.
Page 151
Chapter 4
Nexus
Let's Add a SQL Server Table to our Vertica and Oracle Join Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
+ Oracle
- SQL Server - SQL_Class - System Tables + dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table + dbo.Department_Table + dbo.Order_Table + dbo.Providers + dbo.Services + dbo.Subscribers
+ Vertica
Query 1
Super Join Builder
Execute
Create Table
Query 1 Objects
Columns
Left click on the SQL Server table you want to join and drag it into the Super Join Builder
Preview SQL in Nexus
Sorting
Joins
WHERE
Addresses
V
+
Add Join
Select * Subscriber_No Integer Street Varchar(30) City Varchar(20)
SQL
Join Hub System Vertica Metadata
Analytics
Subscribers
+
O
Add Join
Select * Last_Name Varchar(20) First_Name Varchar(20) Gender Char(1)
State Char(2)
SSN Integer
Zip Integer
Member_No Smallint
AreaCode Smallint
Subscriber_No Integer
Phone Integer
Open up your SQL Server Systems Tree and left click on the table you want to join, and drag it into the Super Join Builder.
Page 152
Chapter 4
Nexus
Defining the Join Columns Add Custom Join Join Type
Inner Existing Tables
1
Make sure you use the Table drop down menu to pick the table that the new table Joins with
SQL_Sandbox.Claims222583S
SQL_Sandbox.Subscribers931827O Subscribers9318270 SSN Gender First_Name Last_Name Member_No Subscriber_No
Number(38,0) Char(1) Varchar(20) Char(20) Number (38,0) Number (38,0)
Claims222583S
O
4 2
Settings icon – If you want to change the data movement or working database options
Claim_Id Claim_Date Claim_Service Subscriber_No Member_No Claim_Amt Provider_No
SQL
Integer Date Smallint 3 Integer Smallint Decimal(12,2) Smallint
SQL_Sandbox. Subscribers9318270 SUB INNER JOIN SQL_Sandbox. Claims222583S CLA ON SUB.Subscriber_No = CLA.Subscriber_No
Reset
5
+ Add Join
1) Choose the correct table that the new table joins with from the table drop down menu. 2) Choose the joining column(s) from the left table. 3) Choose the joining column(s) from the new table on the right. 4) Hit the blue arrow to actually define the join conditions (the SQL will change below to reflect the join). 5) Hit the Add Join Button.
Page 153
Chapter 4
Nexus
All Three Tables are now in the Super Join Builder Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
+
Oracle
- SQL Server - SQL_Class - System Tables + dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table
Query 1
Super Join Builder
Execute
Create Table
Query 1 Objects
Columns
V
Addresses
+
Add Join
Select * Subscriber_No Integer
Preview SQL in Nexus
Sorting
Joins
WHERE
Subscribers
+
O
Add Join
Claims
+
Analytics
SQL Add Join
Select *
Last_Name Varchar(20) First_Name Varchar(20)
Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Phone Integer Subscriber_No Integer
Street Varchar(30) City Varchar(20)
+ dbo.Providers + dbo.Services + dbo.Subscribers
State Char(2)
SSN Integer
Zip Integer
Member_No Smallint
AreaCode Smallint
Subscriber_No Integer
+ Vertica
Metadata
Select *
+ dbo.Department_Table + dbo.Order_Table
Phone Integer
SQL
Join Hub System Vertica
Gender Char(1)
You now have three tables (from different systems) in the Super Join Builder. You have already defined the joining columns and your data movement strategies. Just click on the columns you want on the report and hit Execute. The SQL has already been built and the Oracle and SQL Server tables will be moved temporarily to the Vertica system, where they will be joined. Because we only selected Last_Name and First_Name from the Oracle table and Claim_Date and Claim_Amt from the SQL Server table, only those columns (plus the join condition column – Subscriber_No) will be moved to Vertica.
Page 154
Chapter 4
Nexus
Change the Hub and Run the Join on Oracle From the Join Hub System drop down menu choose Oracle
Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
+ Oracle
-
SQL Server SQL_Class
Query 1
Super Join Builder
Execute
Create Table
Query 1 Objects
Columns
System Tables
+ dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table
V
Addresses
+
Add Join
Select * Subscriber_No Integer
Preview SQL in Nexus
Sorting
Joins
WHERE
Subscribers
+
O
Add Join
Claims
+
Analytics
SQL Add Join
Select *
Last_Name Varchar(20) First_Name Varchar(20)
Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Phone Integer Subscriber_No Integer
Street Varchar(30) City Varchar(20)
+ dbo.Providers + dbo.Services + dbo.Subscribers
State Char(2)
SSN Integer
Zip Integer
Member_No Smallint
AreaCode Smallint
Subscriber_No Integer
+ Vertica
Metadata
Select *
+ dbo.Department_Table + dbo.Order_Table
Phone Integer
SQL
Join Hub System Oracle
Gender Char(1)
Nexus allows you to determine on which system the processing will take place. We call this the Hub. Above, we have changed the Join Hub System to Oracle. This means that the Vertica and SQL Server tables will be moved to Oracle, where all three tables will be joined. You can actually change the Hub to any system in our enterprise.
Page 155
Chapter 4
Nexus
Change the Hub and Run the Join on SQL Server From the Join Hub System drop down menu choose SQL Server
Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
+
Oracle
-
SQL Server SQL_Class
Query 1
Super Join Builder
Execute
Create Table
Query 1 Objects
Columns
System Tables
+ dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table
V
Addresses
+
Add Join
Select * Subscriber_No Integer
Preview SQL in Nexus
Sorting
Joins
WHERE
Subscribers
+
O
Add Join
Metadata
Claims
+
Analytics
SQL Add Join
Select *
Last_Name Varchar(20) First_Name Varchar(20)
Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Phone Integer Subscriber_No Integer
Street Varchar(30) City Varchar(20)
+ dbo.Providers + dbo.Services + dbo.Subscribers
State Char(2)
SSN Integer
Zip Integer
Member_No Smallint
AreaCode Smallint
Subscriber_No Integer
+ Vertica
SQL
Select *
+ dbo.Department_Table + dbo.Order_Table
Phone Integer
Join Hub System SQL Server
Gender Char(1)
Nexus allows you to determine on which system the processing will take place. We call this the Hub. Above, we have changed the Join Hub System to SQL Server. This means that the Vertica and Oracle tables will be moved to the SQL Server system, where all three tables will be joined. You can actually change the Hub to any system in our enterprise.
Page 156
Chapter 4
Nexus
Simply Amazing - Change the Hub to the Garden of Analysis From the Join Hub System drop down menu choose Garden of Analysis
Nexus Chameleon File Edit View Query Tools Help Web Windows
Systems
+
Oracle
- SQL Server - SQL_Class - System Tables + dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table
Query 1
Super Join Builder
Execute
Create Table
Query 1 Objects
Columns
V
Addresses
+
Add Join
Select * Subscriber_No Integer
Preview SQL in Nexus
Sorting
Joins
WHERE
Subscribers
+
O
Add Join
Metadata
Claims
+
Analytics
SQL Add Join
Select *
Last_Name Varchar(20) First_Name Varchar(20)
Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Phone Integer Subscriber_No Integer
Street Varchar(30) City Varchar(20)
+ dbo.Providers + dbo.Services + dbo.Subscribers
State Char(2)
SSN Integer
Zip Integer
Member_No Smallint
AreaCode Smallint
Subscriber_No Integer
+ Vertica
SQL
Select *
+ dbo.Department_Table + dbo.Order_Table
Phone Integer
Join Hub System Garden of A
Gender Char(1)
Nexus allows you to determine on which system the processing will take place. We call this the Hub. Above, we have changed the Join Hub System to the Garden of Analysis. This should be done when the joining tables are not huge. Now, all of the tables will be queried separately, and then joined transparently inside the user's PC. It is as fast as lightning! Brilliant!
Page 157
Chapter 4
Nexus
Have the Answer Set Saved Automatically to Any System Nexus Chameleon
Choose the Create Table File Edit View Query Tools Help Web Windows option
Systems
+ Oracle
- SQL Server - SQL_Class - System Tables + dbo.Addresses + dbo.Claims + dbo.Customer_Table + dbo.Employee_Table
Query 1 Execute
Create Table
Query 1 Objects
Columns
V
Addresses
+
Add Join
Select * Subscriber_No Integer
Preview SQL in Nexus
Sorting
Joins
WHERE
Subscribers
+
O
Add Join
Metadata
Claims
+
Analytics
SQL Add Join
Select *
Last_Name Varchar(20) First_Name Varchar(20)
Claim_Id Integer Claim_Date DATE Subscriber_No Integer Member_No Smallint Claim_Amt Decimal(12,2) Phone Integer Subscriber_No Integer
Street Varchar(30) City Varchar(20)
+ dbo.Providers + dbo.Services + dbo.Subscribers
State Char(2)
SSN Integer
Zip Integer
Member_No Smallint
AreaCode Smallint
Subscriber_No Integer
+ Vertica
SQL
Select *
+ dbo.Department_Table + dbo.Order_Table
Phone Integer
Join Hub System Vertica
Gender Char(1)
Nexus allows you to create a table on any system in your enterprise with an answer set from the Super Join Builder. Once the Create Table option (above) is selected, you will be asked on which system you want the answer set saved, which database or schema and the table name. The answer set won't return to your screen, but instead will be saved as a table to the system you have chosen.
Page 158
Chapter 4
Nexus
Saving the Answer Set to an Oracle or SQL Server System Create Table
Create Table
System
System
Oracle Cloud
SQL Server Test System Database
Database
SQL_Sandbox
SQL_Sandbox
Schema
Table Name
dbo
Addresses_SJB1
Table Name
Create Table
Cancel
Addresses_SJB_Test Create Table
Cancel
Above are the screens you will have to fill in if you want to save your answer set to Oracle (on the left) or SQL Server (on the right). Once you hit the Create Table button you will be returned to the Super Join Builder. When you hit Execute, the answer set will be created as a table using the above details.
Page 159
Chapter 4
Nexus
Saving the Answer Set to a Vertica System Create Table System
Vertica
Database
SQL_Sandbox
Table Name
Addresses_SJB_Test
Distribution Key
Distribution Key SER.Service_Code SER.Service_Desc SER.Service_Pay
Create Table
Cancel
Above is the screen you will have to fill in if you want to save your answer set to a Vertica system. Once you hit the Create Table button you will be returned to the Super Join Builder. When you hit Execute, the answer set will be created as a table using the above details.
Page 160
Chapter 4
Nexus
Saving the Answer Set to a Teradata System Create Table
Multiset tables allow duplicate rows
System
Teradata V15
Database
SQL_Sandbox
Table Name
Addresses_SJB_Test
Table Type Index Type
MultiSet
Set
A Set table kicks out duplicate rows.
Non-Unique Primary Index
The Primary Index is the Distribution Key
SER.Service_Code SER.Service_Desc SER.Service_Pay
Create Table
Cancel
Above is the screen you will have to fill in if you want to save your answer set to a Teradata system. Once you hit the Create Table button you will be returned to the Super Join Builder. When you hit Execute, the answer set will be created as a table using the above details. In Teradata, you will have either a Unique Primary Index, a NonUnique Primary Index or a No Primary Index (NoPI). Above, we have a Non-Unique Primary Index on SER.Service_Code.
Page 161
Chapter 5
Page 162
The Basics of SQL
Chapter 5
The Basics of SQL
Chapter 5 – The Basics of SQL
“As I would not be a slave, so I would not be a master.” - Abraham Lincoln
Page 163
Chapter 5
The Basics of SQL
Introduction Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
The Student_Table above will be used in our early SQL Examples
This is a pictorial of the Student_Table which we will use to present some basic examples of SQL and get some hands-on experience with querying this table. This book attempts to show you the table, show you the query and show you the result set.
Page 164
Chapter 5
The Basics of SQL
Setting your Path Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 set search_path to SQL_Class
Messages
Garden of Analysis
Result 1
Database 1 SQL_CLASS
The example above shows you how to set your path to include a database where your tables can be queried directly.
Page 165
Chapter 5
The Basics of SQL
Setting Your Default Database set search_path to SQL_Class
We have set our default schema to be SQL_Class
SELECT * FROM Student_Table ;
SELECT * FROM SQL_Class.Student_Table ;
The schema is assumed to be SQL_Class
We have specified our schema to be SQL_Class.
Vertica allows you to set your default database. Above, we have set our default database to SQL_Class. If we run a query without specifying the database, then Vertica will assume the database is SQL_Class.
Page 166
Chapter 5
The Basics of SQL
SELECT * (All Columns) in a Table SELECT * FROM Student_Table ;
Student_ID Last_Name ________ ________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450
Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith
First_Name ________ Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy
An asterisk (*) means you want to see ALL columns in the table on your report
Class_Code Grade_Pt _________ _______ FR FR JR ? SO FR SR SR JR SO
0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00
Most every SQL statement will consist of a SELECT and a FROM. You SELECT the columns you want to see on your report and an Asterisk (*) means you want to see all columns in the table on the returning answer set!
Page 167
Chapter 5
The Basics of SQL
Fully Qualifying a Database, Schema and Table Database
Schema
TableName
CREATE TABLE Coffing.SQL_Class.DEPT (DEPT_NO SMALLINT, DEPARTMENT_NAME CHARACTER(30), MGR_NO INT, BUDGET Decimal (10,2) );
SELECT * FROM DEPT; SELECT * FROM Dept;
We just created a table called Dept inside the Coffing Database in theSQL_Class schema.
If you are in the Sales Database, then these statements are both valid because Vertica and is NOT case sensitive.
SELECT * FROM Coffing.SQL_Class.DEPT ;
This is fully qualified
To refer to objects in other databases on Vertica, you must use three-level naming, which consists of the database, the schema (which is the name of the database owner) and the object (table or view etc.). The last example (SQL_Class..Dept) is a convenient way of specifying a fully qualified object name. The system supplies the schema name by internally inserting the current schema name.
Page 168
Chapter 5
The Basics of SQL
SELECT Specific Columns in a Table SELECT First_Name ,Last_Name ,Class_Code ,Grade_Pt FROM Student_Table ;
First_Name _________ Last_Name _________ Class_Code ________ Grade_Pt _________ Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy
Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith
FR FR JR ? SO FR SR SR JR SO
0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00
This is a great way to show the columns you are selecting from the Table_Name.
Page 169
Chapter 5
The Basics of SQL
Commas in the Front or Back? SELECT First_Name ,Last_Name 1 ,Class_Code ,Grade_Pt FROM Student_Table ;
SELECT First_Name, Last_Name, 2 Class_Code, Grade_Pt FROM Student_Table ;
First_Name Last_Name _________ Class_Code ________ Grade_Pt _________ _________ Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy
Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith
FR FR JR ? SO FR SR SR JR SO
0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00
Why is the example on the left better even though they are functionally equivalent? Errors are easier to spot and comments won't cause errors.
Page 170
Chapter 5
The Basics of SQL
Place your Commas in front for better Debugging Capabilities
SELECT First_Name, Last_Name, Class_Code, Grade_Pt,
FROM Student_Table ;
Sometimes if you Add or Remove a COLUMN you can overlook an ending Comma!
SELECT
First_Name ,Last_Name ,Class_Code ,Grade_Pt
FROM Student_Table ;
Error!
Successful
"A life filled with love may have some thorns, but a life empty of love will have no roses." Anonymous Having commas in front to separate column names makes it easier to debug. Remember our quote above. "A query filled with commas at the end just might fill you with thorns, but a query filled with commas in the front will allow you to always come up smelling like roses."
Page 171
Chapter 5
The Basics of SQL
Sort the Data with the ORDER BY Keyword Sorts the Answer Set in Ascending order by default
SELECT * FROM Student_Table ORDER BY Last_Name ;
Student_ID _________ Last_Name First_Name Class_Code Grade_Pt _________ ________ _________ _______ 322133 324652 125634 260000 423400 280023 123250 333450 234121 231222
Bond Delaney Hanson Johnson Larkins McRoberts Phillips Smith Thomas Wilson
Jimmy Danny Henry Stanley Michael Richard Martin Andy Wendy Susie
JR SR FR ? FR JR SR SO FR SO
3.95 3.35 2.88 ? 0.00 1.90 3.00 2.00 4.00 3.80
Rows typically come back to the report in random order. To order the result set, you must use an ORDER BY. When you order by a column, it will order in ASCENDING order. This is called the Major Sort!
Page 172
Chapter 5
The Basics of SQL
ORDER BY Defaults to Ascending Sorts the Answer Set In Ascending Order By Last_Name
SELECT * FROM Student_Table ORDER BY Last_Name ;
Student_ID _________ Last_Name First_Name Class_Code Grade_Pt _________ ________ _________ _______ 322133 324652 125634 260000 423400 280023 123250 333450 234121 231222
Bond Delaney Hanson Johnson Larkins McRoberts Phillips Smith Thomas Wilson
Jimmy Danny Henry Stanley Michael Richard Martin Andy Wendy Susie
JR SR FR ? FR JR SR SO FR SO
3.95 3.35 2.88 ? 0.00 1.90 3.00 2.00 4.00 3.80
Rows typically come back to the report in random order, but we decided to use the ORDER BY statement. Now, the data comes back ordered by Last_Name.
Page 173
Chapter 5
The Basics of SQL
Use the Name or the Number in your ORDER BY Statement SELECT * FROM Student_Table ORDER BY 2 ;
Sorts the Answer Set by Column 2 which is Last_Name
Sort by the 2nd column coming back on the report
Student_ID _________ Last_Name First_Name Class_Code Grade_Pt _________ ________ _________ _______ 322133 324652 125634 260000 423400 280023 123250 333450 234121 231222
Bond Delaney Hanson Johnson Larkins McRoberts Phillips Smith Thomas Wilson
Jimmy Danny Henry Stanley Michael Richard Martin Andy Wendy Susie
JR SR FR ? FR JR SR SO FR SO
3.95 3.35 2.88 ? 0.00 1.90 3.00 2.00 4.00 3.80
The ORDER BY can use a number to represent the sort column. The number 2 represents the second column on the report.
Page 174
Chapter 5
The Basics of SQL
Two Examples of ORDER BY using Different Techniques SELECT * FROM Student_Table ORDER BY 5 ;
Student_ID _________ 260000 423400 280023 333450 125634 123250 324652 231222 322133 234121
Same Query
Last_Name First_Name _________ _________ Johnson Larkins McRoberts Smith Hanson Phillips Delaney Wilson Bond Thomas
Stanley Michael Richard Andy Henry Martin Danny Susie Jimmy Wendy
SELECT * FROM Student_Table ORDER BY Grade_Pt ;
Class_Code _________ Grade_Pt _______ ? FR JR SO FR SR SR SO JR FR
? 0.00 1.90 2.00 2.88 3.00 3.35 3.80 3.95 4.00
Notice that the answer set is sorted in ascending order based on the column Grade_Pt. Also, notice that Grade_Pt is the fifth column coming back on the report. That is why the SQL in both statements is ordering by Grade_Pt. Did you notice that the null value came back first? Nulls sort first in ascending order and last in descending order.
Page 175
Chapter 5
The Basics of SQL
Changing the ORDER BY to Descending Order Sorts the Answer Set In DESC Order By Last_Name
Student_ID Last_Name ________ _________ 231222 Wilson 234121 Thomas 333450 Smith 123250 Phillips 280023 McRoberts 423400 Larkins 260000 Johnson 125634 Hanson 324652 Delaney 322133 Bond
SELECT * FROM Student_Table ORDER BY Last_Name DESC;
First_Name Class_Code Grade_Pt ________ _________ _______ Susie SO 3.80 Wendy FR 4.00 Andy SO 2.00 Martin SR 3.00 Richard JR 1.90 Michael FR 0.00 Stanley ? ? Henry FR 2.88 Danny SR 3.35 Jimmy JR 3.95
Notice that the answer set is sorted in descending order based on the column Last_Name. Also, notice that Last_Name is the second column coming back on the report. We could have done an Order By 2. If you spell out the word DESCENDING the query will fail, so you must remember to just use DESC.
Page 176
Chapter 5
The Basics of SQL
NULL Values sort First in Ascending Mode (Default) SELECT * FROM Student_Table ORDER BY 5 ;
Student_ID _________ 260000 423400 280023 333450 125634 123250 324652 231222 322133 234121
SELECT * FROM Student_Table ORDER BY Grade_Pt ;
Last_Name First_Name _________ _________ Johnson Larkins McRoberts Smith Hanson Phillips Delaney Wilson Bond Thomas
Stanley Michael Richard Andy Henry Martin Danny Susie Jimmy Wendy
Class_Code _________ Grade_Pt _______ ? FR JR SO FR SR SR SO JR FR
Nulls sort first in ASC Order
? 0.00 1.90 2.00 2.88 3.00 3.35 3.80 3.95 4.00
Did you notice that the null value came back first? Nulls sort first in ascending order and last in descending order.
Page 177
Chapter 5
The Basics of SQL
NULL Values sort Last in Descending Mode (DESC) SELECT * FROM Student_Table ORDER BY 5 DESC ;
Student_ID Last_Name _________ __________ 234121 Thomas 322133 Bond 231222 Wilson 324652 Delaney 123250 Phillips 125634 Hanson 333450 Smith 280023 McRoberts 423400 Larkins 260000 Johnson
SELECT * FROM Student_Table ORDER BY Grade_Pt DESC ;
First_Name __________ Wendy Jimmy Susie Danny Martin Henry Andy Richard Michael Stanley
Class_Code ________ Grade_Pt __________ 4.00 FR 3.95 JR 3.80 SO 3.35 SR 3.00 SR 2.88 FR 2.00 SO Nulls sort 1.90 JR Last in 0.00 FR DESC Order ? ?
You can ORDER BY in descending order by putting a DESC after the column name or its corresponding number. Null Values will sort Last in DESC order.
Page 178
Chapter 5
The Basics of SQL
Major Sort vs. Minor Sorts SELECT * FROM Student_Table ORDER BY Class_Code DESC, Grade_Pt ASC;
Student_ID _________ Last_Name ________ 123250 324652 333450 231222 280023 322133 423400 125634 234121 260000
Phillips Delaney Smith Wilson McRoberts Bond Larkins Hanson Thomas Johnson
Minor Sort on Grade_Pt Ascending
First_Name _________ Class_Code Grade_Pt _________ _______ Martin Danny Andy Susie Richard Jimmy Michael Henry Wendy Stanley
SR SR SO SO JR JR FR FR FR ?
Major sorts first
3.00 3.35 2.00 3.80 1.90 3.95 0.00 2.88 4.00 ?
Minor sorts on ties
Major sort is the first sort. There can only be one major sort. A minor sort kicks in if there are Major Sort ties. There can be zero or more minor sorts.
Page 179
Chapter 5
The Basics of SQL
Multiple Sort Keys using Names vs. Numbers SELECT * FROM Employee_Table ORDER BY Dept_No DESC ,Salary ASC ,Last_Name ASC;
SELECT * FROM Employee_Table ORDER BY 2 DESC, 5, 3 ASC ;
These queries sort identically Employee_No __________ 2341218 1256349 1121334 2312225 1324657 1333454 1232578 1000234 2000000
Dept_No _______ 400 400 400 300 200 200 100 10 ?
Last_Name _________ First_Name _______ Salary ________ Reilly Harrison Strickling Larkins Coffing Smith Chambers Smythe Jones
William Herbert Cletus Loraine Billy John Mandee Richard Squiggy
36000.00 54500.00 54500.00 40200.00 41888.88 48000.00 48850.00 64300.00 32800.50
In the example above, the Dept_No is the major sort and we have two minor sorts. The minor sorts are on the Salary and the Last_Name columns. Both Queries above have an equivalent Order by statement and sort exactly the same.
Page 180
Chapter 5
The Basics of SQL
Sorts are Alphabetical, NOT Logical SELECT * FROM Student_Table ORDER BY Class_Code ;
Student_ID ________ Last_Name First_Name Grade_Pt ________ ________ Class_Code ________ ________ 260000 234121 125634 423400 322133 280023 231222 333450 324652 123250
Johnson Thomas Hanson Larkins Bond McRoberts Wilson Smith Delaney Phillips
Stanley Wendy Henry Michael Jimmy Richard Susie Andy Danny Martin
? FR FR FR JR JR SO SO SR SR
? 4.00 2.88 0.00 3.95 1.90 3.80 2.00 3.35 3.00
This sorts alphabetically. Can you change the sort so the Freshman come first, followed by the Sophomores, Juniors, Seniors and then the Null?
Can you change the query to Order BY Class_Code logically (FR, SO, JR, SR, ?)?
Page 181
Chapter 5
The Basics of SQL
Using A CASE Statement to Sort Logically SELECT * FROM Student_Table ORDER BY CASE Class_Code WHEN 'FR' WHEN 'SO' CASE in the WHEN 'JR' ORDER BY WHEN 'SR' Statement
THEN 1 THEN 2 THEN 3 THEN 4 ELSE 5
END; Student_ID ________ Last_Name First_Name Grade_Pt ________ ________ Class_Code ________ ________ 234121 125634 423400 333450 231222 280023 322133 123250 324652 260000
This is the way the pros do it.
Page 182
Thomas Hanson Larkins Smith Wilson McRoberts Bond Phillips Delaney Johnson
Wendy Henry Michael Andy Susie Richard Jimmy Martin Danny Stanley
FR FR FR SO SO JR JR SR SR ?
4.00 2.88 0.00 2.00 3.80 1.90 3.95 3.00 3.35 ?
Chapter 5
The Basics of SQL
How to ALIAS a Column Name Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 Double quotes are used because of spaces or SELECT First_Name, Last_Name reserved words ,Class_Code "class code" ,Grade_Pt AS "AVG" The Keyword AS is optional ,Student_ID AS STU_ID FROM Student_Table You need single quotes in a WHERE WHERE Class_Code = 'JR' clause for character data, but you cannot use single quotes to alias Messages
first_name 1 Richard 2 Jimmy
Garden of Analysis
Result 1
last_name class code McRoberts Bond
JR JR
avg
stu_Id
1.90 3.95
1.90 3.95
When you ALIAS a column, you give it a new name for the report header. You should always reference the column using the ALIAS everywhere else in the query. You never need Double Quotes in SQL unless you are Aliasing.
Page 183
Chapter 5
The Basics of SQL
A Missing Comma can by Mistake become an Alias SELECT First_Name, Last_Name, Class_Code Grade_Pt FROM Student_Table ; Missing a Comma
First_Name Last_Name _________ _________
Michael Susie Richard Jimmy Henry Andy Danny Stanley Wendy Martin
Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips
Grade_Pt _______
FR SO JR JR FR SO SR ? FR SR
Aliased as Grade_Pt
Column names must be separated by commas. Notice in this example, there is a comma missing between Class_Code and Grade_Pt. What this will result in is only three columns appearing on your report with one being aliased wrong.
Page 184
Chapter 5
The Basics of SQL
Aliasing a Column Name with Spaces or Reserved Words SELECT Employee_No AS Emp_Number , Dept_No Dept_Number , Last_name "Last Name" , First_name AS 'First Name' , Salary AS "Employee Pay" FROM Employee_table
For an Alias with a space or one that is also a reserved word use either: 1. Double quotes 2. Single quotes
Emp_Number Dept_Number _________ Last Name __________ First_Name Employee Pay ____________ ____________ ____________ 1000234 10 Smythe Richard 64300.00 1121334 400 Strickling Cletus 54500.00 1232578 100 Chambers Mandee 48850.00 1236548 400 Mays Mary 50000.00 1256349 400 Harrison Herbert 54500.00 1324657 200 Coffing Billy 41888.88 1333454 200 Smith John 48000.00 2000000 ? Jones Squiggy 32800.50 2312225 300 Larkins Loraine 40200.00 2341218 400 Reilly William 36000.00
When you ALIAS a column, you give it a new name for the report header. If your alias is a reserved word or has a space in it you can still use it, but you must use either double quotes or single quotes.
Page 185
Chapter 5
The Basics of SQL
Comments using Double Dashes are Single Line Comments Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
+ + + + + + + + + + + + + + +
Database: SQL Class
History
Sandbox
EXECUTE
New Query
Systems Query 1 Query 2 Query 3 Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
-- This is how you can make a comment
Two single dashes comments out the rest of the line
SELECT * FROM Employee_Table WHERE Dept_No = 400 -- We only want Department 400 rows
Messages
Garden of Analysis
Employee_No Dept_No 1 2 3
1256349 1121334 2341218
400 400 400
Result 1 Last_Name First_Name Salary Herbert 55000.00 Harrison Cletus 55000.00 Strickling William 72000.00 Reilly
Double dashes make a single line comment that will be ignored by the system.
Page 186
?
Chapter 5
The Basics of SQL
Comments for Multi-Lines Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
+ + + + + + + + + + + + + + +
Database: SQL Class
History
Sandbox
EXECUTE
New Query
Systems Query 1 Query 2 Query 3 Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
/* This is how you can make multi-line comments to express what is going on in the code. */ SELECT * FROM Employee_Table WHERE Dept_No = 400 Messages
Garden of Analysis
Employee_No Dept_No 1 2 3
1256349 1121334 2341218
400 400 400
Result 1 Last_Name First_Name Salary Herbert 55000.00 Harrison Cletus 55000.00 Strickling William 72000.00 Reilly
Slash Asterisk starts a multi-line comment and Asterisk Slash ends the comment.
Page 187
?
Chapter 5
The Basics of SQL
Comments for Multi-Lines as Double Dashes per Line Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica + + + + + + + + + + + + + + +
Database: SQL Class
History
Sandbox
EXECUTE
New Query
Systems Query 1 Query 2 Query 3 Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
-- This is how you can make multi-line comments -- to express what is going on in the code. SELECT * FROM Employee_Table WHERE Dept_No = 400 Messages
Garden of Analysis
Employee_No Dept_No 1 2 3
1256349 1121334 2341218
400 400 400
Result 1 Last_Name First_Name Salary Herbert 55000.00 Harrison Cletus 55000.00 Strickling William 72000.00 Reilly
Double Dashes in front of both lines comments both lines out and they’re ignored.
Page 188
?
Chapter 5
The Basics of SQL
Formatting Number 9 -Value with the specified number of digits 0 -Value with leading zeros . (period) - Decimal point , (comma) - Group (thousand) separator PR - Negative value in angle brackets S - Sign anchored to number (uses locale) L - Currency symbol (uses locale) D - Decimal point (uses locale) G - Group separator (uses locale) MI - Minus sign in specified position (if number < 0) PL - Plus sign in specified position (if number > 0) SG - Plus/minus sign in specified position RN - Roman numeral (input between 1 and 3999) TH or th - Ordinal number suffix V - Shift specified number of digits (see notes)
Vertica gives you many options for formatting numbers. The next page will show an example.
Page 189
Chapter 5
The Basics of SQL
Formatting Number Examples SELECT Salary ,TO_CHAR(Salary , 'L999,999.99') AS Dollarsign ,TO_CHAR(Salary , 'SL999999') AS Anchored ,TO_CHAR(Salary , '0000000') AS LeadingZ ,TO_CHAR(Salary , '99999999999.99') AS Float9 ,TO_CHAR(Salary , '999,999,999.99') AS Commas FROM Employee_Table WHERE Dept_No = 200 ;
Salary __________ Dollarsign _________ Anchored _________ LeadingZ ________ Float9 _________ Commas _______
41888.88 48000.00
$ 41,888.88 $ +41889 $ 48,000.00 $ +48000
0041889 41888.88 41,888.88 0048000 48000.00 48,000.00
Above you can see an example of formatted numbers using the TO_CHAR command.
Page 190
Chapter 5
The Basics of SQL
Formatting Dates HH - Hour of day (00-23) HH12 - Hour of day (01-12) HH24 - Hour of day (00-23) MI - Minute (00-59) SS - Second (00-59) MS - Millisecond (000-999) US - Microsecond (000000-999999) SSSS - Seconds past midnight (0-86399) AM or A.M. or PM or P.M. - (uppercase) am or a.m. or pm or p.m. - (lowercase) Y,YYY - Year with comma YYYY - Year (4 and more digits) YYY - Last 3 digits of year YY - Last 2 digits of year Y - Last digit of year IYYY - ISO year (4 and more digits) IYY - Last 3 digits of ISO year IY - Last 2 digits of ISO year I - Last digits of ISO year BC or B.C. or AD or A.D. - (uppercase) bc or b.c. or ad or a.d. - (lowercase) MONTH - Full uppercase month name Month - Full mixed-case month name month - Full lowercase month name
MON - Uppercase month (3 chars) Mon - Mixed-case month (3 chars) mon - Lowercase month (3 chars) MM - Month number (01-12) DAY - Full uppercase day Day - Full mixed-case day day - Full lowercase day DY - Uppercase day (3 chars) Dy - Mixed-case day (3 chars) dy - Lowercase day (3 chars) DDD - Day of year (001-366) DD - Day of month (01-31) for TIMESTAMP D - Day of week (1-7) Sunday = 1 W - Week of month (1-5) WW - Week number of year (1-53) IW - ISO week number of year CC - Century (2 digits) J - Julian Day (days since Jan 1, 4712 BC) Q - Quarter RM - Month in Roman numerals rm - Month in Roman numerals (lowercase) TZ - Time-zone name (uppercase) tz - Time-zone name (lowercase)
Vertica gives you many options for formatting dates. The next page will show an example.
Page 191
Chapter 5
The Basics of SQL
Formatting Date Example SELECT Order_Date ,TO_CHAR(Order_Date , 'YY-MM-DD') AS YMD ,TO_CHAR(Order_Date , 'MON, DD, YYYY') AS Month ,TO_CHAR(Order_Date , 'D, Mon DD, YY') AS DayofWeek ,Current_Time as Time ,TO_CHAR(Current_Time , 'HH24:MI:SS:MS') AS Micro FROM Order_Table WHERE EXTRACT(Year from Order_Date) = 1998 ;
Order_Date YMD Month DayofWeek _______ Time Micro __________ ________ ____________ ____________ ___________
05/04/1998 98-05-04 MAY, 04, 1998 2, May 04, 98 10:09:58 10:09:58:109
Above you can see an example of formatted dates using the TO_CHAR command.
Page 192
Chapter 6
Page 193
The Where Clause
Chapter 6
The Where Clause
Chapter 6 – The WHERE Clause
“I saw the angel in the marble and carved until I set him free.” - Michelangelo
Page 194
Chapter 6
The Where Clause
The WHERE Clause limits Returning Rows Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3
SELECT * FROM Employee_table Where Dept_No = 400 ; Messages
Garden of Analysis
Employee_No Dept_No 1 1256349 400 2 1121334 400 3 2341218 400
The WHERE clause limits the rows Result 1 Last_Name Harrison Strickling Reilly
First_Name Salary Herbert 55000.00 Cletus 55000.00 William 72000.00
The WHERE Clause here filters how many ROWS are coming back. In this example, I am asking for the report to only rows WHERE the first name is Henry.
Page 195
Chapter 6
The Where Clause
Double Quoted Aliases are for Reserved Words and Spaces The AS keyword is always optional.
SELECT First_Name AS Fname If spaces are in the ,Last_Name Lname Alias, you must use ,Class_Code "Class Code" double quotes. ,Grade_Pt AS "AVG" ,Student_ID FROM Student_Table ORDER BY "AVG" ; If Double Quotes are used, then use the Double Quotes throughout the SQL
“Write a wise saying and your name will live forever.”
- Anonymous
When you ALIAS a column you give it a new name for the report header, but a good rule of thumb is to refer to the column by the alias throughout the query. Whoever wrote the above quote was way off. "Write a wise alias and it will live until the query ends – bummer".
Page 196
Chapter 6
The Where Clause
Character Data needs Single Quotes in the WHERE Clause Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_table WHERE First_Name = 'Henry' ; Messages
Garden of Analysis
Character data needs single quotes
Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt 1 125634 Hanson Henry FR 2.88
In the WHERE clause, if you search for Character data such as first name, you need single quotes around it. You don’t single-quote integers.
Page 197
Chapter 6
The Where Clause
Character Data needs Single Quotes, but Numbers Don’t Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_table WHERE Grade_Pt = 0.00 ; Messages
Garden of Analysis
Numeric data never needs quotes Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt 1 423400 Larkins Michael FR 0.00
Character data (letters) need single quotes, but you need NO Single Quotes for Integers (numbers). Remember you never use double quotes except for aliasing.
Page 198
Chapter 6
The Where Clause
Comparisons against a Null Value Col_A 1 2 3 4 5 6 7 8 9 10 NULL NULL NULL NULL NULL NULL
Operator + / * > >=
= =) Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_table WHERE Grade_PT >= 3.0; Messages
1 2 3 4 5
Student_ID 123250 231222 234121 322133 324652
Garden of Analysis
Result 1
Last_Name First_Name Class_Code Grade_Pt Martin SR 3.00 Phillips Susie SO 3.80 Wilson Wendy FR 4.00 Thomas Jimmy JR 3.95 Bond Danny SR 3.35 Delaney
The WHERE Clause doesn’t just deal with ‘Equals’. You can look for things that are GREATER or LESSER THAN along with asking for things that are GREATER/LESSER THAN or EQUAL to.
Page 204
Chapter 6
The Where Clause
AND in the WHERE Clause Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_table WHERE Class_Code = 'Fr' AND First_Name = 'Henry' ; Messages
Garden of Analysis
Both conditions must be met
Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt 1 125634 Hanson Henry FR 2.88
Notice the WHERE statement and the word AND. In this example, qualifying rows must have a Class_Code = ‘FR’ and also must have a First_Name of ‘Henry’. Notice how the WHERE and the AND clause are on their own line. Good practice!
Page 205
Chapter 6
The Where Clause
Troubleshooting AND Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT FROM WHERE AND Messages
* Student_Table Grade_Pt = 3.0 Grade_Pt = 4.0 ; Garden of Analysis
Both conditions must be met Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt No rows are returned because No student can have two different Grade_Pt values
What is going wrong here? You are using an AND to check the same column. What you are basically asking with this syntax is to see the rows that have BOTH a Grade_Pt of 3.0 and a 4.0. No rows will be returned.
Page 206
Chapter 6
The Where Clause
OR in the WHERE Clause Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT FROM WHERE OR Messages
* Student_Table Grade_Pt = 3.0 Grade_Pt = 4.0 ; Garden of Analysis
Either conditions can be met Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt 123250 Phillips Martin SR 3.00 1 234121 Thomas Wendy FR 4.00 2
Notice above in the WHERE Clause we use OR. OR allows for either of the parameters to be TRUE in order for the data to qualify and return.
Page 207
Chapter 6
The Where Clause
Troubleshooting Or Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 OR 4.0; error
SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 OR Grade_Pt = 4.0; perfect
Notice above in the WHERE Clause we use OR. OR allows for either of the parameters to be TRUE in order for the data to qualify and return. The first example errors and is a common mistake. The second example is perfect.
Page 208
Chapter 6
The Where Clause
Troubleshooting Character Data Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
SELECT * FROM Student_Table WHERE Grade_Pt = 3.0 AND Class_Code = SR ;
Error!!! Why?
This query errors! What is WRONG with this syntax? No Single quotes around SR.
Page 209
Chapter 6
The Where Clause
Using Different Columns in an AND Statement Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_table WHERE Grade_Pt = 3.0 AND Class_Code = 'SR' ; Messages
Garden of Analysis
Character data needs single quotes
Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt 1 123250 Phillips Martin SR 3.00
Notice that AND separates two different columns, and the data will come back if both are TRUE.
Page 210
Chapter 6
The Where Clause
Quiz – How many rows will return? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
SELECT * FROM Student_Table WHERE Grade_Pt = 4.0 OR Grade_Pt = 3.0 AND Class_Code = 'SR' ; Which Seniors have a 3.0 or a 4.0 Grade_Pt average. How many rows will return?
Page 211
A) 2
C) Error
B) 1
D) 3
Chapter 6
The Where Clause
Answer to Quiz – How many rows will return? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
SELECT * FROM Student_Table WHERE Grade_Pt = 4.0 OR Grade_Pt = 3.0 AND Class_Code = 'SR' ;
Student_ID _________ Last_Name __________ First_Name Class_Code Grade_Pt _________ __________ ________ 234121 Thomas Wendy FR 4.00 123250 Phillips Martin SR 3.00
We had two rows return! Isn’t that a mystery? Why?
Page 212
Chapter 6
The Where Clause
What is the Order of Precedence?
1
()
2
NOT
3
AND
4
OR
SELECT * FROM Student_Table WHERE Grade_Pt = 4.0 OR Grade_Pt = 3.0 AND Class_Code = 'SR' ; Syntax has an ORDER OF PRECEDENCE. It will read anything with parentheses around it first. Then, it will read all the NOT statements. Next, the AND statements. FINALLY, the OR Statements. This is why the last query came out odd. Let’s fix it and bring back the right answer set.
Page 213
Chapter 6
The Where Clause
Using Parentheses to change the Order of Precedence Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE (Grade_Pt = 3.0 OR Grade_Pt = 4.0) AND Class_Code = 'SR' ; Messages
Garden of Analysis
Parenthesis are evaluated first
Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt 1 123250 Phillips Martin SR 3.00
This is the proper way of looking for rows that have both a Grade_Pt of 3.0 or 4.0 AND also having a Class_Code of ‘SR’. Only ONE row comes back. Parentheses are evaluated first, so this allows you to direct exactly what you want to work first.
Page 214
Chapter 6
The Where Clause
Using an IN List in place of OR Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Grade_Pt IN (3.0, 4.0) AND Class_Code = 'SR' ; Messages
Garden of Analysis
This IN list means any Grade_Pt with a 3.0 or 4.0
Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt 1 123250 Phillips Martin SR 3.00
Using an IN List is a great way of looking for rows that have both a Grade_Pt of 3.0 or 4.0 AND also have a Class_Code of ‘SR’. Only ONE row comes back.
Page 215
Chapter 6
The Where Clause
The IN List is an Excellent Technique Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Grade_Pt IN (2.0, 3.0, 4.0) ; Messages
Student_ID 1 123250 2 234121 3 333450
Garden of Analysis
Last_Name Phillips Thomas Smith
Result 1
First_Name Class_Code Grade_Pt 3.00 Martin SR 4.00 Wendy FR 2.00 Andy SO
The IN Statement avoids retyping the same column name separated by an OR. The IN allows you to search the same column for a list of values. Both queries above are equal, but the IN list is a nice way to keep things easy and organized.
Page 216
Chapter 6
The Where Clause
IN List vs. OR brings the same Results Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
SELECT * FROM Student_Table WHERE Grade_Pt IN (2.0, 3.0, 4.0) ; An IN list is a better technique
Both examples Produce the same results
SELECT FROM WHERE OR OR
* Student_Table Grade_Pt = 2.0 Grade_Pt = 3.0 Grade_Pt = 4.0 ;
The IN Statement avoids retyping the same column name separated by an OR. The IN allows you to search the same column for a list of values. Both queries above are equal, but the IN list is a nice way to keep things easy and organized.
Page 217
Chapter 6
The Where Clause
The IN List Can Use Character Data Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Last_Name IN ('Larkins', 'Bond') ; Messages
Garden of Analysis
Single quotes are used for character data
Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt 3.95 1 322133 Bond Jimmy JR 0.00 2 423400 Larkins Michael FR
The IN Statement avoids retyping the same column name separated by an OR. The IN allows you to search the same column for a list of values. This works with character data as long as you use single quotes.
Page 218
Chapter 6
The Where Clause
Using a NOT IN List Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Grade_Pt NOT IN (2.0, 3.0, 4.0) ; Messages
1 2 3 4 5 6
Student_ID 125634 231222 280023 322133 324652 423400
Garden of Analysis
Last_Name Hanson Wilson McRoberts Bond Delaney Larkins
Result 1
First_Name Class_Code Grade_Pt 2.88 Henry FR 3.80 Susie SO 1.90 Richard JR 3.95 Jimmy JR 3.35 Danny SR 0.00 Michael FR
“First you imitate, then you innovate.” - Miles Davis
You can also ask to see the results that ARE NOT IN your parameter list. That requires the column name and a NOT IN. Neither the IN nor NOT IN can search for NULLs! Miles Davis got this IT quote all wrong. First you innovate, and then you sue anyone who imitates. Please make a note of it!
Page 219
Chapter 6
The Where Clause
Null Values in a NOT IN List Bring Back No Rows Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Grade_Pt NOT IN (2.0, 3.0, 4.0, NULL) ; Warning: When a NOT IN statement encounters a NULL- NO DATA RETURNS Messages
Garden of Analysis
Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt
Few people know that when a NOT IN is used, and a null value is encountered, that no data returns. This is because a null value equals nothing so it can't compare and eliminate values.
Page 220
Chapter 6
The Where Clause
A Technique for Handling Nulls with a NOT IN List SELECT FROM WHERE OR
Student_ID _________ 423400 231222 280023 322133 125634 324652 260000
* Student_Table Grade_Pt NOT IN (2.0, 3.0, 4.0) Grade_Pt IS NULL ;
Last_Name _________ Larkins Wilson McRoberts Bond Hanson Delaney Johnson
First_Name __________ Class_Code ________ Grade_Pt __________ Michael Susie Richard Jimmy Henry Danny Stanley
FR SO JR JR FR SR ? The null row now comes back
This is a great technique to look for a NULL when using a NOT IN List.
Page 221
0.00 3.80 1.90 3.95 2.88 3.35 ?
Chapter 6
The Where Clause
BETWEEN is Inclusive SELECT * FROM Student_Table WHERE Grade_Pt BETWEEN 2.0 AND 4.0 ;
Student_ID _________ 125634 231222 324652 322133 234121 333450 123250
Last_Name _________ First_Name __________ Class_Code __________ Grade_Pt ________ Hanson Wilson Delaney Bond Thomas Smith Phillips
Henry Susie Danny Jimmy Wendy Andy Martin
FR SO SR JR FR SO SR
2.88 3.80 3.35 3.95 4.00 2.00 3.00
2.0 and 4.0 come back in the answer set. The BETWEEN statement is therefore inclusive.
This is a BETWEEN. What this allows you to do is see if a column falls in a range. It is inclusive, meaning that in our example, and we will be getting the rows that also have a 2.0 and 4.0 in their column!
Page 222
Chapter 6
The Where Clause
NOT BETWEEN is Also Inclusive Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Grade_Pt NOT BETWEEN 2.0 AND 4.0 ; Messages
Garden of Analysis
Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt 1.90 1 280023 McRoberts Richard JR 0.00 2 423400 Larkins Michael FR NOT BETWEEN is also inclusive
"The difference between genius and stupidity is that genius has its limits." Albert Einstein
This is a NOT BETWEEN example. What this allows you to do is see if a column does not fall in a range. It is inclusive, meaning that in our example, we will be getting no rows where the grade_pt is between a 2.0 and 4.0 in their column! The 2.0 and the 4.0 will also not return.
Page 223
Chapter 6
The Where Clause
LIKE uses Wildcards Percent ‘%’ and Underscore ‘_’ Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Last_Name LIKE 'Sm%' ;
% is a wildcard for any number of characters when used with the LIKE command Messages
Garden of Analysis
Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt 1 333450 Smith Andy SO 2.00 Any Last_Name that starts with 'Sm' returns
The wildcard percentage sign (%) is a wildcard for any number of characters. We are looking for anyone whose name starts with SM! In this example, the only row that would come back is ‘Smith’. The next page will show an example of underscore.
Page 224
Chapter 6
The Where Clause
LIKE command Underscore is Wildcard for one Character Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 Any Last_Name that has an 'a' SELECT * in the second character qualifies FROM Student_Table WHERE Last_Name LIKE '_a%' ; An Underscore _ is a wildcard for a single character when used with the LIKE command Messages
Garden of Analysis
Result 1
Student_ID Last_Name First_Name Class_Code Grade_Pt 2.88 1 125634 Hanson Henry FR 0.00 2 423400 Larkins Michael FR
The _ underscore sign is a wildcard for any a single character. We are looking for anyone who has an 'a' as the second letter of their last name.
Page 225
Chapter 6
The Where Clause
LIKE Command Works Differently on Char Vs Varchar Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
EXECUTE
Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE First_Name LIKE '%y' ; Messages
1 2 3 4 5 6
Student_ID 125634 234121 260000 322133 324652 333450
Garden of Analysis
Last_Name Hanson Thomas Johnson Bond Delaney Smith
Sandbox ?
New Query
Any First_Name that ends in 'y'
Result 1
First_Name Class_Code Grade_Pt 2.88 Henry FR 4.00 Wendy FR ? Stanley ? 3.95 Jimmy JR 3.35 Danny SR 2.00 Andy SO
It is important that you know the data type of the column you are using with your LIKE command. VARCHAR and CHAR data differ slightly, but Vertica handles them both wisely.
Page 226
Chapter 6
The Where Clause
LIKE Command on Character Data Auto Trims Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
EXECUTE
Query 1 Query 2 Query 3 SELECT * FROM Student_Table WHERE Last_Name LIKE '%n' ;
Messages
Student_ID 1 125634 2 231222 3 260000
Garden of Analysis
Last_Name Hanson Wilson Johnson
Sandbox ?
New Query
Any Last_Name that ends in 'n'
Result 1
First_Name Class_Code Grade_Pt 2.88 Henry FR 3.80 Susie SO ? Stanley ?
This is a CHAR (20) data type. That usually means that any words under 20 characters will pad spaces behind them until they reach 20 characters. You will normally not get any rows back from this example because technically no row ends in an ‘N’, but instead ends in a space. Vertica handles this for you so rows come back.
Page 227
Chapter 6
The Where Clause
Quiz – What Data is Left Justified and what is Right? SELECT FROM WHERE AND
* Sample_Table Column1 IS NULL Column2 IS NULL ;
Answer Set Column1 Integers are Right Justified!
? Right Justified
Column2
?
Character Data is Left Justified!
Left Justified
Which Column from the Answer Set could have a DATA TYPE of INTEGER, and which could have Character Data?
Page 228
Chapter 6
The Where Clause
Numbers are Right Justified and Character Data is Left SELECT FROM WHERE AND
* Sample_Table Column1 IS NULL Column2 IS NULL ;
Answer Set Column1 Integers are Right Justified!
? Right Justified
Column2
?
Character Data is Left Justified!
Left Justified
All Integers will start from the right and move left. Thus, Col1 was defined during the table create statement to hold an INTEGER. The next page shows a clear example.
Page 229
Chapter 6
The Where Clause
Answer – What Data is Left Justified and what is Right? SELECT Employee_No, First_Name FROM Employee_Table WHERE Employee_No = 2000000;
Answer Set Employee_No ____________ Integers are Right justified!
2000000
First_Name __________ Squiggy
Characters are Left justified!
All Integers will start from the right and move left. All Character data will start from the left and move to the right.
Page 230
Chapter 6
The Where Clause
An Example of Data with Left and Right Justification SELECT Student_ID, Last_Name FROM Student_Table ;
Student_ID __________
Integers are Right justified!
423400 125634 280023 260000 231222 234121 324652 123250 322133 333450
Last_Name _______
Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith
Characters are Left justified!
This is how a standard result set will look. Notice that the integer type in Student_ID starts from the right and goes left. Character data type in Last_Name moves left to right like we are used to seeing when reading English.
Page 231
Chapter 6
The Where Clause
A Visual of CHARACTER Data vs. VARCHAR Data Character Data on Disk Last_Name as a Char(20)
Jones _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Hanson _ _ _ _ _ _ _ _ _ _ _ _ _ _
Spaces padded at the end
McRoberts _ _ _ _ _ _ _ _ _ _ _ Johnson _ _ _ _ _ _ _ _ _ _ _ _ _ Varchar Data on Disk
Last_Name as a Varchar(20) 2-byte VLI Variable Length Indicator
0
5 Jones
0
6 Hanson
0
9 McRoberts
0
7
No Spaces
Johnson
Character data pads spaces to the right and Varchar uses a 2-byte VLI instead.
Page 232
Chapter 6
The Where Clause
Use the TRIM command to remove spaces on CHAR Data Character Data on Disk Last_Name as a Char(20) Jones _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Hanson _ _ _ _ _ _ _ _ _ _ _ _ _ _
Spaces padded at the end
Wilson _ _ _ _ _ _ _ _ _ _ _ _ _ _
Johnson _ _ _ _ _ _ _ _ _ _ _ _ _ SELECT Last_Name FROM Student_Table WHERE TRIM (Last_Name) LIKE '%n' ;
Trim removes spaces at the front and back
Last_Name __________ Hanson Wilson Johnson
Last_Name has a Data Type of CHAR (20)
By using the TRIM command on the Last_Name column you are able to trim off any spaces from the end. Once we use the TRIM on Last_Name we have eliminated any spaces at the end, so now we are set to bring back anyone with a Last_Name that truly ends in ‘n’!
Page 233
Chapter 6
The Where Clause
Escape Character in the LIKE Command changes Wildcards
Student_ID __________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999
Student_Table Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_
Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%
FR FR JR ? SO FR SR SR JR SO FR
0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90
/* We just pretended to add a new row to the Student_Table */
/* Can you use the LIKE command to find S% above? */ Here you will have to utilize a Wildcard Escape Character. Turn the page for more.
Page 234
Chapter 6
The Where Clause
Escape Characters Turn off Wildcards in the LIKE Command Student_Table
__________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999
__________ __________ __________ ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_
Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%
FR FR JR ? SO FR SR SR JR SO FR
0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90
Can you use the LIKE command to find S% above? SELECT * FROM Student_Table WHERE First_Name LIKE 'S@%' Escape '@';
We can pick our Escape character and we have chosen a @ sign character. This turns the wildcard off for 1 character, so we find ‘S%’ without bringing back Stanley or Susie.
Page 235
Chapter 6
The Where Clause
Quiz – Turn off that Wildcard
Student_ID __________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999
Student_Table Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_
Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%
FR FR JR ? SO FR SR SR JR SO FR
0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90
Can you use the LIKE command to find the Last_Name of T_? (pronounced Tunderscore!)
This is a little trickier than you might think so be on your toes…. And get a haircut!
Page 236
Chapter 6
The Where Clause
ANSWER – To Find that Wildcard Student_ID __________ 423400 125634 280023 260000 231222 234121 324652 123250 322133 333450 999999
Student_Table Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Hanson McRoberts Johnson Wilson Thomas Delaney Phillips Bond Smith T_
Michael Henry Richard Stanley Susie Wendy Danny Martin Jimmy Andy S%
FR FR JR ? SO FR SR SR JR SO FR
0.00 2.88 1.90 ? 3.80 4.00 3.35 3.00 3.95 2.00 1.90
Can you use the LIKE command to find the Last_Name of T_? (pronounced Tunderscore!)
SELECT * FROM Student_Table WHERE TRIM(Last_Name) LIKE 'T@_' Escape '@' ;
You didn’t really need to get a full haircut, but just a TRIM Command and the Escape!
Page 237
Chapter 6
The Where Clause
The Distinct Command Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT DISTINCT Class_Code FROM Student_Table ORDER BY 1 ; Messages
1 2 3 4 5
Garden of Analysis
The keyword DISTINCT won't allow duplicate Class_Code values to return Result 1
Class_Code FR JR SO SR ?
DISTINCT eliminates duplicates from returning in the Answer Set.
Page 238
Sandbox
Chapter 6
The Where Clause
Distinct vs. GROUP BY Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT Class_Code FROM Student_Table GROUP BY Class_Code ; Messages
Garden of Analysis
The GROUP BY statement is equivalent to the DISTINCT command seen previously
Result 1
Class_Code 1 2 3 4 5
? FR JR SO SR
Distinct and GROUP BY in the two examples return the same answer set.
Page 239
Sandbox
Chapter 6
The Where Clause
Quiz – How many rows come back from the Distinct? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
SELECT Distinct Class_Code, Grade_Pt FROM Student_Table ORDER BY Class_Code, Grade_Pt;
How many rows will come back from the above SQL?
Page 240
Chapter 6
The Where Clause
Answer – How many rows come back from the Distinct? SELECT Distinct Class_Code, Grade_Pt FROM Student_Table ORDER BY Class_Code, Grade_Pt ;
Class_Code __________ ? FR FR FR JR JR SO SO SR SR
Grade_Pt ________ ? 0.00 2.88 4.00 1.90 3.95 2.00 3.80 3.00 3.35
No Rows have the exact same values for both the Class_Code and Grade_Pt. Each row is Distinct!
How many rows will come back from the above SQL? 10. All rows came back. Why? Because there are no exact duplicates that contain a duplicate Class_Code and Duplicate Grade_Pt combined. Each row in the SELECT list is distinct.
Page 241
Chapter 7
Page 242
Aggregation
Chapter 7
Aggregation
Chapter 7 – Aggregation
“Vertica climbed Aggregate Mountain and delivered a better way to Sum It.” - Tera-Tom Coffing
Page 243
Chapter 7
Aggregation
Quiz – You calculate the Answer Set in your own Mind Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
SELECT
FROM WHERE
Avg(Grade_Pt) AS "AVG" ,Count(Grade_Pt) AS "Count" ,Count(*) AS "Count *" Student_Table Class_Code IS NULL AVG _____ Count _____
Count * _______
What would the result set be from the above query? The next slide shows answers!
Page 244
Chapter 7
Aggregation
Answer – You calculate the Answer Set in your own Mind Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
SELECT
Avg(Grade_Pt) AS "AVG" ,Count(Grade_Pt) AS "Count" ,Count(*) AS "Count *" Student_Table Class_Code IS NULL
FROM WHERE
AVG _____ Count _____ ?
Here are your answers!
Page 245
0
Count * _______ 1
Here are the correct answers
Aggregates ignore Null values
Chapter 7
Aggregation
Quiz – You calculate the Answer Set in your own Mind Aggregation_Table Employee_No 423400 423401 423402
Salary
100000.00 100000.00 NULL
SELECT AVG(Salary) as "AVG" ,Count(Salary) as SalCnt ,Count(*) as RowCnt FROM Aggregation_Table ;
AVG _____
SalCnt _______ RowCnt ______
Please fill in the values you think will be in the Answer.
What would the result set be from the above query? The next slide shows answers!
Page 246
Chapter 7
Aggregation
Answer – You calculate the Answer Set in your own Mind Aggregation_Table Employee_No
Salary
100000.00 100000.00 NULL
423400 423401 423402
Aggregates ignore Null values
SELECT AVG(Salary) as "AVG" ,Count(Salary) as SalCnt ,Count(*) as RowCnt FROM Aggregation_Table ;
AVG _____
100000.00 Here are your answers!
Page 247
SalCnt RowCnt ______ _______
2
3
Here are the correct answers
Chapter 7
Aggregation
The 3 Rules of Aggregation Aggregation_Table Employee_No 423400 423401 423402
Salary 100000.00 100000.00 NULL
SELECT AVG(Salary) as "AVG" ,Count(Salary) as SalCnt ,Count(*) as RowCnt FROM Aggregation_Table ;
1) Aggregates Ignore Null Values.
2) Aggregates WANT to come back in one row. 3) You CAN’T mix Aggregates with normal columns unless you use a GROUP BY.
AVG(Salary) = $100000.00
Count(Salary) = 2
Follow the three rules of aggregation and you will make fewer mistakes.
Page 248
Count(*) = 3
Chapter 7
Aggregation
There are Five Aggregates Nexus Chameleon System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
Sandbox
History
File Edit View Query Tools Help Web Windows
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,COUNT(*) FROM Employee_Table ; Messages
Garden of Analysis
MIN 1 32800.50
MAX 64300.00
These are the five aggregates
Result 1
SUM 421039.38
AVG 46782.15
COUNT 9
Aggregate are designed to return a single row “Don’t count the days, make the days count.” – Mohammed Ali
The five aggregates are listed above. Mohammed Ali was way off in his quote. He meant to say, "Don't you count the days, but instead make the data count for you".
Page 249
Chapter 7
Aggregation
Quiz – How many rows come back? Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT MIN (Salary) AS Min ,MAX (Salary) AS Max ,SUM (Salary) AS Sum ,AVG (Salary) AS AVG ,COUNT(*) COUNT FROM Employee_Table ; Messages
Garden of Analysis
How many rows will the above query produce in the result set?
Page 250
History
1) How many columns return? 2) How many rows return?
Result 1
Chapter 7
Aggregation
Answer – How many rows come back? Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
Query 1 Query 2 Query 3 SELECT MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,COUNT(*) FROM Employee_Table ; Messages
Garden of Analysis
MIN 1 32800.50
MAX 64300.00
These are the five aggregates
Result 1
SUM AVG 421039.38 46782.15
Aggregate are designed to return a single row
How many rows will the above query produce in the result set? The answer is one.
Page 251
New Query
COUNT 9
Chapter 7
Aggregation
Troubleshooting Aggregates Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT Dept_No ,MIN (Salary) AS Min ,MAX (Salary) AS Max ,SUM (Salary) AS Sum ,AVG (Salary) AS AVG ,COUNT(*) COUNT FROM Employee_Table ; Messages
Garden of Analysis
What happens when you mix normal columns with columns that are aggregated?
Result 1
If you have a normal column (not aggregated) in your query, you must have a corresponding GROUP BY statement.
Page 252
Chapter 7
Aggregation
GROUP BY when Aggregates and Normal Columns Mix
NON-Aggregate
Group By Needed
If you have a normal column (not aggregated) in your query, you must have a corresponding GROUP BY statement.
Page 253
Chapter 7
Aggregation
GROUP BY delivers one row per Group
Group By Needed
Dept_No ________ 10 100 200 300 400 ?
Min(Salary) __________ 64300.00 48850.00 41888.88 40200.00 36000.00 32800.50
NON-Aggregate SELECT Dept_No ,MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,Count(*) FROM Employee_Table GROUP BY Dept_No ORDER BY Dept_No ;
Max(Salary) __________ 64300.00 48850.00 48000.00 40200.00 54500.00 32800.50
Sum(Salary) ___________ AVG(Salary) Count(*) __________ _______ 64300.00 1 64300.00 48850.00 1 48850.00 44944.44 2 89888.88 40200.00 1 40200.00 48333.33 3 145000.00 32800.50 1 32800.50
GROUP BY Dept_No command allow for the Aggregates to be calculated per Dept_No. The data has also been sorted with the ORDER BY statement.
Page 254
Chapter 7
Aggregation
GROUP BY Dept_No or GROUP BY 1 the same thing SELECT Dept_No ,MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,Count(*) FROM Employee_Table GROUP BY Dept_No ORDER BY Dept_No;
Dept_No ________ ? 10 100 200 300 400
Min(Salary) __________ 32800.50 64300.00 48850.00 41888.88 40200.00 36000.00
Both Queries are exactly the same
Max(Salary) __________ 32800.50 64300.00 48850.00 48000.00 40200.00 54500.00
SELECT Dept_No ,MIN (Salary) ,MAX (Salary) ,SUM (Salary) ,AVG (Salary) ,Count(*) FROM Employee_Table GROUP BY 1 ORDER BY 1;
Sum(Salary) ___________ AVG(Salary) Count(*) __________ _______ 32800.50 1 32800.50 64300.00 1 64300.00 48850.00 1 48850.00 44944.44 2 89888.88 40200.00 1 40200.00 48333.33 3 145000.00
Both queries above produce the same result. The GROUP BY allows you to either name the column or use the number in the SELECT list just like the ORDER BY. .
Page 255
Chapter 7
Aggregation
Limiting Rows and Improving Performance with WHERE Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
EXECUTE
Query 1 Query 2 Query 3 SELECT Dept_No ,MAX (Salary) AS Max ,SUM (Salary) AS Sum ,AVG (Salary) AS AVG FROM Employee_Table WHERE Dept_No IN (200, 400) GROUP BY Dept_No ; Messages
Dept_No 1 2
200 400
Garden of Analysis
MAX 48000.00 54500.00
?
SUM
89888.88 145000
New Query
Calculations are only done on rows WHERE Dept_No Equals 200 or 400
Result 1
Will Dept_No 300 be calculated? Of course you know it will…NOT!
Page 256
Sandbox
AVG
44944.44 48333.33
Chapter 7
Aggregation
WHERE Clause in Aggregation limits unneeded Calculations Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
EXECUTE
Query 1 Query 2 Query 3 SELECT Dept_No ,MAX (Salary) AS Max ,SUM (Salary) AS Sum ,AVG (Salary) AS AVG FROM Employee_Table WHERE Dept_No IN (200, 400) GROUP BY Dept_No ; Messages
Dept_No 1 2
200 400
Garden of Analysis
MAX 48000.00 54500.00
Sandbox ?
New Query
The WHERE clause is a filter that will speed up the query.
Result 1
SUM
89888.88 145000
AVG
44944.44 48333.33
The system eliminates reading any other Dept_No’s other than 200 and 400. This means that only Dept_No’s of 200 and 400 will come off the disk to be calculated.
Page 257
Chapter 7
Aggregation
Keyword HAVING tests Aggregates after they are totaled Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
EXECUTE
Query 1 Query 2 Query 3 SELECT Dept_No ,MAX (Salary) AS Max ,SUM (Salary) AS Sum ,AVG (Salary) AS AVG FROM Employee_Table WHERE Dept_No IN (200, 400) GROUP BY Dept_No HAVING AVG(Salary) > 45000 ; Messages
Dept_No
1
400
Garden of Analysis
MAX
54500.00
Sandbox ?
New Query
HAVING filters rows after the rows are aggregated.
Result 1
SUM 145000
AVG 48333.33
The HAVING Clause only works on Aggregate Totals. The WHERE filters rows to be excluded from calculation, but the HAVING filters the Aggregate totals after the calculations, thus eliminating certain Aggregate totals.
Page 258
Chapter 7
Aggregation
Keyword HAVING is like an Extra WHERE Clause for totals SELECT Dept_No, MIN (Salary), MAX (Salary), SUM (Salary) , AVG (Salary) , COUNT(*) FROM Employee_Table WHERE Dept_No in (200, 400) GROUP BY Dept_No HAVING Count(*) > 2 ;
HAVING clause acts as a filter on all aggregates after they are totaled.
Previous Answer Set (without HAVING statement) Dept_No Min(Salary) Max(Salary) Sum(Salary) AVG(Salary) Count(*) ________ __________ __________ __________ ___________ ________ 200 41888.88 48000.00 89888.88 2 44944.44 400 36000.00 54500.00 145000.00 3 48333.33
New Answer Set using the HAVING Statement Dept_No Max(Salary) __________ Sum(Salary) ___________ AVG(Salary) ________ Count(*) ________ Min(Salary) __________ __________ 400 36000.00 145000.00 54500.00 3 48333.33
The HAVING Clause only works on Aggregate Totals, and in the above example, only Count (*) > 2 can return.
Page 259
Chapter 7
Aggregation
Keyword HAVING tests Aggregates after they are totaled SELECT Dept_No , MIN (Salary) as "Min" , MAX (Salary) as "Max" , SUM (Salary) as "Sum" , AVG (Salary) as "Avg" , COUNT(*) as "Count" FROM Employee_Table WHERE Dept_No IN (200, 400) GROUP BY Dept_No HAVING AVG(Salary) > 45000 Order by 1 ;
HAVING Clause acts as a filter on the totals after the Calculations are done
Dept_No __________ Min Max Sum AVG Count ________ __________ __________ ___________ ________ 200 400
41888.88 36000.00
48000.00 54500.00
89888.88 145000.00
44944.44 48333.33
2 3
The HAVING Clause only works on Aggregate Totals. The WHERE filters rows to be excluded from calculation, but the HAVING filters the Aggregate totals after the calculations, thus eliminating certain Aggregate totals.
Page 260
Chapter 7
Aggregation
Getting the Average Values per Column Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT 'Product_ID' AS "Column Name" ,CAST(COUNT(*) / COUNT(DISTINCT(Product_ID)) as Decimal (5,2)) AS "Avg Rows" ,'Sale_Date' AS "Column Name2" ,CAST(COUNT(*) / COUNT(DISTINCT(Sale_Date)) as Decimal (5,2)) AS "Avg Rows2" FROM Sales_Table ; Messages
Garden of Analysis
Result 1
Column Name Avg Rows Column Name2 1 Product_ID
7.00
Sale_Date
Avg Rows2 3.00
The query retrieved the average rows per value for the columns Product_ID and Sale_Date.
Page 261
Chapter 7
Aggregation
GROUP BY Rollup SELECT Product_ID ,EXTRACT (MONTH FROM Sale_Date) AS MTH ,EXTRACT (YEAR FROM Sale_Date) AS YR ,SUM(Daily_Sales) AS SUM_Daily_Sales FROM Sales_Table GROUP BY ROLLUP (Product_ID, MTH, YR) ORDER BY Product_ID Desc, MTH Desc, YR Desc;
GROUP BY ROLLUP displays what the Daily_Sales were for each Product_ID, for each distinct month, for each month per year and for each year, plus a grand total.
Page 262
Chapter 7
Aggregation
GROUP BY Rollup Result Set Product_ID _________
MTH ____
3000 3000 3000 3000 3000 2000 2000 2000 2000 2000 1000 1000 1000 1000 1000 ?
10 10 9 9 ? 10 10 9 9 ? 10 10 9 9 ? ?
YR ____ 2000 ? 2000 ? ? 2000 ? 2000 ? ? 2000 ? 2000 ? ? ?
SUM_Daily_Sales _______________ 84908.06 84908.06 139679.76 139679.76 224587.82 166872.90 166872.90 139738.91 139738.91 306611.81 191854.03 191854.03 139350.69 139350.69 331204.72 862404.35
This is the full result set from the previous GROUP BY ROLLUP query.
Page 263
Chapter 8
Page 264
Join Functions
Chapter 8
Join Functions
Chapter 8 – Join Functions
“When spider webs unite they can tie up a lion.” - African Proverb
Page 265
Chapter 8
Join Functions
A Two-Table Join Using Traditional Syntax Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
SELECT Customer_Table.Customer_Number The column ,Customer_Name Customer_Number is in both ,Order_Number tables. It must be fully ,Order_Total qualified with the table name FROM Customer_Table, or it errors. Order_Table WHERE Customer_Table.Customer_Number = Order_Table.Customer_Number ; Customer_Number is the column that has matching data in both tables. This is called the "Join Condition"
A Join combines columns on the report from more than one table. The example above joins the Customer_Table and the Order_Table together. The most complicated part of any join is the JOIN CONDITION. The JOIN CONDITION is which Column from each table is a match. In this case, Customer_Number is a match that establishes the relationship, so this join will happen on matching Customer_Number columns.
Page 266
Chapter 8
Join Functions
A two-table join using Non-ANSI Syntax with Table Alias Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
SELECT The column Customer_Number is in both tables. It must be fully qualified or it errors.
Cust.Customer_Number ,Customer_Name We alias the table ,Order_Number names to shorten the typing when ,Order_Total fully qualifying a FROM Customer_Table as Cust, column. Order_Table as ORD WHERE Cust.Customer_Number = Ord.Customer_Number;
A Join combines columns on the report from more than one table. The example above joins the Customer_Table and the Order_Table together. The most complicated part of any join is the JOIN CONDITION. The JOIN CONDITION means which Column from each table is a match. In this case, Customer_Number is a match that establishes the relationship.
Page 267
Chapter 8
Join Functions
You Can Fully Qualify All Columns Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
The column Customer_Number is in both tables. It must be fully qualified or it errors.
SELECT
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
A good practice is
Cust.Customer_Number to fully qualify all ,Cust.Customer_Name columns in the SELECT list for ,Ord.Order_Number clarity to other ,Ord.Order_Total users. FROM Customer_Table as Cust, Order_Table as ORD WHERE Cust.Customer_Number = Ord.Customer_Number ;
Whenever a column is in both tables, you must fully qualify it when doing a join. You don't have to fully qualify tables that are only in one of the tables because the system knows which table that particular column is in. You can choose to fully qualify every column if you like. This is a good practice because it is more apparent which columns belong to which tables for anyone else looking at your SQL.
Page 268
Chapter 8
Join Functions
A two-table join using ANSI Syntax Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT Cust.Customer_Name, Order_Number, Order_Total FROM Customer_Table Cust INNER JOIN Order_Table ORD ON Cust.Customer_Number = Ord.Customer_Number ; Messages
1 2 3 4 5
Garden of Analysis
Customer_Name Billy's Best Choice Billy's Best Choice ACE Consulting XYZ Plumbing Databases N-U
Result 1
Order_Number 123456 123512 123552 123777 123585
Order_Total 12347.53 8005.91 5111.47 23454.84 15231.62
This is the same join as the previous slide except it is using ANSI syntax. Both will return the same rows with the same performance. Rows are joined when the Customer_Number matches on both tables, but non-matches won’t return.
Page 269
Chapter 8
Join Functions
Both Queries have the same Results and Performance Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________
11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
Traditional Syntax
ANSI Syntax
SELECT Cust.Customer_Number, Customer_Name, Order_Number, Order_Total FROM Customer_Table as Cust, Order_Table as ORD WHERE Cust.Customer_Number = Ord.Customer_Number ;
SELECT Cust.Customer_Number, Customer_Name, Order_Number, Order_Total FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number ;
Both of these syntax techniques bring back the same result set and have the same performance. The INNER JOIN is considered ANSI. Which one does Outer Joins?
Page 270
Chapter 8
Join Functions
Quiz – Can You Finish the Join Syntax? Employee_Table
Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT First_Name, Last_Name, Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON Finish the Join
Finish this join by placing the missing SQL in the proper place!
Page 271
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Chapter 8
Join Functions
Answer to Quiz – Can You Finish the Join Syntax? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Department_Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Primary Key
Foreign Key
SELECT First_Name, Last_Name, Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;
This query is ready to run.
Page 272
Dept_No is the column that both tables have in common. This is called a Primary Key/Foreign Key relationship
Chapter 8
Join Functions
Quiz – Can You Find the Error? Employee_Table
Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table Dept_No ________________ Department_Name ________
SELECT First_Name ,Last_Name ,Dept_No ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ; This query has an error! Can you find it?
Page 273
100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Can you find the error?
Chapter 8
Join Functions
Answer to Quiz – Can You Find the Error? Employee_Table
Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
The column Dept_No is in both tables. It needs to be fully qualified as E.Dept_No or D.Dept_No
Department_Table Dept_No ________________ Department_Name ________
SELECT First_Name ,Last_Name ,E.Dept_No ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;
If a column in the SELECT list is in both tables, you must fully qualify it.
Page 274
100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Chapter 8
Join Functions
Super Quiz – Can You Find the Difficult Error? Employee_Table
Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
SELECT First_Name ,Last_Name ,E.Dept_No ,Department_Name Can you find FROM Employee_Table as E the error? INNER JOIN Department_Table as D ON Employee_Table.Dept_No = D.Dept_No ;
This query has an error! Can you find it?
Page 275
Marketing Research and Dev Sales Customer Support Human Resources
Chapter 8
Join Functions
Answer to Super Quiz – Can You Find the Difficult Error? Employee_Table
Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
SELECT First_Name, Last_Name, E.Dept_No ,Department_Name Once you FROM Employee_Table as E alias a table INNER JOIN (as E) Department_Table as D ON Employee_Table.Dept_No = D.Dept_No ; You must fully qualify with E.Dept_No (Not Employee_Table.Dept_No) (This query thinks there are three tables (E, D, and Employee_Table)
If a column in the SELECT list is in both tables, you must fully qualify it.
Page 276
Chapter 8
Join Functions
Quiz – Which rows from both tables won’t return? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Department_Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT E.First_Name ,E.Last_Name ,D.Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
This inner join will return all rows that have a matching Dept_No in both tables. Which rows won't return?
An Inner Join returns matching rows, but did you know an Outer Join returns both matching rows and nonmatching rows? You will understand soon!
Page 277
Chapter 8
Join Functions
Answer to Quiz – Which rows from both tables Won’t Return? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Department_Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT E.First_Name ,E.Last_Name ,D.Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;
1 2 3
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Squiggy Jones has a NULLDept_No Richard Smythe has an invalid Dept_No 10
No Employees work in Department 500
The bottom line is that the three rows excluded did not have a matching Dept_No.
Page 278
Marketing Research and Dev Sales Customer Support Human Resources
Chapter 8
Join Functions
LEFT OUTER JOIN Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
1st Table after FROM is always the LEFT Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;
Department_Table Dept_No ________________ Department_Name ________
100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Since we are doing a Left Outer Join, the Employee_Table is referred to as the outer table.
This is a LEFT OUTER JOIN. That means that all rows from the LEFT Table will appear in the report regardless if it finds a match on the right table.
Page 279
Chapter 8
Join Functions
LEFT OUTER JOIN Results Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Department_Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;
First_Name __________ Mandee Herbert William Loraine Squiggy Richard Cletus Billy John
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Department_Name ________________ Marketing Customer Support Customer Support Sales Nulls show ? mismatches ? Customer Support Research and Dev Research and Dev
Marketing Research and Dev Sales Customer Support Human Resources
The matching rows return just like an inner join, but orphaned rows from the Left table also return.
A LEFT Outer Join Returns all rows from the LEFT Table including all Matches. If a LEFT row can’t find a match, a NULL is placed on right columns not found!
Page 280
Chapter 8
Join Functions
RIGHT OUTER JOIN Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
2nd Table after FROM is always the RIGHT Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E RIGHT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;
Department_Table Dept_No ________________ Department_Name ________
100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Since we are doing a Right Outer Join, the Department_Table is referred to as the outer table.
This is a RIGHT OUTER JOIN. That means that all rows from the RIGHT Table will appear in the report regardless if it finds a match with the LEFT Table.
Page 281
Chapter 8
Join Functions
RIGHT OUTER JOIN Example and Results Employee_Table
Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Department_Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E RIGHT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ; Nulls show mismatches
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
First_Name __________ Department_Name ________________ Mandee Herbert William Loraine Cletus Billy John ?
Marketing Customer Support Customer Support Sales Customer Support Research and Dev Research and Dev Human Resources
The matching rows return just like an inner join, but orphaned rows from the Right table also return.
All rows from the Right Table were returned with matches, but since Dept_No 500 didn’t have a match, the system put a NULL Value for Left Column values.
Page 282
Chapter 8
Join Functions
FULL OUTER JOIN Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E FULL OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;
Department_Table Dept_No ________________ Department_Name ________
100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Since we are doing a Full Outer Join, both tables are referred to as the outer table.
This is a FULL OUTER JOIN. That means that all rows from both the RIGHT and LEFT Table will appear in the report regardless if it finds a match.
Page 283
Chapter 8
Join Functions
FULL OUTER JOIN Results Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Department_Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT E.First_Name ,D.Department_Name FROM Employee_Table as E FULL OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No ;
First_Name __________ Mandee Herbert William Loraine Squiggy Richard Cletus Billy John ?
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Department_Name ________________ Marketing Customer Support Customer Support Sales ? ? Customer Support Research and Dev Research and Dev Human Resources
The FULL Outer Join Returns all rows from both Tables. NULLs show the flaws!
Page 284
All rows return from both tables on a Full Outer Join
Chapter 8
Join Functions
Which Tables are the Left and which Tables are Right? Fill in the blank. Is the SELECT Cla.Claim_Id, table a Left Table or a Cla.Claim_Date, Right Table? SUB.Last_Name, SUB.First_Name, Claims __________ "ADD".Phone, Providers __________ Services __________ SER.Service_Pay, Subscribers __________ PRO.Provider_Code, Addresses __________ PRO.Provider_Name FROM CLAIMS Cla LEFT OUTER JOIN PROVIDERS PRO ON Cla.Provider_No = PRO.Provider_Code LEFT OUTER JOIN SERVICES SER ON Cla.Claim_Service = SER.Service_Code LEFT OUTER JOIN SUBSCRIBERS SUB ON Cla.Subscriber_No = SUB.Subscriber_No AND Cla.Member_No = SUB.Member_No LEFT OUTER JOIN ADDRESSES "ADD" ON SUB.Subscriber_No = "ADD".Subscriber_No;
Can you list which tables above are left tables and which tables are right tables?
Page 285
Chapter 8
Join Functions
Answer - Which Tables are the Left and which are the Right? Fill in the blank. SELECT Cla.Claim_Id, Is the table a Left Cla.Claim_Date, Table or a Right SUB.Last_Name, Table? SUB.First_Name, Claims Left "ADD".Phone, Providers Right SER.Service_Pay, Services Right PRO.Provider_Code, Subscribers Right PRO.Provider_Name Addresses Right FROM CLAIMS Cla LEFT OUTER JOIN PROVIDERS PRO ON Cla.Provider_No = PRO.Provider_Code LEFT OUTER JOIN SERVICES SER ON Cla.Claim_Service = SER.Service_Code LEFT OUTER JOIN SUBSCRIBERS SUB ON Cla.Subscriber_No = SUB.Subscriber_No AND Cla.Member_No = SUB.Member_No LEFT OUTER JOIN ADDRESSES "ADD" ON SUB.Subscriber_No = "ADD".Subscriber_No;
There is always only one Left table (the first table after the FROM clause) All tables after the first table are each Right Tables.
Tables are joined two at a time. The result from each join remains the Left Table
The first table is always the left table and the rest are right tables. The results from the first two tables being joined becomes the left table.
Page 286
Chapter 8
Join Functions
INNER JOIN with Additional AND Clause Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D WHERE E.Dept_No = D.Dept_No AND Department_Name like 'Marke%' ;
The additional AND is performed first in order to eliminate unwanted data, so the join is less intensive than joining everything first and then eliminating rows that don't qualify.
Page 287
Chapter 8
Join Functions
ANSI INNER JOIN with Additional AND Clause Employee_Table
Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND Department_Name like 'Marke%' ;
The additional AND is performed first in order to eliminate unwanted data, so the join is less intensive than joining everything first and then eliminating after.
Page 288
Chapter 8
Join Functions
ANSI INNER JOIN with Additional WHERE Clause Employee_Table
Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No WHERE Department_Name like 'Marke%' ;
The additional WHERE is performed first in order to eliminate unwanted data, so the join is less intensive than joining everything first and then eliminating.
Page 289
Chapter 8
Join Functions
OUTER JOIN with Additional WHERE Clause Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Department_Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT First_Name, Last_Name, Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No WHERE E.Dept_No = 100 ;
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
__________ First_Name Department_Name _______________ Marketing Mandee
Only Mandee Chambers is in Dept_No 100
The additional WHERE clause is performed last on an Outer Join. All rows will be joined first and then the additional WHERE clause filters after the join takes place.
Page 290
Chapter 8
Join Functions
OUTER JOIN with Additional AND Clause Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
SELECT First_Name ,Department_Name AS Dname FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND E.Dept_No = 100 ;
The additional AND is performed in conjunction with the ON statement on Outer Joins. All rows will be evaluated with the ON clause and the AND combined.
Page 291
Chapter 8
Join Functions
OUTER JOIN with Additional AND Clause Results Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
OUTER Join with additional AND Clause SELECT First_Name ,Department_Name AS Dname FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND E.Dept_No = 100 ;
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
First_Name __________ Mandee Herbert William Loraine Squiggy Richard Cletus Billy John
Dname ________ Marketing ? ? ? ? ? ? ? ?
The additional AND is performed in conjunction with the ON statement on Outer Joins. This can surprise you. Only Mandee is in Dept_No 100, so she showed up like expected, but an outer join returns non-matches also.
Page 292
Chapter 8
Join Functions
Quiz – Why is this considered an INNER JOIN? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table Dept_No ________________ Department_Name ________
100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
SELECT First_Name, Department_Name FROM Employee_Table as E LEFT OUTER JOIN Department_Table as D ON E.Dept_No = D.Dept_No AND D.Dept_No = 400 ;
This is considered an INNER JOIN because we are doing a LEFT OUTER JOIN on the Employee_Table and then filtering with the AND for a column in the right table!
Page 293
Chapter 8
Join Functions
Evaluation Order for Outer Queries SELECT Cou.*, STU1.* FROM COURSE_TABLE Cou LEFT OUTER JOIN STUDENT_COURSE_TABLE STU ON Cou.Course_Id = STU.Course_Id LEFT OUTER JOIN STUDENT_TABLE STU1 ON STU.Student_Id = STU1.Student_Id;
The Order in which Vertica evaluates Outer Queries
1
The first ON clause in the query (reading from left to right).
2
Any ON clause applies to its immediately preceding join operation.
3
Parenthesis can be used to override the natural left to right order.
When you perform an inner join, Vertica considers this to be both commutative and associative. That means that two tables being inner joined will easily come up with the intended answer. This allows the optimizer to select the best join order between tables. This is because the end result will be the same. Outer Joins are different. They will follow the above three rules for evaluation order by the Parsing Engine.
Page 294
Chapter 8
Join Functions
The DREADED Product Join Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
No Join Condition Linking the Two Tables!
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D WHERE Department_Name like '%m%' Order by 1, 2, 3;
This query becomes a Product Join because it does not possess any JOIN Conditions (Join Keys). Every row from one table is compared to every row of the other table, and quite often, the data is not what you intended to get back.
Page 295
Chapter 8
Join Functions
The DREADED Product Join Results
No Join Condition Linking the Two Tables!
SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D WHERE Department_Name like '%m%' Order by 1, 2, 3;
First_Name _________ Last_Name _________ Department_Name ________________
Not all rows are displayed
Billy Billy Cletus Cletus Herbert Herbert
Coffing Coffing Strickling Strickling Harrison Harrison
Customer Support Human Resources Customer Support Human Resources Customer Support Human Resources
How can Billy Coffing work in 2 different departments?
18 Rows came back. Nine employees with each working in three different departments. This data is WRONG!
A Product Join is often a mistake! Two department rows had an ‘m’ in their name, so these were joined to every employee, and the information is worthless.
Page 296
Chapter 8
Join Functions
The Horrifying Cartesian Product Join Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
No WHERE Clause in the join!
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E, Department_Table as D
A Cartesian Product Join is usually a big mistake.
Page 297
Department_Table
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
This joins every row from one table to every row of another table. 9 rows multiplied by 5 rows = 45 rows of complete nonsense!
Chapter 8
Join Functions
The ANSI Cartesian Join will ERROR Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
No ON Clause in the join!
Department_Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT First_Name ,Last_Name ,Department_Name FROM Employee_Table as E INNER JOIN Department_Table as D
Dept_No ________________ Department_Name ________ 100 200 300 400 500
This query Errors because ANSI forbids joins without ON clauses.
Error
This causes an error. ANSI won’t let this run unless a join condition is present.
Page 298
Marketing Research and Dev Sales Customer Support Human Resources
Chapter 8
Join Functions
Quiz – Do these Joins Return the Same Answer Set? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Query 1 SELECT First_Name, Department_Name FROM Employee_Table INNER JOIN Department_Table ;
Do these two queries produce the same result?
Page 299
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Query 2 SELECT First_Name, Department_Name FROM Employee_Table, Department_Table ;
Chapter 8
Join Functions
Answer – Do these Joins Return the Same Answer Set? Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Department_Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Query 1 SELECT First_Name, Department_Name FROM Employee_Table INNER JOIN Department_Table ; This query errors
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Query 2 SELECT First_Name, Department_Name FROM Employee_Table, Department_Table ; Cartesian product join occurs
Do these two queries produce the same result? No, Query 1 Errors due to ANSI syntax and no ON Clause, but Query 2 Product Joins to bring back junk!
Page 300
Chapter 8
Join Functions
The CROSS JOIN Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
A Cross Join is the ANSI equivalent to a Product Join
Only a WHERE will work. ON Will NOT!
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
SELECT Customer_Name, Order_Number FROM Customer_Table CROSS JOIN Order_Table WHERE Order_Number = 123456 ORDER BY 1 ;
This query becomes a Product Join because a Cross Join is an ANSI Product Join. It will compare every row from the Customer_Table to Order_Number 123456 in the Order_Table. Check out the Answer Set on the next page.
Page 301
Chapter 8
Join Functions
The CROSS JOIN Answer Set Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
Answer Set
SELECT Customer_Name, Order_Number FROM Customer_Table CROSS JOIN Order_Table WHERE Order_Number = 123456 ORDER BY 1 ;
Customer_Name ______________ Order_Number _____________ Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
This Cross Join produces information that just isn’t worth anything quite often!
Page 302
123456 123456 123456 123456 123456
Chapter 8
Join Functions
The Self Join Employee_Table2 Employee_No Dept_No Last_Name First_Name Salary ____________ _______ _________ _________ _______ 1232578 100 Chambers Mandee 48850.00 54500.00 1256349 400 Harrison Herbert 2341218 400 Reilly William 36000.00 54500.00 1121334 400 Strickling Cletus 2312225 300 Larkins Loraine 40200.00 2000000 ? Jones Squiggy 32800.50 1000234 10 Smythe Richard 32800.00 41888.88 1324657 200 Coffing Billy 48000.00 1333454 200 Smith John SELECT Mgrs.Dept_No , Mgrs.Last_Name as MgrName , Mgrs.Salary as MgrSal , Emps.Last_Name as EmpName , Emps.Salary as Empsal FROM Employee_Table2 as Emps, Employee_Table2 as Mgrs WHERE Emps.Dept_No = Mgrs.Dept_No AND Mgrs.Mgr = 'Y' AND Emps.Salary > Mgrs.Salary ;
Mgr ____ Y N Y N Y N N N Y
Which Workers make a bigger Salary than their Manager?
A Self Join gives itself 2 different Aliases, which is then seen as two different tables.
Page 303
Chapter 8
Join Functions
The Self Join with ANSI Syntax Employee_Table2 Employee_No Dept_No Last_Name First_Name Salary ____________ _______ _________ _________ _______ 1232578 100 Chambers Mandee 48850.00 54500.00 1256349 400 Harrison Herbert 2341218 400 Reilly William 36000.00 54500.00 1121334 400 Strickling Cletus 2312225 300 Larkins Loraine 40200.00 2000000 ? Jones Squiggy 32800.50 1000234 10 Smythe Richard 32800.00 41888.88 1324657 200 Coffing Billy 48000.00 1333454 200 Smith John SELECT Mgrs.Dept_No , Mgrs.Last_Name as MgrName , Mgrs.Salary as MgrSal , Emps.Last_Name as EmpName , Emps.Salary as Empsal FROM Employee_Table2 as Emps INNER JOIN Employee_Table2 as Mgrs ON Emps.Dept_No = Mgrs.Dept_No WHERE Mgrs.Mgr = 'Y' AND Emps.Salary > Mgrs.Salary ;
Mgr ____ Y N Y N Y N N N Y
Which Workers make a bigger Salary than their Manager?
A Self Join gives itself 2 different Aliases, which is then seen as two different tables.
Page 304
Chapter 8
Join Functions
Quiz – Will both queries bring back the same Answer Set? Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1; Will both queries bring back the same result set?
Page 305
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;
Chapter 8
Join Functions
Answer – Will both queries bring back the same Answer Set? Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1;
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
SELECT * FROM Customer_Table as Cust INNER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;
Will both queries bring back the same result set? Yes! Because they’re both inner joins.
Page 306
12347.53 8005.91 5111.47 15231.62 23454.84
Chapter 8
Join Functions
Quiz – Will both queries bring back the same Answer Set? Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1; Will both queries bring back the same result set?
Page 307
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;
Chapter 8
Join Functions
Answer – Will both queries bring back the same Answer Set? Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number WHERE Customer_Name like 'Billy%' ORDER BY 1;
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
SELECT * FROM Customer_Table as Cust LEFT OUTER JOIN Order_Table as ORD ON Cust.Customer_Number = Ord.Customer_Number AND Customer_Name like 'Billy%' ORDER BY 1;
Will both queries bring back the same result set? NO! The WHERE clause is performed last.
Page 308
Chapter 8
Join Functions
How would you join these two tables? Course_Table Course_ID Course_Name Credits _________ _________________ ______ Seats ____ 100 Database Concepts 3 50 200 Introduction to SQL 3 20 210 Advanced SQL 3 22 220 V2R3 SQL Features 2 25 300 Physical Database Design 4 20 400 Database Administration 4 16 Student_Table Student_ID __________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name __________ Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips
First_Name Class_Code Grade_Pt __________ __________ ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00
How would you join these two tables together? You can't do it. There is no matching column with like data. There is no Primary Key/Foreign Key relationship between these two tables. That is why you are about to be introduced to a bridge table. It is formally called an Associative table or a Lookup table.
Page 309
Chapter 8
Join Functions
An Associative Table is a Bridge that Joins Two Tables Associative
Course_Table
Table
Course_ID Course_Name Credits _________ _________________ ______ Seats ____ 100 Database Concepts 3 50 200 Introduction to SQL 3 20 210 Advanced SQL 3 22 220 V2R3 SQL Features 2 25 300 Physical Database Design 4 20 400 Database Administration 4 16
Student_Course_Table Student_ID Course_ID 280023 231222 125634 231222 125634 322133 125634 322133 324652 333450 260000 333450 234121 123250
210 210 100 220 200 220 220 300 200 500 400 400 100 100
Student_Table Student_ID __________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name __________ Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips
First_Name Class_Code Grade_Pt __________ __________ ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00
The Associative Table is a bridge between the Course_Table and Student_Table.
Page 310
Chapter 8
Join Functions
Quiz – Can you write the 3-Table Join? Associative
Course_Table
Table
Course_ID Course_Name Credits _________ _________________ ______ Seats ____ 100 Database Concepts 3 50 200 Introduction to SQL 3 20 210 Advanced SQL 3 22 220 V2R3 SQL Features 2 25 300 Physical Database Design 4 20 400 Database Administration 4 16
Student_Course_Table Student_ID Course_ID 280023 231222 125634 231222 125634 322133 125634 322133 324652 333450 260000 333450 234121 123250
210 210 100 220 200 220 220 300 200 500 400 400 100 100
Student_Table Student_ID __________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name __________ Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips
First_Name Class_Code Grade_Pt __________ __________ ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00
SELECT ALL Columns from the Course_Table and Student_Table and Join them.
Page 311
Chapter 8
Join Functions
Answer to quiz – Can you Write the 3-Table Join? Student_Course_Table Student_Table
Student_ID Last_Name First_Name Class_Code Grade_Pt
Course_Table Student_ID Course_ID
SELECT S.*, C.* FROM Student_Table as S, Course_Table as C, Student_Course_Table as SC Where S.Student_ID = SC.Student_ID AND C.Course_ID = SC.Course_ID ;
Course_ID Course_Name Credits Seats
Notice the * technique of getting ALL columns from both tables!
The Associative Table is a bridge between the Course_Table and Student_Table, and its sole purpose is to join these two tables together.
Page 312
Chapter 8
Join Functions
Quiz – Can you write the 3-Table Join to ANSI Syntax? Student_Course_Table Student_Table
Student_ID Last_Name First_Name Class_Code Grade_Pt
Course_Table Student_ID Course_ID
Course_ID Course_Name Credits Seats
SELECT S.*, C.* FROM Student_Table as S, Course_Table as C, Student_Course_Table as SC Where S.Student_ID = SC.Student_ID AND C.Course_ID = SC.Course_ID ; Convert this query to ANSI syntax
Please re-write the above query using ANSI Syntax.
Page 313
Chapter 8
Join Functions
Answer – Can you write the 3-Table Join to ANSI Syntax? Student_Course_Table
Student_Table Student_ID Last_Name First_Name Class_Code Grade_Pt
Course_Table Student_ID Course_ID
Course_ID Course_Name Credits Seats
ANSI Syntax Traditional Syntax SELECT S.*, C.* FROM Student_Table as S, Course_Table as C, Student_Course_Table as SC Where S.Student_ID = SC.Student_ID AND C.Course_ID = SC.Course_ID ;
Select S.*, C.* From Student_Table as S INNER JOIN Student_Course_Table as SC ON S.Student_ID = SC.Student_ID INNER JOIN Course_Table as C ON C.Course_ID = SC.Course_ID;
The above queries show both traditional and ANSI form for this three table join.
Page 314
Chapter 8
Join Functions
Quiz – Can you Place the ON Clauses at the End? Student_Course_Table Student_Table Student_ID Last_Name First_Name Class_Code Grade_Pt
Course_Table Student_ID Course_ID
Course_ID Course_Name Credits Seats
ANSI Syntax Select S.*, C.* From Student_Table as S INNER JOIN Student_Course_Table as SC ON S.Student_ID = SC.Student_ID INNER JOIN Course_Table as C ON C.Course_ID = SC.Course_ID; Please re-write the above query and place both ON Clauses at the end.
Page 315
Can you rewrite this and place all of the ON clauses at the end?
Chapter 8
Join Functions
Answer – Can you Place the ON Clauses at the End? Student_Course_Table Student_Table Student_ID Last_Name First_Name Class_Code Grade_Pt
Course_Table Student_ID Course_ID
Course_ID Course_Name Credits Seats
Select S.*, C.* The trick is to From Student_Table as S put the first ON INNER JOIN clause for the Student_Course_Table as SC last join and go INNER JOIN backwards Course_Table as C ON C.Course_ID = SC.Course_ID ON SC.Student_ID = S.Student_ID;
This is tricky. The only way it works is to place the ON clauses backwards. The first ON Clause represents the last INNER JOIN and then moves backwards.
Page 316
Chapter 8
Join Functions
The 5-Table Join – Logical Insurance Model Addresses
Subscriber_No
Subscribers
Claims
Subscriber_No
Subscriber_No
Member_No
Member_No
Services Service_Code
Claim_Service
Providers Provider_Code
Provider_No
Above is the logical model for the insurance tables showing the Primary Key and Foreign Key relationships (PK/FK).
Page 317
Chapter 8
Join Functions
Quiz - Write a Five Table Join Using ANSI Syntax Addresses
Subscriber_No
Subscribers
Claims
Subscriber_No
Subscriber_No
Member_No
Member_No
Services Service_Code
Claim_Service
Providers Provider_Code
Provider_No
Your mission is to write a five table join selecting all columns using ANSI syntax.
Page 318
Chapter 8
Join Functions
Answer - Write a Five Table Join Using ANSI Syntax SELECT cla1.*, sub1.*, add1.* ,pro1.*, ser1.* FROM CLAIMS AS cla1 INNER JOIN SUBSCRIBERS AS sub1 ON cla1.Subscriber_No = sub1.Subscriber_No AND cla1.Member_No = sub1.Member_No INNER JOIN ADDRESSES AS add1 ON sub1.Subscriber_No = add1.Subscriber_No INNER JOIN PROVIDERS AS pro1 ON cla1.Provider_No = pro1.Provider_Code INNER JOIN SERVICES AS ser1 ON cla1.Claim_Service = ser1.Service_Code ; Above is the example writing this five table join using ANSI syntax.
Page 319
Chapter 8
Join Functions
Quiz - Write a Five Table Join Using Non-ANSI Syntax Addresses
Subscriber_No
Subscribers
Claims
Subscriber_No
Subscriber_No
Member_No
Member_No
Services Service_Code
Claim_Service
Providers Provider_Code
Provider_No
Your mission is to write a five table join selecting all columns using Non-ANSI syntax.
Page 320
Chapter 8
Join Functions
Answer - Write a Five Table Join Using Non-ANSI Syntax SELECT FROM
WHERE AND AND AND AND
cla1.*, sub1.*, add1.* ,pro1.*, ser1.* CLAIMS AS cla1, SUBSCRIBERS AS sub1, ADDRESSES AS add1, PROVIDERS AS pro1, SERVICES AS ser1 cla1.Subscriber_No = sub1.Subscriber_No cla1.Member_No = sub1.Member_No sub1.Subscriber_No = add1.Subscriber_No cla1.Provider_No = pro1.Provider_Code cla1.Claim_Service = ser1.Service_Code ;
Above is the example writing this five table join using Non-ANSI syntax.
Page 321
Chapter 8
Join Functions
Quiz –Re-Write this putting the ON clauses at the END SELECT cla1.*, sub1.*, add1.* ,pro1.*, ser1.* FROM CLAIMS AS cla1 INNER JOIN SUBSCRIBERS AS sub1 ON cla1.Subscriber_No = sub1.Subscriber_No AND cla1.Member_No = sub1.Member_No INNER JOIN ADDRESSES AS add1 ON sub1.Subscriber_No = add1.Subscriber_No INNER JOIN PROVIDERS AS pro1 ON cla1.Provider_No = pro1.Provider_Code INNER JOIN SERVICES AS ser1 ON cla1.Claim_Service = ser1.Service_Code ; Above is the example writing this five table join using Non-ANSI syntax.
Page 322
Chapter 8
Join Functions
Answer –Re-Write this putting the ON clauses at the END SELECT cla1.*, sub1.*, add1.* ,pro1.*, ser1.* FROM PROVIDERS AS pro1 INNER JOIN ADDRESSES AS add1 INNER JOIN SUBSCRIBERS AS sub1 INNER JOIN SERVICES AS ser1 INNER JOIN CLAIMS as cla1 ON cla1.Claim_Service = ser1.Service_Code ON cla1.Subscriber_No = sub1.Subscriber_No AND cla1.Member_No = sub1.Member_No ON sub1.Subscriber_No =add1.Subscriber_No ON cla1.Provider_No = pro1.Provider_Code ; Above is the example writing this five table join using ANSI syntax with the ON clauses at the end. We had to move the tables around also to make this happen. Notice that the first ON clause represents the last two tables being joined, and then it works backwards. .
Page 323
Chapter 9
Page 324
Date Functions
Chapter 9
Date Functions
Chapter 9 – Date Functions
"An inch of time cannot be bought with an inch of gold." - Chinese Proverb
Page 325
Chapter 9
Date Functions
Current_Date Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
EXECUTE
Query 1 Query 2 Query 3 SELECT Current_Date AS ANSI_Date ;
Messages
Garden of Analysis
ANSI_Date 1 10/11/2015
The Current_Date will return today's date.
Page 326
History
Result 1
Sandbox ?
New Query
Chapter 9
Date Functions
Current_Date, Current_Time and Current_Timestamp Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT Current_Date AS ANSI_Date ,Current_Time AS ANSI_Time ,Current_Timestamp(6) AS ANSI_Timestamp
Messages
Garden of Analysis
ANSI_Date
ANSI_TIME
1 10/11/2015
12:53:26
Result 1
6 Microseconds
ANSI_Timestamp
10/11/2015 12:53:26.423126 Date
Space
Time
Above are the keywords you can utilize to get the date, time or timestamp. These are reserved words that the system will deliver to you when requested.
Page 327
Chapter 9
Date Functions
Timestamp Differences SELECT Current_Timestamp(0) AS Col1 ,Current_Timestamp(6) AS Col2
Col1 ________________
Col2 ________________________
2014/03/22 10:34:44
2011/03/22 10:34:44.123456
Date Space Time
Microseconds
A timestamp has the date separated by a space and the time. In our second example we have asked for 6 microseconds.
Page 328
Chapter 9
Date Functions
Getdate This example uses the Getdate() function to return the timestamp.
SELECT Getdate() as "The Date"; The Date -----------03/30/2015 8:46:04.567
“Not all who wander are lost.” – J. R. R. Tolkien The Getdate command will return today's date and time just like the Current_Timestamp command. This is not ANSI.
Page 329
Chapter 9
Date Functions
Date and Time Keywords SELECT GETDATE() AS 'GETDATE' , CURRENT_TIMESTAMP AS 'CURRENT_TIMESTAMP' , GETUTCDATE() AS 'GETUTCDATE'
GETDATE CURRENT_TIMESTAMP 03/30/2015 8:42:04.83352 03/30/2015 8:42:04.83352 Date and Time
Date and Time ANSI
GETUTCDATE 03/30/2015 1:42:04.83352 Date and Time UTC
The above example shows another way to get the date and time. The GETDATE and CURRENT_TIMESTAMP are equivalent, but CURRENT_TIMESTAMP is ANSI compliant.
Page 330
Chapter 9
Date Functions
Using CAST in Literal Values SELECT CAST('20150216' AS DATE) as "Date YMD"; Date YMD _________ 2015-02-16
This is an example of using the CAST function with a date literal.
Page 331
Chapter 9
Date Functions
Add or Subtract Days from a date SELECT Order_Date ,Order_Date + 60 as "Due Date" ,Order_Total ,Order_Date + 50 as Disc_Date ,Order_Total *.98 as Disc_Amt FROM Order_Table ORDER BY 1 ;
Order_Date __________
05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999
Due Date Order_Total _________ Disc_Date _________ __________
07/03/1998 03/02/1999 11/08/1999 11/30/1999 12/09/1999
12347.53 8005.91 23454.84 5111.47 15231.62
06/23/1998 02/20/1999 10/29/1999 11/20/1999 11/29/1999
Disc_Amt __________
12100.58 7845.79 22985.74 5,009.24 14926.99
When you add or subtract from a Date you are adding/subtracting Days
Because Dates are stored internally on disk as integers, it makes it easy to add days to the calendar. In the query above we are adding 60 days to the Order_Date.
Page 332
Chapter 9
Date Functions
Formatting Dates HH - Hour of day (00-23) HH12 - Hour of day (01-12) HH24 - Hour of day (00-23) MI - Minute (00-59) SS - Second (00-59) MS - Millisecond (000-999) US - Microsecond (000000-999999) SSSS - Seconds past midnight (0-86399) AM or A.M. or PM or P.M. - (uppercase) am or a.m. or pm or p.m. - (lowercase) Y,YYY - Year with comma YYYY - Year (4 and more digits) YYY - Last 3 digits of year YY - Last 2 digits of year Y - Last digit of year IYYY - ISO year (4 and more digits) IYY - Last 3 digits of ISO year IY - Last 2 digits of ISO year I - Last digits of ISO year BC or B.C. or AD or A.D. - (uppercase) bc or b.c. or ad or a.d. - (lowercase) MONTH - Full uppercase month name Month - Full mixed-case month name month - Full lowercase month name
MON - Uppercase month (3 chars) Mon - Mixed-case month (3 chars) mon - Lowercase month (3 chars) MM - Month number (01-12) DAY - Full uppercase day Day - Full mixed-case day day - Full lowercase day DY - Uppercase day (3 chars) Dy - Mixed-case day (3 chars) dy - Lowercase day (3 chars) DDD - Day of year (001-366) DD - Day of month (01-31) for TIMESTAMP D - Day of week (1-7) Sunday = 1 W - Week of month (1-5) WW - Week number of year (1-53) IW - ISO week number of year CC - Century (2 digits) J - Julian Day (days since Jan 1, 4712 BC) Q - Quarter RM - Month in Roman numerals rm - Month in Roman numerals (lowercase) TZ - Time-zone name (uppercase) tz - Time-zone name (lowercase)
Vertica gives you many options for formatting dates. The next page will show an example.
Page 333
Chapter 9
Date Functions
Formatting Date Example SELECT Order_Date ,TO_CHAR(Order_Date , 'YY-MM-DD') AS YMD ,TO_CHAR(Order_Date , 'MON, DD, YYYY') AS Month ,TO_CHAR(Order_Date , 'D, Mon DD, YY') AS DayofWeek ,Current_Time as Time ,TO_CHAR(Current_Time , 'HH24:MI:SS:MS') AS Micro FROM Order_Table WHERE EXTRACT(Year from Order_Date) = 1998 ;
Order_Date YMD Month DayofWeek _______ Time Micro __________ ________ ____________ ____________ ___________
05/04/1998 98-05-04 MAY, 04, 1998 2, May 04, 98 10:09:58 10:09:58:109 Above you can see an example of formatted dates using the TO_CHAR command.
Page 334
Chapter 9
Date Functions
A Summary of Math Operations on Dates 1
DATE - DATE = Interval (days between dates)
2
DATE + or - Integer = Date
SELECT Order_Number ,Order_Total ,Order_Date ,Order_Date - 365 as Last_Year , Current_Date - Order_Date as Days_Between FROM Order_Table
Order_Number _____________ Order_Total __________ 123456 12347.53 123512 8005.91 123552 5111.47 123585 15231.62 123777 23454.84
Order_Date Last_Year Days_Between __________ __________ ____________ 05/04/1998 05/04/1997 6221 01/01/1999 01/01/1998 5979 10/01/1999 10/01/1998 5706 10/10/1999 10/10/1998 5697 09/09/1999 09/09/1998 5728
A DATE – DATE is an interval of days between dates. A DATE + or – Integer = Date.
Page 335
Chapter 9
Date Functions
The ADD_MONTHS Command Order_Table Order_Number Customer_Number ___________ Order_Date ____________ ________________ 123456 11111111 05/04/1998 123512 11111111 01/01/1999 123552 31323134 10/01/1999 123585 87323456 10/10/1999 123777 57896883 09/09/1999
Order_Total __________ 12347.53 8005.91 5111.47 15231.62 23454.84
SELECT Order_Date ,Add_Months (Order_Date,2) as "Due Date2" ,Order_Total FROM Order_Table ORDER BY 1 ; Order_Date __________ 05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999
Due Date2 ___________ Order_Total _________ 07/04/1998 12347.53 03/01/1999 8005.91 11/09/1999 23454.84 12/01/1999 5111.47 12/10/1999 15231.62
This is the Add_Months Command. What you can do with it is add a month or many months to your date columns. Can you convert this to one year? There is no ADD_YEAR command!
Page 336
Chapter 9
Date Functions
Using the ADD_MONTHS Command to Add 1 Year Order_Table Order_Number Customer_Number ___________ Order_Date ____________ ________________ 123456 11111111 05/04/1998 123512 11111111 01/01/1999 123552 31323134 10/01/1999 123585 87323456 10/10/1999 123777 57896883 09/09/1999
Order_Total __________ 12347.53 8005.91 5111.47 15231.62 23454.84
SELECT Order_Date ,Add_Months (Order_Date,12) as "Due Date12" ,Order_Total FROM Order_Table ORDER BY 1 ; Order_Date __________ 05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999
Due Date12 ____________ Order_Total _________ 05/04/1999 12347.53 01/01/2000 8005.91 09/09/2000 23454.84 10/01/2000 5111.47 10/10/2000 15231.62
The Add_Months command adds months to any date. Above we used a great technique that would give us 1 year. Can you give me 5 years?
Page 337
Chapter 9
Date Functions
Using the ADD_MONTHS Command to Add 1 Year Order_Table Order_Number Customer_Number ___________ Order_Date ____________ ________________ 123456 11111111 05/04/1998 123512 11111111 01/01/1999 123552 31323134 10/01/1999 123585 87323456 10/10/1999 123777 57896883 09/09/1999
Order_Total __________ 12347.53 8005.91 5111.47 15231.62 23454.84
SELECT Order_Date ,Add_Months (Order_Date,12) as "Due Date12" ,Order_Total FROM Order_Table ORDER BY 1 ; Order_Date __________ 05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999
Due Date12 ____________ Order_Total _________ 05/04/1999 12347.53 01/01/2000 8005.91 09/09/2000 23454.84 10/01/2000 5111.47 10/10/2000 15231.62
The Add_Months command adds months to any date. Above we used a great technique that would give us 1 year. Can you give me 5 years?
Page 338
Chapter 9
Date Functions
Using the ADD_MONTHS Command to Add 5 Years Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT Order_Date ,Add_Months (Order_Date,12 * 5) as "Due Date" ,Order_Total FROM Order_Table WHERE EXTRACT(Month from Order_Date) = 9 ; Messages
Order_Date
Garden of Analysis
Due Date
1 09/09/1999 09/09/2004
Result 1
Order_Total 23454.84
Above you see a great technique for adding multiple years to a date.
Page 339
Sandbox
Chapter 9
Date Functions
The EXTRACT Command Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT Order_Date ,Add_Months (Order_Date,12 * 5) as "Due Date" ,Order_Total FROM Order_Table WHERE EXTRACT(Month from Order_Date) = 9 ; Messages
Order_Date
Garden of Analysis
Due Date
1 09/09/1999 09/09/2004
Result 1
Order_Total 23454.84
This is the Extract command. It returns a date part, such as a day, month or year, from a timestamp value or expression. It can be used in the SELECT list, the WHERE Clause, or the ORDER BY Clause!
Page 340
Chapter 9
Date Functions
YEAR, MONTH, and DAY Functions SELECT Order_Date ,Year(Order_Date) as "Yr" ,Month(Order_Date) as "Mo" ,Day(Order_Date) as "Day" FROM Order_Table ORDER BY 1 ;
Order_Date ____ Yr Mo __________ ___ 1998-05-04 1999-01-01 1999-09-09 1999-10-01 1999-10-10
1998 1999 1999 1999 1999
5 1 9 10 10
Day ____ 4 1 9 1 10
The YEAR, MONTH and DAY functions are abbreviations for the DATEPART function.
Page 341
Chapter 9
Date Functions
A Better Technique for YEAR, MONTH, and DAY Functions SELECT Order_Number, Customer_Number, Order_Date, Order_Total FROM Order_Table WHERE YEAR(order_date) = 1999 AND MONTH(order_date) = 10;
SELECT Order_Number, Customer_Number, Order_Date, Order_Total FROM Order_Table This approach is more efficient for Vertica. WHERE order_date >= '19991001' Indexes can take advantage of this AND order_date < '19991101' technique!
Both queries above do the same thing and deliver the same result set, but the bottom query could be much faster.
Order_Number ________________ Customer_Number Order_Date _____________ __________ Order_Total __________ 123552 123585
31323134 87323456
1999-10-01 1999-10-10
5111.47 15231.62
Above are the tale of two queries. The top query applies manipulation on the filtered column. Yet, in most cases Vertica can’t use an index efficiently when using this technique. The bottom query uses a range filter instead. Brilliant!
Page 342
Chapter 9
Date Functions
Another Version of the EXTRACT Command The EXTRACT command extracts portions of Date, Time, and Timestamp SELECT Order_Date ,Add_Months (Order_Date,12 * 5) as "Due Date" ,Order_Total FROM Order_Table WHERE EXTRACT(Month from Order_Date) = 9 ; Below is another version of the Extract Command SELECT Order_Date ,Add_Months (Order_Date,12 * 5) as "Due Date" ,Order_Total FROM Order_Table WHERE Month (Order_Date) = 9 ; Order_Date __________ 09/09/1999
Due Date Order_Total __________ ____________ 09/09/2004
23454.84
Both examples above are equivalent, but beware! The EXTRACT command is a better form because it also works on Day, Year, Hour, Minute and Second. The example on the bottom won't work with all of them.
Page 343
Chapter 9
Date Functions
EXTRACT from DATES and TIME SELECT Current_Date as Date ,EXTRACT(Year from Current_Date) as Yr ,EXTRACT(Month from Current_Date) as Mo ,EXTRACT(Day from Current_Date) as Da ,Current_Time as Time ,EXTRACT(Hour from Current_Time) as Hr ,EXTRACT(Minute from Current_Time) as Mn ,EXTRACT(Second from Current_Time) as Sc ,EXTRACT(TIMEZONE_HOUR from Current_Time) as Th ,EXTRACT(TimeZONE_MINUTE from Current_Time) as Tm ;
Answer Set Date Yr Mo ___ Da ________ Time __________ ____ ___ 05/16/2015 2015
5 16
07:51:51
Hr Sc Th Tm __ Mn ___ ________ ___ ___ 7 51
51.444636
-4
0
Just like the Add_Months, the EXTRACT Command is a Temporal Function or a Time-Based Function.
Page 344
Chapter 9
Date Functions
Why EXTRACT is a Better Form SELECT Current_Date as Date ,EXTRACT(Year from Current_Date) as Yr ,EXTRACT(Month from Current_Date) as Mo ,EXTRACT(Day from Current_Date) as Da ,Current_Time as Time ,EXTRACT(Hour from Current_Time) as Hr ,EXTRACT(Minute from Current_Time) as Mn ,EXTRACT(Second from Current_Time) as Sc ,EXTRACT(TIMEZONE_HOUR from Current_Time) as Th ,EXTRACT(TimeZONE_MINUTE from Current_Time) as Tm ; SELECT Current_Date as Date ,Year (Current_Date) as Yr , Month (Current_Date) as Mo ,Day(Current_Date) as Da ,Current_Time as Time ,Hour(Current_Time) as Hr ,Minute(Current_Time) as Min ,Second(Current_Time) as Sec
Timezone_Hour (Current_Time) and Timezone_Minute (Current_Time) do not work with this format.
Most extracts are on the month or year. That can be done using either technique above.
Page 345
Chapter 9
Date Functions
EXTRACT with DATE and TIME Literals SELECT EXTRACT(Year FROM Date '2000-10-01') AS "YR" ,EXTRACT(Month FROM Date '2000-10-01') AS "Mth" ,EXTRACT(DAY FROM Date '2000-10-01') AS 'Day' ,EXTRACT(HOUR FROM TIME '10:01:30') AS 'Hr' ,EXTRACT(MINUTE FROM TIME '10:01:30') AS 'Min' ,EXTRACT(SECOND FROM TIME '10:01:30') AS 'Sec' ,EXTRACT(MONTH FROM Current_Timestamp) AS ts_Mth ,EXTRACT(SECOND FROM Current_Timestamp) AS ts_Part
YR Mth Day Hr Min Sec ts_Mth ts_Part ____ ____ ___ ___ ____ _________ ______ _________ 2000 10
1 10
1 30.000000
5
5.266518
Just like the Add_Months, the EXTRACT Command is a Temporal Function or a Time-Based Function. The query above is designed to show how to use it with literal values.
Page 346
Chapter 9
Date Functions
EXTRACT of the Month on Aggregate Queries SELECT EXTRACT(Month FROM Order_Date) ,COUNT(*) as "Rows" ,AVG(Order_Total) as "AVG" FROM Order_Table GROUP BY 1 ORDER BY 1
date_part AVG ________ Rows _____ _________ 1 5 9 10
1 1 1 2
8005.91 12347.53 23454.84 10171.55
The above SELECT uses the EXTRACT to only display the month and also to control the number of aggregates displayed in the GROUP BY. Notice the Answer Set headers.
Page 347
Chapter 9
Date Functions
AGE_IN_MONTHS Order_Table Order_Number Customer_Number ___________ Order_Date ____________ ________________ 123456 11111111 05/04/1998 123512 11111111 01/01/1999 123552 31323134 10/01/1999 123585 87323456 10/10/1999 123777 57896883 09/09/1999
Order_Total __________ 12347.53 8005.91 5111.47 15231.62 23454.84
SELECT Order_Date ,Age_In_Months (Current_Date, Order_Date) as "Age_in_Months" ,Current_Date - Order_Date as Age_in_Days ,Current_Date as Todays_Date FROM Order_Table ORDER BY 1 ; Order_Date Todays_Date __________ Age_in_Months _____________ Age_in_Days ___________ ___________ 05/04/1998 204 6223 05/18/2015 01/01/1999 196 5981 05/18/2015 09/09/1999 188 5730 05/18/2015 10/01/1999 187 5708 05/18/2015 10/10/1999 187 5699 05/18/2015 Above you see a great technique for seeing the age in months between two dates or timestamps. Page 348
Chapter 9
Date Functions
AGE_IN_YEARS Order_Table Order_Number Customer_Number ___________ Order_Date ____________ ________________ 123456 11111111 05/04/1998 123512 11111111 01/01/1999 123552 31323134 10/01/1999 123585 87323456 10/10/1999 123777 57896883 09/09/1999
Order_Total __________ 12347.53 8005.91 5111.47 15231.62 23454.84
SELECT Order_Date ,Age_In_Months (Current_Date, Order_Date) as "Age_in_Months" ,Age_In_Years (Current_Date, Order_Date) as "Age_in_Years" ,Current_Date as Todays_Date FROM Order_Table ORDER BY 1 ; Order_Date Age_in_Months __________ _____________ Age_in_Years ___________ Todays_Date ___________ 05/18/2015 05/04/1998 204 17 05/18/2015 01/01/1999 196 16 05/18/2015 09/09/1999 188 15 05/18/2015 10/01/1999 187 15 05/18/2015 10/10/1999 187 15
Above you see a great technique for seeing the age in years between two dates or timestamps.
Page 349
Chapter 9
Date Functions
DATE_TRUNC SELECT Current_Date as "Today" ,DATE_TRUNC('Century', Current_Date) as "Century" ,DATE_TRUNC('Day', Current_Date) as "Day"
Today Century _________ _______________________ 05/18/2015 01/01/2001 12:00:00.000000
Day _______________________ 05/18/2015 12:00:00.000000
SELECT Current_Time as "Time" ,DATE_TRUNC('Minute', Current_Time) as "Minute" ,DATE_TRUNC('Hour', Current_Time) as "Hour" ,DATE_TRUNC('Microseconds', Current_Time) as "Micro" Time Minute Hour Micro _______ _______ _______ ________ 20:24:25 20:24:00 20:00:00 20:24:25
The Date_Trunc function truncates date and time values as specified.
Page 350
Chapter 9
Date Functions
DATEDIFF SELECT Current_Date as "Date Today" ,Order_Date ,DATEDIFF(Year, Order_Date, Current_Date) as "Years" ,DATEDIFF(Quarter, Order_Date, Current_Date) as "Quarters" ,DATEDIFF(Month, Order_Date, Current_Date) as "Months" ,DATEDIFF(Day, Order_Date, Current_Date) as "Days" ,DATEDIFF(Week, Order_Date, Current_Date) as "Weeks" ,DATEDIFF(Hour, Order_Date, Current_Date) as "Hours" FROM Order_Table WHERE EXTRACT(Year from Order_Date) = 1998 ;
Date Today Order_Date Quarters Months Days ______ Weeks Hours _________ _________ Years ____ _______ ______ ____ _______ 05/18/2015 05/04/1998
17
68
204 6223
889 149352
The DATEDIFF function Returns the difference between two date or time values based on the specified start and ending arguments. The DATEDIFF function includes all of the above plus minute, second, millisecond and microsecond.
Page 351
Chapter 9
Date Functions
DAYOFWEEK SELECT Current_Date as "Date Today" ,DAYOFWEEK(Current_Date) ,CASE DAYOFWEEK(Current_Date) WHEN 1 Then 'Sunday' WHEN 2 Then 'Monday' WHEN 3 Then 'Tuesday' WHEN 4 Then 'Wednesday' WHEN 5 Then 'Thursday' WHEN 6 Then 'Friday' WHEN 7 Then 'Saturday' END as WhatDayIsIt Date Today DAYOFWEEK _________ ____________ WhatDayIsIt ___________
05/18/2015
2
Monday
The DAYOFWEEK function returns a 1 if the day of the week is Sunday, 2 if Monday and so on.
Page 352
Chapter 9
Date Functions
Intervals for Date, Time and Timestamp Interval Chart Simple Intervals
More involved Intervals
YEAR MONTH DAY HOUR MINUTE SECOND
DAY TO HOUR DAY TO MINUTE DAY TO SECOND HOUR TO MINUTE HOUR TO SECOND MINUTE TO SECOND
“It’s not the size of the dog in the fight, but the size of the fight in the dog.”
– Archie Griffin Vertica has added INTERVAL processing, however, it is not ANSI compliant. Intervals are used to perform DATE, TIME and TIMESTAMP arithmetic and conversion.
Page 353
Chapter 9
Date Functions
Interval Data Types and the Bytes to Store Them Interval Chart Bytes 2 4 2 2 2 8 10/12 2 4 8 2 6/8 6/8
Data Type INTERVAL YEAR INTERVAL YEAR TO MONTH INTERVAL MONTH INTERVAL MONTH TO DAY INTERVAL DAY 10 for 32-bit INTERVAL DAY TO MINUTE systems and INTERVAL DAY TO SECOND 12 for 64-bit systems INTERVAL HOUR 2 INTERVAL HOUR TO MINUTE 4 INTERVAL HOUR TO SECOND 8 6 for 32-bit INTERVAL MINUTE 2 systems and INTERVAL MINUTE TO SECOND 8 for 64-bit INTERVAL SECOND systems
Above are the interval data types and the bytes to store them.
Page 354
Chapter 9
Date Functions
Using Intervals SELECT Current_Date as Our_Date ,Current_Date + Interval '1' Day as Plus_1_Day ,Current_Date + Interval '3' Month as Plus_3_Months ,Current_Date + Interval '5' Year as Plus_5_Years
SELECT Current_Date as Our_Date ,CAST(Current_Date + Interval '1' Day as Date) as Plus_1_Day ,CAST(Current_Date + Interval '3' Month as Date) as Plus_3_Months ,CAST(Current_Date + Interval '5' Year as Date) as Plus_5_Years Our_Date ________
Plus_1_Day Plus_3_Months _____________ _______________ Plus_5_Years _____________
05/16/2015 05/17/2015
08/16/2015
05/16/2020
Above we are using simple intervals. Notice in the first example the time added to the interval. Notice the second example has used the CAST (Convert and Store) technique. Either way, the intervals have been added.
Page 355
Chapter 9
Date Functions
How a Simple Interval Handles Leap Year SELECT Date '2012-01-29' as Our_Date ,Date '2012-01-29' + INTERVAL '1' Month as Leap_Year
Our_Date _________ 01/29/2012
Leap_Year _________________________ 02/29/2012 12:00:00.000000
SELECT Date '2011-01-29' as Our_Date ,Date '2011-01-29' + INTERVAL '1' Month as Leap_Year Our_Date _________ 01/29/2011
Leap_Year _________________________ 02/28/2011 12:00:00.000000
The first example works because we added 1 month to the date '2012-01-29' and we got '2012-02-29'. Because this was leap year, there actually is a date of February 29, 2012. The next example is the real point. We have a date of '2011-01-29' and we add 1-month to that, but there is no February 29th in 2011, so the query places the day at 02/28/2011.
Page 356
Chapter 9
Date Functions
Interval Arithmetic Results DATE and TIME arithmetic results using intervals: DATE TIME TIMESTAMP
-
DATE TIME TIMESTAMP
- or + Interval = DATE - or + Interval = TIME - or + Interval = TIMESTAMP
Interval
DATE TIME TIMESTAMP
= Interval = Interval = Interval
- or + Interval = Interval
“Once the game is over, the king and the pawn go back in the same box.” - Italian Proverb To use DATE and TIME arithmetic, it is important to keep in mind the results of various operations. The above chart is your Interval guide.
Page 357
Chapter 9
Date Functions
A Time Interval Example Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT (TIME '12:45:01' - TIME '10:10:01') HOUR AS "Hours" ,(TIME '12:45:01' - TIME '10:10:01') MINUTE AS "Minutes" ,(TIME '12:45:01' - TIME '10:10:01') SECOND AS "Seconds"
Messages
Hours 1 2
Garden of Analysis
Result 1
Minutes
Seconds
155
9300.000000
Time intervals work as you can see from the example above.
Page 358
History
Chapter 9
Date Functions
A DATE Interval Example Going Back in Time Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT Current_Date as Date ,INTERVAL -'2' YEAR + CURRENT_DATE as Two_years_Ago ;
Messages
Garden of Analysis
Date
Result 1
Two_years_Ago
1 10/11/2015 10/11/2013 12:00:00.000000
“I know that you believe that you understand what you think I said, but I am not sure you realize that what you heard is not what I meant.” -Sign on Pentagon office wall
The above Interval example uses a -'2' to go back in time.
Page 359
Chapter 9
Date Functions
A Complex Time Interval Example using CAST Below is the syntax for using the CAST with a date: SELECT CAST ( AS INTERVAL ) FROM ;
The following converts an INTERVAL of 6 years and 2 months to an INTERVAL number of months:
SELECT CAST( (INTERVAL '6-02' YEAR TO MONTH) AS INTERVAL MONTH ) Mths
_____ Mths 74 The CAST function (Convert and Store) is the ANSI method for converting data from one type to another. It can also be used to convert one INTERVAL to another INTERVAL representation. Although the CAST is normally used in the SELECT list, it works in the WHERE clause for comparison reasons.
Page 360
Chapter 9
Date Functions
A Complex Time Interval Example using CAST Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT CAST(INTERVAL '1300' MONTH AS INTERVAL YEAR TO MONTH) AS "Years & Months" ;
The above request attempts to convert 1300 months to show the number of years and months. Messages
Garden of Analysis
Result 1
Years & Months 1
108-04
The biggest advantage in using the INTERVAL processing is that SQL written on another system is now compatible.
Page 361
Chapter 9
Date Functions
The OVERLAPS Command Compatibility: Vertica Extension The syntax of the OVERLAPS is: SELECT WHERE (, ) OVERLAPS (, ) ;
SELECT 'The Dates Overlap' as Dater WHERE (DATE '2001-01-01', DATE '2001-11-30') OVERLAPS (DATE '2001-10-15', DATE '2001-12-31');
Answer
Dater ________________ The Dates Overlap
When working with dates and times, sometimes it is necessary to determine whether two different ranges have common points in time. Vertica provides a Boolean function to make this test for you. It is called OVERLAPS; it evaluates true if multiple points are in common, otherwise it returns a false. The literal is returned because both date ranges have from October 15 through November 30 in common.
Page 362
Chapter 9
Date Functions
An OVERLAPS Example that Returns No Rows Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT 'The dates overlap' AS OverlapAnswer WHERE (DATE '2001-01-01', DATE '2001-11-30') OVERLAPS (DATE '2001-11-30', DATE '2001-12-31') ;
Messages
Garden of Analysis
Result 1
OverlapAnswer No rows returned so we know the dates did not overlap
The above SELECT example tests two literal dates and uses the OVERLAPS to determine whether or not to display the character literal. The literal was not selected because the ranges do not overlap. So, the common single date of November 30 does not constitute an overlap. When dates are used, 2 days must be involved, and when time is used, 2 seconds must be contained in both ranges.
Page 363
Chapter 9
Date Functions
The OVERLAPS Command using TIME Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT 'The Times Overlap' As DoThey WHERE (TIME '08:00:00', TIME '02:00:00') OVERLAPS (TIME '02:01:00', TIME '04:15:00') ;
Messages
Garden of Analysis
Result 1
Do They 1 The Times Overlap
The above SELECT example tests two literal times and uses the OVERLAPS to determine whether or not to display the character literal. At first glance, it appears as if this answer is incorrect because 02:01:00 looks like it starts 1 second after the first range ends. However, the system works on a 24-hour clock when a date and time (timestamp) is not used together. Therefore, the system considers the earlier time of 2AM time as the start and the later time of 8 AM as the end of the range. Therefore, not only do they overlap, the second range is entirely contained in the first range.
Page 364
Chapter 10
Page 365
OLAP Functions
Chapter 10
OLAP Functions
Chapter 10 – OLAP Functions
“Don’t count the days, make the days count.” - Mohammed Ali
Page 366
Chapter 10
OLAP Functions
The Row_Number Command SELECT Product_ID ,Sale_Date , Daily_Sales, ROW_NUMBER() OVER (ORDER BY Product_ID, Sale_Date) AS Seq_Number FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;
Product_ID __________ Sale_Date ________
Not all rows are displayed
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01
Daily_Sales ___________ Seq_Number _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29
1 2 3 4 5 6 7 8 9 10 11
The ROW_NUMBER Keyword(s) caused Seq_Number to increase sequentially. Notice that this does NOT have a Rows Unbounded Preceding, and it still works!
Page 367
Chapter 10
OLAP Functions
Quiz – How did the Row_Number Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, ROW_NUMBER() OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04
What Keyword(s) caused StartOver to reset?
Page 368
Daily_Sales _________
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
StartOver _______
1 2 3 4 5 6 7 1 2 3 4 5 6 7
Chapter 10
OLAP Functions
Quiz – How did the Row_Number Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, ROW_NUMBER() OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________
Sale_Date ________
Daily_Sales _________
StartOver _______
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
1 2 3 4 5 6 7 1 2 3 4 5 6 7
What Keyword(s) caused StartOver to reset? It is the PARTITION BY statement.
Page 369
Chapter 10
OLAP Functions
Using a Derived Table and Row_Number WITH Results AS ( SELECT ROW_NUMBER() OVER(ORDER BY Product_ID, Sale_Date) AS RowNumber, Product_ID, Sale_Date FROM Sales_Table ) SELECT * FROM Results WHERE RowNumber BETWEEN 8 AND 14 RowNumber __________ Product_ID _________ Sale_Date _________ 2000-09-28 8 2000 9 2000 2000-09-29 10 2000 2000-09-30 11 2000 2000-10-01 12 2000 2000-10-02 13 2000 2000-10-03 14 2000 2000-10-04
In the example above we are using a derived table called Results and then using a WHERE clause to only take certain RowNumbers.
Page 370
Chapter 10
OLAP Functions
Finding the First Occurrence using a WITH Derived Table Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 WITH Derived_Tbl AS (select Product_ID as Prod, Sale_Date, Daily_Sales, Row_Number() over (PARTITION BY product_id ORDER BY Sale_Date ASC) AS Row_Num from sales_table) Select * from Derived_Tbl Where Row_Num = 1 ; Messages
1 2 3
Prod 1000 2000 3000
Garden of Analysis
Result 1
Sale_Date Daily_Sales 09/28/2000 48850.40 09/28/2000 41888.88 09/28/2000 61301.77
Row_Num 1 1 1
Using the Row_Number ordered analytic and by partitioning of Product_ID and the sorting by Sale_Date ASC we are bringing back only the first occurrence of a row based on the earliest Sale_Date. This can be done because we are placing our query in a derived table and then selecting from that derived table using a WHERE clause.
Page 371
Chapter 10
OLAP Functions
Finding the Last Occurrence using a WITH Derived Table Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 WITH Derived_Tbl AS (select Product_ID as Prod, Sale_Date, Daily_Sales, Row_Number() over (PARTITION BY product_id ORDER BY Sale_Date Desc) AS Row_Num from sales_table) Select * from Derived_Tbl Where Row_Num = 1 ; Messages
1 2 3
Prod 1000 2000 3000
Garden of Analysis
Result 1
Sale_Date Daily_Sales 10/04/2000 54553.10 10/04/2000 32800.50 10/04/2000 15675.33
Row_Num 1 1 1
Using the Row_Number ordered analytic and by partitioning of Product_ID and the sorting by Sale_Date DESC we are bringing back only the first occurrence of a row based on the latest Sale_Date. This can be done because we are placing our query in a derived table and then selecting from that derived table using a WHERE clause.
Page 372
Chapter 10
OLAP Functions
Ordered Analytics OVER SELECT Product_ID as Prod ,Sale_Date ,Daily_Sales ,SUM(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Total ,AVG(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Avg ,COUNT(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Cnt ,MIN(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Min ,MAX(Daily_Sales) OVER(PARTITION BY Sale_Date) AS Max FROM Sales_Table Prod ____ 1000 2000 3000 3000 2000 1000 1000 2000 3000
Sale_Date __________ Daily_Sales ________ Total _________ 2000-09-28 48850.40 152041.05 2000-09-28 41888.88 152041.05 2000-09-28 61301.77 152041.05 2000-09-29 34509.13 137009.35 2000-09-29 48000.00 137009.35 2000-09-29 54500.22 137009.35 2000-09-30 36000.07 129718.96 2000-09-30 49850.03 129718.96 2000-09-30 43868.86 129718.96
Avg Cnt Min ________ ___ ________ 50680.35 3 41888.88 50680.35 3 41888.88 50680.35 3 41888.88 45669.78 3 34509.13 45669.78 3 34509.13 45669.78 3 34509.13 43239.65 3 36000.07 43239.65 3 36000.07 43239.65 3 36000.07
Not all rows are shown in the answer set
Above is an example of the Ordered Analytics that uses the keyword OVER.
Page 373
Max _______ 61301.77 61301.77 61301.77 54500.22 54500.22 54500.22 49850.03 49850.03 49850.03
Chapter 10
OLAP Functions
RANK and DENSE RANK SELECT Product_ID, Daily_Sales, RANK() OVER (ORDER BY Daily_Sales ASC) as "Rank", DENSE_RANK() OVER(Order By Daily_Sales ASC) as "DenseRank" FROM Sales_Table WHERE Product_ID in(1000, 2000)
Not all rows are displayed
Prod ____ 2000 1000 1000 2000 1000 2000 2000 2000 1000
Daily_Sales _____ Rank __________ DenseRank __________ 32800.50 1 1 32800.50 1 1 36000.07 3 2 36021.93 4 3 40200.43 5 4 41888.88 6 5 43200.18 7 6 48000.00 8 7 48850.40 9 8
Above is an example of the RANK and DENSE_RANK commands. Notice the difference in the ties and the next ranking.
Page 374
Chapter 10
OLAP Functions
RANK Defaults to Ascending Order SELECT Product_ID ,Sale_Date , Daily_Sales, RANK() OVER (ORDER BY Daily_Sales) AS Rank1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID _________
Sale_Date ________
1000 2000 1000 2000 1000 Not all 2000 rows 2000 are displayed 2000 1000 2000 1000 1000 2000
10/02/2000 10/04/2000 09/30/2000 10/02/2000 10/01/2000 09/28/2000 10/03/2000 09/29/2000 09/28/2000 09/30/2000 09/29/2000 10/04/2000 10/01/2000
The RANK OVER command defaults the Sort to ASC.
Page 375
Daily_Sales Rank1 _________ _____ 1 32800.50 1 32800.50 3 36000.07 4 36021.93 5 40200.43 6 41888.88 7 43200.18 8 48000.00 9 48850.40 10 49850.03 11 54500.22 12 54553.10 13 54850.29
Chapter 10
OLAP Functions
Getting RANK to Sort in DESC Order SELECT Product_ID ,Sale_Date , Daily_Sales, RANK() OVER (ORDER BY Daily_Sales DESC) AS Rank1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;
Product_ID _________ 1000 2000 1000 1000 2000 1000 2000 2000 2000 1000 2000 1000 2000 1000
Sale_Date ________
Daily_Sales _________
10/03/2000 10/01/2000 10/04/2000 09/29/2000 09/30/2000 09/28/2000 09/29/2000 10/03/2000 09/28/2000 10/01/2000 10/02/2000 09/30/2000 10/04/2000 10/02/2000
64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50
Rank1 _____ 1 2 3 4 5 6 7 8 9 10 11 12 13 13
Utilize the DESC keyword in the ORDER BY statement to rank in descending order.
Page 376
Chapter 10
OLAP Functions
RANK OVER and PARTITION BY SELECT Product_ID ,Sale_Date , Daily_Sales, RANK() OVER (PARTITION BY Product_ID ORDER BY Daily_Sales DESC) AS Rank1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
10/03/2000 10/04/2000 09/29/2000 09/28/2000 10/01/2000 09/30/2000 10/02/2000 10/01/2000 09/30/2000 09/29/2000 10/03/2000 09/28/2000 10/02/2000 10/04/2000
Daily_Sales Rank1 _________ _____ 64300.00 54553.10 54500.22 48850.40 40200.43 36000.07 32800.50 54850.29 49850.03 48000.00 43200.18 41888.88 36021.93 32800.50
1 2 3 4 5 6 7 1 2 3 4 5 6 7
What does the PARTITION Statement in the RANK OVER do? It resets the rank.
Page 377
Chapter 10
OLAP Functions
PERCENT_RANK OVER SELECT Product_ID ,Sale_Date , Daily_Sales, PERCENT_RANK() OVER (PARTITION BY PRODUCT_ID ORDER BY Daily_Sales DESC) AS PercentRank1 FROM Sales_Table WHERE Product_ID in (1000, 2000) ; Product_ID Sale_Date ________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
2000-10-03 2000-10-04 2000-09-29 2000-09-28 2000-10-01 2000-09-30 2000-10-02 2000-10-01 2000-09-30 2000-09-29 2000-10-03 2000-09-28 2000-10-02 2000-10-04
Daily_Sales _________ PercentRank1 _________ 64300.00 54553.10 54500.22 48850.40 40200.43 36000.07 32800.50 54850.29 49850.03 48000.00 43200.18 41888.88 36021.93 32800.50
0 0.17 0.33 0.5 0.67 0.83 1 0 0.17 0.33 0.5 0.67 0.83 1
7 Rows in Calculation for 1000 Product_ID
7 Rows in Calculation for 2000 Product_ID
We now have added a Partition statement which resets on Product_ID so this produces 7 rows for each of our Product_IDs.
Page 378
Chapter 10
OLAP Functions
PERCENT_RANK OVER with 14 rows in Calculation SELECT Product_ID ,Sale_Date , Daily_Sales, PERCENT_RANK() OVER ( ORDER BY Daily_Sales DESC) AS PercentRank1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 2000 1000 1000 2000 1000 2000 2000 2000 1000 2000 1000 2000 1000
2000-10-03 2000-10-01 2000-10-04 2000-09-29 2000-09-30 2000-09-28 2000-09-29 2000-10-03 2000-09-28 2000-10-01 2000-10-02 2000-09-30 2000-10-04 2000-10-02
Daily_Sales _________ 64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50
PercentRank1 __________ 0 0.08 0.15 0.23 0.31 0.38 0.46 0.54 0.62 0.69 0.77 0.85 0.92 0.92
14 Rows in calculation for both the 1000 and 2000 Product_IDs
Percent_Rank is just like RANK, however, it gives you the Rank as a percent, but only a percent of all the other rows up to 100%.
Page 379
Chapter 10
OLAP Functions
PERCENT_RANK OVER with 21 rows in Calculation SELECT Product_ID ,Sale_Date , Daily_Sales, PERCENT_RANK() OVER ( ORDER BY Daily_Sales DESC) AS PercentRank1 FROM Sales_Table ; Product_ID ________ Sale_Date ________ 1000 3000 2000 1000 1000 Not all 2000 rows 1000 are displayed 2000 3000 2000 2000 1000 2000 1000
2000-10-03 2000-09-28 2000-10-01 2000-10-04 2000-09-29 2000-09-30 2000-09-28 2000-09-29 2000-09-30 2000-10-03 2000-09-28 2000-10-01 2000-10-02 2000-09-30
Daily_Sales _________
PercentRank1 __________
64300.00 61301.77 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43868.86 43200.18 41888.88 40200.43 36021.93 36000.07
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65
21 Rows in Calculation for all of the Product_IDs
Percent_Rank is just like RANK, however, it gives you the Rank as a percent but only a percent of all the other rows up to 100%.
Page 380
Chapter 10
OLAP Functions
Quiz – What Causes the Product_ID to Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, PERCENT_RANK() OVER (PARTITION BY PRODUCT_ID ORDER BY Daily_Sales DESC) AS PercentRank1 FROM Sales_Table WHERE Product_ID in (1000, 2000) ;
Product_ID Sale_Date ________ ________ 1000 2000-10-03 1000 2000-10-04 1000 2000-09-29 1000 2000-09-28 1000 2000-10-01 1000 2000-09-30 1000 2000-10-02 2000 2000-10-01 2000 2000-09-30 2000 2000-09-29 2000 2000-10-03 2000 2000-09-28 2000 2000-10-02 2000 2000-10-04
What caused the Product_IDs to be sorted?
Page 381
Daily_Sales _________ 64300.00 54553.10 54500.22 48850.40 40200.43 36000.07 32800.50 54850.29 49850.03 48000.00 43200.18 41888.88 36021.93 32800.50
PercentRank1 __________ 0 0.17 0.33 0.5 0.67 0.83 1 0 0.17 0.33 0.5 0.67 0.83 1
Chapter 10
OLAP Functions
Answer to Quiz – What Cause the Product_ID to Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, PERCENT_RANK() OVER (PARTITION BY PRODUCT_ID ORDER BY Daily_Sales DESC) AS PercentRank1 FROM Sales_Table WHERE Product_ID in (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
2000-10-03 2000-10-04 2000-09-29 2000-09-28 2000-10-01 2000-09-30 2000-10-02 2000-10-01 2000-09-30 2000-09-29 2000-10-03 2000-09-28 2000-10-02 2000-10-04
Daily_Sales _________
64300.00 54553.10 54500.22 48850.40 40200.43 36000.07 32800.50 54850.29 49850.03 48000.00 43200.18 41888.88 36021.93 32800.50
PercentRank1 __________
0 0.17 0.33 0.5 0.67 0.83 1 0 0.17 0.33 0.5 0.67 0.83 1
What caused the Product_IDs to be sorted? It was the PARTITION BY statement.
Page 382
Chapter 10
OLAP Functions
Finding Gaps between Dates SELECT Product_Id, Sale_Date, MIN(Sale_Date) OVER (PARTITION BY Product_Id ORDER BY Sale_Date ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS Date_Of_Next_Row ,MIN(Sale_Date) OVER (PARTITION BY Product_Id ORDER BY Sale_Date ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) - Sale_Date AS Days_To_Next_Row FROM Sales_Table WHERE Product_ID Between 1000 and 2000 ; Product_ID Sale_Date __________ __________ 1000 10/04/2000 1000 10/03/2000 1000 10/02/2000 1000 10/01/2000 1000 09/30/2000 1000 09/29/2000 1000 09/28/2000 2000 10/04/2000 2000 10/03/2000
The above query finds gaps in dates.
Page 383
Date_Of_Next_Row ________________ ? 10/04/2000 10/03/2000 10/02/2000 10/01/2000 09/30/2000 09/29/2000 ? 10/04/2000
Days_To_Next_Row _________________ ? 1 1 1 Not all 1 rows 1 are displayed 1 ? 1
Chapter 10
OLAP Functions
CSUM – Rows Unbounded Preceding Explained SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Sale_Date ROWS UNBOUNDED PRECEDING) AS CsumAnsi FROM Sales_Table WHERE Product_ID BETWEEN 1000 and 2000 ;
Product_ID Sale_Date ___________ Daily_Sales __________ _________ 2000 2000-09-28 41888.88 1000 2000-09-28 48850.40 2000 2000-09-29 48000.00 Not all rows 1000 2000-09-29 54500.22 are displayed 1000 2000-09-30 36000.07 in this 49850.03 answer set 2000 2000-09-30 1000 2000-10-01 40200.43 2000 2000-10-01 54850.29 1000 2000-10-02 32800.50 2000 2000-10-02 36021.93
CsumAnsi ________ 41888.88 90739.28 138739.28 193239.50 229239.57 279089.60 319290.03 374140.32 406940.82 442962.75
The keywords Rows Unbounded Preceding determines that this is a cumulative sum (CSUM). There are only a few different statements and Rows Unbounded Preceding is the main one. It means start calculating at the beginning row, and continue calculating until the last row. Page 384
Chapter 10
OLAP Functions
CSUM – Making Sense of the Data SELECT
Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Sale_Date ROWS UNBOUNDED PRECEDING) AS SUMOVER FROM Sales_Table WHERE Product_ID BETWEEN 1000 and 2000 ;
Product_ID ________ Sale_Date ________
Not all rows are displayed in this answer set
2000 1000 2000 1000 1000 2000 1000 2000 1000 2000 1000
2000-09-28 2000-09-28 2000-09-29 2000-09-29 2000-09-30 2000-09-30 2000-10-01 2000-10-01 2000-10-02 2000-10-02 2000-10-03
Daily_Sales _________ 41888.88 48850.40 48000.00 54500.22 36000.07 49850.03 40200.43 54850.29 32800.50 36021.93 64300.00
SUMOVER _________ 41888.88 90739.28 138739.28 193239.50 229239.57 279089.60 319290.03 374140.32 406940.82 442962.75 507262.75
The second “SUMOVER” row is 90739.28. That is derived by the first row’s Daily_Sales (41888.88) added to the SECOND row’s Daily_Sales (48850.40).
Page 385
Chapter 10
OLAP Functions
CSUM – Making Even More Sense of the Data SELECT
Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Sale_Date ROWS UNBOUNDED PRECEDING) AS SUMOVER FROM Sales_Table WHERE Product_ID BETWEEN 1000 and 2000 ; Product_ID _________
Not all rows are displayed in this answer set
2000 1000 2000 1000 1000 2000 1000 2000 1000 2000 1000
Sale_Date _________ Daily_Sales _________ SUMOVER _________ 2000-09-28 2000-09-28 2000-09-29 2000-09-29 2000-09-30 2000-09-30 2000-10-01 2000-10-01 2000-10-02 2000-10-02 2000-10-03
41888.88 48850.40 48000.00 54500.22 36000.07 49850.03 40200.43 54850.29 32800.50 36021.93 64300.00
41888.88 90739.28 138739.28 193239.50 229239.57 279089.60 319290.03 374140.32 406940.82 442962.75 507262.75
The third “SUMOVER” row is 138739.28. That is derived by taking the first row’s Daily_Sales (41888.88) and adding it to the SECOND row’s Daily_Sales (48850.40). Then, you add that total to the THIRD row’s Daily_Sales (48000.00).
Page 386
Chapter 10
OLAP Functions
CSUM – The Major and Minor Sort Key(s) SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS SumOVER FROM Sales_Table ;
Product_ID Sale_Date Daily_Sales _________ SumOVER ________ _________ _________ 1000 2000-09-28 48850.40 48850.40 1000 2000-09-29 54500.22 103350.62 1000 2000-09-30 36000.07 139350.69 1000 2000-10-01 40200.43 179551.12 Not all rows are displayed 1000 2000-10-02 32800.50 212351.62 in this 1000 2000-10-03 64300.00 276651.62 answer set 1000 2000-10-04 54553.10 331204.72 2000 2000-09-28 41888.88 373093.60 2000 2000-09-29 48000.00 421093.60 2000 2000-09-30 49850.03 470943.63 2000 2000-10-01 54850.29 525793.92
You can have more than one SORT KEY. In the top query, Product_ID is the MAJOR Sort and Sale_Date is the MINOR Sort. Remember, the data is sorted first and then the cumulative sum is calculated. That is why they are called Ordered Analytics.
Page 387
Chapter 10
OLAP Functions
The ANSI CSUM – Getting a Sequential Number SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) as SUMOVER, SUM(1) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS Seq_Number FROM Sales_Table ; Product_ID Daily_Sales ___________ SUM OVER ___________ Seq_Number __________ Sale_Date _________ __________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03
48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 373093.60 421093.60 470943.63
1 2 3 4 5 6 7 8 9 10
With “Seq_Number”, it will continuously add 1 to the answer for each row. Because you placed the number 1 in the area which calculates the cumulative sum, it will continuously add 1 to the answer for each row.
Page 388
Chapter 10
OLAP Functions
Troubleshooting the ANSI OLAP on a GROUP BY SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Sale_Date ROWS UNBOUNDED PRECEDING) AS AnsiCsum FROM Sales_Table GROUP BY Product_ID ; Error! Why?
Never GROUP BY in a SUM Over or with any ANSI Syntax OLAP command. If you want to reset, use a PARTITION BY Statement, but never a GROUP BY.
Page 389
Chapter 10
OLAP Functions
Reset with a PARTITION BY Statement SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS SumANSI FROM Sales_Table ;
Product_ID Sale_Date ________ ________
Not all rows are displayed in this answer set
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30
Daily_Sales SumANSI _________ ________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03
48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 41888.88 89888.88 139738.91
CSUM Resets on Product_ID break
The PARTITION Statement is how you reset in ANSI. This will cause the SUMANSI to start over (reset) on its calculating for each NEW Product_ID.
Page 390
Chapter 10
OLAP Functions
PARTITION BY only Resets a Single OLAP not ALL of them SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS Subtotal, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS GrandTotal FROM Sales_Table ; Product_ID ________ Sale_Date Daily_Sales Subtotal GrandTotal _________ _________ ________ ________
Not all rows are displayed in this answer set
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03
48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 41888.88 89888.88 139738.91
48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 373093.60 421093.60 470943.63
Above are two OLAP statements. Only one has PARTITION BY, so only it resets. The other continuously does a CSUM.
Page 391
Chapter 10
OLAP Functions
CURRENT ROW AND UNBOUNDED FOLLOWING SELECT Product_ID, Sale_Date ,Daily_Sales ,SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS CumulativeTotal FROM Sales_Table ORDER BY CumulativeTotal Product_ID ________ Sale_Date Daily_Sales CumulativeTotal _________ _________ ____________
Not all rows are displayed in this answer set
3000 3000 3000 3000 3000 3000 3000 2000 2000 2000 2000
10/04/2000 10/03/2000 10/02/2000 10/01/2000 09/30/2000 09/29/2000 09/28/2000 10/04/2000 10/03/2000 10/02/2000 10/01/2000
15675.33 21553.79 19678.94 28000.00 43868.86 34509.13 61301.77 32800.50 43200.18 36021.93 54850.29
15675.33 37229.12 56908.06 84908.06 128776.92 163286.05 224587.82 257388.32 300588.50 336610.43 391460.72
Above we used the ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING to produce a CSUM, but notice that the Product_ID and the Sale_Date are reversed. We see the Product_ID of 3000 first and the latest date first.
Page 392
Chapter 10
OLAP Functions
Different Windowing Options SELECT Product_ID, Sale_Date, Daily_Sales ,SUM(Daily_Sales) OVER( PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS BETWEEN 1 PRECEDING AND CURRENT ROW ) as Row_Preceding ,SUM(Daily_Sales) OVER( PARTITION BY Product_ID ORDER BY Product_Id, Sale_Date ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) as Row_Following FROM Sales_Table Product_ID ________ Sale_Date Daily_Sales Row_Preceding Row_Following _________ _________ ____________ ___________
Not all rows are displayed in this answer set
1000 1000 1000 1000 1000 1000 1000 2000 2000
09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00
48850.40 103350.62 90500.29 76200.50 73000.93 97100.50 118853.10 41888.88 89888.88
103350.62 90500.29 76200.50 73000.93 97100.50 118853.10 54553.10 89888.88 97850.03
The example above uses ROWS BETWEEN 1 PRECEDING AND CURRENT ROW and then uses a different example with ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING. Notice how the report came out?
Page 393
Chapter 10
OLAP Functions
Moving Sum has a Moving Window SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding)AS Sum3_ANSI FROM Sales_Table ; Calculate the Current Row and 2 rows preceding
Moving Window of 3 rows
Product_ID ________ Sale_Date Daily_Sales _________ _________ Sum3_ANSI _________
Not all rows are displayed in this answer set
1000 1000 1000 1000 1000 1000 1000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29
48850.40 103350.62 139350.69 130700.72 109001.00 137300.93 151653.60 160741.98 144441.98
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00
The SUM () Over allows you to get the moving SUM of a certain column. The moving window in ANSI form always includes the current row. A Rows 2 Preceding statement means the current row and two preceding, which is a moving window of 3. .
Page 394
Chapter 10
OLAP Functions
How ANSI Moving SUM Handles the Sort SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS Sum3_ANSI FROM Sales_Table ; Major and Minor Sort keys Product_ID _________ Sale_Date _________ Daily_Sales _________
Not all rows are displayed in this answer set
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02
The SUM OVER places the sort after the ORDER BY.
Page 395
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93
Sum3_AN SI __________
48850.40 103350.62 139350.69 130700.72 109001.00 137300.93 151653.60 160741.98 144441.98 139738.91 152700.32 140722.25
Chapter 10
OLAP Functions
Quiz – How is that Total Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS Sum3_ANSI FROM Sales_Table ; Product_ID ________
Sale_Date _________
Daily_Sales _________
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93
Not all rows are displayed in this answer set
Sum3_ANSI __________ 48850.40 103350.62 139350.69 130700.72 109001.00 137300.93 151653.60 160741.98 144441.98 139738.91 152700.32 140722.25
With a Moving Window of 3, how is the 139350.69 amount derived in the Sum3_ANSI column in the third row?
Page 396
Chapter 10
OLAP Functions
Answer to Quiz – How is that Total Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS Sum3_ANSI FROM Sales_Table ; Product_ID ________
Sale_Date _________
Daily_Sales _________
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93
Not all rows are displayed in this answer set
Sum3_ANSI __________ 48850.40 103350.62 139350.69 130700.72 109001.00 137300.93 151653.60 160741.98 144441.98 139738.91 152700.32 140722.25
With a Moving Window of 3, how is the 139350.69 amount derived in the Sum3_ANSI column in the third row? It is the sum of 48850.40, 54500.22 and 36000.07. The current row of Daily_Sales plus the previous two rows of Daily_Sales.
Page 397
Chapter 10
OLAP Functions
Moving SUM every 3-rows Vs a Continuous Average SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS SUM3, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED Preceding) AS Continuous FROM Sales_Table; Product_ID ________ Sale_Date _________ Daily_Sales ________ SUM3 Continuous _________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00
48850.40 103350.62 139350.69 130700.72 109001.00 137300.93 151653.60 160741.98 144441.98
48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 373093.60 421093.60
Not all rows are displayed in this answer set
The ROWS 2 Preceding gives the MSUM for every 3 rows. The ROWS UNBOUNDED Preceding gives the continuous MSUM.
Page 398
Chapter 10
OLAP Functions
PARTITION BY Resets an ANSI OLAP SELECT Product_ID , Sale_Date, Daily_Sales, SUM(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS SUM3, SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ANSI reset ROWS UNBOUNDED Preceding) AS Continuous much Like a FROM Sales_Table; GROUP BY Product_ID __________ 1000 1000 1000 Not all 1000 rows are 1000 displayed 1000 1000 2000 2000
Sale_Date Daily_Sales _________ SUM3 __________ __________ 2000-09-28 48850.40 48850.40 2000-09-29 54500.22 103350.62 2000-09-30 36000.07 139350.69 2000-10-01 40200.43 130700.72 2000-10-02 32800.50 109001.00 2000-10-03 64300.00 137300.93 2000-10-04 54553.10 151653.60 2000-09-28 41888.88 160741.98 2000-09-29 48000.00 144441.98
Continuous __________ 48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 41888.88 89888.88
Use a PARTITION BY Statement to Reset the ANSI OLAP. Notice it only resets the OLAP command containing the Partition By statement, but not the other OLAPs.
Page 399
Chapter 10
OLAP Functions
The Moving Window is Current Row and Preceding SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding)AS AVG_3_ANSI FROM Sales_Table ; Moving Window of 3 rows
Calculate the Current Row and 2 rows preceding
Product_ID Sale_Date _________ Daily_Sales AVG_3_ANSI _________ _________ ___________ 48850.40 1000 2000-09-28 48850.40 54500.22 1000 2000-09-29 51675.31 36000.07 1000 2000-09-30 46450.23 40200.43 1000 2000-10-01 43566.91 Not all rows 32800.50 1000 2000-10-02 36333.67 are 64300.00 1000 2000-10-03 45788.98 displayed 2000-10-04 54553.10 1000 50551.20 41888.88 2000 2000-09-28 53580.66 48000.00 2000 2000-09-29 48147.33 49850.03 2000 2000-09-30 46579.11
The AVG () Over allows you to get the moving average of a certain column. The Rows 2 Preceding is a moving window of 3 in ANSI.
Page 400
Chapter 10
OLAP Functions
How Moving Average Handles the Sort SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3_ANSI FROM Sales_Table ; Major and Minor Sort keys
Product_ID ________ Sale_Date _________
Not all rows are displayed
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02
Daily_Sales ___________ AVG_3_ANSI ________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93
48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33 46579.11 50900.11 46907.42
Much like the SUM OVER Command, the Average OVER places the sort keys via the ORDER BY keywords.
Page 401
Chapter 10
OLAP Functions
Moving Average SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3 FROM Sales_Table ; Product_ID ________ Sale_Date _________ Daily_Sales _________
Not all rows are displayed
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03
AVG_3 __________
48850.4 51675.31 46450.23 43566.91 36333.67 45766.98 50551.2 53580.66 48147.33 46579.64
Understand that in ANSI a ROWS 2 PRECEDING is considered a Moving Window of 3. That is because in ANSI it is considered the Current Row and 2 preceding. The next page will use the CAST command to provide a precision of 0 decimal places.
Page 402
Chapter 10
OLAP Functions
Moving Average SELECT Product_ID , Sale_Date, Daily_Sales, CAST(AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) as Decimal (8,0)) AS AVG_3 FROM Sales_Table ;
Product_ID ________ Sale_Date _________ Daily_Sales _______ AVG_3 _________
Not all rows are displayed
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03
48850 51675 46450 43567 36334 45767 50551 53581 48147 46580
Understand that in ANSI a ROWS 2 PRECEDING is considered a Moving Window of 3. That is because in ANSI it is considered the Current Row and 2 preceding.
Page 403
Chapter 10
OLAP Functions
Quiz – How is that Total Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3_ANSI FROM Sales_Table ; Product_ID _________
Sale_Date _________
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02
Not all rows are displayed
Daily_Sales ___________ AVG_3_ANSI _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93
48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33 46579.11 50900.11 46907.42
With a Moving Window of 3, how is the 46450.23 amount derived in the AVG_3_ANSI column in the third row?
Page 404
Chapter 10
OLAP Functions
Answer to Quiz – How is that Total Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3_ANSI FROM Sales_Table ; Product_ID Sale_Date _________ Daily_Sales ___________ AVG_3_ANSI ________ ________ 1000 2000-09-28 48850.40 48850.40 1000 2000-09-29 51675.31 54500.22 1000 2000-09-30 46450.23 36000.07 1000 2000-10-01 43566.91 40200.43 Not all 1000 2000-10-02 36333.67 32800.50 rows 1000 2000-10-03 45788.98 64300.00 are 1000 2000-10-04 50551.20 54553.10 displayed 2000 2000-09-28 53580.66 41888.88 2000 2000-09-29 48147.33 48000.00 2000 2000-09-30 46579.11 49850.03 2000 2000-10-01 50900.11 54850.29 2000 2000-10-02 46907.42 36021.93 AVG of 48850.40, 54500.22, and 36000.07
With a Moving Window of 3, the 46450.23 amount derived in the third row is the average of 48850.40, 54500.22 and 36000.07.
Page 405
Chapter 10
OLAP Functions
Quiz – How is that 4th Row Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3_ANSI FROM Sales_Table ;
Product_ID _________ Sale_Date Daily_Sales ___________ AVG_3_ANSI ________ _________
Not all rows are displayed
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93
48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33 46579.11 50900.11 46907.42
With a Moving Window of 3, how is the 43566.91 amount derived in the AVG_3_ANSI column in the fourth row?
Page 406
Chapter 10
OLAP Functions
Answer to Quiz – How is that 4th Row Calculated? SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG_3_ANSI FROM Sales_Table ; Product_ID _________
Sale_Date _________
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02
Not all rows are displayed
Daily_Sales AVG_3_ANSI _________ __________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93
48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33 46579.11 50900.11 46907.42
AVG of 54500.22, 36000.07 and 40200.43
With a Moving Window of 3, how is the 43566.91 amount derived in the AVG_3_ANSI column in the fourth row? The current row plus Rows 2 Preceding.
Page 407
Chapter 10
OLAP Functions
Moving Average every 3-rows vs a Continuous Average SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG3, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED Preceding) AS Continuous FROM Sales_Table;
Product_ID Sale_Date Daily_Sales _______ AVG3 Continuous _________ _________ _________ _________ 1000 1000 1000 Not all rows 1000 are 1000 displayed 1000 1000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00
48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33
48850.40 51675.31 46450.23 44887.78 42470.32 46108.60 47314.96 46636.70 46788.18
The ROWS 2 Preceding gives the MAVG for every 3 rows. The ROWS UNBOUNDED Preceding gives the continuous MAVG.
Page 408
Chapter 10
OLAP Functions
PARTITION BY Resets an ANSI OLAP SELECT Product_ID , Sale_Date, Daily_Sales, AVG(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS 2 Preceding) AS AVG3, AVG(Daily_Sales) OVER (PARTITION BY Product_ID ANSI reset ORDER BY Product_ID, Sale_Date much Like a ROWS UNBOUNDED Preceding) AS Continuous GROUP BY FROM Sales_Table; Product_ID Sale_Date Daily_Sales _______ AVG3 Continuous _________ _________ _________ _________
Not all rows are displayed
1000 1000 1000 1000 1000 1000 1000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00
48850.40 51675.31 46450.23 43566.91 36333.67 45788.98 50551.20 53580.66 48147.33
48850.40 51675.31 46450.23 44887.78 42470.32 46108.60 47314.96 41888.88 44944.44
Use a PARTITION BY Statement to Reset the ANSI OLAP. The Partition By statement only resets the column using the statement. Notice that only Continuous resets.
Page 409
Chapter 10
OLAP Functions
Moving Difference using ANSI Syntax SELECT Product_ID, Sale_Date, Daily_Sales, Daily_Sales - SUM(Daily_Sales) OVER ( ORDER BY Product_ID ASC, Sale_Date ASC ROWS BETWEEN 4 PRECEDING AND 4 PRECEDING) AS "MDiff_ANSI" FROM Sales_Table ; Product_ID _________
Not all rows are displayed
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000
Sale_Date __________ Daily_Sales MDiff_ANSI _________ __________ 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29
? ? ? ? -16049.90 9799.78 18553.03 1688.45 15199.50 -14449.97 297.19
This is how you do a MDiff using the ANSI Syntax with a moving window of 4.
Page 410
Chapter 10
OLAP Functions
Moving Difference using ANSI Syntax with Partition By SELECT Product_ID, Sale_Date, Daily_Sales, Daily_Sales - SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID ASC, Sale_Date ASC ROWS BETWEEN 4 PRECEDING AND 4 PRECEDING) AS "MDiff_ANSI" FROM Sales_Table; Product_ID _________ Sale_Date __________ Daily_Sales ___________ MDiff_ANSI __________ Not all rows are displayed
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93
? ? ? ? -16049.90 9799.78 18553.03 ? ? ? ? -5866.95
Wow! This is how you do a MDiff using the ANSI Syntax with a moving window of 4 and with a PARTITION BY statement.
Page 411
Chapter 10
OLAP Functions
COUNT OVER for a Sequential Number SELECT Product_ID ,Sale_Date , Daily_Sales, COUNT(*) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS Seq_Number FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;
Product_ID Sale_Date _________ Daily_Sales Seq_Number ________ _________ __________ 48850.40 1 1000 2000-09-28 54500.22 2 1000 2000-09-29 36000.07 3 1000 2000-09-30 Not all 40200.43 4 1000 2000-10-01 rows 32800.50 5 1000 2000-10-02 are 64300.00 6 displayed 1000 2000-10-03 54553.10 7 1000 2000-10-04 41888.88 8 2000 2000-09-28 48000.00 9 2000 2000-09-29 49850.03 10 2000 2000-09-30 54850.29 11 2000 2000-10-01 This is the COUNT OVER. It will provide a sequential number starting at 1. The Keyword(s) ROWS UNBOUNDED PRECEDING causes Seq_Number to start at the beginning and increase sequentially to the end.
Page 412
Chapter 10
OLAP Functions
COUNT OVER without Rows Unbounded Preceding SELECT Product_ID ,Sale_Date , Daily_Sales, COUNT(*) OVER (ORDER BY Product_ID, Sale_Date) AS No_Seq FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date __________ Daily_Sales _______ No_Seq __________ _________ 48850.40 1 1000 2000-09-28 54500.22 2 1000 2000-09-29 36000.07 3 1000 2000-09-30 40200.43 4 1000 2000-10-01 32800.50 5 1000 2000-10-02 64300.00 6 1000 2000-10-03 54553.10 7 14 rows 1000 2000-10-04 came 41888.88 8 2000 2000-09-28 back 48000.00 9 2000 2000-09-29 49850.03 10 2000 2000-09-30 54850.29 11 2000 2000-10-01 36021.93 12 2000 2000-10-02 43200.18 13 2000 2000-10-03 32800.50 14 2000 2000-10-04 When you don’t have a ROWS UNBOUNDED PRECEDING this still works just fine.
Page 413
Chapter 10
OLAP Functions
Quiz – What caused the COUNT OVER to Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, COUNT(*) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date _________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04
What Keyword(s) caused StartOver to reset?
Page 414
Daily_Sales _________
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
StartOver _______
1 2 3 4 5 6 7 1 2 3 4 5 6 7
Chapter 10
OLAP Functions
Answer to Quiz – What caused the COUNT OVER to Reset? SELECT Product_ID ,Sale_Date , Daily_Sales, COUNT(*) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS StartOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID ________ Sale_Date ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04
Daily_Sales _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
StartOver _______ 1 2 3 4 5 6 7 1 2 3 4 5 6 7
What Keyword(s) caused StartOver to reset? It is the PARTITION BY statement. Page 415
Chapter 10
OLAP Functions
The MAX OVER Command SELECT Product_ID ,Sale_Date , Daily_Sales, MAX(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS MaxOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;
Product_ID _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
Sale_Date ________
Daily_Sales _________
MaxOver _______
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
48850.40 54500.22 54500.22 54500.22 54500.22 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00 64300.00
After the sort, the Max Over shows the Max Value up to that point.
Page 416
Chapter 10
OLAP Functions
MAX OVER with PARTITION BY Reset SELECT Product_ID ,Sale_Date , Daily_Sales, MAX(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS MaxOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID _________ 1000 1000 1000 1000 Not all 1000 rows 1000 are displayed 1000 2000 2000 2000 2000
Sale_Date ________ 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01
Daily_Sales _________
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29
MaxOver ________
48850.40 54500.22 54500.22 54500.22 54500.22 64300.00 64300.00 41888.88 48000.00 49850.03 54850.29
The largest value is 64300.00 in the column MaxOver. Once it was evaluated, it did not continue until the end because of the PARTITION BY reset.
Page 417
Chapter 10
OLAP Functions
MAX OVER without Rows Unbounded Preceding SELECT Product_ID ,Sale_Date , Daily_Sales, MAX(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ) AS MaxOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Daily_Sales ________ MaxOver __________ Sale_Date ________ __________
Not all rows are displayed
1000 1000 1000 1000 1000 1000 1000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03
You don't need the Rows Unbounded Preceding with the MAX OVER.
Page 418
48850.40 54500.22 54500.22 54500.22 54500.22 64300.00 64300.00 64300.00 64300.00 64300.00
Chapter 10
OLAP Functions
The MIN OVER Command SELECT Product_ID, Sale_Date ,Daily_Sales ,MIN(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS MinOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date _________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04
Daily_Sales _________
MinOver _______
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
48850.40 48850.40 36000.07 36000.07 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50
After the sort, the MIN () Over shows the Max Value up to that point.
Page 419
Chapter 10
OLAP Functions
MIN OVER without Rows Unbounded Preceding SELECT Product_ID ,Sale_Date , Daily_Sales, MIN(Daily_Sales) OVER (ORDER BY Product_ID, Sale_Date ) AS MinOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;
Product_ID __________ Sale_Date _________ Daily_Sales __________ 48850.40 1000 2000-09-28 54500.22 1000 2000-09-29 36000.07 1000 2000-09-30 40200.43 1000 2000-10-01 Not all rows 32800.50 1000 2000-10-02 are 64300.00 1000 2000-10-03 displayed 1000 2000-10-04 54553.10 41888.88 2000 2000-09-28 48000.00 2000 2000-09-29 49850.03 2000 2000-09-30 54850.29 2000 2000-10-01
You don't need the Rows Unbounded Preceding with the MIN OVER.
Page 420
MinOver ________ 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50 32800.50
Chapter 10
OLAP Functions
Finding a Value of a Column in the Next Row with MIN SELECT Product_ID, Sale_Date, Daily_Sales, MIN(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS BETWEEN 1 Following and 1 Following) AS NextSale FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID __________ 1000 1000 1000 1000 Not all rows 1000 are 1000 displayed 1000 2000 2000 2000 2000
Sale_Date Daily_Sales _________ __________ 48850.40 09/28/2000 54500.22 09/29/2000 36000.07 09/30/2000 40200.43 10/01/2000 32800.50 10/02/2000 64300.00 10/03/2000 54553.10 10/04/2000 41888.88 09/28/2000 48000.00 09/29/2000 49850.03 09/30/2000 54850.29 10/01/2000
NextSale ________ 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 ? 48000.00 49850.03 54850.29 36021.93
The above example finds the value of a column in the next row for Daily_Sales. Notice it is partitioned so there is a Null value at the end of each Product_ID.
Page 421
Chapter 10
OLAP Functions
The CSUM for Each Product_Id and the Next Start Date SELECT ROW_NUMBER() OVER (PARTITION BY Product_ID ORDER BY Sale_Date) As Rnbr ,Product_Id as PROD ,Sale_Date ,MIN(Sale_Date) OVER (PARTITION BY Product_ID ORDER BY Sale_Date ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) As Next_Start_Dt ,Daily_Sales ,SUM(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Sale_Date ROWS UNBOUNDED PRECEDING) As To_Date_Revenue FROM Sales_Table Rnbr Prod _________ Sale_Date ____________ Next_Start_Dt __________ Daily_Sales To_Date_Revenue ____ ____ ____ ___________ 1 2 3 4 5 6 7 1
1000 1000 1000 1000 1000 1000 1000 2000
09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000
09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 ? 09/29/2000
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88
Not all rows are displayed
48850.40 103350.62 139350.69 179551.12 212351.62 276651.62 331204.72 41888.88
The above example shows the cumulative SUM for the Daily_Sales and the next date on the same line.
Page 422
Chapter 10
OLAP Functions
Quiz – Fill in the Blank SELECT Product_ID ,Sale_Date , Daily_Sales, MIN(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS MinOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;
Product_ID _________ Sale_Date Daily_Sales MinOver ________ _________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
The last two answers (MinOver) are blank, so you can fill in the blank.
Page 423
48850.40 48850.40 36000.07 36000.07 32800.50 32800.50 32800.50 41888.88 41888.88 41888.88 41888.88 36021.93
Chapter 10
OLAP Functions
Answer – Fill in the Blank SELECT Product_ID ,Sale_Date , Daily_Sales, MIN(Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Product_ID, Sale_Date ROWS UNBOUNDED PRECEDING) AS MinOver FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID _________ Sale_Date Daily_Sales ________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04 2000-09-28 2000-09-29 2000-09-30 2000-10-01 2000-10-02 2000-10-03 2000-10-04
The last two answers (MinOver) are filled in.
Page 424
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
MinOver ________ 48850.40 48850.40 36000.07 36000.07 32800.50 32800.50 32800.50 41888.88 41888.88 41888.88 41888.88 36021.93 36021.93 32800.50
Chapter 10
OLAP Functions
How Ntile Works SELECT Product_ID, Sale_Date, Daily_Sales ,NTILE (4) OVER (ORDER BY Daily_Sales , Sale_Date ) AS "Quartiles" FROM Sales_Table WHERE Product_ID = 1000;
Product_ID Sale_Date __________ Daily_Sales ________ Quartiles __________ _________ 1000 1000 1000 1000 1000 1000 1000
10/02/2000 09/30/2000 10/01/2000 09/28/2000 09/29/2000 10/04/2000 10/03/2000
32800.50 36000.07 40200.43 48850.40 54500.22 54553.10 64300.00
1 1 2 2 3 3 4
Assigning a different value to the indicator of the Ntile function changes the number of partitions established. Each Ntile partition is assigned a number starting at 1 increasing to a value that is one less than the partition number specified. So, with an Ntile of 4 the partitions are 1 through 4. Then, all the rows are distributed as evenly as possible into each partition from highest to lowest values. Normally, extra rows with the lowest value begin back in the lowest numbered partitions.
Page 425
Chapter 10
OLAP Functions
Ntile SELECT Last_Name, Grade_Pt, NTILE(5) OVER (ORDER BY Grade_Pt) as "Tile" FROM Student_Table ORDER BY "Tile" DESC;
Last_Name Grade_Pt ____ Tile ________ _________ 3.95 5 Bond 4.00 5 Thomas 3.35 4 Delaney 3.80 4 Wilson 2.88 3 Hanson 3.00 3 Phillips 1.90 2 McRoberts 2.00 2 Smith ? 1 Johnson 0.00 1 Larkins
The Ntile function organizes rows into n number of groups. These groups are referred to as tiles. The tile number is returned. For example, the example above has 10 rows, so NTILE (5) splits the 10 rows into five equally sized tiles. There are 2 rows in each tile in the order of the OVER clause's ORDER BY.
Page 426
Chapter 10
OLAP Functions
Ntile Continued SELECT Dept_No, EmployeeCount, NTILE(2) OVER (ORDER BY EmployeeCount) as "Tile" FROM (SELECT Dept_No, COUNT(*) as EmployeeCount FROM Employee_Table GROUP BY Dept_No ) AS Q ORDER BY "Tile" DESC; Dept_No ________ EmployeeCount _____________ Tile ____ 1 2 300 2 2 200 3 2 400 1 1 ? 1 1 10 1 1 100
The Ntile function organizes rows into n number of groups. These groups are referred to as tiles. The tile number is returned. For example, the example above has 6 rows, so NTILE (2) splits the 10 rows into 2 equally sized tiles. There are 3 rows in each tile in the order of the OVER clause's ORDER BY.
Page 427
Chapter 10
OLAP Functions
Ntile Percentile SELECT Claim_ID, Claim_Date, ClaimCount, NTILE(100) OVER (ORDER BY ClaimCount) as Percentile FROM (SELECT Claim_ID, Claim_Date, COUNT(*) as ClaimCount FROM Claims GROUP BY Claim_ID, Claim_Date ) AS Q ORDER BY Percentile DESC Claim_ID _________ 1302111 4307444 3306333 1304111 2303222 4305444 4303555 3402222 3308333
Claim_Date ClaimCount ___________ __________ 2003-03-01 4 2003-07-05 3 2003-06-28 3 2003-04-28 2 2003-03-12 2 2003-05-12 2 2004-03-01 2 2004-02-28 2 2003-08-01 2
Percentile _________ 26 25 24 23 22 21 20 19 18
Not all rows are displayed
The Ntile function organizes rows into n number of groups. These groups are referred to as tiles. The tile number is returned. Above is a way to get the percentile.
Page 428
Chapter 10
OLAP Functions
Another Ntile Example This example determines the percentile for every row in the Sales table based on the daily sales amount and sorts it into sequence by the value being categorized, which here is daily sales. SELECT Product_ID, Sale_Date, Daily_Sales ,NTILE(100) OVER (ORDER BY Daily_Sales) AS "Quantile" FROM Sales_Table WHERE Product_ID < 2000 ;
Product_ID _________ 1000 1000 1000 1000 1000 1000 1000 Above is another Ntile example.
Page 429
Sale_Date _________
Daily_Sales ________ Quantile __________
10/02/2000 09/30/2000 10/01/2000 09/28/2000 09/29/2000 10/04/2000 10/03/2000
32800.50 36000.07 40200.43 48850.40 54500.22 54553.10 64300.00
1 2 3 4 5 6 7
Chapter 10
OLAP Functions
Using Tertiles (Partitions of Four) SELECT Product_ID, Sale_Date, Daily_Sales ,NTILE (4) OVER (Order by Daily_Sales , Sale_Date ) AS "Quartiles" FROM Sales_Table WHERE Product_ID in (1000, 2000) ;
Product_ID __________ 1000 2000 1000 2000 1000 2000 2000 2000 1000 2000 1000 1000 2000 1000
Sale_Date __________ Daily_Sales ________ Quartiles _________ 10/02/2000 32800.50 1 10/04/2000 32800.50 1 09/30/2000 36000.07 1 10/02/2000 36021.93 1 10/01/2000 40200.43 2 09/28/2000 41888.88 2 10/03/2000 43200.18 2 09/29/2000 48000.00 2 09/28/2000 48850.40 3 09/30/2000 49850.03 3 09/29/2000 54500.22 3 10/04/2000 54553.10 4 10/01/2000 54850.29 4 10/03/2000 64300.00 4
Instead of 100, the example above uses a quartile (QUANTILE based on 4 partitions).
Page 430
Chapter 10
OLAP Functions
NTILE SELECT Product_ID ,Sale_Date , Daily_Sales, NTILE(4) OVER (ORDER BY Daily_Sales) AS Bucket FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;
Product_ID ________ Sale_Date ________ 1000 2000 1000 1000 2000 1000 2000 2000 2000 1000 2000 1000 1000 2000
10/03/2000 10/01/2000 10/04/2000 09/29/2000 09/30/2000 09/28/2000 09/29/2000 10/03/2000 09/28/2000 10/01/2000 10/02/2000 09/30/2000 10/02/2000 10/04/2000
Daily_Sales _________ 64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50
Bucket ________ 1 1 1 1 2 2 2 2 3 3 3 4 4 4
The NTILE function divides the rows into buckets as evenly as possible. In this example, because PARTITION BY is omitted, the entire input will be sorted using the ORDER BY clause, and then divided into the number of buckets specified.
Page 431
Chapter 10
OLAP Functions
NTILE Using a Value of 10 SELECT Product_ID ,Sale_Date , Daily_Sales, NTILE(10) OVER (ORDER BY Daily_Sales) AS Bucket FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date __________ _________ 1000 10/03/2000 2000 10/01/2000 1000 10/04/2000 1000 09/29/2000 2000 09/30/2000 1000 09/28/2000 2000 09/29/2000 2000 10/03/2000 2000 09/28/2000 1000 10/01/2000 2000 10/02/2000 1000 09/30/2000 1000 10/02/2000 2000 10/04/2000
Daily_Sales Bucket __________ _____ 64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50
1 1 2 2 3 3 4 4 5 6 7 8 9 10
The NTILE function divides the rows into buckets as evenly as possible. In this example, because PARTITION BY is omitted, the entire input will be sorted using the ORDER BY clause, and then divided into the number of buckets specified. This example uses a value of 10 in the NTILE.
Page 432
Chapter 10
OLAP Functions
NTILE with a Partition SELECT Product_ID ,Sale_Date , Daily_Sales, NTILE(3) OVER (PARTITION BY Product_ID ORDER BY Daily_Sales) AS Bucket FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;
Product_ID Sale_Date Daily_Sales __________ _________ __________ 32800.50 1000 10/02/2000 36000.07 1000 09/30/2000 40200.43 1000 10/01/2000 48850.40 1000 09/28/2000 54500.22 1000 09/29/2000 54553.10 1000 10/04/2000 64300.00 1000 10/03/2000 32800.50 2000 10/04/2000 36021.93 2000 10/02/2000 41888.88 2000 09/28/2000 43200.18 2000 10/03/2000 48000.00 2000 09/29/2000 49850.03 2000 09/30/2000 54850.29 2000 10/01/2000
Bucket ______ 1 1 1 2 2 3 3 1 1 1 2 2 3 3
The NTILE function divides the rows into buckets as evenly as possible. In this example, because PARTITION BY is listed, the data will first be sorted by Product_ID and then sorted using the ORDER BY clause (within Product_ID), and then divided into the number of buckets specified. This example uses a value of 3 in the NTILE. Notice that the PARTITION BY statement causes the answer set to reset on Product_ID breaks.
Page 433
Chapter 10
OLAP Functions
Using FIRST_VALUE SELECT Last_name, first_name, dept_no ,FIRST_VALUE(first_name) OVER (ORDER BY dept_no, last_name desc rows unbounded preceding) AS "First All" ,FIRST_VALUE(first_name) OVER (PARTITION BY dept_no ORDER BY dept_no, last_name desc rows unbounded preceding) AS "First Partition" FROM Employee_Table; LAST_NAME Jones Smythe Chambers Smith Coffing Larkins Strickling Reilly Harrison
FIRST_NAME DEPT_NO Squiggy ? Richard 10 Mandee 100 John 200 Billy 200 Loraine 300 Cletus 400 William 400 Herbert 400
First All Squiggy Squiggy Squiggy Squiggy Squiggy Squiggy Squiggy Squiggy Squiggy
First Partition Squiggy Richard Mandee John John Loraine Cletus Cletus Cletus
The above example uses FIRST_VALUE to show you the very first first_name returned. It also uses the keyword Partition to show you the very first first_name returned in each department.
Page 434
Chapter 10
OLAP Functions
FIRST_VALUE SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - First_Value (Daily_Sales) OVER (ORDER BY Sale_Date) AS Delta_First FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ ________ 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000
09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000 10/01/2000 10/01/2000 10/02/2000 10/02/2000 10/03/2000 10/03/2000 10/04/2000 10/04/2000
Daily_Sales _________ Delta_First __________ 48850.40 41888.88 54500.22 48000.00 36000.07 49850.03 40200.43 54850.29 32800.50 36021.93 64300.00 43200.18 54553.10 32800.50
0.00 -6961.52 5649.82 -850.40 -12850.33 999.63 -8649.97 5999.89 -16049.90 -12828.47 15449.60 -5650.22 5702.70 -16049.90
Above, after sorting the data by Sale_Date, we compute the difference between the first row's Daily_Sales and the Daily_Sales of each following row. All rows Daily_Sales are compared with the first row's Daily_Sales, thus the name First_Value.
Page 435
Chapter 10
OLAP Functions
FIRST_VALUE after Sorting by the Highest Value SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - First_Value (Daily_Sales) OVER (ORDER BY Daily_Sales DESC) AS Delta_First FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ _________ 1000 2000 1000 1000 2000 1000 2000 2000 2000 1000 2000 1000 1000 2000
10/03/2000 10/01/2000 10/04/2000 09/29/2000 09/30/2000 09/28/2000 09/29/2000 10/03/2000 09/28/2000 10/01/2000 10/02/2000 09/30/2000 10/02/2000 10/04/2000
Daily_Sales _________
Delta_First _________
64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50
0.00 -9449.71 -9746.90 -9799.78 -14449.97 -15449.60 -16300.00 -21099.82 -22411.12 -24099.57 -28278.07 -28299.93 -31499.50 -31499.50
Above, after sorting the data by Daily_Sales DESC, we compute the difference between the first row's Daily_Sales and the Daily_Sales of each following row. All rows Daily_Sales are compared with the first row's Daily_Sales, thus the name First_Value. This example shows that how much less each Daily_Sales is compared to 64,300.00 (our highest sale).
Page 436
Chapter 10
OLAP Functions
FIRST_VALUE with Partitioning SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - First_Value (Daily_Sales) OVER (PARTITION BY Product_ID ORDER BY Sale_Date) AS Delta_First FROM Sales_Table WHERE Product_ID IN (1000, 2000) ;
Product_ID Sale_Date _________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000
Daily_Sales _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
Delta_First _________ 0.00 5649.82 -12850.33 -8649.97 -16049.90 15449.60 5702.70 0.00 6111.12 7961.15 12961.41 -5866.95 1311.30 -9088.38
We are now comparing the Daily_Sales of the first Sale_Date for each Product_ID with the Daily_Sales of all other rows within the Product_ID partition. Each row is only compared with the first row (First_Value) in it's partition. Page 437
Chapter 10
OLAP Functions
Using LAST_VALUE SELECT Last_name, first_name, dept_no ,LAST_VALUE(first_name) OVER (ORDER BY dept_no, last_name desc rows unbounded preceding) AS "Last All" ,LAST_VALUE(first_name) OVER (PARTITION BY dept_no ORDER BY dept_no, last_name desc rows unbounded preceding) AS "Last Partition" FROM sql_class.Employee_Table; LAST_NAME Jones Smythe Chambers Smith Coffing Larkins Strickling Reilly Harrison
FIRST_NAME DEPT_NO Squiggy ? Richard 10 Mandee 100 John 200 Billy 200 Loraine 300 Cletus 400 William 400 Herbert 400
Last All Squiggy Richard Mandee John Billy Loraine Cletus William Herbert
Last Partition Squiggy Richard Mandee John Billy Loraine Cletus William Herbert
The FIRST_VALUE and LAST_VALUE are good to use anytime you need to propagate a value from one row to all or multiple rows based on a sorted sequence. However, the output from the LAST_VALUE function appears to be incorrect and is a little misleading until you understand a few concepts. The SQL request specifies "rows unbounded preceding“, and LAST_VALUE looks at the last row. The current row is always the last row, and therefore, it appears in the output.
Page 438
Chapter 10
OLAP Functions
LAST_VALUE SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LAST_Value (Daily_Sales) OVER (ORDER BY Sale_Date) AS Delta_Last FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ ________ 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000 1000 2000
09/28/2000 09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000 10/01/2000 10/01/2000 10/02/2000 10/02/2000 10/03/2000 10/03/2000 10/04/2000 10/04/2000
Daily_Sales _________
Delta_Last ________
48850.40 41888.88 54500.22 48000.00 36000.07 49850.03 40200.43 54850.29 32800.50 36021.93 64300.00 43200.18 54553.10 32800.50
0.00 -6961.52 0.00 -6500.22 0.00 13849.96 0.00 14649.86 0.00 3221.43 0.00 -21099.82 0.00 -21752.60
Above, after sorting the data by Sale_Date, we compute the difference between the last row's Daily_Sales and the Daily_Sales of each following row (from the same Sale_Date). Since there is only two product totals for each day, there is always a 0.00 for one of the rows.
Page 439
Chapter 10
OLAP Functions
Using LAG and LEAD Compatibility: Vertica Extension The LAG and LEAD functions allow you to compare different rows of a table by specifying an offset from the current row. You can use these functions to analyze change and variation. Syntax for LAG and LEAD: {LAG | LEAD} (, [ [, ]]) OVER ([PARTITION BY [,...]] ORDER BY [ASC | DESC] [,...] ) ;
The above provides information and the syntax for LAG and LEAD.
Page 440
Chapter 10
OLAP Functions
Using LEAD SELECT Last_Name, Dept_No ,LEAD(Dept_No) OVER (ORDER BY Dept_No, Last_Name) as "Lead All" ,LEAD(Dept_No) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lead Partition" FROM Employee_Table; LAST_NAME Jones Smythe Chambers Coffing Smith Larkins Harrison Reilly Strickling
DEPT_NO ? 10 100 200 200 300 400 400 400
Lead All 10 100 200 200 300 400 400 400 ?
Lead Partition ? ? ? 200 ? ? 400 400 ?
As you can see, the first LEAD brings back the value from the next row except for the last which has no row following it. The offset value was not specified in this example, so it defaulted to a value of 1 row.
Page 441
Chapter 10
OLAP Functions
Using LEAD With and Offset of 2 SELECT Last_Name, Dept_No ,LEAD(Dept_No,2) OVER (ORDER BY Dept_No, Last_Name) as "Lead All" ,LEAD(Dept_No,2) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lead Partition" FROM Employee_Table;
LAST_NAME Jones Smythe Chambers Coffing Smith Larkins Harrison Reilly Strickling
DEPT_NO ? 10 100 200 200 300 400 400 400
Lead All 100 200 200 300 400 400 400 ? ?
Lead Partition ? ? ? ? ? ? 400 ? ?
Above, each value in the first LEAD is 2 rows away, and the partitioning only shows when values are contained in each value group with 1 more than offset value.
Page 442
Chapter 10
OLAP Functions
LEAD SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LEAD(Daily_Sales, 1, 0) OVER (ORDER BY Product_ID, Sale_Date) AS Lead1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000
Daily_Sales _________
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
Lead1 ________
-5649.82 18500.15 -4200.36 7399.93 -31499.50 9746.90 12664.22 -6111.12 -1850.03 -5000.26 18828.36 -7178.25 10399.68 32800.50
Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the next row's Daily_Sales, or one whose Daily_Sales is the same). The expression LEAD (Daily_Sales, 1, 0) tells LEAD () to evaluate the expression Daily_Sales on the row that is positioned one row following the current row. If there is no such row (as is the case on the last row of the partition or relation), then the default value of 0 is used.
Page 443
Chapter 10
OLAP Functions
LEAD With Partitioning SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LEAD(Daily_Sales, 1, 0) OVER (PARTITION BY Product_ID ORDER BY Sale_Date) AS Lead1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ ________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000
Daily_Sales ________ Lead1 _________ 48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
-5649.82 18500.15 -4200.36 7399.93 -31499.50 9746.90 54553.10 -6111.12 -1850.03 -5000.26 18828.36 -7178.25 10399.68 32800.50
Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the next row's Daily_Sales, or one whose Daily_Sales is the same). We also partitioned the data by Product_ID.
Page 444
Chapter 10
OLAP Functions
Using LAG SELECT Last_Name, Dept_No ,LAG(Dept_No) OVER (ORDER BY Dept_No, Last_Name) as "Lag All" ,LAG(Dept_No) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lag Partition" FROM Employee_Table;
LAST_NAME DEPT_NO Jones ? Smythe 10 Chambers 100 Coffing 200 Smith 200 Larkins 300 Harrison 400 Reilly 400 Strickling 400
Lag All ? ? 10 100 200 200 300 400 400
Lag Partition ? ? ? ? 200 ? ? 400 400
From the example above, you see that LAG uses the value from a previous row and makes it available in the next row. For LAG, the first row(s) will contain a null based on the value in the offset. Here it defaulted to 1. The first null comes from the function whereas the second row gets the null from the first row. Page 445
Chapter 10
OLAP Functions
Using LAG with an Offset of 2 SELECT Last_Name, Dept_No ,LAG(Dept_No,2) OVER (ORDER BY Dept_No, Last_Name) as "Lag All" ,LAG(Dept_No,2) OVER (PARTITION BY Dept_No ORDER BY Dept_No, Last_Name) as "Lag Partition" FROM Employee_Table; LAST_NAME Jones Smythe Chambers Coffing Smith Larkins Harrison Reilly Strickling
DEPT_NO ? 10 100 200 200 300 400 400 400
Lag All ? ? ? 10 100 200 200 300 400
Lag Partition ? ? ? ? ? ? ? ? 400
For this example, the first two rows have a null because there is not a row two rows before these. The number of nulls will always be the same as the offset value. There is a third null because Jones Dept_No is null.
Page 446
Chapter 10
OLAP Functions
LAG SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LAG(Daily_Sales, 1, 0) OVER (ORDER BY Product_ID, Sale_Date) AS Lag1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date ________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000
Daily_Sales _________
Lag1 _______
48850.40 54500.22 36000.07 40200.43 32800.50 64300.00 54553.10 41888.88 48000.00 49850.03 54850.29 36021.93 43200.18 32800.50
48850.40 5649.82 -18500.15 4200.36 -7399.93 31499.50 -9746.90 -12664.22 6111.12 1850.03 5000.26 -18828.36 7178.25 -10399.68
Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the previous row's Daily_Sales, or one whose Daily_Sales is the same). The expression LAG (Daily_Sales, 1, 0) tells LAG to evaluate the expression Daily_Sales on the row that is positioned one row before the current row. If there is no such row (as is the case on the first row of the partition or relation), then the default value of 0 is used. Page 447
Chapter 10
OLAP Functions
LAG with Partitioning SELECT Product_ID ,Sale_Date , Daily_Sales, Daily_Sales - LAG(Daily_Sales, 1, 0) OVER (PARTITION BY Product_ID ORDER BY Sale_Date) AS Lag1 FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date _________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000 09/28/2000 09/29/2000 09/30/2000 10/01/2000 10/02/2000 10/03/2000 10/04/2000
Daily_Sales Lag1 _________ _______ 48850.40 48850.40 54500.22 5649.82 36000.07 -18500.15 40200.43 4200.36 32800.50 -7399.93 64300.00 31499.50 54553.10 -9746.90 41888.88 41888.88 48000.00 6111.12 49850.03 1850.03 54850.29 5000.26 36021.93 -18828.36 43200.18 7178.25 32800.50 -10399.68
Above, we compute the difference between a product's Daily_Sales and that of the next Daily_Sales in the sort order (which will be the previous row's Daily_Sales, or one whose Daily_Sales is the same). The expression LAG (Daily_Sales, 1, 0) tells LAG to evaluate the expression Daily_Sales on the row that is positioned one row before the current row. If there is no such row (as is the case on the first row of the partition or relation), then the default value of 0 is used. Page 448
Chapter 10
OLAP Functions
MEDIAN with Partitioning SELECT Last_Name, Dept_No, Salary, MEDIAN(Salary) OVER (PARTITION BY Dept_No) AS MEDIAN FROM Employee_Table as e WHERE Dept_No in (200, 400)
Last_Name Dept_No ________ _______
Salary _______
MEDIAN _______
Coffing Smith Reilly Harrison Strickling
41888.88 48000.00 36000.00 54500.00 54500.00
44944.44 44944.44 54500 54500 54500
200 200 400 400 400
The Median is a numerical value of an expression in an answer set within a window that separates the higher half of a sample from the lower half. After sorting all values from lowest value to highest, it then picks the middle one. If there is an even number of values, then there is no single middle value, so the median is considered to be the mean (average) of the two middle values.
Page 449
Chapter 10
OLAP Functions
CUME_DIST SELECT Product_ID ,Sale_Date , Daily_Sales, CUME_DIST() OVER (ORDER BY Daily_Sales DESC) AS CDist FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date _________ _________ 1000 10/03/2000 2000 10/01/2000 1000 10/04/2000 1000 09/29/2000 2000 09/30/2000 1000 09/28/2000 2000 09/29/2000 2000 10/03/2000 2000 09/28/2000 1000 10/01/2000 2000 10/02/2000 1000 09/30/2000 1000 10/02/2000 2000 10/04/2000
Daily_Sales __________
64300.00 54850.29 54553.10 54500.22 49850.03 48850.40 48000.00 43200.18 41888.88 40200.43 36021.93 36000.07 32800.50 32800.50
CDist _____
0.07 0.14 0.21 0.29 0.36 0.43 0.50 0.57 0.64 0.71 0.79 0.86 1.00 1.00
The CUME_DIST is a cumulative distribution function that assigns a relative rank to each row, based on a formula. That formula is (number of rows preceding or peer with current row) / (total rows). We order by Daily_Sales DESC, so that each row is ranked by cumulative distribution. The distribution is represented relatively, by floating point numbers from 0 to 1. When there is only one row in a partition, it is assigned 1. When there is more than one row, each is assigned a cumulative distribution ranking, ranging from 0 to 1.
Page 450
Chapter 10
OLAP Functions
CUME_DIST with a Partition SELECT Product_ID ,Sale_Date , Daily_Sales, CUME_DIST() OVER (PARTITION by Product_ID ORDER BY Daily_Sales DESC) AS CDist FROM Sales_Table WHERE Product_ID IN (1000, 2000) ; Product_ID Sale_Date Daily_Sales ________ _________ _________ 1000 1000 1000 1000 1000 1000 1000 2000 2000 2000 2000 2000 2000 2000
10/03/2000 10/04/2000 09/29/2000 09/28/2000 10/01/2000 09/30/2000 10/02/2000 10/01/2000 09/30/2000 09/29/2000 10/03/2000 09/28/2000 10/02/2000 10/04/2000
64300.00 54553.10 54500.22 48850.40 40200.43 36000.07 32800.50 54850.29 49850.03 48000.00 43200.18 41888.88 36021.93 32800.50
CDist _____ 0.14 0.29 0.43 0.57 0.71 0.86 1.00 0.14 0.29 0.43 0.57 0.71 0.86 1.00
The CUME_DIST is a cumulative distribution function that assigns a relative rank to each row, based on a formula. That formula is (number of rows preceding or peer with current row) / (total rows). We Partition by Product_ID and then ORDER BY Daily_Sales DESC, so that each row is ranked by cumulative distribution within its partition.
Page 451
Chapter 10
OLAP Functions
SUM (SUM (n)) SELECT Product_ID , SUM(Daily_Sales) as Summy, SUM(SUM(Daily_Sales)) OVER (ORDER BY Sum(Daily_Sales) ) AS Prod_Sales_Running_Sum FROM Sales_Table GROUP BY Product_ID ;
Product_ID __________ Summy _______ Prod_Sales_Running_Sum ___________________ 3000 2000 1000
224587.82 306611.81 331204.72
224587.82 531199.63 862404.35
Window functions can compute aggregates of aggregates, as in the example above.
Page 452
Chapter 11
Page 453
Temporary Tables
Chapter 11
Temporary Tables
Chapter 11 – Temporary Tables
“I cannot imagine any condition which would cause this ship to founder. Modern shipbuilding has gone beyond that.” - E. I. Smith, Captain of the Titanic
Page 454
Chapter 11
Temporary Tables
There are three types of Temporary Tables Derived Table • • • •
Exists only within a query Materialized by a SELECT Statement inside a query Space comes from the User’s Spool space Deleted when the query ends
Local Temporary Table • • •
Created by the User and materialized with an INSERT/SELECT Table and Data are deleted only after a User Logs off the session Can be session specific or seen across different sessions
Global Temporary Table • • • • •
Table definition is created by a User and the table definition is permanent Materialized with an INSERT/SELECT When User logs off the session the data is deleted, but the table definition stays Many Users can populate the same Global table, but each has their own copy Global temporary tables are created in the public schema, with the data contents private to the transaction or session through which data is inserted.
The three types of Temporary tables are Derived, Local Temporary and Global Temporary Tables.
Page 455
Chapter 11
Temporary Tables
CREATING A Derived Table • • • •
Exists only within a query Materialized by a SELECT Statement inside a query Space comes from the User’s Spool space Deleted when the query ends
SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) AS TeraTom(AVGSAL) ; A query within a query.
AVGSAL ________ 46782.15
Answer Set
The SELECT Statement that creates and populates the Derived table is always inside Parentheses.
Page 456
Chapter 11
Temporary Tables
Naming the Derived Table SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) AS TeraTom(AVGSAL) ;
The name of the Derived Table is TeraTom
AVGSAL ________ 46782.15
Answer Set
In the example above, TeraTom is the name we gave the Derived Table. It is mandatory that you always name the table or it errors.
Page 457
Chapter 11
Temporary Tables
Aliasing the Column Names in The Derived Table SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) AS TeraTom(AVGSAL) ; AVGSALis the Column Name in the derived table named TeraTom
AVGSAL ________
46782.15
Answer Set
AVGSAL is the name we gave to the column in our Derived Table that we call TeraTom. Our SELECT (which builds the columns) shows we are only going to have one column in our derived table, and we have named that column AVGSAL.
Page 458
Chapter 11
Temporary Tables
Multiple Ways to Alias the Columns in a Derived Table Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
Query 1 Query 2 Query 3
SELECT * FROM (SELECT AVG(salary) as avgsal FROM Employee_Table) AS TeraTom
Messages
Garden of Analysis
?
New Query
The derived table's name is TeraTom
Result 1
AVGSAL 1 46782.15
A derived table only lasts for the lifetime of the query and then it is deleted
You can alias the column name within the SQL query that materializes the derived table.
Page 459
Chapter 11
Temporary Tables
CREATING a Derived Table using the WITH Command Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 You create the derived table WITH TeraTom(AVGSAL) AS first using a WITH (SELECT AVG(salary) statement FROM Employee_Table) SELECT * You must then include the derived FROM TeraTom ; table in a final SELECT query Messages
Garden of Analysis
Result 1
AVGSAL 1 46782.15
When using the WITH Command, we can CREATE our Derived table before running the main query. The only issue here is that you can only have 1 WITH.
Page 460
Chapter 11
Temporary Tables
The Same Derived Query shown Three Different Ways
1
SELECT * FROM (SELECT AVG(salary) FROM Employee_Table) TeraTom (AVGSAL) ; Alias CAN be done here or here
2
3
Page 461
SELECT * FROM (SELECT AVG(salary) as AVGSAL FROM Employee_Table) TeraTom ;
WITH TeraTom(AVGSAL) AS (SELECT AVG(salary)FROM Employee_Table) SELECT * FROM TeraTom ;
Chapter 11
Temporary Tables
Most Derived Tables Are Used To Join To Other Tables SELECT E.*, AVGSAL The SELECT materializes FROM Employee_Table as E the Derived Table INNER JOIN (SELECT Dept_No, AVG(salary) FROM Employee_Table GROUP BY Dept_No) AS TeraTom (Dept_No, AVGSAL) ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;
The derived table name is TeraTom
The columns are aliased
Employee_No _______ Dept_No Last_Name First_Name ______ Salary ___________ ________ ________ 1000234 1232578 1324657 1333454 2312225 1121334 1256349 2341218
10 100 200 200 300 400 400 400
Smythe Chambers Coffing Smith Larkins Strickling Harrison Reilly
Richard Mandee Billy John Loraine Cletus Herbert William
64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 54500.00 36000.00
AVGSAL _______ 64300.00 48850.00 44944.44 44944.44 40200.00 48333.33 48333.33 48333.33
The first five columns in the Answer Set came from the Employee_Table. AVGSAL came from the derived table named TeraTom.
Page 462
Chapter 11
Temporary Tables
The Three Components of a Derived Table SELECT E.*, Salary - AVGSAL as PlusMinAvg FROM Employee_Table as E INNER JOIN (SELECT Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;
Dept_No AVGSAL ________ ________ ? 32800.50 10 64300.00 100 48850.00 200 44944.44 300 40200.00 400 48333.33 The derived table lives in memory
1
A derived table will always have a SELECT query to materialize the derived table with data. The SELECT query always starts with an open parenthesis and ends with a close parenthesis.
2
The derived table must be given a name. Above we called our derived table TeraTom.
3
You will need to define (alias) the columns in the derived table. Above we allowed Dept_No to default to Dept_No, but we had to specifically alias AVG(Salary) as AVGSAL.
Every derived table must have the three components listed above.
Page 463
TeraTom
Chapter 11
Temporary Tables
Visualize This Derived Table SELECT E.*, Salary - AVGSAL as PlusMinAvg FROM Employee_Table as E INNER JOIN (SELECT Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;
Employee_No ____________ Dept_No ________ 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 1256349 400 2341218 400
TeraTom Dept_No AVGSAL ________ ________ ? 32800.50 10 64300.00 100 48850.00 200 44944.44 300 40200.00 400 48333.33
The derived table is built first
Last_Name Salary PlusMinAvg ___________ First_Name ___________ ________ ___________ Smythe Richard 64300.00 0.00 Chambers Mandee 48850.00 0.00 Coffing Billy 41888.88 -3055.56 Smith John 48000.00 3055.56 Larkins Loraine 40200.00 0.00 Strickling Cletus 54500.00 6166.67 Harrison Herbert 54500.00 6166.67 Reilly William 36000.00 -12333.33
Our example above shows the data in the derived table named TeraTom. This query allows us to see each employee and the plus or minus avg of their salary compared to the other workers in their department.
Page 464
Chapter 11
Temporary Tables
Our Join Example with a Different Column Aliasing Style I don't need to alias this
SELECT E.*, AVGSAL because it can default to its FROM Employee_Table as E current name INNER JOIN (SELECT Dept_No as Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom I must alias this ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;
column because it is an aggregate
Employee_No ________ Dept_No _________ Last_Name _________ First_Name _______ Salary AVGSAL __________ _______ 1000234 1232578 1324657 1333454 2312225 1121334 1256349 2341218
10 100 200 200 300 400 400 400
Smythe Chambers Coffing Smith Larkins Strickling Harrison Reilly
Richard Mandee Billy John Loraine Cletus Herbert William
64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 54500.00 36000.00
64300.00 48850.00 44944.44 44944.44 40200.00 48333.33 48333.33 48333.33
Our example above aliases the column Dept_No, but it doesn’t need an alias. It will default to Dept_No, but the aggregate must be aliased..
Page 465
Chapter 11
Temporary Tables
Column Aliasing Can Default for Normal Columns I don't need to alias this SELECT E.*, AVGSAL because it can default to its FROM Employee_Table as E current name INNER JOIN (SELECT Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) AS TeraTom ON E.Dept_No = TeraTom.Dept_No ORDER BY E.Dept_No ;
TeraTom Dept_No AVGSAL ________ ________ ? 32800.50 10 64300.00 100 48850.00 200 44944.44 300 40200.00 400 48333.33 The derived table is built first
In a derived table, you will always have a SELECT query in parenthesis, and you will always name the table. You have options when aliasing the columns. As in the example above, you can let normal columns default to their current name.
Page 466
Chapter 11
Temporary Tables
A Derived example Using the WITH Syntax Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History We have toSandbox alias the aggregate, but dept_no EXECUTE ? New Query defaults to its own name
Query 1 Query 2 Query 3 WITH TeraTom AS (SELECT Dept_No, AVG(salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) SELECT e.dept_no, e.last_name, e.first_name, e.salary, AVGSAL FROM TeraTom JOIN Employee_Table as E ON E.Dept_No = TeraTom.Dept_No WHERE Salary > AVGSAL Messages Garden of Analysis Result 1
dept_no last_name 1 200 Smith 2 400 Strickling 3 400 Harrison
first_name John Cletus Herbert
salary 48000.00 54500.00 54500.00
avgsal 44944.44 48333.33 48333.33
Most derived tables involve calculations, aggregations or ordered analytics. This allows tables and derived columns to mix well on the final report. Above, we are finding all employees who make a salary that is greater than the average salary within their own department. We created a derived table that holds all departments and the average salary within the department. We then join the derived table (named TeraTom) to the employee_table where we can check the salary vs. the avg (salary).
Page 467
Chapter 11
Temporary Tables
Quiz - Answer the Questions SELECT Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table INNER JOIN (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) as TeraTom (Depty, AVGSAL) ON Dept_No = Depty ;
1) What is the name of the derived table? __________ 2) How many columns are in the derived table? _______ 3) What is the name of the derived table columns? ______
4) Is there more than one row in the derived table? _______ 5) What common keys join the Employee and Derived? _______ 6) Why were the join keys named differently? ______________
Answer the questions above an you will fully understand the components of a derived table.
Page 468
Chapter 11
Temporary Tables
Answer to Quiz - Answer the Questions SELECT Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table INNER JOIN (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) as TeraTom (Depty, AVGSAL) ON Dept_No = Depty ;
1) What is the name of the derived table? TeraTom 2) How many columns are in the derived table? 2
3) What’s the name of the derived columns? Depty and AVGSAL 4) Is their more than one row in the derived table? Yes 5) What keys join the tables? Dept_No and Depty 6) Why were the join keys named differently? If both were named Dept_No, we would error unless we full qualified.
Great job!
Page 469
Chapter 11
Temporary Tables
Clever Tricks on Aliasing Columns in a Derived Table SELECT Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table Alias Here INNER JOIN
1
(SELECT Dept_No as Depty, AVG(Salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) as TeraTom ON Dept_No = Depty ;
SELECT E.Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table as E INNER JOIN Alias Here
2
(SELECT Dept_No, AVG(Salary) as AVGSAL FROM Employee_Table GROUP BY Dept_No) as TeraTom ON E.Dept_No = TeraTom.Dept_No ;
Check out a few clever tricks to help you with derived tables.
Page 470
Chapter 11
Temporary Tables
A Derived Table lives only for the lifetime of a single query Begin Transaction ; First query
1
Begin Transaction
WITH T (Dept_No, AVGSAL) AS (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) SELECT T.Dept_No, First_Name, Last_Name, AVGSAL FROM Employee_Table as E INNER JOIN T ON E.Dept_No = T.Dept_No ;
Second query
2
SELECT * FROM T ;
END Transaction;
Error – Query Fails…. T does Not exist. End Transaction
We tried everything to see if the derived table would live past the current query. Notice above, we started with a BEGIN TRANSACTION statement. Then we ran our query that materialized our derived table name T. Then, we attempted to run another query (within the same transaction) that did a SELECT * FROM T and the query failed.
Page 471
Chapter 11
Temporary Tables
An Example of Two Derived Tables in a Single Query WITH T (Dept_No, AVGSAL) AS (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) SELECT T.Dept_No, First_Name, Last_Name, AVGSAL, Counter FROM Employee_Table as E INNER JOIN T ON E.Dept_No = T.Dept_No INNER JOIN (SELECT Employee_No, SUM(1) OVER(PARTITION BY Dept_No ORDER BY Dept_No, Last_Name Rows Unbounded Preceding) FROM Employee_Table) as S (Employee_No, Counter) ON E.Employee_No = S.Employee_No ORDER BY T.Dept_No;
Above we have built two different derived tables. The first is named T and the second is named S. Notice that we materialized T using a WITH statement and we build S right after the INNER JOIN keywords.
Page 472
Chapter 11
Temporary Tables
Example of Two Derived Tables in a Single WITH Statement Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Systems
Query 1 Query 2 Query 3 + Aster Data WITH E AS (SELECT Dept_No, Last_Name, Salary + Azure Cloud Separate FROM Employee_Table) multiple + DB2 Derived ,D AS (SELECT Dept_No, Department_Name + Excel tables in a FROM Department_Table) + Greenplum WITH + Hadoop SELECT E.*, department_name by using a + Kognitio FROM E INNER JOIN D comma + Netezza ON E.Dept_No = D.Dept_No + Oracle WHERE E.Dept_No = 100 + Matrix + + + + +
Redshift SQL Server Sybase Teradata Vertica
Messages
Garden of Analysis
e.dept_no e.last_name
1 100
Chambers
Result 1 e.salary 48850.00
department_name Marketing
Above we have built two different derived tables within a single WITH statement. The first is named E and the second is named D. There is only one WITH statement, but the tables and definitions are separated with a comma.
Page 473
Chapter 11
Temporary Tables
Finding the First Occurrence of a Row using WITH Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 WITH Derived_Tbl AS (select Product_ID as Prod, Sale_Date, Daily_Sales, Row_Number() over (PARTITION BY product_id ORDER BY Sale_Date ASC) AS Row_Num from sales_table) Select * from Derived_Tbl Where Row_Num = 1 ; Messages
1 2 3
Prod 1000 2000 3000
Garden of Analysis
Result 1
Sale_Date Daily_Sales 09/28/2000 48850.40 09/28/2000 41888.88 09/28/2000 61301.77
Row_Num 1 1 1
Using the Row_Number ordered analytic and by partitioning of Product_ID and the sorting by Sale_Date ASC we are bringing back only the first occurrence of a row based on the earliest Sale_Date. This can be done because we are placing our query in a derived table and then selecting from that derived table using a WHERE clause.
Page 474
Chapter 11
Temporary Tables
Finding the Last Occurrence of a Row using WITH Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 WITH Derived_Tbl AS (select Product_ID as Prod, Sale_Date, Daily_Sales, Row_Number() over (PARTITION BY product_id ORDER BY Sale_Date Desc) AS Row_Num from sales_table) Select * from Derived_Tbl Where Row_Num = 1 ; Messages
1 2 3
Prod 1000 2000 3000
Garden of Analysis
Result 1
Sale_Date Daily_Sales 10/04/2000 54553.10 10/04/2000 32800.50 10/04/2000 15675.33
Row_Num 1 1 1
Using the Row_Number ordered analytic and by partitioning of Product_ID and the sorting by Sale_Date DESC we are bringing back only the last occurrence of a row based on the latest Sale_Date. This can be done because we are placing our query in a derived table and then selecting from that derived table using a WHERE clause.
Page 475
Chapter 11
Temporary Tables
Syntax for Temporary Tables CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } ] TABLE [schema-name].table-name { ( column-definition [ , ... ] ) | [ column-name-list ] }
[ ON COMMIT { DELETE | PRESERVE } ROWS ] [ AS [ AT EPOCH LATEST ] | [ AT TIME 'timestamp' ] query ] [ [ ORDER BY table-column [ , ... ] ] [ ENCODED BY column-definition [ , ... ] [ hash-segmentation-clause | range-segmentation-clause
| UNSEGMENTED { NODE node | ALL NODES } ] [ KSAFE [ k-num ] ] | [ NO PROJECTION ] ]
The syntax above is for creating temporary tables. Global tables can be seen outside the session and persist until the end of the session. Global is the default. Local tables can only be seen inside the session and persist until the end of session.
Page 476
Chapter 11
Temporary Tables
Temporary Tables Explained Global Temporary Tables - The definition of a global temporary table is permanent in the database catalogs until explicitly removed by using the DROP TABLE command.
Global temporary tables are created in the public schema, and they are visible to all users and sessions. But, the contents (data) of a global table are private to the transaction or session in which the data was inserted. Data is automatically removed when the transaction commits, rolls back, or the session ends. This allows two users to use the same temporary table, but each only sees the data specific to his or her own transactions for the duration of those transactions or sessions. Local Temporary Tables - A local temporary table is created in the V_TEMP_SCHEMA namespace and is inserted into the user's search path automatically. It can only be seen by the user who created the table, and it lasts for only the duration of the session in which it is created. When the session ends, the table definition is automatically dropped from the database catalogs. Local Temporary Tables can be dropped explicitly. Above are the major differences between Global and Local Temporary tables.
Page 477
Chapter 11
Temporary Tables
Key Temporary Table Terms Global - [Optional] means that the table definition is visible to all sessions. Temporary table data is visible only to the session that materializes (inserts) the data into the table. Temporary tables in default to global. Local - [Optional] Means that the table definition is visible only to the session in which it is created. Temporary tables always default to global. On Commit Preserve|Delete rows – Preserve will preserve the rows until session end and then Truncate the table and Delete will Truncate the rows after each COMMIT. AT EPOCH LATEST | AT TIME - Used with AS query to query historical data. You can specify AT EPOCH LATEST to include data from the latest committed transaction or specify a specific epoch based on a specific time stamp.
Above are the key terms you will want to know when creating a temporary table.
Page 478
Chapter 11
Temporary Tables
Creating and Populating a Local Temporary Table CREATE LOCAL Temporary TABLE Dept_Agg_Local ( Dept_no Integer 1 ,AVG_Salary Decimal(10,2) ) ON COMMIT PRESERVE ROWS ;
2
3
INSERT INTO Dept_Agg_Local SELECT Dept_no ,AVG(Salary) FROM Employee_Table GROUP BY Dept_no ;
SELECT * FROM Dept_Agg_Local ORDER BY 1;
Local tables are Materialized with an Insert/Select statement
Dept_No AVG_Salary _______ __________ ? 32800.50 10 64300.00 100 48850.00 200 89888.88 300 40200.00 400 145000.00
1) A USER Creates a Local Temporary Table and then 2) populates the Temporary Table with an INSERT/SELECT Statement. Now, the user can query this table all session long. When the session is logged off, the table and the data are automatically deleted (Truncated).
Page 479
Chapter 11
Temporary Tables
Using a Local Temporary Table CREATE LOCAL Temporary TABLE Dept_Agg_Local2 ( Dept_no Integer ,AVG_Salary Decimal(10,2) ) ON COMMIT PRESERVE ROWS ; INSERT INTO Dept_Agg_Local2 SELECT Dept_no ,AVG(Salary) FROM Employee_Table GROUP BY Dept_no ; SELECT E.*, AVG_Salary FROM Employee_Table as E INNER JOIN Dept_Agg_Local2 ON E.Dept_No = Dept_Agg_Local2.Dept_No AND Salary > AVG_Salary Employee_No ____________ Dept_No ________ Last_Name __________ First_Name __________ Salary ______ AVG_Salary __________ 1333454 1256349 1121334
200 Smith 400 Harrison 400 Strickling
John Herbert Cletus
48000.00 54500.00 54500.00
44944.44 48333.33 48333.33
We created the Local Temporary Table, materialized it and then used it in a join. The above query finds all employees making a greater salary then the AVG (Salary) within their own dept_no.
Page 480
Chapter 11
Temporary Tables
Creating and Populating a Global Temporary Table CREATE Global Temporary TABLE Dept_Agg_Global ( Dept_no Integer 1 ,AVG_Salary Decimal(10,2) ) ON COMMIT PRESERVE ROWS ;
2
3
INSERT INTO Dept_Agg_Global SELECT Dept_no ,AVG(Salary) FROM Employee_Table GROUP BY Dept_no ;
SELECT * FROM Dept_Agg_Global ORDER BY 1;
Global tables are Materialized with an Insert/Select statement
Dept_No AVG_Salary _______ __________ ? 32800.50 10 64300.00 100 48850.00 200 89888.88 300 40200.00 400 145000.00
1) A USER Creates a Global Temporary Table once and the table definition will persist permanently, until it is dropped. Users can then 2) populates the Global Temporary Table with an INSERT/SELECT Statement. Now, the user can query this table all session long. When the session is logged off the table definition stays, but the data is automatically deleted (Truncated). Many different users can populate the table, but each only sees the table they materialized.
Page 481
Chapter 11
Temporary Tables
Creating and Populating a Global Temporary Table CREATE Global Temporary TABLE Dept_Agg_Global ( Dept_no Integer ,AVG_Salary Decimal(10,2) ) ON COMMIT PRESERVE ROWS ;
User 1
User n
INSERT INTO Dept_Agg_Global SELECT Dept_no ,AVG(Salary) FROM Employee_Table WHERE Dept_No in (100, 200) GROUP BY Dept_no ;
INSERT INTO Dept_Agg_Global SELECT Dept_no ,AVG(Salary) FROM Employee_Table GROUP BY Dept_no HAVING AVG(Salary) > 46000;
Both users above can only see the data they populated
Two users above have materialized the same Global Temporary table, but each only sees their table. Users can not share a Global Temporary table, but only the definition.
Page 482
Chapter 11
Temporary Tables
Some Great Examples of Creating a Temporary Table Quickly This table is created from the Sales_Table CREATE TEMP TABLE Sales_Agg ON COMMIT PRESERVE ROWS AS SELECT Product_ID ,SUM(Daily_Sales) FROM Sales_Table Group by Product_ID;
This table is materialized from a join CREATE TEMP TABLE Emp_Dept ON COMMIT PRESERVE ROWS AS SELECT E.*, Department_Name, Budget FROM Employee_Table as E INNER JOIN Department_Table as D ON E.Dept_No = D.Dept_No;
Above are two great examples to quickly CREATE a temporary Table from another table.
Page 483
Chapter 11
Temporary Tables
Creating a Temporary Table That is sorted This table is sorted by Sale_Date CREATE GLOBAL TEMP TABLE Temp_Orders ( Order_Number INTEGER ,Customer_Number INTEGER ,Order_Date Date ,Order_Total Decimal(8,2)) ON COMMIT PRESERVE ROWS ORDER BY Order_Date, Customer_Number;
INSERT INTO Temp_Orders SELECT * FROM Order_Table; SELECT * FROM Temp_Orders; A great reason to create a temporary table is to have it sorted.
Page 484
Chapter 11
Temporary Tables
A Temp Table That Populates some of the Rows Create a Temporary Table with orders from September
CREATE Temp TABLE Order_Vol ON COMMIT PRESERVE ROWS AS (SELECT * FROM Order_Table WHERE Extract(Month from Order_Date) = 9);
Above is an example of creating a temporary table that is not an exact copy. It is only populating the table with orders from the month of September.
Page 485
Chapter 11
Temporary Tables
A Temporary Table with Some of the Columns This creates a table with only three columns
CREATE Temporary TABLE Order_Vol5 ON COMMIT PRESERVE ROWS AS (SELECT Customer_Number ,Order_Date, Order_Total FROM Order_Table) ;
Above is an example of creating a Temporary table with three columns. The original table had four columns.
Page 486
Chapter 12
Page 487
Sub-query Functions
Chapter 12
Sub-query Functions
Chapter 12 – Sub-query Functions
“An invasion of Armies can be resisted, but not an idea whose time has come.” - Victor Hugo
Page 488
Chapter 12
Sub-query Functions
An IN List is much like a Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Employee_Table WHERE Dept_No IN (100, 200) ;
Messages
Garden of Analysis
Result 1
Employee_No Dept_No Last_Name 1 1232578 100 Chambers 2 1324657 200 Coffing 3 1333454 200 Smith
First_Name Salary Mandee 48850.00 41888.88 Billy 48000.00 John
This query is easy to understand. It uses an IN List to find all Employees who are in Dept_No 100 or Dept_No 200.
Page 489
Chapter 12
Sub-query Functions
An IN List Never has Duplicates – Just like a Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3
SELECT * FROM Employee_Table WHERE Dept_No IN (100, 100, 200, 200) ;
Messages
Garden of Analysis
Duplicates in an IN-List are silly
Result 1
Employee_No Dept_No Last_Name 1 1232578 100 Chambers 2 1324657 200 Coffing 3 1333454 200 Smith
First_Name Salary Mandee 48850.00 41888.88 Billy 48000.00 John
The answer still only produced three rows
What is going on with this IN List? Why in the world are their duplicates in there? Will this query even work? What will the result set look like? Duplicate values are ignored here. We got the same rows back as before, and it is as if the system ignored the duplicate values in the IN List. That is exactly what happened.
Page 490
Chapter 12
Sub-query Functions
The Subquery Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
There is a Top Query and a Bottom Query!
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table
Dept_No ________________ Department_Name ________
SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ;
100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Which Query Runs First?
The query above is a Subquery which means there are multiple queries in the same SQL. The bottom query runs first, and its purpose in life is to build a distinct list of values that it passes to the top query. The top query then returns the result set. This query solves the problem: Show all Employees in Valid Departments!
Page 491
Chapter 12
Sub-query Functions
The Three Steps of How a Basic Subquery Works Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
SELECT * FROM Employee_Table 1 WHERE Dept_No IN ( SELECT Dept_No The Bottom Query runs first! FROM Department_Table) ;
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
2 The result is passed to the top query!
3 SELECT * FROM Employee_Table WHERE Dept_No IN (100, 200, 300, 400, 500) ;
The top query runs using the bottom query answer set
The bottom query runs first and builds a distinct IN list. Then the top query runs using the list.
Page 492
Chapter 12
Sub-query Functions
These are Equivalent Queries Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
1
2
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ;
SELECT * FROM Employee_Table WHERE Dept_No IN (100, 200, 300, 400, 500) ;
Both queries above are the same. Query 2 has values in an IN list. Query 1 runs a subquery to build the values in the IN list.
Page 493
Chapter 12
Sub-query Functions
The Final Answer Set from the Subquery Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400 Remember that a subquery never has columns return in the final answer set
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Page 494
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources Notice that No employees are in dept 500
SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ; Employee_No Dept_No ____________ ________ 1232578 100 1324657 200 1333454 200 2312225 300 1256349 400 2341218 400 1121334 400
.
Department_Table
Last_Name __________ Chambers Coffing Smith Larkins Harrison Reilly Strickling
First_Name __________ Mandee Billy John Loraine Herbert William Cletus
Salary ________ 48850.00 41888.88 48000.00 40200.00 54500.00 36000.00 54500.00
Chapter 12
Sub-query Functions
Quiz- Answer the Difficult Question Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
How are Subqueries similar to Joins between two tables?
A great question was asked above. Do you know the key to answering? Turn the page!
Page 495
Chapter 12
Sub-query Functions
Answer to Quiz- Answer the Difficult Question Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Department_Table
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
Primary Key
Foreign Key
How are Subqueries similar to Joins between two tables?
A Subquery between two tables or a Join between two tables will each need a common key that represents the relationship. This is called a Primary Key/Foreign Key relationship.
A Subquery will use a common key linking the two tables together very similar to a join! When subquerying between two tables, look for the common link between the two tables. Most of the time they both have a column with the same name but not always.
Page 496
Chapter 12
Sub-query Functions
Should you use a Subquery or a Join? Employee_Table
Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Department_Table
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
When do I Subquery? SELECT * FROM Employee_Table WHERE Dept_No IN ( SELECT Dept_No FROM Department_Table) ;
Dept_No ________________ Department_Name ________ 100 200 300 400 500
Marketing Research and Dev Sales Customer Support Human Resources
When do I perform a Join?
SELECT E.*, Department_Name FROM Employee_Table as E Inner Join Department_Table as D ON E.Dept_No = D.Dept_No;
If you only want to see a report where the final result set has only columns from one table, use a Subquery. Obviously, if you need columns on the report where the final result set has columns from both tables, you have to do a Join.
Page 497
Chapter 12
Sub-query Functions
Quiz- Write the Subquery Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________
11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
Write the Subquery
Select all columns in the Customer_Table if the customer has placed an order!
Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table. Good luck! Advice: Look for the common key among both tables!
Page 498
Chapter 12
Sub-query Functions
Answer to Quiz- Write the Subquery Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + Aster Data + Azure Cloud + DB2 + Excel + Greenplum + Hadoop + Kognitio + Netezza + Oracle + Matrix + Redshift + SQL Server + Sybase + Teradata + Vertica
Database: SQL Class
EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Customer_Table WHERE Customer_Number IN (SELECT Customer_Number FROM Order_Table) ; Messages
1 2 3 4
Garden of Analysis
Customer_Number 11111111 31323134 57896883 87323456
Result 1
Customer_Name Phone_Number Billy's Best Choice 555-1234 555-1212 ACE Consulting 347-8954 XYZ Plumbing 322-1012 Databases N-U
The common key among both tables is Customer_Number. The bottom query runs first and delivers a distinct list of Customer_Number values which the top query uses in the IN List!
Page 499
Chapter 12
Sub-query Functions
Quiz- Write the More Difficult Subquery Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
Write the Subquery Select all columns in the Customer_Table if the customer has placed an order over $10,000.00 Dollars!
Here is your opportunity to show how smart you are. Write a Subquery that will bring back everything from the Customer_Table if the customer has placed an order in the Order_Table that is greater than $10,000.00.
Page 500
Chapter 12
Sub-query Functions
Answer to Quiz- Write the More Difficult Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Here is your answer!
Page 501
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Customer_Table WHERE Customer_Number IN ( SELECT Customer_Number FROM Order_Table WHERE Order_Total > 10000.00) ; Messages
Garden of Analysis
Customer_Number 1 11111111 2 57896883 3 87323456
Result 1
Customer_Name Phone_Number Billy's Best Choice 555-1234 347-8954 XYZ Plumbing 322-1012 Databases N-U
Chapter 12
Sub-query Functions
Quiz – Write the Extreme Subquery Course_Table Course_ID Course_Name _________ _________________ Student_Course_Table Student_ID Course_ID 280023 210 231222 210 125634 100 231222 220 125634 200 322133 220 125634 220 322133 300 324652 200 333450 500 260000 400 333450 400 234121 100 123250 100
100 200 210 220 300 400
Credits ______ Seats ____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 V2R3 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16 Student_Table
__________ Student_ID 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
__________ Last_Name Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips
__________ First_Name __________ Class_Code Grade_Pt ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00
Write SQL that will bring back an answer set that selects all columns from the Student_Table if that student is taking a course that has four (4) credits.
Use a subquery to get the answer set requested above. The answer is on the next page.
Page 502
Chapter 12
Sub-query Functions
Answer to Quiz- Write the Extreme Subquery Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
?
New Query
Query 1 Query 2 Query 3 SELECT S.* FROM Student_Table as S WHERE Student_ID IN (SELECT Student_ID FROM Student_Course_Table WHERE Course_ID IN (SELECT Course_ID FROM Course_Table WHERE Credits=4)) Messages
Student_ID 1 260000 2 322133 3 333450
Above is something to enjoy and learn from.
Page 503
EXECUTE
Sandbox
Garden of Analysis
Last_Name Johnson Bond Smith
Result 1
First_Name Class_Code Grade_Pt ? Stanley ? 3.95 Jimmy JR 2.00 Andy SO
Chapter 12
Sub-query Functions
Quiz- Write the Subquery with an Aggregate Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name __________ Jones Squiggy Smythe Richard Chambers Mandee Coffing Billy Smith John Larkins Loraine Strickling Cletus Reilly William Harrison Herbert
Salary _______ 32800.50 64300.00 48850.00 41888.88 48000.00 40200.00 54500.00 36000.00 54500.00
Write the Subquery Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary. Another opportunity knocking! Would someone please answer the query door?
Page 504
Chapter 12
Sub-query Functions
Answer to Quiz- Write the Subquery with an Aggregate Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
Query 1 Query 2 Query 3 SELECT * FROM Employee_Table WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table) ; Messages
Garden of Analysis
Result 1
Employee_No Dept_No Last_Name 10 Smythe 1 1000234 400 Strickling 2 1121334 100 Chambers 3 1232578 200 Smith 4 1333454 400 Harrison 5 1256349
First_Name Richard Cletus Mandee John Herbert
Notice that we are no longer using an IN clause, but instead a greater than sign.
Page 505
New Query
Salary 64300.00 54500.00 48850.00 48000.00 54500.00
Chapter 12
Sub-query Functions
Quiz- Write the Correlated Subquery Employee_Table Employee_No ________ Dept_No ____________ 2000000 ? 1000234 10 1232578 100 1324657 200 1333454 200 2312225 300 1121334 400 2341218 400 1256349 400
Last_Name __________ First_Name _______ Salary __________ 32800.50 Jones Squiggy 64300.00 Smythe Richard 48850.00 Chambers Mandee 41888.88 Coffing Billy 48000.00 Smith John 40200.00 Larkins Loraine 54500.00 Strickling Cletus 36000.00 Reilly William 54500.00 Harrison Herbert
Write the Correlated Subquery
Select all columns in the Employee_Table if the employee makes a greater Salary than the AVERAGE Salary (within their own Department). Another opportunity knocking! This is a tough one, and only the best get this written correctly.
Page 506
Chapter 12
Sub-query Functions
Answer to Quiz- Write the Correlated Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3
SELECT * This co-relates or FROM Employee_Table as EE correlates the top WHERE Salary > ( query to the bottom SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EEEE.Dept_No = EE.Dept_No) ; Messages
1 2 3
Garden of Analysis
Result 1
Employee_No Dept_No Last_Name First_Name Salary 400 Strickling Cletus 54500.00 1121334 200 Smith 48000.00 1333454 John 400 Harrison Herbert 54500.00 1256349
A Correlated subquery uses a column from the top query in the WHERE clause on the bottom query. This corelates the top and bottom queries, thus the name correlated subquery. Since we wanted to see all salaries greater than the average salary within their own Dept_No the correlating column is Dept_No. Both tables are aliased so the WHERE clause is correlated. Page 507
Chapter 12
Sub-query Functions
The Basics of a Correlated Subquery The Top Query is Co-Related (Correlated) with the Bottom Query. The table name from the top query and the table name from the bottom query are given a different alias.
The bottom query WHERE clause co-relates Dept_No from Top and Bottom. The top query is run first. The bottom query is run one time for each distinct value delivered from the top query. SELECT * FROM Employee_Table as EE WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EE.Dept_No = EEEE.Dept_No) ;
A correlated subquery breaks all the rules. It is the top query that runs first. Then, the bottom query is run one time for each distinct column in the bottom WHERE clause. In our example, this is the column Dept_No. This is because in our example, the WHERE clause is comparing the column Dept_No. After the top query runs and brings back its rows, the bottom query will run one time for each distinct Dept_No. If this is confusing, it is not you. These take a little time to understand, but I have a plan to make you an expert. Keep reading!
Page 508
Chapter 12
Sub-query Functions
The Top Query always runs first in a Correlated Subquery The Top Query runs first (colored in blue)
SELECT * FROM Employee_Table as EE WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EE.Dept_No = EEEE.Dept_No)
EE.Dept_No = EEEE.Dept_No
SELECT * FROM Employee_Table as EE Employee_No Dept_No ____________ ________ Last_Name _________ Null is 2000000 skipped ? Jones 1000234 10 Smythe 1232578 100 Chambers 1324657 200 Coffing 1333454 200 Smith 2312225 300 Larkins 1121334 400 Strickling 2341218 400 Reilly 1256349 400 Harrison
First_Name _______ Salary _________ Squiggy 32800.50 Richard 64300.00 Mandee 48850.00 Billy 41888.88 John 48000.00 Loraine 40200.00 Cletus 54500.00 William 36000.00 Herbert 54500.00
Dept_No ________ 10 100 200 300 400
Employee_No ________ Dept_No __________ Last_Name __________ First_Name _______ Salary ____________ 1333454 1256349 1121334
200 400 400
Smith Harrison Strickling
John Herbert Cletus
The bottom Query (in red) runs 1 time for each distinct Dept_No
48000.00 54500.00 54500.00
AVGSAL ________ 64300.00 48850.00 44944.44 40200.00 48333.33
Only these three employees make more than the AVG salary within their own department
The top query runs first and then the bottom query is only run once per distinct Dept_No. Page 509
Chapter 12
Sub-query Functions
Correlated Subquery Example vs. a Join with a Derived Table SELECT Last_Name, Dept_No, Salary FROM Employee_Table as EE WHERE Salary > ( SELECT AVG(Salary) FROM Employee_Table as EEEE WHERE EE.Dept_No = EEEE.Dept_No) ;
SELECT Last_Name, Dept_No, Salary, AVGSAL FROM Employee_Table as E INNER JOIN (SELECT Dept_No, AVG(Salary) FROM Employee_Table GROUP BY Dept_No) as TeraTom (Depty, AVGSAL) ON Dept_No = Depty AND Salary > AVGSAL ;
Correlated Subquery Last_Name Dept_No __________ ________ Smith 200 Harrison 400 Strickling 400
Salary _______ 48000.00 54500.00 54500.00
Join with a Derived Table Last_Name Dept_No _________ ________ Smith 200 Harrison 400 Strickling 400
Salary AVGSAL _______ ________ 48000.00 44944.44 54500.00 48333.33 54500.00 48333.33
Both queries above will bring back all employees making a salary that is greater than the average salary in their department. The biggest difference is that the Join with the Derived Table also shows the Average Salary in the result set.
Page 510
Chapter 12
Sub-query Functions
Quiz- A Second Chance to Write a Correlated Subquery Sales_Table
Product_ID _________ Sale_Date __________ 1000 10/02/2000 1000 09/30/2000 1000 10/01/2000 All Rows are 2000 10/04/2000 NOT 2000 10/02/2000 Displayed 2000 09/28/2000 3000 10/04/2000 3000 10/02/2000 3000 10/03/2000
Daily_Sales __________ 32800.50 36000.07 40200.43 32800.50 36021.93 41888.88 15675.33 19678.94 21553.79
Write the Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Product_ID. Another opportunity knocking! This is your second chance. I will even give you a third chance.
Page 511
Chapter 12
Sub-query Functions
Answer - A Second Chance to Write a Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Product_ID. SELECT * FROM Sales_Table as TopS WHERE Daily_Sales > ( SELECT AVG(Daily_Sales) FROM Sales_Table as BotS WHERE TopS.Product_ID = BotS.Product_ID) ORDER BY Product_ID, Sale_Date ; Product_ID _________ Sale_Date __________ Daily_Sales __________
Answer Set
1000 1000 1000 1000 2000 2000 2000 3000 3000 3000
09/28/2000 09/29/2000 10/03/2000 10/04/2000 09/29/2000 09/30/2000 10/01/2000 09/28/2000 09/29/2000 09/30/2000
Notice that it is the Product_Id in the bottom WHERE clause.
Page 512
48850.40 54500.22 64300.00 54553.10 48000.00 49850.03 54850.29 61301.77 34509.13 43868.86
Chapter 12
Sub-query Functions
Quiz- A Third Chance to Write a Correlated Subquery Sales_Table
Product_ID _________ Sale_Date __________ 1000 10/02/2000 1000 09/30/2000 1000 10/01/2000 All Rows are 2000 10/04/2000 NOT 2000 10/02/2000 Displayed 2000 09/28/2000 3000 10/04/2000 3000 10/02/2000 3000 10/03/2000
Daily_Sales __________ 32800.50 36000.07 40200.43 32800.50 36021.93 41888.88 15675.33 19678.94 21553.79
Write the Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Sale_Date. Another opportunity knocking! There is just one minor adjustment and you are home free.
Page 513
Chapter 12
Sub-query Functions
Answer - A Third Chance to Write a Correlated Subquery Select all columns in the Sales_Table if the Daily_Sales column is greater than the Average Daily_Sales within its own Sale_Date. SELECT * FROM Sales_Table as TopS WHERE Daily_Sales > ( SELECT AVG(Daily_Sales) FROM Sales_Table as BotS WHERE TopS.Sale_Date = BotS.Sale_Date) ORDER BY Sale_Date ; Product_ID _________ Sale_Date __________ Daily_Sales __________
Answer Set
3000 2000 1000 3000 2000 2000 2000 1000 2000 1000 1000
09/28/2000 09/29/2000 09/29/2000 09/30/2000 09/30/2000 10/01/2000 10/02/2000 10/02/2000 10/03/2000 10/03/2000 10/04/2000
61301.77 48000.00 54500.22 43868.86 49850.03 54850.29 36021.93 32800.50 43200.18 64300.00 54553.10
Notice that it is the Sale_Date in the bottom WHERE clause. Plus, we threw in an ORDER BY that is outside of the subquery.
Page 514
Chapter 12
Sub-query Functions
Quiz- Last Chance to Write a Correlated Subquery Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
Write the Correlated Subquery Select all columns in the Student_Table if the Grade_Pt column is greater than the Average Grade_Pt within its own Class_Code. Another opportunity knocking! There is just one minor adjustment and you are home free.
Page 515
Chapter 12
Sub-query Functions
Answer – Last Chance to Write a Correlated Subquery Select all columns in the Student_Table if the Grade_Pt column is greater than the Average Grade_Pt within its own Class_Code.
SELECT * FROM Student_Table as TopS WHERE Grade_Pt > ( SELECT AVG(Grade_Pt) FROM Student_Table as BotS WHERE TopS. Class_Code = BotS.Class_Code ) ORDER BY Class_Code ;
Answer Set Student_ID Last_Name First_Name __________ __________ __________ Class_Code __________ Grade_Pt ________ 234121 125634 322133 231222 324652
Page 516
Thomas Hanson Bond Wilson Delaney
Wendy Henry Jimmy Susie Danny
FR FR JR SO SR
4.00 2.88 3.95 3.80 3.35
Chapter 12
Sub-query Functions
Quiz – Write the Extreme Correlated Subquery Course_Table Course_ID Course_Name _________ _________________ Student_Course_Table Student_ID Course_ID 280023 210 231222 210 125634 100 231222 220 125634 200 322133 220 125634 220 322133 300 324652 200 333450 500 260000 400 333450 400 234121 100 123250 100
100 200 210 220 300 400
Credits ______ Seats ____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 V2R3 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16 Student_Table
__________ Student_ID 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
__________ Last_Name Larkins Wilson McRoberts Bond Hanson Smith Delaney Johnson Thomas Phillips
First_Name __________ __________ Class_Code Grade_Pt ________ Michael FR 0.00 Susie SO 3.80 Richard JR 1.90 Jimmy JR 3.95 Henry FR 2.88 Andy SO 2.00 Danny SR 3.35 Stanley ? ? Wendy FR 4.00 Martin SR 3.00
Write a correlated subquery that will bring back an answer set that returns all columns from the Course_Table if that course is being taken by a student who has a greater than average grade point within their own class code.
Use a subquery to get the answer set requested above. The answer is on the next page.
Page 517
Chapter 12
Sub-query Functions
Answer To Quiz – Write the Extreme Correlated Subquery SELECT * FROM Course_Table WHERE Course_ID IN (SELECT Course_ID FROM Student_Course_Table WHERE Student_ID IN (SELECT Student_ID FROM Student_Table AS s1 WHERE Grade_Pt > (SELECT AVG(Grade_Pt) FROM Student_Table AS s2 WHERE s1.Class_Code=s2.Class_Code) ) ); Course_ID _________ 200 100 220 300 210
Above is something to enjoy and learn from.
Page 518
Course_Name _____________________ Credits ______ Seats _____ Introduction to SQL 3 20 Vertica Concepts 3 50 V2R3 SQL Features 2 25 Physical Database Design 4 20 Advanced SQL 3 22
Chapter 12
Sub-query Functions
Quiz- Write the NOT Subquery Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
Write the Subquery Select all columns in the Customer_Table if the Customer has NOT placed an order.
Another opportunity knocking! Write the above query!
Page 519
12347.53 8005.91 5111.47 15231.62 23454.84
Chapter 12
Sub-query Functions
Answer to Quiz- Write the NOT Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table WHERE Customer_Number IS NOT NULL) ; Messages
Garden of Analysis
Customer_Number 1 31313131
Use this technique to get rid of Nulls
Result 1
Customer_Name Acme Products
Phone_Number 555-1111
When a NOT IN subquery encounters a NULL value it returns nothing. Since the bottom query is passing up the Customer_Number to the top query, if there are NULL values in any Customer_Number, the top query returns nothing. That is why we used the IS NOT NULL statement in the bottom WHERE clause.
Page 520
Chapter 12
Sub-query Functions
Quiz- Write the Subquery using a WHERE Clause Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
Write the Subquery Select all columns in the Order_Table that were placed by a customer with ‘Bill’ anywhere in their name.
Write the above query and then check out the results on the next page.
Page 521
12347.53 8005.91 5111.47 15231.62 23454.84
Chapter 12
Sub-query Functions
Answer - Write the Subquery using a WHERE Clause Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Order_Table WHERE Customer_Number IN (SELECT Customer_Number FROM Customer_Table WHERE Customer_Name ilike '%Bill%') ; Messages
Garden of Analysis
Result 1
Order_Number Customer_Number Order_Date Order_Total 1 123456 11111111 05/04/1998 12347.53 2 123512 11111111 01/01/1999 8005.91
Great job on writing your query just like the above.
Page 522
History
Chapter 12
Sub-query Functions
Quiz- Write the Subquery with Two Parameters Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
Write the Subquery What is the highest dollar order for each Customer? This Subquery will involve two parameters!
Get ready to be amazed at either yourself or the Answer on the next page!
Page 523
12347.53 8005.91 5111.47 15231.62 23454.84
Chapter 12
Sub-query Functions
Answer to Quiz- Write the Subquery with Two Parameters Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
Sandbox
EXECUTE
?
Query 1 Query 2 Query 3 SELECT Customer_Number, Order_Number, Order_Total FROM Order_Table WHERE (Customer_Number, Order_Total) IN (SELECT Customer_Number, MAX(Order_Total) FROM Order_Table GROUP BY Customer_Number) ; Messages
Garden of Analysis
Result 1
Customer_Number Order_Number Order_Total 1 2 3 4
57896883 11111111 31323134 87323456
123777 123456 123552 123585
23454.84 12347.53 5111.47 15231.62
This is how you utilize multiple parameters in a Subquery! Turn the page for more.
Page 524
New Query
Chapter 12
Sub-query Functions
How the Double Parameter Subquery Works Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
SELECT Customer_Number, Order_Number, Order_Total FROM Order_Table WHERE (Customer_Number, Order_Total) IN (SELECT Customer_Number, MAX(Order_Total) FROM Order_Table GROUP BY 1) ; Customer_Number Max(Order_Total) ________________ _______________ 11111111 31323134 87323456 57896883
12347.53 5111.47 15231.62 23454.84
The bottom query runs first returning two columns. Next page for more info!
Page 525
These 4 rows are sent to the top query
Chapter 12
Sub-query Functions
More on how the Double Parameter Subquery Works Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
SELECT Customer_Number, Order_Number, Order_Total FROM Order_Table WHERE (Customer_Number, Order_Total ) IN ( 11111111 ,12347.53 The top query now uses the 31323134 , 5111.47 In-list 87323456 ,15231.62 57896883 ,23454.84 ); The IN list is built and the top query can now process for the final Answer Set.
Page 526
Chapter 12
Sub-query Functions
Quiz – Write the Triple Subquery Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
Write the Subquery
What is the Customer_Name who has the highest dollar order among all customers? This query will have multiple Subqueries! Good luck in writing this. Remember that this will involve multiple Subqueries.
Page 527
Chapter 12
Sub-query Functions
Answer to Quiz – Write the Triple Subquery Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
The answer is XYZ Plumbing.
Page 528
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 This SELECT Customer_Name runs FROM Customer_Table third WHERE Customer_Number IN Runs (SELECT Customer_Number FROM Order_Table second WHERE Order_Total IN (SELECT Max(Order_Total) FROM Order_Table)) Runs first Messages
Garden of Analysis
Customer_Name
1
XYZ Plumbing
Result 1
Chapter 12
Sub-query Functions
Quiz – How many rows return on a NOT IN with a NULL? Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777 000099
11111111 11111111 31323134 87323456 57896883 NULL
We added a Null Value to the Order_Table
12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL
SELECT Customer_Name FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table ) ;
How many rows return from the query now that a NULL value is in a Customer_Number?
We really didn’t place a new row inside the Order_Table with a NULL value for the Customer_Number column, but in theory, if we had, how many rows would return?
Page 529
Chapter 12
Sub-query Functions
Answer – How many rows return on a NOT IN with a NULL? Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777 000099
11111111 11111111 31323134 87323456 57896883 NULL
We added a Null Value to the Order_Table
12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL
SELECT Customer_Name FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table ) ;
How many rows return from the query now that a NULL value is in a Customer_Number? ZERO rows will return
The answer is no rows come back. This is because when you have a NULL value in a NOT IN list, the system doesn’t know the value of NULL, so it returns nothing.
Page 530
Chapter 12
Sub-query Functions
How to handle a NOT IN with Potential NULL Values Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777 000099
11111111 11111111 31323134 87323456 57896883 NULL
We added a Null Value to the Order_Table
12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL
SELECT Customer_Name FROM Customer_Table WHERE Customer_Number NOT IN (SELECT Customer_Number FROM Order_Table WHERE Customer_Number IS NOT NULL) ;
How many rows return NOW from the query? 1 Acme Products
You can utilize a WHERE clause that tests to make sure Customer_Number IS NOT NULL. This should be used when a NOT IN could encounter a NULL.
Page 531
Chapter 12
Sub-query Functions
IN is equivalent to =ANY Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Query 1 Query 2 Query 3 SELECT * FROM Customer_Table WHERE Customer_Number = ANY (SELECT Customer_Number FROM Order_Table ) ; Messages
Garden of Analysis
11111111 31323134 57896883 87323456
?
New Query
= ANY Is the same As IN
Result 1
Customer_Number Customer_Name 1 2 3 4
Sandbox
Billy's Best Choice ACE Consulting XYZ Plumbing Databases N-U
Phone_Number 555-1234 555-1212 347-8954 322-1012
Instead of using the IN, you can use the = ANY command. These queries work the SAME. The above queries will produce the same result set.
Page 532
Chapter 12
Sub-query Functions
Using a Correlated Exists Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT * FROM Customer_Table Top1 EXISTS is a Boolean that is either true or false WHERE EXISTS (SELECT * FROM Order_Table Bot1 WHERE Top1.Customer_Number = Bot1.Customer_Number ) ; Messages
Garden of Analysis
Result 1
Customer_Number Customer_Name 1 2 3 4
11111111 31323134 57896883 87323456
Billy's Best Choice ACE Consulting XYZ Plumbing Databases N-U
Phone_Number 555-1234 555-1212 347-8954 322-1012
The EXISTS command will determine via a Boolean if something is True or False. If a customer placed an order, it EXISTS, and using the Correlated Exists statement, only customers who have placed an order will return in the answer set.
Page 533
Chapter 12
Sub-query Functions
How a Correlated Exists matches up Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Does not Acme Products Exist in ACE Consulting Order_Table XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ; Customer_Number ________________
________________ Customer_Name
11111111 31323134 57896883 87323456
Billy’s Best Choice ACE Consulting XYZ Plumbing Databases N-U
Only customers who placed an order return with the above Correlated EXISTS.
Page 534
Chapter 12
Sub-query Functions
The Correlated NOT Exists Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT * NOT EXISTS is a Boolean FROM Customer_Table Top1 that is either true or false WHERE NOT EXISTS (SELECT * FROM Order_Table Bot1 WHERE Top1.Customer_Number = Bot1.Customer_Number )
Messages
Garden of Analysis
Result 1
Customer_Number Customer_Name 1
31313131
Acme Products
Phone_Number 555-1111
The EXISTS command will determine via a Boolean if something is True or False. If a customer has not placed an order, it does not EXIST, and using the Correlated Exists statement, only customers who have not placed an order will return in the answer set. Null values do not affect a NOT EXIST statement like they do a NOT IN statement.
Page 535
Chapter 12
Sub-query Functions
The Correlated NOT Exists Answer Set Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777
11111111 11111111 31323134 87323456 57896883
12347.53 8005.91 5111.47 15231.62 23454.84
Use NOT EXISTS to find which Customers have NOT placed an Order? SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE NOT EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ; Customer_Number ________________ Customer_Name ______________ 31313131
Acme Products
The only customer who did NOT place an order was Acme Products.
Page 536
Chapter 12
Sub-query Functions
Quiz – How many rows come back from this NOT Exists? Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777 000099
11111111 11111111 31323134 87323456 57896883 NULL
We added a Null Value to the Order_Table
12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL
SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE NOT EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ;
How many rows return from the query?
A NULL value in a list for queries with NOT IN returned nothing, but you must now decide if that is also true for the NOT EXISTS. How many rows will return?
Page 537
Chapter 12
Sub-query Functions
Answer – How many rows come back from this NOT Exists? Customer_Table
Order_Table
Customer_Number Customer_Name Order_Number ______________ Customer_Number _________ Order_Total _____________ ______________ ___________ 11111111 31313131 31323134 57896883 87323456
Billy’s Best Choice Acme Products ACE Consulting XYZ Plumbing Databases N-U
123456 123512 123552 123585 123777 000099
11111111 11111111 31323134 87323456 57896883 NULL
We added a Null Value to the Order_Table
12347.53 8005.91 5111.47 15231.62 23454.84 9999.99 NULL
SELECT Customer_Number, Customer_Name FROM Customer_Table as Top1 WHERE NOT EXISTS (SELECT * FROM Order_Table as Bot1 Where Top1.Customer_Number = Bot1.Customer_Number ) ; How many rows return from the query? One row Acme Products
NOT EXISTS is unaffected by a NULL in the list. That’s why it is more flexible!
Page 538
Chapter 13
Page 539
Strings
Chapter 13
Strings
Chapter 13 – Strings
“It’s always been and always will be the same in the world: the horse does the work and the coachman is tipped.” - Anonymous
Page 540
Chapter 13
Strings
The LENGTH Command Counts Characters Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT First_Name ,LENGTH (First_Name) AS Lnth FROM Employee_Table WHERE LENGTH (First_Name) < 7 ORDER BY 1; Messages
first_name
1 2 3 4
Billy Cletus John Mandee
Garden of Analysis
Result 1
Lnth 5 6 4 6
The LENGTH command counts the number of characters. If ‘Tom’ was in the Employee_Table, his length would be 3.
Page 541
Chapter 13
Strings
The LENGTH Command – Spaces can Count too Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
EXECUTE
Sandbox ?
New Query
Query 1 Query 2 Query 3 SELECT 'T o m' AS First_Name ,LENGTH('T o m') AS Lnth There are spaces in between each letter Messages
Garden of Analysis
Result 1
First_Name Length 1
T o m
5
Spaces in between count
If ‘T o m’ was in the Employee_Table, his length would be 5. Yes, spaces do count as characters.
Page 542
Chapter 13
Strings
The LENGTH Command and Character Data CHAR (20) SELECT Last_Name ,LENGTH(Last_Name) AS Lnth FROM Employee_Table ORDER BY 1;
Last_Name Lnth __________ _____ Chambers 8 Coffing 7 Harrison 8 Jones 5 Larkins 7 Reilly 6 Smith 5 Smythe 6 Strickling 10
Even though Last_Name is a CHAR (20), the LENGTH command in Vertica will automatically trim the spaces for the LENGTH command.
Page 543
Chapter 13
Strings
LENGTH and CHARACTER_LENGTH Are Equivalent Query 1 SELECT First_Name ,LENGTH(First_Name) AS C_Length FROM Employee_Table ;
Query 2 SELECT First_Name ,CHARACTER_Length(First_Name) AS C_Length FROM Employee_Table ;
These two queries will get you the SAME EXACT answer set in your report.
Page 544
Chapter 13
Strings
OCTET_LENGTH Query 1 SELECT First_Name ,LENGTH(First_Name) AS C_Length FROM Employee_Table ;
Query 2 SELECT First_Name ,CHARACTER_Length(First_Name) AS C_Length FROM Employee_Table ;
Query 3 SELECT First_Name ,Octet_Length (First_Name) AS C_Length FROM Employee_Table ; You can also use the OCTET LENGTH command. These three queries get the same exact answer sets! Query 2 and 3 are ANSI Standard.
Page 545
Chapter 13
Strings
UPPER and LOWER Commands Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
?
Query 1 Query 2 Query 3 SELECT Last_Name AS "Name_Normal" ,UPPER (Last_Name) AS "Name_Upper" ,LOWER (Last_name) AS "Name_Lower" FROM Employee_Table WHERE Last_Name LIKE 'S%' ; Messages
Garden of Analysis
Result 1
Name_Normal Name_Upper Name_Lower smythe SMYTHE Smythe 1 STRICKLING strickling Strickling 2 smith SMITH Smith 3
Upper convert’s text to uppercase and Lower converts text to lowercase.
Page 546
Sandbox New Query
Chapter 13
Strings
Using the LOWER Command Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1
SELECT LOWER('AbCdE') as "Go Low" FROM Order_Table Limit 1 ; Messages
Garden of Analysis
Result 1
Go Low 1
abcde
The LOWER function converts all letters in a specified string to lowercase letters. If there are characters in the string that are not letters, they are not affected by the LOWER command.
Page 547
Chapter 13
Strings
A LOWER Command Example Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 SELECT 'They match' as "Do They Match?" FROM Order_Table WHERE LOWER('ABCDE') = 'abcde' Limit 1 ; Messages
Garden of Analysis
Result 1
Do They Match? 1
They match
The LOWER function converts all letters in a specified string to lowercase letters. If there are characters in the string that are not letters, they are not affected by the LOWER command. Above, we compare a LOWER 'ABCDE' = 'abcde' and they are now equivalent because we have lowercased the 'ABCDE'.
Page 548
Chapter 13
Strings
Using the UPPER Command Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1
SELECT UPPER('AbCdE') as "Go upper" FROM Order_Table Limit 1 ; Messages
Garden of Analysis
Result 1
Go upper 1
ABCDE
The UPPER function converts all letters in a specified string to uppercase letters. If there are characters in the string that are not letters, they are not affected by the UPPER command.
Page 549
Chapter 13
Strings
An UPPER Command Example Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Sandbox ?
New Query
Query 1 SELECT 'They match' as "Do They Match?" FROM Order_Table WHERE 'ABCDE' = UPPER('abcde') LIMIT 1 ; Messages
Garden of Analysis
Result 1
Do They Match?
1
They match
The UPPER function converts all letters in a specified string to uppercase letters. If there are characters in the string that are not letters, they are not affected by the UPPER command. Above, we compare a string of 'ABCDE' = UPPER 'abcde' and they are now equivalent because we have uppercased the 'abcde'.
Page 550
Chapter 13
Strings
Non-Letters are Unaffected by UPPER and LOWER Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
Sandbox
EXECUTE
?
New Query
Query 1 SELECT LOWER('ABCDE1') as "Number Stays" ,UPPER('abCdE2') as "Numbers Hold" FROM Order_Table LIMIT 1 ; Messages
Garden of Analysis
Result 1
Number Stays Numbers Hold 1
abcde1
ABCDE2
The UPPER and LOWER functions convert all letters in a specified string to either upper or lower case letters. If there are characters in the string that are not letters, they are not affected by the UPPER or LOWER commands. Notice in our example that the numbers 1 and 2 were unaffected by the LOWER and UPPER commands.
Page 551
Chapter 13
Strings
The TRIM Command trims both Leading and Trailing Spaces Query 1
SELECT Last_Name ,Trim(Last_Name) AS No_Spaces FROM Employee_Table ;
Query 2 SELECT Last_Name ,Trim(Both from Last_Name) AS No_Spaces FROM Employee_Table ;
Both queries above do the exact same thing. They remove spaces from the beginning and the end of the column Last_Name.
Both queries trim both the leading and trailing spaces from Last_Name.
Page 552
Chapter 13
Strings
Trim Combined with the CHARACTERS Command SELECT ' Rodriquez ' as "Name" ,LENGTH (Trim (' Rodriquez ')) AS No_Spaces ;
2 front spaces
2 back spaces
' Rodriquez '
Name _________ Rodriquez
No_Spaces _________ 9
This will allow for the character count to only be 9 because both the leading and trailing spaces have been cut.
Page 553
Chapter 13
Strings
How to TRIM only the Trailing Spaces SELECT ' Rodriquez ' ,LENGTH (Trim (Trailing FROM ' Rodriquez ')) AS Front_Spaces ;
2 front spaces
2 back spaces
' Rodriquez '
' Rodriquez ' ___________ Rodriquez
Front_Spaces ___________ 11
The TRAILING FROM Command allows you to only TRIM the spaces behind the Last_Name. Now, we will still get a character count of 11 because we are only cutting off the trailing spaces and not the beginning spaces.
Page 554
Chapter 13
Strings
A Visual of the TRIM Command Using Concatenation Concatenation without Trim and with Trim SELECT Last_Name concatenate ,First_Name ,Last_Name || First_Name as NameBackwards ,TRIM(Last_Name) || First_Name as TrimNameBackwards FROM Employee_Table
Last_Name First_Name __________ __________ Jones Squiggy Smith John Smythe Richard Harrison Herbert Chambers Mandee Strickling Cletus Reilly William Coffing Billy Larkins Loraine
NameBackwards TrimNameBackwards ______________________ __________________ Jones Squiggy JonesSquiggy Smith John SmithJohn Smythe Richard SmytheRichard Harrison Herbert HarrisonHerbert Chambers Mandee ChambersMandee Strickling Cletus StricklingCletus Reilly William ReillyWilliam Coffing Billy CoffingBilly Larkins Loraine LarkinsLoraine
When you use the TRIM command on a column, that column will have all beginning and ending spaces removed.
Page 555
Chapter 13
Strings
Trim and Trailing is Case Sensitive VARCHAR Capitol 'Y'
SELECT First_Name, Trim(trailing 'Y' from First_Name) AS No_Y, Trim(trailing 'y' from First_Name) AS Success FROM Employee_Table Lower Case 'y' ORDER BY 1; For leading and trailing TRIM commands, case sensitivity is important. First_Name No_Y Success __________ ________ __________ Billy Billy Bill Cletus Cletus Cletus Herbert Herbert Herbert John John John Loraine Loraine Loraine Mandee Mandee Mandee Richard Richard Richard Squiggy Squiggy Squigg William William William
For LEADING and TRAILNG TRIM commands, case sensitivity is required.
Page 556
Chapter 13
Strings
How to TRIM Trailing Letters VARCHAR
SELECT First_Name ,Trim(trailing 'y' from First_Name) AS No_Y ,Last_Name ,Trim(trailing 'g' from (TRIM (Last_Name))) AS No_G FROM Employee_Table ; CHAR(20)
First_Name No_Y __________ ________
Last_Name _________ No_G __________
Squiggy John Richard Herbert Mandee Cletus William Billy Loraine
Jones Smith Smythe Harrison Chambers Strickling Reilly Coffing Larkins
Squigg John Richard Herbert Mandee Cletus William Bill Loraine
Jones Smith Smythe Harrison Chambers Stricklin Reilly Coffin Larkins
The above example removed the trailing ‘y’ from the First_Name and the trailing ‘g’ from the Last_Name. Remember that this is case sensitive.
Page 557
Chapter 13
Strings
The SUBSTRING Command SELECT First_Name, SUBSTRING (First_Name FROM 2 for 3) AS Quiz FROM Employee_Table ; Start in position 2
First_Name __________ Squiggy John Richard Herbert Mandee Cletus William Billy Loraine
Go for 3 positions
Quiz ______ qui ohn ich erb and let ill ill ora
This is a SUBSTRING. The substring is passed two parameters, and they are the starting position of the string and the number of positions to return (from the starting position). The above example will start in position 2 and go for 3 positions!
Page 558
Chapter 13
Strings
SUBSTRING and SUBSTR are equal, but use different syntax Query 1 with Substring
SELECT First_Name, SUBSTRING(First_Name FROM 2 for 3) AS Quiz FROM Employee_Table ;
Query 2 with Substr
SELECT First_Name, SUBSTR (First_Name , 2 ,3) AS Quiz2 FROM Employee_Table ;
Both queries above are going to yield the same results! SUBSTR is just a different way of doing a substring. Both have two parameters, which are starting position and number of characters to return.
Page 559
Chapter 13
Strings
How SUBSTRING Works with NO ENDING POSITION SELECT First_Name, SUBSTRING (First_Name FROM 2) AS GoToEnd FROM Employee_Table ; Start in Position 2
First_Name GoToEnd __________ _________ Squiggy quiggy John ohn Richard ichard Herbert erbert Mandee andee Cletus letus William illiam Billy illy Loraine oraine
If you don’t tell the Substring the end position, it will go all the way to the end.
Page 560
Chapter 13
Strings
Using SUBSTRING to move backwards SELECT First_Name, SUBSTRING (First_Name FROM 0 For 6) AS Before1 FROM Employee_Table ; Start in Position 0 (one space before)
First_Name Before1 __________ ________ Squiggy Squig John John Richard Richa Herbert Herbe Mandee Mande Cletus Cletu William Willi Billy Billy Loraine Lorai
A starting position of zero moves one space in front of the beginning. Notice that our FOR Length is 6 so ‘Squiggy’ turns into ‘ Squig’. The point being made here is that both the starting position and ending positions can move backwards which will come in handy as you see other examples.
Page 561
Chapter 13
Strings
How SUBSTRING Works with a Starting Position of -1 SELECT First_Name, SUBSTRING (First_Name FROM -1 For 3) AS Before2 FROM Employee_Table ; Start in Position -1. This is two spaces before.
First_Name Before2 __________ ________ Squiggy S John J Richard R Herbert H Mandee M Cletus C William W Billy B Loraine L
A starting position of -1 moves two spaces in front of the beginning. Notice that our FOR Length is 3, so each name delivers only the first initial. The point being made here is that both the starting position and ending positions can move backwards which will come in handy as you see other examples.
Page 562
Chapter 13
Strings
How SUBSTRING Works with an Ending Position of 0 SELECT First_Name, SUBSTRING (First_Name FROM 3 For 0) AS WhatsUp FROM Employee_Table ; Go for 0 positions
First_Name WhatsUp __________ ________ Squiggy John Richard Herbert Mandee Cletus William Billy Loraine In our example above, we start in position 3, but we go for zero positions, so nothing is delivered in the column. That is what’s up!
Page 563
Chapter 13
Strings
An Example using SUBSTRING, TRIM and CHAR Together SELECT Last_Name CHAR(20) ,SUBSTRING(Last_Name FROM LENGTH( TRIM (TRAILING FROM Last_Name)) -1 FOR 2) AS Letters FROM Employee_Table; Last_Name __________ Jones Smith Smythe Harrison Chambers Strickling Reilly Coffing Larkins
Letters ______ es th he on rs ng ly ng ns
The SQL above brings back the last two letters of each Last_Name. The tricky part is that the last names are different lengths. We first trimmed the spaces off of the Last_Name. Then, we counted the characters in the Last_Name. Then, we subtracted two from the Last_Name character length and then passed it to our substring as the starting position.
Page 564
Chapter 13
Strings
The POSITION Command finds a Letters Position SELECT Last_Name ,Position ('e' in Last_Name) AS Find_The_E ,Position ('f' in Last_Name) AS Find_The_F FROM Employee_Table ;
e is in 4th position
e is 2nd position in name
Last_Name Find_The_E Find_The_F __________ __________ __________ Jones 4 0 Smith 0 0 Smythe 6 0 No f is in Harrison 0 0 the name Chambers 6 0 Strickling 0 0 Reilly 2 0 1st f is in Coffing 0 3 3rd position Larkins 0 0
This is the position counter. What it will do is tell you what position a letter is on. Why did Jones have a 4 in the result set? The ‘e’ was in the 4th position. Why did Smith get a zero for both columns? There is no ‘e’ in Smith and no ‘f’ in Smith. If there are two ‘f’s, only the first occurrence is reported.
Page 565
Chapter 13
Strings
Quiz – Find that SUBSTRING Starting Position SELECT DISTINCT Department_Name as Dept_Name ,SUBSTRING(Department_Name FROM POSITION(' ' IN Department_Name) +1) as Word2 FROM Department_Table WHERE POSITION(' ' IN trim(Department_Name)) >0;
Dept_Name __________________ Customer Support Human Resources Research and Develop
Word2 ___________ Support Resources and Develop
What is the Starting Position here? What is the Starting position of the Substring in the above query? Hint: This only looks for a Dept_Name that has two words or more.
Page 566
Chapter 13
Strings
Answer to Quiz – Find that SUBSTRING Starting Position SELECT DISTINCT Department_Name as Dept_Name ,SUBSTRING(Department_Name FROM POSITION(' ' IN Department_Name) +1) as Word2 FROM Department_Table WHERE POSITION(' ' IN trim(Department_Name)) >0; Dept_Name __________________
Customer Support Human Resources Research and Develop
Word2 ___________
Support Resources and Develop
What is the Starting Position here? The Starting Position is calculated by finding the length up to the first SPACE and then adding 1.
Customer Support (FROM 10) Human Resources (FROM 7) Research and Develop FROM 10)
What is the Starting position of the Substring in the above query? See above!
Page 567
Chapter 13
Strings
Using the SUBSTRING to Find the Second Word On SELECT DISTINCT Department_Name as Dept_Name ,SUBSTRING(Department_Name FROM POSITION(' ' IN Department_Name) +1) as Word2 FROM Department_Table WHERE POSITION(' ' IN trim(Department_Name)) >0;
Dept_Name __________________ Customer Support Human Resources Research and Develop
Word2 ____________ Support Resources and Develop
Notice we only had three rows come back. That is because our WHERE looks for only Department_Name that has multiple words. Then, notice that our starting position of the Substring is a subquery that looks for the first space. Then, it adds 1 to the starting position, and we have a starting position for the 2nd word. We don’t give a FOR length parameter, so it goes to the end.
Page 568
Chapter 13
Strings
Quiz – Why did only one Row Return SELECT Department_Name ,SUBSTRING(Department_Name from POSITION(' ' IN Department_Name) + 1 + POSITION(' ' IN SUBSTRING(Department_Name FROM POSITION(' ' IN Department_Name) + 1))) as Third_Word FROM Department_Table WHERE POSITION(' ' IN TRIM(Substring(Department_Name from POSITION(' ' in Department_Name) + 1)))> 0
Dept_Name _________ Research and Develop Why did only one row come back?
Page 569
Third_Word __________ Develop
Chapter 13
Strings
Answer to Quiz – Why Did only one Row Return SELECT Department_Name ,SUBSTRING(Department_Name from POSITION(' ' IN Department_Name) + 1 + POSITION(' ' IN SUBSTRING(Department_Name FROM POSITION(' ' IN Department_Name) + 1))) as Third_Word FROM Department_Table WHERE POSITION(' ' IN TRIM(Substring(Department_Name from POSITION(' ' in Department_Name) + 1)))> 0
Dept_Name __________________ Research and Develop
Third_Word __________ Develop
It has 3 words
Why did only one row come back? It’s the Only Department Name with three words. The SUBSTRING and the WHERE clause both look for the first space, and if they find it, they look for the second space. If they find that, add 1 to it, and their Starting Position is the third word. There is no FOR position, so it defaults to “go to the end”.
Page 570
Chapter 13
Strings
Concatenation
Two Pipe Symbols together (no space) mean concatenate
SELECT First_Name ,Last_Name ,First_Name A space || ' ' || Last_Name as Full_Name FROM Employee_Table WHERE First_Name = 'Squiggy'
First_Name _________
Last_Name Full_Name _________ ___________
Squiggy
Jones
Squiggy Jones
Two pipe symbols represent concatenation. That allows you to combine multiple columns into one column. The || (Pipe Symbol) on your keyboard is just above the ENTER key. Don’t put a space in between, just put two Pipe Symbols together. In this example, we have combined the first name, then a single space and then the last name to get a new column called Full_Name.
Page 571
Chapter 13
Strings
Concatenation and SUBSTRING A Period (.) and a space
SELECT First_Name ,Last_Name ,Substring(First_Name, 1, 1) || '. ' || Last_Name as Full_Name FROM Employee_Table WHERE First_Name = 'Squiggy' ;
_________ First_Name _________ Last_Name _________ Full_Name Squiggy Jones S. Jones
Of the three items being concatenated together, what is the first item of concatenation in the example above? The first initial of the First_Name. Then, we concatenated a literal space and a period. Then, we concatenated the Last_Name.
Page 572
Chapter 13
Strings
Four Concatenations Together CHAR(20)
VARCHAR(12)
SELECT First_Name ,Last_Name ,TRIM(Last_Name) ||' ' || Substring(First_Name, 1, 1) || '.' AS Last_Name_1st FROM Employee_Table WHERE First_Name = 'Squiggy' ;
First_Name Last_Name_1st __________ Last_Name _________ _____________
Squiggy
Jones
Jones S.
Why did we TRIM the Last_Name? To get rid of the spaces, otherwise the output would have looked odd. How many items are being concatenated in the example above? There are 4 items concatenated. We start with the Last_Name (after we trim it), then we have a single space, then we have the First Initial of the First Name, and then we have a Period.
Page 573
Chapter 13
Strings
Troubleshooting Concatenation ERROR: There should never be spaces between the pipe symbols
SELECT First_Name ,Last_Name ,TRIM (Last_Name) | | First_Name AS LastFirst FROM Employee_Table WHERE First_Name = 'Squiggy' ; This is now perfect
SELECT First_Name ,Last_Name ,TRIM (Last_Name) || First_Name AS LastFirst FROM Employee_Table WHERE First_Name = 'Squiggy' ; First_Name Last_Name ___________ LastFirst __________ __________ Squiggy
Jones
JonesSquiggy
What happened above to cause the error? Can you see it? The Pipe Symbols || have a space between them like | |, when it should be ||. It is a tough one to spot, so be careful.
Page 574
Chapter 14
Page 575
Interrogating the Data
Chapter 14
Interrogating the Data
Chapter 14 – Interrogating the Data
"The difference between genius and stupidity is that genius has its limits" - Albert Einstein
Page 576
Chapter 14
Interrogating the Data
Numeric Manipulation Functions Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT -10 as Neg10 ,Cos(90) as Cos -- Trigonometric cosine of an angle ,Sin(90) as Sin -- Trigonometric sine of an angle ,Tan(90) as Tan -- Trigonometric tangent of an angle ,Exp(6) as Exp -- Exponential value of a number ,Sqrt(16) as Sqrt -- Square root of a number FROM Order_Table limit 1 ;
Messages
Neg 10 1 -10
Garden of Analysis
Result 1
Cos Sin Tan Exp Sqrt -0.45 0.89 -2 403.43 4
The functions above are often used for algebraic, trigonometric, or geometric calculations.
Page 577
Chapter 14
Interrogating the Data
Finding the Cube Root Nexus Chameleon History
File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
?
New Query
Query 1 SELECT cbrt(27.0) cube_root_27 cbrt finds the ,cbrt(216) cube_root_216 Cube Root FROM Order_Table All queries need a FROM statement LIMIT 1 ; so pick any table and LIMIT 1
Messages
Garden of Analysis
Result 1
cube_root_27 cube_root_216
1
Find the cube root with the cbrt function.
Page 578
EXECUTE
Sandbox
3
6
Chapter 14
Interrogating the Data
Ceiling Gets the Smallest Integer Not Smaller Than X Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Query 1 SELECT ceil(-0.1) as Ceil_1 ,ceil(3.333) as Ceil_2 ,order_total ,ceil(order_total) as Ceiling_Total FROM order_table LIMIT 1; Messages
Garden of Analysis
Sandbox ?
New Query
ceil finds the smallest integer NOT smaller than X
Result 1
ceil_1 ceil_2 order_total ceiling_total 1
0
4
15231.62
15232
Find the smallest integer not smaller than x by using the ceil command. This stands for a numbers integer ceiling.
Page 579
Chapter 14
Interrogating the Data
Floor Finds the Largest Integer Not Greater Than X Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History EXECUTE
Query 1 SELECT floor(-0.1) as Floor_1 ,floor(3.333) as Floorl_2 ,order_total ,floor(order_total) as Floor_Total FROM order_table LIMIT 1; Messages
Garden of Analysis
Sandbox ?
New Query
Floor finds the largest integer NOT greater than X
Result 1
Floor_1 Floor_2 order_total Floor_Total 1
-1
3
15231.62
15231
Find the largest integer not greater than x by using the floor command. This stands for a numbers integer floor.
Page 580
Chapter 14
Interrogating the Data
The Round Function and Precision Nexus Chameleon System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
EXECUTE
?
New Query
Query 1 Query 2 Query 3 SELECT Customer_Number, Order_Total ,round(Order_Total, 0) as no_decimals ,round(Order_Total, 1) as one_decimal FROM Order_Table Round 1 decimal place
Messages
Garden of Analysis
Result 1
customer_number order_total no_decimals one_decimal 1 2 3 4 5
87323456 57896883 31323134 11111111 11111111
15231.62 23454.84 5111.47 12347.53 8005.91
Use the round function to round to the precision you need.
Page 581
Sandbox
History
File Edit View Query Tools Help Web Windows
15232 23455 5111 12348 8006
15231.6 23454.8 5111.5 12347.5 8005.9
Chapter 14
Interrogating the Data
Quiz – What would the Answer be? Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
SELECT Class_Code ,Grade_Pt / (Grade_Pt * 2 ) as Math1 FROM Student_Table ORDER BY 1,2 ;
Can you guess what would return in the Answer Set? Using the Student_Table above, and try and predict what the answer will be if this query was running on the system.
Page 582
Chapter 14
Interrogating the Data
Answer to Quiz – What would the Answer be? SELECT Class_Code ,Grade_Pt / (Grade_Pt * 2 ) as Math1 FROM Student_Table ORDER BY 1, 2 ;
Class_Code __________ Math1 ___________________________
FR FR FR JR JR SO SO SR SR ? Above are your answers.
Page 583
0 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 ?
Chapter 14
Interrogating the Data
The NULLIFZERO Command Student_Table Student_ID _________ 423400 231222 280023 322133 125634 333450 324652 260000 234121 123250
Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Wilson Susie SO 3.80 McRoberts Richard JR 1.90 Bond Jimmy JR 3.95 Hanson Henry FR 2.88 Smith Andy SO 2.00 Delaney Danny SR 3.35 Johnson Stanley ? ? Thomas Wendy FR 4.00 Phillips Martin SR 3.00
SELECT Class_Code ,Grade_Pt / ( NULLIFZERO (Grade_pt) * 2 ) AS Math1 FROM Student_Table ORDER BY 1, 2 ; If you have a calculation where a ZERO is not desired, you can use the NULLIFZERO command to convert any zero value to a null value. Turn the page and see the results.
Page 584
Chapter 14
Interrogating the Data
The NULLIFZERO vs. Zeroes SELECT Class_Code as class ,Grade_Pt / (Grade_Pt * 2 ) as Math1 FROM Student_Table ORDER BY 1, 2 ;
SELECT Class_Code as Class ,Grade_Pt / (NULLIFZERO (Grade_pt) * 2 ) AS Math1 FROM Student_Table ORDER BY 1, 2 ;
Class Math1 _____ ___________________________
Class Math1 _____ ___________________________
FR FR FR JR JR SO SO SR SR ?
FR FR FR JR JR SO SO SR SR ?
0 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 ?
? 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 0.5000000000000000000000000 ?
If you have a calculation where a ZERO is not desired, you can use the NULLIFZERO command to convert any zero value to a null value.
Page 585
Chapter 14
Interrogating the Data
Quiz – Fill in the Blank Values in the Answer Set Sample_Table
Cust_No ________ 0
Acc_Balance Location ___________ _______ ? 3
SELECT NULLIFZERO (Cust_No) AS Cust_No ,NULLIFZERO (Acc_Balance) AS Acc_Balance ,NULLIFZERO (Location) AS Location FROM Sample_Table ;
Cust_No Acc_Balance
________ ____________
Location
_________
Fill in the Answer Set above after looking at the table and the query.
Okay! Time to show me your brilliance! What would the Answer Set produce?
Page 586
Chapter 14
Interrogating the Data
Answer to Quiz – Fill in the Blank Values in the Answer Set Sample_Table Cust_No ________ 0
Acc_Balance Location ___________ _______ ? 3
SELECT NULLIFZERO (Cust_No) AS Cust_No ,NULLIFZERO (Acc_Balance) AS Acc_Balance ,NULLIFZERO (Location) AS Location FROM Sample_Table ;
Cust_No Acc_Balance Location ________ _____________ _________ ?
?
3
Here is the answer set! How did you do? The NULLIFZERO command found a zero in Cust_No, so it made it Null. The others were not zero, so they retained their value. The only time NULLIFZERO changes data is if it finds a zero, and then it changes it to null.
Page 587
Chapter 14
Interrogating the Data
Quiz – Fill in the Answers for the NULLIF Command Student_Table Student_ID _________ 423400 123250 234121
Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Phillips Martin SR 3.00 Thomas Wendy FR 4.00
SELECT Fill in the Answer Last_Name Set below after ,NULLIF(Grade_Pt, 0) AS GP1 looking at the table ,NULLIF(Grade_Pt, 3.0) AS GP2 and the query. ,NULLIF(Grade_Pt, 4.0) AS GP3 FROM Student_Table WHERE Student_ID IN (423400, 123250, 234121) ORDER BY Last_Name ; Last_Name GP1 __________ ____ Larkins Phillips Thomas
GP2 ____
What would the above Answer Set produce from your analysis?
Page 588
GP3 ____
Chapter 14
Interrogating the Data
Answer – Fill in the Answers for the NULLIF Command Student_Table Student_ID _________ 423400 123250 234121
Last_Name First_Name Grade_Pt __________ __________ Class_Code __________ ________ Larkins Michael FR 0.00 Phillips Martin SR 3.00 Thomas Wendy FR 4.00
SELECT Fill in the Answer Last_Name Set below after ,NULLIF(Grade_Pt, 0) AS GP1 looking at the table ,NULLIF(Grade_Pt, 3.0) AS GP2 and the query. ,NULLIF(Grade_Pt, 4.0) AS GP3 FROM Student_Table WHERE Student_ID IN (423400, 123250, 234121) ORDER BY Last_Name ; Last_Name GP1 GP2 __________ ____ ____ ? 0.00 Larkins 3.00 ? Phillips 4.00 4.00 Thomas
GP3 ____ 0.00 3.00 ?
Look at the answers above, and if it doesn’t make sense, go over it again until it does.
Page 589
Chapter 14
Interrogating the Data
The ZEROIFNULL Command Sample_Table Cust_No ________ 0
Acc_Balance Location ___________ _______ ? 3 Notice the Null! We’re turning it into a 0 shortly!
SELECT ZEROIFNULL (Cust_No) as Cust ,ZEROIFNULL (Acc_Balance) as Balance ,ZEROIFNULL (Location) as Location FROM Sample_Table ;
Cust Balance Location _____ _________ _________
Fill in the Answer Set above after looking at the table and the query.
This is the ZEROIFNULL. What it will do is put a zero into a place where a NULL shows up. Fill in what you think the answer set will be.
Page 590
Chapter 14
Interrogating the Data
Answer to the ZEROIFNULL Question Sample_Table Cust_No ________ 0
Acc_Balance Location ___________ _______ ? 3 Notice the Null! We’re turning it into a 0 shortly!
SELECT ZEROIFNULL (Cust_No) as Cust ,ZEROIFNULL (Acc_Balance) as Balance ,ZEROIFNULL (Location) as Location FROM Sample_Table ;
Cust Balance Location _____ _________ _________ 0 0 3 The answer set placed a zero in the place of the NULL Acc_Balance, but the other values didn’t change because they were NOT Null.
Page 591
Chapter 14
Interrogating the Data
The COALESCE Command Sample_Table Last_Name Home_Phone ___________ Work_Phone __________ Cell_Phone __________ ___________ Jones Patel Gonzales Nguyen
555-1234 ? ? ?
444-1234 456-7890 ? ?
? 454-6789 354-0987 ?
SELECT Last_Name ,COALESCE (Home_Phone, Work_Phone, Cell_Phone) as Phone FROM Sample_Table ; Last_Name __________
Phone ______
Fill in the Answer Set above after looking at the table and the query
Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null.
Page 592
Chapter 14
Interrogating the Data
The COALESCE Answer Set Sample_Table Last_Name Home_Phone ___________ Work_Phone Cell_Phone __________ ___________ __________ Jones Patel Gonzales Nguyen
555-1234 ? ? ?
444-1234 456-7890 ? ?
? 454-6789 354-0987 ?
SELECT Last_Name ,COALESCE (Home_Phone, Work_Phone, Cell_Phone) as Phone FROM Sample_Table ;
Last_Name __________ Jones Patel Gonzales Nguyen
Phone ________ 555-1234 456-7890 354-0987 ?
Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null.
Page 593
Chapter 14
Interrogating the Data
The Coalesce Quiz Sample_Table
Last_Name Home_Phone ___________ Work_Phone Cell_Phone __________ ___________ __________ Jones Patel Gonzales Nguyen
555-1234 ? ? ?
444-1234 456-7890 ? ?
? 454-6789 354-0987 ?
SELECT Last_Name ,COALESCE (Home_Phone, Work_Phone, Cell_Phone, 'No Phone') as Phone FROM Sample_Table ; Last_Name __________
Phone ________
Fill in the answer set above after looking at the table and the query
Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null. Since we decided in the above query we don’t want NULLs, notice we have placed a literal ‘No Phone’ in the list. How will this affect the Answer Set?
Page 594
Chapter 14
Interrogating the Data
Answer – The Coalesce Quiz Sample_Table
Last_Name ___________ Home_Phone ___________ Work_Phone Cell_Phone __________ __________ Jones Patel Gonzales Nguyen
555-1234 ? ? ?
444-1234 456-7890 ? ?
? 454-6789 354-0987 ?
SELECT Last_Name ,COALESCE (Home_Phone, Work_Phone, Cell_Phone, 'No Phone') as Phone FROM Sample_Table ; Last_Name __________ Jones Patel Gonzales Nguyen
Phone ________ 555-1234 456-7890 354-0987 No Phone
Answers are above! We put a literal in the list so there’s no chance of NULL returning.
Page 595
Chapter 14
Interrogating the Data
The COALESCE Command – Fill In the Answers Student_Table Student_ID _________ 423400 260000 234121
Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Johnson Stanley ? ? Thomas Wendy FR 4.00
SELECT Fill in the Answer Last_Name Set below after looking at the table ,Grade_Pt and the query. ,Student_ID ,COALESCE (Grade_Pt, Student_ID) as ValidStudents FROM Student_Table WHERE Last_Name IN ('Johnson', 'Larkins', 'Thomas') ORDER BY 1 ; Last_Name Grade_Pt __________ ________ Johnson Larkins Thomas
? 0.00 4.00
Student_ID __________ ValidStudents ___________ 260000 423400 234121
Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null.
Page 596
Chapter 14
Interrogating the Data
The COALESCE Answer Set Student_Table Student_ID _________ 423400 260000 234121
Last_Name First_Name __________ __________ Class_Code __________ Grade_Pt ________ Larkins Michael FR 0.00 Johnson Stanley ? ? Thomas Wendy FR 4.00
SELECT Last_Name ,Grade_Pt ,Student_ID ,COALESCE (Grade_Pt, Student_ID) as ValidStudents FROM Student_Table WHERE Last_Name IN ('Johnson', 'Larkins', 'Thomas') ORDER BY 1 ;
Last_Name Grade_Pt __________ ________ Johnson Larkins Thomas
? 0.00 4.00
Student_ID __________ ValidStudents ___________ 260000 423400 234121
260000.00 0.00 4.00
Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null.
Page 597
Chapter 14
Interrogating the Data
COALESCE is Equivalent to This CASE Statement SELECT Last_Name ,Grade_Pt ,Class_Code ,COALESCE (Grade_Pt, Student_ID) as ValidStudents FROM Student_Table ; SELECT Last_Name ,Grade_Pt ,Class_Code , CASE WHEN Grade_Pt IS NOT NULL THEN Grade_Pt WHEN Student_ID IS NOT NULL THEN Student_ID ELSE NULL END as ValidStudents FROM Student_Table ;
Coalesce returns the first non-Null value in a list, and if all values are Null, returns Null. Above are two queries that return the exact same answer set. These examples are designed to give you a better idea of how Coalesce works.
Page 598
Chapter 14
Interrogating the Data
Some Great CAST (Convert and Store) Examples Nexus Chameleon File Edit View Query Tools Help Web Windows System: Vertica
Systems + + + + + + + + + + + + + + +
Aster Data Azure Cloud DB2 Excel Greenplum Hadoop Kognitio Netezza Oracle Matrix Redshift SQL Server Sybase Teradata Vertica
Database: SQL Class
History
Sandbox
EXECUTE
?
New Query
Query 1
SELECT CAST('ABCDE' AS CHAR(1) ) AS Trunc ,CAST(128 AS CHAR(3) ) AS OK ,CAST(127 AS INTEGER ) AS Bigger
Messages
1
Garden of Analysis
TRUNC
OK
BIGGER
A
128
127
Result 1
The first CAST truncates the five characters (left to right) to form the single character ‘A’. In the second CAST, the integer 128 is converted to three characters and left justified in the output. The 127 was initially stored in a SMALLINT (5 digits - up to 32767) and then converted to an INTEGER. Hence, it uses 11 character positions for its display, ten numeric digits and a sign (positive assumed) and right justified as numeric.
Page 599
Chapter 14
Interrogating the Data
Some Great CAST (Convert and Store) Examples SELECT CAST(121.53 AS SMALLINT) AS Whole ,CAST(121.53 AS DECIMAL(3,0)) AS Rounder ; ______ _______ Whole Rounder 122 122 SELECT CAST(121.49 AS SMALLINT) AS Whole ,CAST(121.53 AS DECIMAL(3,0)) AS Rounder ; ______ _______ Whole Rounder 121 122 SELECT CAST(121.50 AS SMALLINT) AS Whole ,CAST(121.53 AS DECIMAL(3,0)) AS Rounder ; ______ _______ Whole Rounder 122 122
The value of 121.53 was initially stored as a DECIMAL as 5 total digits with 2 of them to the right of the decimal point. Then, it is converted to a SMALLINT using CAST to remove the decimal positions. Therefore, it truncates data by stripping off the decimal portion, but also rounds up because 53 is > 50. The CAST in the next column called Rounder is converted to a DECIMAL as 3 digits with no decimals, so it will also round data values. Since .53 is greater than .5, it is rounded up to 122.
Page 600
Chapter 14
Interrogating the Data
A Rounding Example SELECT CAST(.014 ,CAST(.016 ,CAST(.015 ,CAST(.0150 ,CAST(.0250 ,CAST(.0159
.014 ____ 0.01
.016 ____ 0.02
AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2)) AS Decimal(3,2))
.015 ____ 0.02
AS ".014" AS ".016" AS ".015" AS ".0150" AS ".0250" AS ".0159"
.0150 _____ 0.02
Rounding isn't always intuitive as you can see from the examples above.
Page 601
.0250 _____ 0.03
.0159 _____ 0.02
Chapter 14
Interrogating the Data
Some Great CAST (Convert and Store) Examples SELECT Order_Number as OrdNo ,Customer_Number as CustNo ,Order_Date ,Order_Total ,CAST(Order_Total as integer) as Chopped ,CAST(Order_Total as Decimal(5,0)) as Rounded FROM Order_Table ORDER BY Order_Date ;
OrdNo _________ CustNo Order_Date Order_Total _______ __________ __________ Chopped _______
123456 123512 123777 123552 123585
11111111 11111111 57896883 31323134 87323456
05/04/1998 01/01/1999 09/09/1999 10/01/1999 10/10/1999
12347.53 8005.91 23454.84 5111.47 15231.62
Notice how the rounding did not take place as you might have expected.
Page 602
12348 8006 23455 5111 15232
Rounded _______
12348 8006 23455 5111 15232
Chapter 14
Interrogating the Data
Quiz - The Basics of the CASE Statements Course_Table Course_ID _________ 100 200 210 220 300 400
Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16
SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' END AS CreditAlias FROM Course_Table WHERE Course_ID IN (220, 300) ; Course_Name ______________________ CreditAlias ____________ Physical Database Design SQL Features
This is a CASE STATEMENT which allows you to evaluate a column in your table, and from that, come up with a new answer for your report. Every CASE begins with a CASE, and they all must end with a corresponding END. What would the answer be?
Page 603
Chapter 14
Interrogating the Data
Answer to Quiz - The Basics of the CASE Statements Course_Table Course_ID _________ 100 200 210 220 300 400
Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16
SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' END AS CreditAlias FROM Course_Table WHERE Course_ID IN (220, 300) ; Course_Name ______________________ CreditAlias ____________ ? Physical Database Design Two Credits SQL Features
The answer for the Physical Database Design class is null. This is because it fell through the case statement. The answer for the SQL Features course is Two Credits. Once a case statement gets a match, it leaves the statement and gets the next row.
Page 604
Chapter 14
Interrogating the Data
Using an ELSE in the Case Statement Course_Table Course_ID _________ 100 200 210 220 300 400
Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16
SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' ELSE 'Four Credits' END AS CreditAlias FROM Course_Table WHERE Course_ID IN (220, 300) ; Course_Name ______________________ CreditAlias ____________ Four Credits Physical Database Design Two Credits SQL Features
Now that we have an ELSE in our case statement we are guaranteed that nothing will fall through.
Page 605
Chapter 14
Interrogating the Data
Using an ELSE as a Safety Net Course_Table Course_ID _________ 100 200 210 220 300 400
Course_Name Credits _____________________ ______ Seats _____ Database Concepts 3 50 Introduction to SQL 3 20 Advanced SQL 3 22 SQL Features 2 25 Physical Database Design 4 20 Database Administration 4 16
SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' WHEN 4 THEN 'Four Credits' ELSE 'Do not know' END AS CreditAlias FROM Course_Table ;
Now that we have an ELSE in our case statement we are guaranteed that nothing will fall through. An ELSE should be used in case you forgot a possibility and there was no match.
Page 606
Chapter 14
Interrogating the Data
Rules for a Valued Case Statement SELECT Course_Name ,CASE Credits WHEN 1 THEN 'One Credit' WHEN 2 THEN 'Two Credits' WHEN 3 THEN 'Three Credits' Else 'Credits not found' END AS CreditAlias FROM Course_Table ;
The column Credits (in blue) follows the word CASE. This is a valued case statement. The value is the column Credits.
Rules for a Valued CASE: 1. You can only check for equality 2. You can only check the value of the column Credits There are two types of CASE statements. There is the Valued CASE and the Searched CASE. Above are the rules for the Valued CASE statement.
Page 607
Chapter 14
Interrogating the Data
Rules for a Searched Case Statement SELECT Course_Name No Value follows the ,CASE word CASE. This is WHEN Credits