Thursday, March 16, 2023

The SELECT Statement in Oracle

The SELECT statement is used to retrieve information from a database. Following the execution of a SELECT statement, a tabular result is held in a result table (called a result set). It allows you to select the table columns depending on a set of criteria.

SELECT statements might be as straightforward as selecting a few columns from a single table. They can also be complex, with several columns and criteria and multiple tables. In this article, I will introduce the basic SELECT statement fetching data from just one table.

SELECT Syntax:

The SELECT statement contains 4 Parts:

SELECT >> COLUMNS >> FROM >>WHERE

The SELECT clause defines which table columns are retrieved. The FROM clause defines which tables are queried. The WHERE clause determines which table rows should be selected and if it is omitted, all table rows are selected.

The basic syntax looks like this:

SELECT Column_Name1, Column_Name2, Column_Name
FROM Table_Name 
WHERE Criteria;

A database consists of one or more tables. Each table is given a distinct name (e.g., “Employee”, “Department”, “Address”). Data records are stored in tables (rows).

Here’s an example of an Employee table: You can use the following SQL script to create Employee and load the data.

SQL script to create the  “Employee” table.

CREATE TABLE Employee (
    Employeeid Varchar(15) Primary Key,
    Name       Varchar(15),
    LastName   Varchar(15),
    FirstName  Varchar(15),
    Department Varchar(15),
    Dob        Date,
    Salary     Number(8,0)
);

The following INSERT statements will load data into the Employee table for the demos in this article.

INSERT INTO Employee (EmployeeId, Name, LastName, FirstName, Department, DOB, Salary)
VALUES ('1001', 'Allen Cox', 'Cox', 'Allen', 'IT', TO_DATE('01-July-1982', 'DD-MM-YYYY'),75000);
INSERT INTO Employee (EmployeeId, Name, LastName, FirstName, Department, DOB, Salary)
VALUES ('1002', 'Dave Carry', 'Carry', 'Dave', 'Sales', TO_DATE('01-Jan-1992', 'DD-MM-YYYY'),70000);
INSERT INTO Employee (EmployeeId, Name, LastName, FirstName, Department, DOB, Salary)
VALUES ('1003', 'Amit Singh','Singh', 'Amit', 'Sales', TO_DATE('20-Oct-1991', 'DD-MM-YYYY'),50000);
INSERT INTO Employee (EmployeeId, Name, LastName, FirstName, Department, DOB, Salary)
VALUES ('1004', 'Rhonda Grant', 'Grant', 'Rhonda', 'Marketing', TO_DATE('15-May-1980', 'DD-MM-YYYY'),60000);
INSERT INTO Employee (EmployeeId, Name, LastName, FirstName, Department, DOB, Salary)
VALUES ('1005', 'Marvin Cox', 'Cox', 'Marvin', 'Marketing', TO_DATE('15-Jan-1985', 'DD-MM-YYYY'),62000);
INSERT INTO Employee (EmployeeId, Name, LastName, FirstName, Department, DOB, Salary)
VALUES ('1006', 'Dave Grant', 'Grant', 'Dave', 'Sales', TO_DATE('05-Oct-1980', 'DD-MM-YYYY'),90000);

This is how the data in the table looks.

EmployeeId

Name

LastName

FirstName

Department

DOB

Salary

1001

Allen Cox

Cox

Allen

IT

01-JUL-82

75000

1002

Dave Carry

Carry

Dave

Sales

01-JAN-92

70000

1003

Amit Singh

Singh

Amit

Sales

20-OCT-91

50000

1005

Marvin Cox

Cox

Marvin

Marketing

15-JAN-85

62000

1006

Dave Grant

Grant

Dave

Sales

05-OCT-80

90000

Let’s write SELECT statements on the above table.

Retrieve Data from a Single Table

The SELECT statement can retrieve data from single or multiple tables; however, all my examples in this article use a single table to focus on the parts of the SELECT statement itself. In a future article, I will cover multiple table queries and how JOIN operators can be used to connect data from multiple tables and then shaped into a single tabular data stream.

Select all columns

An asterisk following the word “SELECT” indicates that all fields that are available should be retrieved (columns).

Using SELECT * in SQL queries is great for doing quick ad hoc work, but for code that will be reused in production, this is a poor practice that should be avoided. Instead of SELECT *, it is recommended to provide the column names you want in the SELECT query. Using column names not only improves efficiency but also makes SQL code easier to understand.

Furthermore, if you use SQL with * in an application such as PeopleSoft code or .NET code and add any new columns to the table or reorder the existing columns, your code may break. It is however, what most people use when exploring data.

The following SELECT statement gets all the columns and rows from the given table. Note that the FROM clause is where you specify the set of data that your query will be using. I will discuss this more when covering JOIN operations.

SELECT  *  
FROM Employee;

This returns:

Graphical user interface, table Description automatically generated

Selecting one or more columns by name

Any database object such as a column name can be named using either quoted or unquoted identifiers. Unquoted identifiers are not case-sensitive. Oracle classifies them as uppercase.  Case is important when using quoted IDs. By enclosing names in double quotation marks, you can give the following names to different objects in the same namespace.

  • “employees”
  • “Employees”
  • “EMPLOYEES”

It is important to note that Oracle reads the following names identically, therefore they cannot be used for various objects in the same namespace:

  • employees
  • EMPLOYEES
  • “EMPLOYEES”

I prefer object names without quotation marks because they are easier for me to type. This implies that when I create a table, I don’t use quotation marks so that the columns can be entered in either capital or lowercase characters in a select statement The decision is entirely personal in this case. However, consistency is important. I like uppercase keywords and PASCAL case for things like column and table names, but it is up to you to define your style.

When selecting only certain columns from the table:

  • Specify each column’s name when selecting multiple columns.
  • Column names must be separated by commas.

For example, the following SELECT statement retrieves particular columns and rows from the given table.

SELECT EmployeeId, Name, LastName, FirstName 
FROM Employee;

This query, using the table provided, returns:

Graphical user interface, text Description automatically generated with medium confidence

Sorting data: ORDER BY

The ORDER BY clause is used to sort the result set in ascending or descending order. When you execute a query with no ORDER BY clause, the order of the output is not guaranteed to be in any order. 

For example, assume you wish to sort the data from lowest to highest date of birth.

SELECT  *  
FROM Employee 
ORDER BY DOB;

This returns the following, note that the data in the DOB column is sorted when the results are returned:

Graphical user interface, table Description automatically generated

By default, ORDER BY sorts the rows in ascending order. Although it is the default, you may optionally add ASC to the end of the ORDER BY clause to explicitly state the sort is ascending. By adding DESC at the end of the ORDER BY clause, you can instruct SQL to return results sorted by that column in descending order (For numbers 100 to 1 and letters Z to A).

As a quick example, if you want to sort the data from highest to lowest date of birth, add DESC as a suffix to DOB in the ORDER BY clause.

SELECT *  
FROM Employee 
ORDER BY DOB DESC;

Now you can see the DOB data goes from most recent to oldest data:

Graphical user interface, table Description automatically generated with medium confidence

Sorting by Multiple Columns:

You can use more than one sort of level to order your data. The primary sort order is defined by the first field name, the secondary sort the second, and so on. You simply add more columns to the ORDER BY clause. Each column may be ascending or descending, as you wish.

SELECT Column_Name1, Column_Name2, Column_Name3 
FROM Tablename 
ORDER BY Column_Name1 <desc>, Column_Name2 <desc>, 
         Column_Name3 <desc>;

For example, you may sort your data by salary and then by date of birth within salary:

SELECT * 
FROM Employee 
ORDER BY Salary DESC, DOB DESC;

There are no duplicated values in our dataset, but this is the output.

Graphical user interface, application Description automatically generated

Filtering rows from the output

The SQL SELECT statement’s WHERE clause specifies which rows to (or not to) return from your table or view by filtering rows based on a given condition.

When referring to literal character values in the WHERE clause, one must put the values in single quotes, for example, DEPARTMENT='Sales'; Other datatypes have different rules for how they are represented in code. For example, numeric values should not be surrounded by quotations.

This is the basic syntax:

SELECT <Column Name List> 
FROM   Tablename 
WHERE  <Comparison Expression>
ORDER BY Column_Name <desc>;

The <comparison expression> in the WHERE clause can most any expression returning a Boolean value. For example, you might compare a column value to a literal (Salary = 10000), a column value to another column (CurrentSalary = StartingSalary), or even if a column value has a NULL value (Salary IS NULL). A NULL value generally means that a value is unknown, in that you don’t have the value.

For comparing two values, most comparison operators can be used with a WHERE clause:

Operator

Description

IN

Equal to any item on a list

NOT IN

Not equal to any item on a list

AND

All conditions are true

OR

Any Condition is true

=

Equal

<>

Not equal

>

Greater than

<

Less than

>=

Greater than or equal

<=

Less than or equal

BETWEEN

Between a range

LIKE

Search for a pattern

IS [NOT] NULL

Checks to see if the column value IS NULL

The WHERE clause may also be quite complex depending on the necessary criteria, something that I will cover in a later article.

WHERE Clause Examples

In these next subsections, I will give a few examples of WHERE clauses using different comparison operators.

= (Equal) Comparison Operator

The most common comparison operator is =. It simply is used to compare that two values are equal. For example:

SELECT  *  
FROM Employee 
WHERE Department ='Sales';

This returns the following, where the employee is in the Sales department:

Table Description automatically generated

<> Not equal Comparison Operator

The following query shows all employees from Department Sales.

SELECT  *  
FROM Employee 
WHERE DEPARTMENT <> 'Sales';

This returns:

Graphical user interface, application, table Description automatically generated

There are also comparison operators for > (Greater than), < (Less than), >= (Greater than equal to), <= (Less than equal to), and more that work in a similar way.

IN Comparison Operator

The following query returns all Employee rows with EmployeeId values in the range of ‘1001’,1002, or ‘1003’.

SELECT * 
FROM Employee 
WHERE EmployeeId IN ('1001',1002,'1003');

This returns:

Graphical user interface, application Description automatically generated

If you want the rows that are not in a certain range, you need to use NOT IN. For example, the following query returns all Employee rows with EmployeeId values, not in the range of ‘1001’,1002, or ‘1003’.

SELECT *
FROM Employee 
WHERE EmployeeId NOT IN ('1001',1002,'1003');

This returns the other rows in the table:

Graphical user interface Description automatically generated with medium confidence

Note: be slightly careful with the NOT IN operator when NULL values could be involved. Column NOT IN (NULL, 1, 2) will never return any rows. NULL comparisons return NULL, and only rows with a TRUE comparison will be returned.

Graphical user interface, table Description automatically generated

AND Boolean Operator

Boolean operators are used to create more complex filters by tying 2 or more conditions together. For example, say you need to find all employees from the “Sales” Department with the last name “Grant”.

The two conditions are equality operations, and then we tie them together with AND:

SELECT  *
FROM Employee 
WHERE Department ='Sales' 
  AND LastName='Grant';

The result is:

OR Operator

Much like and, but in this case, we want to find where any condition is met. For example, the following query finds all employees from the “Sales” Department or last name is “Grant”

SELECT  *  
FROM Employee 
WHERE DEPARTMENT='Sales' 
   OR LastName= 'Grant';

This returns:

Graphical user interface, table Description automatically generated

It should be noted that when using AND and OR Boolean expression you need to take care when combining them. Use parenthesis to group together items. For example, if you want all employees from the “Sales” Department with the last name “Grant” as well as all employees not in the “Sales” Department, you will use the following:

WHERE (Department ='Sales' 
       AND LastName='Grant')
    OR (Department <> 'Sales')

Between Comparison Operator

The SQL BETWEEN operator is used to filter data based on a range of values. It allows you to select rows from a table where a column’s value falls within a specified range.

Note that the BETWEEN operator is inclusive, meaning it includes both the lower and upper bounds in the range. The following query returns all employees whose salaries are between $50000 and $70000.

SELECT  *  
FROM Employee 
WHERE Salary BETWEEN 50000 And 70000;

This returns:

Graphical user interface, table Description automatically generated

In the output you can see that the two rows with 50000 and 70000 are in the output. If you want to exclude the bounds, you need to be specific in your comparison. For example, the previous BETWEEN expression could be re-written:

WHERE Salary >= 50000 AND Salary <=70000;

If you didn’t want the endpoints, you could rewrite it as:

WHERE Salary > 50000 AND Salary <70000;

And you may need to vary that based on exactly what you are trying to achieve. Some people do not like to use BETWEEN because it isn’t as expressive as using individual comparison operators.

The LIKE Comparison Operator

A very important operator that I want to introduce is the LIKE operator. It lets you find more complex matches using simple comparison operators.

The LIKE operator does a pattern match comparison, which means it compares a string value to a pattern string that contains wildcard characters.

Wildcards can be defined using a “%” or “_” symbol both before and after the pattern

In Oracle SQL, the escape character can be used in a LIKE statement to escape special characters and match their literal value instead of their special meaning. The escape character is specified using the ESCAPE clause, and its default value is the backslash (\) character. For more information about escape character or wild cards please refer to the Oracle link

Note: Oracle also supports a regular expressions LIKE operator too. For more information about this, check out this article.

The percent sign (%) represents many, multiple, or zero characters, while the underscore symbol (_) one and only one value.

For example. the SQL query below will return employees whose first names begin with the letter ‘A’:

SELECT  *  
FROM Employee 
WHERE Name like 'A%';

This returns the following:

Graphical user interface Description automatically generated with low confidence

The SQL query below will return employees whose first names end with the letter ‘t’:

SELECT  *  
FROM Employee 
WHERE Name like '%t';

Which you can see in the results:

Table Description automatically generated with medium confidence

The SQL query below will return employees whose first names include the letter ‘C’:

SELECT *
FROM Employee 
WERE name like '%C%';

This returns the following:

Graphical user interface, application Description automatically generated

Note: Be cautious when using LIKE expressions with leading wildcard values because normal indexes may not be particularly beneficial due to the way indexes are structured and prevent efficient index utilization that uses a leading wildcard. As a result, LIKE expressions with leading wildcards (for example, ‘%list’) should be avoided where possible.

Finally, the underscore symbol (_) specifies a single character. Hence, the SQL query below will return employees with the letter “l” in the second position in their first names.

SELECT *
FROM   Employee 
WHERE  name like '_l%';

This returns the following:

Limiting the number of Rows in the output

It is sometimes a good idea or a requirement to display only a certain number of rows from a query. This can help improve performance when working with large tables containing millions of rows of data.

For instance, you may be creating a search application but just want to display a portion of the results. If your search query returns hundreds of rows but you just want to display ten at a time, you would use limit clauses to narrow down your set of results.

Another example would be if you wanted to show the top 10 or 20 highest-paid people from the employee table when your result set returned a thousand rows. The following clauses can be used with the Select statement to limit the number of rows.

FETCH FIRST or FETCH NEXT:

  • Specify the NEXT or FIRST row or rows to be returned.
  • Specify the number of rows to be returned.

The integer literal value must be 1 or greater for the fetch first clause. If the literal is missing, the value defaults to 1. This is the base syntax:

SELECT ColumnName1, ColumnName2, ColumnNameN 
FROM Tablename 
ORDER BY ColumnName1 FETCH FIRST/NEXT number ROWS/ROWS ONLY;

The ORDER BY clause is not required here, but it may be helpful if you want to control the order in which the data is returned. This is especially important when retrieving large datasets.

The following SELECT statement returns the first row from the query of the employee table, sorted by the EmployeeId column.

SELECT EmployeeId, Name, LastName, FirstName  
FROM Employee 
ORDER BY EmployeeId FETCH FIRST 1 ROW ONLY;

The result is the one row:

Next, the following SELECT statement returns the first 3 rows from the employee table, sorted by the EmployeeId columns:

SELECT EmployeeId, Name, LastName, FirstName  
FROM Employee 
ORDER BY EmployeeId FETCH FIRST 3 ROWS ONLY;

Which returns:

Table Description automatically generated

In the next example, the SELECT statement returns the next 2 rows and skips the first-row employee table, sorted by the EmployeeId columns.

SELECT EmployeeId, Name, LastName, FirstName, Department, Dob, Salary  
FROM Employee 
ORDER BY EmployeeId FETCH NEXT 2 ROWS ONLY;

In the result, you can see that this is still just the first 2 rows. The value of this will become more obvious in the next sections.

Graphical user interface, application Description automatically generated

OFFSET:

The OFFSET clause allows you to skip the first N rows in a result set before returning any rows.

The value of the integer literal (or the dynamic parameter?) for the result offset clause must be either 0 (default if no clause is specified) or positive. If the number of rows is more than the number of rows in the main result set, no rows are returned.

SELECT ColumnName1, ColumnName2, ColumnName N 
FROM Tablename 
ORDER BY ColumnName1 OFFSET number ROWS/ROWS;

The following SELECT statement skips the first two rows and returns the remaining rows sorted by the EmployeeId columns.

SELECT EmployeeId, Name, LastName, FirstName, 
       Department, Dob, Salary 
FROM Employee 
ORDER BY EmployeeId OFFSET  2 ROWS;

Now you can see in the following rows that 1001 and 1002 are not in the output, since they were the first two rows based on the sort order:

Graphical user interface, application, table Description automatically generated

Using OFFSET together with NEXT/FIRST

The OFFSET clause may be used in combination with the NEXT or FIRST clauses to restrict the number of rows returned in the result set.

For example, the following SQL skips the first employee, employees are ordered by lowest salary to highest salary, and returns the next two employees.

SELECT EmployeeId, Name, LastName, FirstName, Department, 
        Dob, Salary  
FROM Employee 
ORDER BY SALARY OFFSET  1 Rows FETCH NEXT 2 ROWS ONLY ;

Result:

A picture containing application Description automatically generated

Eliminate duplicates using DISTINCT:

The SELECT command retrieves data from table columns. But what if we only want unique data? A column in a database frequently has many duplicate values, thus there are instances when you just want to show the unique values.

For example, a business user may require a list of the unique countries of birth of all employees of a company to assess the organization’s diversity. Because many employees may have the same birth country, we may prefer a separate count of employees by country of birth rather than a total count of employees. In this case, we can use DISTINCT keyword with a SELECT statement.

The DISTINCT keyword specifies that only distinct (unique) data should be returned. Using DISTINCT will remove duplicates from the results table.

SELECT DISTINCT  ColumnName 
FROM Tablename;

The following SELECT statement retrieves unique Department values from the Employee table. There are several employees with the same department, and you as a business user may require a distinct list of departments where employees work.

SELECT DISTINCT Department 
FROM Employee;

Graphical user interface, text, application Description automatically generated

While it is beyond this introduction to the SELECT statement topic, beware of using DISTINCT to cover duplicated data that you don’t understand. When querying more than one table it often is used to hide issues that are either a problem for the performance or correctness of a query.

Grouping Data for Aggregation

The GROUP BY clause in SQL is used to group rows of data in a result set based on the values in one or more columns. The purpose of using the GROUP BY clause is to aggregate data and obtain summary information.

SELECT fieldname1, fieldname2, ... fieldname_n,   
       aggregate_function (aggregate_fieldname)  
FROM tables  
WHERE conditions  
GROUP BY fieldname1, fieldname2, ... fieldname_n;

The aggregate function (aggregate fieldname) represents an the aggregate function that you want to apply to the data, such as SUM, AVG, MIN, MAX, or COUNT. This function will operate on the values in the aggregate fieldname column.

The data will be formed into groups based on the columns in the GROUP BY clause. The values that are not in the GROUP BY clause then are collapsed into scalar values by the aggregate function. Rows with identical values in the grouping columns are grouped. In the query result, each group is interpreted as a single row and aggregate functions (like SUM, MIN, MAX) can be applied to the results For example, if the grouping had three rows with the same value, you could do COUNT(*) as an aggregate function and it would return 3.

GROUP BY is an optional part of a query. It goes after the WHERE clause or the FROM clause if the WHERE clause is not needed.

For instance, you want to know the total salary of a department. You should use the SUM aggregate function and group the data by departments.

SELECT Department ,Sum(Salary) FROM Employee  GROUP BY Department;

In our data set, this will return:

Table Description automatically generated

Filtering Groups of Data

The HAVING clause in SQL is used in conjunction with the GROUP BY clause to filter groups based on a specific condition. The HAVING clause operates on the results of the aggregate function, whereas the WHERE” clause operates on individual rows.

SELECT fieldname1, fieldname2, ... fieldname_n,   
       aggregate_function (aggregate_fieldname)  
FROM tables  
WHERE conditions  
GROUP BY fieldname1, fieldname2, ... fieldname_n 
HAVING aggregate_condition;

The aggregate condition is the condition that you want to apply to the groups of data, using the results of the aggregate function. For example, you could specify HAVING AVG(fieldname2) > value to only show the groups where the average of fieldname2 is greater than a certain value.

For an example, the following query returns the average salary for all departments that have an average over 61000 using the HAVING clause.

SELECT Department, AVG(salary)
FROM Employee
GROUP BY Department HAVING AVG(Salary) > 61000;

The result is:

Table Description automatically generated with medium confidence

Summary

In this introduction to the SELECT statement in Oracle, we have looked at how to query a single table, return only certain columns, put that data into a specified order, filter it down to a set of (if desired, distinct) rows, and then group that data and do aggregates.

It is a lot in one introduction, but the SELECT statement is the one statement that pretty much every Oracle user of any level will use. In the future, I will look at querying multiple tables at the same time, as well as how to create and remove data from a database.

The post The SELECT Statement in Oracle appeared first on Simple Talk.



from Simple Talk https://ift.tt/5nhEOdA
via

No comments:

Post a Comment