6 Querying a Database
Contents
- Querying a Database in Julia
- Querying a Database in SAS
- Querying a Database in R
- Querying a Database in Python
Once you are connected to a database, you can query the tables that are there or the ones you have written. To review writing tables, see the previous chapter on writing to a database.
In the code blocks below, we assume you’ve already connected to your database and have named the connection as shown in the third chapter on connecting to a database.
In this chapter, we discuss the logistics of passing a query onto the database and reading the output using each language, rather than the basics of how to use SQL A brief introduction to SQL can be found in the background chapter. If you’re new to SQL and want more resources, here are some references we recommend:
- SQL Keywords Reference by w3schools: where there are multiple options (i.e. MySQL or SQL Server / Oracle / MS Access) make sure you follow instructions for SQL Server
- SQL Tutorial by w3schools
- SQL Server Basics by sqlservertutorial
6.1 Julia
To query a database in Julia, we use the ODBC.query
function. This takes in the connection dsn
and a raw string for the query. This can be a one-line string, but we recommend using tripple quotes to pass in a multi-line string so you can use pretty SQL syntax.
= ODBC.query(dsn,"""
res select cyl, drat
from mtcars
where disp > 200
""")
This can also be done LINQ style, using ODBC.Query
, with a capital Q.
=@from i in ODBC.Query(dsn,"select * from mtcars") begin
res@where i.disp > 200
@select {Cylinder=i.cyl, Drat=i.drat}
@collect table
end
dyplr/tidyverse style
= ODBC.query(dsn,"select * from mtcars") |>
res @filter( _.disp > 200) |>
@select( :cyl, :drat) |>
DataFrame
6.2 SAS
In SAS, we query using proc sql
. The following code block references the connection libname created in [the connections chapter][#connect_sas].
proc sql;
create table mtcars as
select * from conn2.mtcars ;
quit;
libname conn2 clear;
In this example, we don’t use the libname. This works just the same, but the syntax is a bit more convoluted.
proc sql;
connect to odbc as conn
required="Driver={ODBC Driver 17 for SQL Server};Address=24.205.251.117;Database=NickDb;UID=&id;PWD=&pass";
create table event as
select * from connection to conn
(select * from events) ;
disconnect from conn;
quit;
6.3 R
In R, you can choose to read a full table or query a subset with a sql query. Both options will return a data.frame
.
Read full table
dbReadTable(conn, name = "iris")
data <-head(data)
Query a table
dbGetQuery(
species <-
conn, 'select distinct Species from iris'
) species
6.4 Python
6.4.1 Standard Python
In standard python, we pass a select
statement to cursor.execute
. To fetch the results as a list of rows, we use cursor.fetchall
. Each row in this output is of the type pyodbc.Row
, and with this data type we can access values by the column names of the table using dot notation.
"select * from iris")
cursor.execute(= cursor.fetchall()
iris_out
print(type(iris_out[1]))
= iris_out[1].SepalLength first_sepal_length
6.4.2 Pandas
With pandas, we can use the read_sql
function to query tables. This function returns the query results as a pandas DataFrame, which is frequently a lot easier to work with than the pyodbc.Row
data type mentioned above. We recommend using pandas for this, and will continue to use pandas for the following chapters.
import pandas as pd
= pd.read_sql("select * from iris", conn) iris_df_out