6 Querying a Database

Contents

  1. Querying a Database in Julia
  2. Querying a Database in SAS
  3. Querying a Database in R
  4. Querying a Database in Python

Once you are connected to a database, you can query the tables that are there or the ones you have written. To review writing tables, see the previous chapter on writing to a database.

In the code blocks below, we assume you’ve already connected to your database and have named the connection as shown in the third chapter on connecting to a database.

In this chapter, we discuss the logistics of passing a query onto the database and reading the output using each language, rather than the basics of how to use SQL A brief introduction to SQL can be found in the background chapter. If you’re new to SQL and want more resources, here are some references we recommend:

6.1 Julia

To query a database in Julia, we use the ODBC.query function. This takes in the connection dsn and a raw string for the query. This can be a one-line string, but we recommend using tripple quotes to pass in a multi-line string so you can use pretty SQL syntax.

res = ODBC.query(dsn,"""
  select cyl, drat 
    from mtcars 
    where disp > 200
""") 

This can also be done LINQ style, using ODBC.Query, with a capital Q.

res=@from i in ODBC.Query(dsn,"select * from mtcars") begin 
    @where i.disp > 200
    @select {Cylinder=i.cyl, Drat=i.drat}
    @collect table 
end 

dyplr/tidyverse style

res = ODBC.query(dsn,"select * from mtcars") |> 
@filter( _.disp > 200) |>
@select( :cyl, :drat) |>
DataFrame

6.2 SAS

In SAS, we query using proc sql. The following code block references the connection libname created in [the connections chapter][#connect_sas].

proc sql;
  create table mtcars as
  select * from conn2.mtcars ;
  quit;
  
libname conn2 clear;

In this example, we don’t use the libname. This works just the same, but the syntax is a bit more convoluted.

proc sql;
  connect to odbc as conn
  required="Driver={ODBC Driver 17 for SQL Server};Address=24.205.251.117;Database=NickDb;UID=&id;PWD=&pass";

  create table event as
  select * from connection to conn
  (select * from events) ;

  disconnect from conn;
  
quit;

6.3 R

In R, you can choose to read a full table or query a subset with a sql query. Both options will return a data.frame.

Read full table

data <- dbReadTable(conn, name = "iris")
head(data)

Query a table

species <- dbGetQuery(
  conn, 
  'select distinct Species from iris'
)
species

6.4 Python

6.4.1 Standard Python

In standard python, we pass a select statement to cursor.execute. To fetch the results as a list of rows, we use cursor.fetchall. Each row in this output is of the type pyodbc.Row, and with this data type we can access values by the column names of the table using dot notation.

cursor.execute("select * from iris")
iris_out = cursor.fetchall()

print(type(iris_out[1]))

first_sepal_length = iris_out[1].SepalLength

6.4.2 Pandas

With pandas, we can use the read_sql function to query tables. This function returns the query results as a pandas DataFrame, which is frequently a lot easier to work with than the pyodbc.Row data type mentioned above. We recommend using pandas for this, and will continue to use pandas for the following chapters.

import pandas as pd

iris_df_out = pd.read_sql("select * from iris", conn)