Connect to DBs

R & SQLite

This section will cover how to connect to DBs from within RStudio with a combination of R & SQL

Using .sql file

World_Economies

conn=DBI::dbConnect(RSQLite::SQLite(),'D:/data/GDP/World_Economies.db')

After you setup the connection you have to reference it and instruct RStudio to use SQL with this
{sql connection = conn}
as the first line in the SQL code chunk

Preview

SELECT  *
FROM    Countries_by_GDP

Displaying records 1 - 10
Country	GDP_USD_billions
United States	26854.60
China	19373.59
Japan	4409.74
Germany	4308.85
India	3736.88
United Kingdom	3158.94
France	2923.49
Italy	2169.74
Canada	2089.67
Brazil	2081.24

List Tables

SELECT name FROM sqlite_master WHERE type='table';

1 records
name
Countries_by_GDP

Table Attributes

PRAGMA table_info([table_name])

INSTRUCTOR

When updating a db make sure you commit() as some require it. In this example the data can only be accessed from the connection that created and edited the table. When I tried to access it from other connections it would not display any information. I went back to the original connection and conn.commit(). and it worked fine after that.

conn2=DBI::dbConnect(RSQLite::SQLite(),'D:/data/sqlite3/INSTRUCTOR.db')

List Tables

SELECT name FROM sqlite_master WHERE type='table';

4 records
name
Sc_table
Sc_table2
Instructor
Scores

Preview

SELECT   *
FROM     Sc_table
LIMIT    10

0 records
country	first_name	last_name	test_score

STAFF db

conn4=DBI::dbConnect(RSQLite::SQLite(),'D:/data/STAFF.db')

List all Tables

SELECT name FROM sqlite_master WHERE type='table';

1 records
name
INSTRUCTOR

Preview

SELECT *
FROM   INSTRUCTOR
LIMIT  10

Displaying records 1 - 10
ID	FNAME	LNAME	CITY	CCODE
1	Rav	Ahuja	TORONTO	CA
2	Raul	Chong	Markham	CA
3	Hima	Vasudevan	Chicago	US
4	John	Thomas	Illinois	US
5	Alice	James	Illinois	US
6	Steve	Wells	Illinois	US
7	Santosh	Kumar	Illinois	US
8	Ahmed	Hussain	Illinois	US
9	Nancy	Allen	Illinois	US
10	Mary	Thomas	Illinois	US

R & SQLite

Using .r file

library(RSQLite)

# create connection
conn5 <- dbConnect(RSQLite::SQLite(),'D:/data/sqlite3/INSTRUCTOR.db' )

# List Tables
dbListTables(conn5)

[1] "Instructor" "Sc_table"   "Sc_table2"  "Scores"

preview <- dbGetQuery(conn5, "SELECT * FROM Instructor")
preview

  ID FNAME     LNAME      CITY CCODE
1  1   Rav     Ahuja MOOSETOWN    CA
2  2  Raul     Chong   Markham    CA
3  3  Hima Vasudevan   Chicago    US

Python & SQLite

Refer to Data/SQLite3 page for more details. Here is a summary table: # Python & SQL Summary

SQLite

Topic	Syntax	Description
connect()	`sqlite3.connect()`	Create a new database and open a database connection to allow sqlite3 to work with it. Call `sqlite3.connect()` to create a connection to the database INSTRUCTOR.db in the current working directory, implicitly creating it if it does not exist.
cursor()	`con.cursor()`	To execute SQL statements and fetch results from SQL queries, use a database cursor. Call `con.cursor()` to create the Cursor.
execute()	`cursor_obj.execute()`	The `execute` method in Python’s SQLite library allows to perform SQL commands, including retrieving data from a table using a query like “Select * from table_name.” When you execute this command, the result is obtained as a collection of table data stored in an object, typically in the form of a list of lists.
fetchall()	`cursor_obj.fetchall()`	The `fetchall()` method in Python retrieves all the rows from the result set of a query and presents them as a list of tuples.
fetchmany()	`cursor_obj.fetchmany()`	The `fetchmany()` method retrieves the subsequent group of rows from the result set of a query rather than just a single row. To fetch a few rows from the table, use fetchmany(numberofrows) and mention how many rows you want to fetch.
read_sql_query()	`read_sql_query()`	`read_sql_query()` is a function provided by the Pandas library in Python, and it is not specific to MySQL. It is a generic function used for executing SQL queries on various database systems, including MySQL, and retrieving the results as a Pandas DataFrame.
shape	`dataframe.shape`	It provides a tuple indicating the shape of a DataFrame or Series, represented as (number of rows, number of columns).
close()	`con.close()`	`con.close()` is a method used to close the connection to a MySQL database. When called, it terminates the connection, releasing any associated resources and ensuring the connection is no longer active. This is important for managing database connections efficiently and preventing resource leaks in your MySQL database interactions.
CREATE TABLE	`CREATE TABLE table_name ( column1 datatype constraints, column2 datatype constraints, ... );`	The `CREATE TABLE` statement is used to define and create a new table within a database. It specifies the table’s name, the structure of its columns (including data types and constraints), and any additional properties such as indexes. This statement essentially sets up the blueprint for organizing and storing data in a structured format within the database.
barplot()	`seaborn.barplot(x="x-axis_variable", y="y-axis_variable", data=data)`	`seaborn.barplot()` is a function in the Seaborn Python data visualization library used to create a bar plot, also known as a bar chart. It is particularly used to display the relationship between a categorical variable and a numeric variable by showing the average value for each category.
read_csv()	`df = pd.read_csv('file_path.csv')`	`read_csv()` is a function in Python’s Pandas library used for reading data from a Comma-Separated Values (CSV) file and loading it into a Pandas DataFrame. It’s a common method for working with tabular data stored in CSV format
to_sql()	`df.to_sql('table_name', index=False)`	`df.to_sql()` is a method in Pandas, a Python data manipulation library used to write the contents of a DataFrame to a SQL database. It allows to take data from a DataFrame and store it structurally within a SQL database table.
read_sql()	`df = pd.read_sql(sql_query, conn)`	`read_sql()` is a function provided by the Pandas library in Python for executing SQL queries and retrieving the results into a DataFrame from an SQL database. It’s a convenient way to integrate SQL database interactions into your data analysis workflows.

Db2

Topic	Syntax	Description
connect()	`conn = ibm_db.connect('DATABASE=dbname; HOST=hostname;PORT=port;UID=username; PWD=password;', '', '')`	`ibm_db.connect()` is a Python function provided by the ibm_db library, which is used for establishing a connection to an IBM Db2 or IBM Db2 Warehouse database. It’s commonly used in applications that need to interact with IBM Db2 databases from Python.
server_info()	`ibm_db.server_info()`	`ibm_db.server_info(conn)` is a Python function provided by the `ibm_db` library, which is used to retrieve information about the IBM Db2 server to which you are connected.
close()	`con.close()`	`con.close()` is a method used to close the connection to a db2 database. When called, it terminates the connection, releasing any associated resources and ensuring the connection is no longer active. This is important for managing database connections efficiently and preventing resource leaks in your db2 database interactions.
exec_immediate()	`sql_statement = "SQL statement goes here"stmt = ibm_db.exec_immediate(conn, sql_statement)`	`ibm_db.exec_immediate()` is a Python function provided by the ibm_db library, which is used to execute an SQL statement immediately without the need to prepare or bind it. It’s commonly used for executing SQL statements that don’t require input parameters or don’t need to be prepared in advance.

Python & DB2

Note: cannot use Jupyter Notebook from outside Db2 cloud with the instructions detailed below. To access IBM Db2 from outside need to load different packages from the ones specified below

In order to get the connection credentials:

Go to cloud.ibm.com/resources
Find Databases dropdown on the resources table
Choose Db2-xx
Service credentials form the left menu
The first time this page will be empty
Click > New Credentials
Next page will have Name: … - Leave it
Role: Manager - Leave it
Click > Add
Now you’ll see the credential listed
Check the box on the left and
Click on Down Arrow to view the credential details

So as you see above we’ve already created a Db2 instance and we’ve retrieved our credentials in Data/Connect Script page.

Import ibm_db Library

The ibm_db API provides a variety of useful Python functions for accessing and manipulating data in an IBM® data server database, including functions for connecting to a database, preparing and issuing SQL statements, fetching rows from result sets, calling stored procedures, committing and rolling back transactions, handling errors, and retrieving metadata.

library(reticulate)
#py_install("ibm_db")

import os
os.add_dll_directory(r'c:\users\emhrc\onedrive\docume~1\virtua~1\r-reti~1\lib\site-packages')

import ibm_db

Define Credentials

dsn_hostname = "6667d8e9-9d4d-4ccb-ba32-21da3bb5aafc.c1ogj3sd0tgtu0lqde00.databases.appdomain.cloud" 
dsn_uid = "nzx60439"        
dsn_pwd = "XqJtPBH1PGw3rnjF"     

dsn_driver = "{IBM DB2 ODBC DRIVER}"
dsn_database = "bludb"            
dsn_port = "30376"                 
dsn_protocol = "TCPIP"            
dsn_security = "SSL"

Create Connection

Ibm_db API uses the IBM Data Server Driver for ODBC and CLI APIs to connect to IBM DB2 and Informix.

Lets build the dsn connection string using the credentials entered above

#Create the dsn connection string
dsn = (
    "DRIVER={0};"
    "DATABASE={1};"
    "HOSTNAME={2};"
    "PORT={3};"
    "PROTOCOL={4};"
    "UID={5};"
    "PWD={6};"
    "SECURITY={7};").format(dsn_driver, dsn_database, dsn_hostname, dsn_port, dsn_protocol, dsn_uid, dsn_pwd,dsn_security)

#print the connection string to check correct values are specified
print(dsn)

Establish Connection

try:
    conn = ibm_db.connect(dsn, "", "")
    print ("Connected to database: ", dsn_database, "as user: ", dsn_uid, "on host: ", dsn_hostname)

except:
    print ("Unable to connect: ", ibm_db.conn_errormsg() )

Close Connection

ibm_db.close(conn)

# OUTPUT
TRUE

R Studio & SQLite3

Here is a way to connect to a local SQLite3 db using RStudio

File > New File > SQL Script
The new file will open this this line on top

-- !preview conn=DBI::dbConnect(RSQLite::SQLite(),'D:/data/sqlite3/INSTRUCTOR.db')

Obviously as you see above it setup a connection to the db already

List Tables in db

SELECT name
FROM   sqlite_master
WHERE  type='table'

Preview the Results

As you save the code, you’ll see in the lower pane (next to console, terminal tabs) a SQL Results tab
Open that tab and you’ll see the results of the SQL Query there, in this case the list of tables in the db
If you wanted to run it here in quarto make sure you supply the conn to the sql code chunk
Add #| connection = conn

# This is R code
conn=DBI::dbConnect(RSQLite::SQLite(),'D:/data/sqlite3/INSTRUCTOR.db')

/* To supply a connection access to the sql chunk put this as first line: #| connection = conn */
SELECT name
FROM   sqlite_master
WHERE  type='table'

4 records
name
Sc_table
Sc_table2
Instructor
Scores

Close Connection

Make sure to always close connection to db

DBI::dbDisconnect(conn)