=DBI::dbConnect(RSQLite::SQLite(),'D:/data/GDP/World_Economies.db') conn
Connect to DBs
R & SQLite
This section will cover how to connect to DBs from within RStudio with a combination of R & SQL
Using .sql file
World_Economies
- After you setup the connection you have to reference it and instruct RStudio to use SQL with this
{sql connection = conn}
- as the first line in the SQL code chunk
Preview
SELECT *
FROM Countries_by_GDP
Country | GDP_USD_billions |
---|---|
United States | 26854.60 |
China | 19373.59 |
Japan | 4409.74 |
Germany | 4308.85 |
India | 3736.88 |
United Kingdom | 3158.94 |
France | 2923.49 |
Italy | 2169.74 |
Canada | 2089.67 |
Brazil | 2081.24 |
List Tables
SELECT name FROM sqlite_master WHERE type='table';
name |
---|
Countries_by_GDP |
Table Attributes
PRAGMA table_info([table_name])
INSTRUCTOR
When updating a db make sure you commit() as some require it. In this example the data can only be accessed from the connection that created and edited the table. When I tried to access it from other connections it would not display any information. I went back to the original connection and
conn.commit().
and it worked fine after that.
=DBI::dbConnect(RSQLite::SQLite(),'D:/data/sqlite3/INSTRUCTOR.db') conn2
List Tables
SELECT name FROM sqlite_master WHERE type='table';
name |
---|
Sc_table |
Sc_table2 |
Instructor |
Scores |
Preview
SELECT *
FROM Sc_table
LIMIT 10
country | first_name | last_name | test_score |
---|
STAFF db
=DBI::dbConnect(RSQLite::SQLite(),'D:/data/STAFF.db') conn4
List all Tables
SELECT name FROM sqlite_master WHERE type='table';
name |
---|
INSTRUCTOR |
Preview
SELECT *
FROM INSTRUCTOR
LIMIT 10
ID | FNAME | LNAME | CITY | CCODE |
---|---|---|---|---|
1 | Rav | Ahuja | TORONTO | CA |
2 | Raul | Chong | Markham | CA |
3 | Hima | Vasudevan | Chicago | US |
4 | John | Thomas | Illinois | US |
5 | Alice | James | Illinois | US |
6 | Steve | Wells | Illinois | US |
7 | Santosh | Kumar | Illinois | US |
8 | Ahmed | Hussain | Illinois | US |
9 | Nancy | Allen | Illinois | US |
10 | Mary | Thomas | Illinois | US |
R & SQLite
Using .r file
library(RSQLite)
# create connection
<- dbConnect(RSQLite::SQLite(),'D:/data/sqlite3/INSTRUCTOR.db' )
conn5
# List Tables
dbListTables(conn5)
[1] "Instructor" "Sc_table" "Sc_table2" "Scores"
<- dbGetQuery(conn5, "SELECT * FROM Instructor")
preview preview
ID FNAME LNAME CITY CCODE
1 1 Rav Ahuja MOOSETOWN CA
2 2 Raul Chong Markham CA
3 3 Hima Vasudevan Chicago US
Python & SQLite
Refer to Data/SQLite3 page for more details. Here is a summary table: # Python & SQL Summary
SQLite
Topic | Syntax | Description |
---|---|---|
connect() | sqlite3.connect() |
Create a new database and open a database connection to allow sqlite3 to work with it. Call sqlite3.connect() to create a connection to the database INSTRUCTOR.db in the current working directory, implicitly creating it if it does not exist. |
cursor() | con.cursor() |
To execute SQL statements and fetch results from SQL queries, use a database cursor. Call con.cursor() to create the Cursor. |
execute() | cursor_obj.execute() |
The execute method in Python’s SQLite library allows to perform SQL commands, including retrieving data from a table using a query like “Select * from table_name.” When you execute this command, the result is obtained as a collection of table data stored in an object, typically in the form of a list of lists. |
fetchall() | cursor_obj.fetchall() |
The fetchall() method in Python retrieves all the rows from the result set of a query and presents them as a list of tuples. |
fetchmany() | cursor_obj.fetchmany() |
The fetchmany() method retrieves the subsequent group of rows from the result set of a query rather than just a single row. To fetch a few rows from the table, use fetchmany(numberofrows) and mention how many rows you want to fetch. |
read_sql_query() | read_sql_query() |
read_sql_query() is a function provided by the Pandas library in Python, and it is not specific to MySQL. It is a generic function used for executing SQL queries on various database systems, including MySQL, and retrieving the results as a Pandas DataFrame. |
shape | dataframe.shape |
It provides a tuple indicating the shape of a DataFrame or Series, represented as (number of rows, number of columns). |
close() | con.close() |
con.close() is a method used to close the connection to a MySQL database. When called, it terminates the connection, releasing any associated resources and ensuring the connection is no longer active. This is important for managing database connections efficiently and preventing resource leaks in your MySQL database interactions. |
CREATE TABLE | CREATE TABLE table_name ( column1 datatype constraints, column2 datatype constraints, ... ); |
The CREATE TABLE statement is used to define and create a new table within a database. It specifies the table’s name, the structure of its columns (including data types and constraints), and any additional properties such as indexes. This statement essentially sets up the blueprint for organizing and storing data in a structured format within the database. |
barplot() | seaborn.barplot(x="x-axis_variable", y="y-axis_variable", data=data) |
seaborn.barplot() is a function in the Seaborn Python data visualization library used to create a bar plot, also known as a bar chart. It is particularly used to display the relationship between a categorical variable and a numeric variable by showing the average value for each category. |
read_csv() | df = pd.read_csv('file_path.csv') |
read_csv() is a function in Python’s Pandas library used for reading data from a Comma-Separated Values (CSV) file and loading it into a Pandas DataFrame. It’s a common method for working with tabular data stored in CSV format |
to_sql() | df.to_sql('table_name', index=False) |
df.to_sql() is a method in Pandas, a Python data manipulation library used to write the contents of a DataFrame to a SQL database. It allows to take data from a DataFrame and store it structurally within a SQL database table. |
read_sql() | df = pd.read_sql(sql_query, conn) |
read_sql() is a function provided by the Pandas library in Python for executing SQL queries and retrieving the results into a DataFrame from an SQL database. It’s a convenient way to integrate SQL database interactions into your data analysis workflows. |
Db2
Topic | Syntax | Description | Example |
---|---|---|---|
connect() | conn = ibm_db.connect('DATABASE=dbname; HOST=hostname;PORT=port;UID=username; PWD=password;', '', '') |
ibm_db.connect() is a Python function provided by the ibm_db library, which is used for establishing a connection to an IBM Db2 or IBM Db2 Warehouse database. It’s commonly used in applications that need to interact with IBM Db2 databases from Python. |
|
server_info() | ibm_db.server_info() |
ibm_db.server_info(conn) is a Python function provided by the ibm_db library, which is used to retrieve information about the IBM Db2 server to which you are connected. |
|
close() | con.close() |
con.close() is a method used to close the connection to a db2 database. When called, it terminates the connection, releasing any associated resources and ensuring the connection is no longer active. This is important for managing database connections efficiently and preventing resource leaks in your db2 database interactions. |
|
exec_immediate() | sql_statement = "SQL statement goes here"stmt = ibm_db.exec_immediate(conn, sql_statement) |
ibm_db.exec_immediate() is a Python function provided by the ibm_db library, which is used to execute an SQL statement immediately without the need to prepare or bind it. It’s commonly used for executing SQL statements that don’t require input parameters or don’t need to be prepared in advance. |
Python & DB2
Note: cannot use Jupyter Notebook from outside Db2 cloud with the instructions detailed below. To access IBM Db2 from outside need to load different packages from the ones specified below
In order to get the connection credentials:
- Go to cloud.ibm.com/resources
- Find Databases dropdown on the resources table
- Choose Db2-xx
- Service credentials form the left menu
- The first time this page will be empty
- Click > New Credentials
- Next page will have Name: … - Leave it
- Role: Manager - Leave it
- Click > Add
- Now you’ll see the credential listed
- Check the box on the left and
- Click on Down Arrow to view the credential details
So as you see above we’ve already created a Db2 instance and we’ve retrieved our credentials in Data/Connect Script page.
Import ibm_db Library
The ibm_db
API provides a variety of useful Python functions for accessing and manipulating data in an IBM® data server database, including functions for connecting to a database, preparing and issuing SQL statements, fetching rows from result sets, calling stored procedures, committing and rolling back transactions, handling errors, and retrieving metadata.
library(reticulate)
#py_install("ibm_db")
import os
r'c:\users\emhrc\onedrive\docume~1\virtua~1\r-reti~1\lib\site-packages')
os.add_dll_directory(
import ibm_db
Define Credentials
= "6667d8e9-9d4d-4ccb-ba32-21da3bb5aafc.c1ogj3sd0tgtu0lqde00.databases.appdomain.cloud"
dsn_hostname = "nzx60439"
dsn_uid = "XqJtPBH1PGw3rnjF"
dsn_pwd
= "{IBM DB2 ODBC DRIVER}"
dsn_driver = "bludb"
dsn_database = "30376"
dsn_port = "TCPIP"
dsn_protocol = "SSL" dsn_security
Create Connection
Ibm_db API uses the IBM Data Server Driver for ODBC and CLI APIs to connect to IBM DB2 and Informix.
- Lets build the dsn connection string using the credentials entered above
#Create the dsn connection string
= (
dsn "DRIVER={0};"
"DATABASE={1};"
"HOSTNAME={2};"
"PORT={3};"
"PROTOCOL={4};"
"UID={5};"
"PWD={6};"
"SECURITY={7};").format(dsn_driver, dsn_database, dsn_hostname, dsn_port, dsn_protocol, dsn_uid, dsn_pwd,dsn_security)
#print the connection string to check correct values are specified
print(dsn)
Establish Connection
try:
= ibm_db.connect(dsn, "", "")
conn print ("Connected to database: ", dsn_database, "as user: ", dsn_uid, "on host: ", dsn_hostname)
except:
print ("Unable to connect: ", ibm_db.conn_errormsg() )
Close Connection
ibm_db.close(conn)
# OUTPUT
TRUE
R Studio & SQLite3
Here is a way to connect to a local SQLite3 db using RStudio
- File > New File > SQL Script
- The new file will open this this line on top
-- !preview conn=DBI::dbConnect(RSQLite::SQLite(),'D:/data/sqlite3/INSTRUCTOR.db')
- Obviously as you see above it setup a connection to the db already
List Tables in db
SELECT name
FROM sqlite_master='table' WHERE type
Preview the Results
- As you save the code, you’ll see in the lower pane (next to console, terminal tabs) a SQL Results tab
- Open that tab and you’ll see the results of the SQL Query there, in this case the list of tables in the db
- If you wanted to run it here in quarto make sure you supply the conn to the sql code chunk
- Add
#| connection = conn
# This is R code
=DBI::dbConnect(RSQLite::SQLite(),'D:/data/sqlite3/INSTRUCTOR.db') conn
/* To supply a connection access to the sql chunk put this as first line: #| connection = conn */
SELECT name
FROM sqlite_master
WHERE type='table'
name |
---|
Sc_table |
Sc_table2 |
Instructor |
Scores |
Close Connection
- Make sure to always close connection to db
::dbDisconnect(conn) DBI