Thank you Sponsors
Solving Common DBA Problems With R Steve Williams Solving Common DBA Problems With R
R Overview
General Overview Programming language that is an evolution of S Strengths are in statistics, anomaly detection, graphics, and data wrangling Extended easily using packages Integrated with the database engine in SQL 2016 Power BI can run R Scripts for visuals or data mod
Advantages of R Faster syntax to write than TSQL Faster performance for some things Whatever you are trying to do, someone can help Many great packages to use My favorites are: httr, tidyr, dplyr, readr, data.table, purrr, tibble, tidyjson, roxygen, shiny, lubridate, jsonlite, RODBC, RevoScaleR, stringi, stringr, magrittr, digest, devtools Run R code inside of SQL Server stored procs
Easy things to do in R Convert to and from JSON with very little syntax Manipulate strings fast and easy Transpose/Pivot Iterate over columns and rows Import files, Export Files Do things you would normally turn to SSIS for Read/write database Run your computer out of RAM
Intermediate/Advanced things to do in R Call an API and exchange data Write your own package Upload/Download documents with DocDb Execute JavaScript using the V8 package Run advanced scripts from Power BI Create a website (seriously! Search for “Shiny”) Join SQL Server, MySQL, and DocDb data Integrate Azure Key Vault with your systems Feed a data warehouse
Demo Time
My rules for creating Stored Procs with R You must return 1 data.frame from R code Exactly one data.frame Other data.frames can be written directly from R to Db Instead of returning it as a result set to the proc Other data can be returned to stored proc, but not another DF Your output must be named “OutputDataSet”, or you must change the expected name Put as little code as possible in the proc. Keep all of the coding work in Rstudio or RTVS You must install all necessary packages in the system library. I highly eliminating personal libraries on production servers. All stored procs are executed as one of the 20 logins (by default) that are created during install. They are all members of the SQLRUserGroup. Even if you are executing a stored proc, as yourself (domain login), it will run under one of these users. Common gotcha. Running a stored proc that runs R code, which then attempts to connect to a data source using a trusted connection/integrated security, it will NOT be using your login. This means that accessing things like DSNs or writing to a directory might require that you add appropriate permissions for these users. In this case, write access to one folder for input/output, and granting read permissions to the ODBC registry hive
Thanks for attending Q & A