Back in the good old days, when I had more time and ambition, I really wanted to make a “generic oData feed”… built into Excel. The idea was that since Power Pivot could read oData… if I could make it super easy to convert any data source into an OData feed, then we could easily consume almost any data source. For some reason I was always excited to feed the results of a PowerShell script into PowerPivot… probably because PowerShell is basically my favorite thing evar.
Well, with the inclusion of R Script as a data source in Power BI… my wish has come true for “I can consume almost anything in Power BI, including PowerShell!” and it required no effort on my part. #winning!
Step 1 – Prepare R Environment
I already forgot how I did this. I mean… I installed R, and I probably installed RStudio, and then… did a funny dance or something? Maybe I had to set options in Power BI desktop so it knew where my R install was? Heck, I dunno.
Step 2 – Prepare an R Script
The extent of my R knowledge is writing a blog post last week on custom visuals… after reading the R For Dummies book. So, I can only (barely) string together a few simple commands to get what I want.
extensions <- system("powershell dir c:\\windows\\system32 | select Length, Extension", intern=TRUE)
valid <- grep("([[:digit:]]+)([[:space:]]+)([^[:space:]]*)", extensions, value = TRUE)
clean <- sub("([[:digit:]]+)([[:space:]]+)([^[:space:]]+).*", "\\1 \\3", valid)
final <- strsplit(clean, " ")
frame <- as.data.frame(matrix(unlist(final), ncol = 2, byrow=TRUE))
names(frame) <- c('length', 'extension')
I wrote and debugged this in RStudio (later I will just copy/paste it into Power BI). Line 1 is where I tell R to call PowerShell to execute something stupid as a proof of concept. The intern=TRUE is required for the standard output to be returned, so that we can stuff it into a variable.
What comes back is an array or list or vector or something… I dunno, R data types are confusing, but it is just a collection of strings, each full line of PowerShell output is 1 string. So, I ran it through grep/sub/strsplit (like a crazy person that doesn’t know R), using regular expressions to split the text output into something I can get into a data frame.
And this is important.
The way Power BI knows what to bring in from R… is through data frames. So, that last line that builds a data frame with the horrible name of “frame”, that is required. No, not the name. It should be better.
Step 3 – Run R Script in Power BI
Under “Get Data” in the “Other” section, you will find R script. When you select that it will bring up a cute dialog box.
After you paste in your script, and click OK… you will then be presented with a Navigator:Here you will see all your data frames… like my perfectly name one called “frame”… and a preview of the data. I then suggest you click “Edit” to bring up “The Feature That Is Not Called Power Query” and make sure you are happy with your data types and such (my length column came across as Text, so I flipped it to whole number).
Step 4 – Dance
Cuz… That’s It! Your data is in… go do your thing. Now, obviously this is a terrible example… but *I* still think it’s awesome 🙂
- It gives us a way to bring in many data sources not natively supported by Power BI. My example here was PowerShell… but it could have been Survey Monkey or Zen Desk or a rich powershell script that leverages .NET… or… just about anything
- It gives us a new and repeatable way to do ETL. Currently we can share around M script, but I suspect there are a lot more R scripts running around for interesting data sources – and I suspect some types of cleaning are just easier in R, compared to M.
- It lets you leverage the bazillion packages in R for doing interesting statistics, machine learning, etc.
- It lets you put R on your resume. Kidding 🙂