How we connected Amazon S3 to Microsoft PowerBI… in 5 minutes…
Last Updated on February 14, 2022 by Editorial Team
Author(s): Ilyas Iyoob, PhD, and Umar Iyoob
PowerBI has quickly risen to become the default dashboarding tool for many business users, while S3 continues to remain the default object store for technical users. So, what would seem like an effortless connection between the two is surprisingly not so effortless. Let us spare you the pain and take you right to our 5-minute solution.
Why PowerBI?
Microsoft PowerBI is quickly growing as the dashboarding tool of choice because:
- desktop version is free,
- sharing dashboards only costs $10/user/mo,
- library of (free) crowdsourced visualizations is growing rapidly (such as word clouds, box plots etc), and
- data querying capabilities are comprehensive enough to replace the need for a separate data engineering tool.
Why S3?
Amazon S3 is the most popular object store for small and medium businesses due to the fact that:
- uploading files to S3 is free,
- storing and retrieving files only costs ~$0.02/GB/mo
- access management is easy to setup, and
- programmatic access capabilities of S3 make it convenient enough to be embedded within enterprise-ready applications.
How would we connect the two?
While most online solutions suggest setting up a database connection to Redshift or Athena, here is a workaround that is much simpler:
Step 1:
Create an Amazon S3 bucket in your AWS account.
Step 2:
Store your data file in this bucket.
Step 3:
Create access credentials in Amazon IAM to access the data.
Step 4:
Open PowerBI desktop and select Get data > Other > Python script.
Step 5:
Insert the code below, specifying your AWS keys, bucket name and file path.
Note that embedding credentials within code is not ideal (even though this is within PowerBI), so in practice it would be better to save the credentials securely as environment variables.
Step 6:
Hit OK and watch your data file come through as a data table.
From here on, your PowerBI queries should work as they would with any data table. Enjoy!
How we connected Amazon S3 to Microsoft PowerBI… in 5 minutes… was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Published via Towards AI
Nori
Hey, this is a great tutorial, thank you for it. I’ve run into problem and thought you might have already found a solution for it: can’t schedule auto-refresh using this method. Power BI service won’t allow it. Have you come across this problem before? Thanks!
Abirami
Hey Yes, I have the same issue. Is there a fix to this issue? Creating auto refresh to this approach?
Gaurav Kumar Singh
I tried to connect and it says that module boto3 not found.
Please suggest.