When you use the UI to import files from SharePoint, you’ll end up with the Sharepoint.Files function. This function can become fairly or super slow when you use it on large SharePoint sites. This is due to the fact, that it will retrieve metadata for ALL files that lie on the site. Meaning: The root site whose URL you have to enter as the function argument. So I’ve developed a better way for File import from SharePoint.
Alternative
A faster alternative is the function SharePoint.Contents. This function will read much less metadata and that seems to make it faster. But it comes with a different navigation experience: It basically only allows to select files from one folder.
Therefore I’ve created 2 functions that overcome those limitations.
SharePoint.GetAllFilesInFolder and Sharepoint.GetFile
Sharepoint.GetAllFilesInFolder allows to enter a folder path and will trigger the import of all the files from that folder including all subfolders. So you don’t have to navigate to all of them individually or create multiple queries for them.
The functions reference a parameter or query named “RootPath”. This needs to exist or being created in your file. Unfortunately it cannot be passed as an ordinary function parameter, as it would cause refresh problems in the service. The root path is basically the name of your SP site, generally everything before the 5th slash (“/”) in the URL. To get the syntax right, you should retrieve the folder name from the details-section like so:
The second function Sharepoint.GetFile is a convenient function if you want to import one specific file. It allows you to simply enter the full path to the desired file. No need for manual navigation. Also, this can speed up the load process as well, as navigation steps that are manually created, can slow down the import as well.
Function Code
Get all files from folder
Get single file
How to use
Considerations
Both functions use the fast SharePoint.Contents function (it took my queries to run half as slow than before) and use some dynamic logic to retrieve the data. But as a side effect, the formula firewall seems to kick in sooner. So there might be some query redesign necessary if you’re going to use this function on existing models.
These are a pretty new function and I haven’t tested it extensively yet. But the potential performance gain will probably strongly depend on the size of the SharePoint sites you’re running them on.
I’d be curious to hear your experiences with these functions.
Enjoy & stay queryious 😉
“Unfortunately it [the root path] cannot be passed as an ordinary function parameter, as it would cause refresh problems in the service.”
Do you have enough information to explain that? That’s hard to grasp.
Hi Frank,
unfortunately I have zero information about this apart from my own experience when trying to publish (and I’ve tried it a couple of times :))
/Imke
Hallo Imke! Tks for sharing! Did you had some experience like this?
An error occurred in the ‘’ query. DataSource.Error: Microsoft.Mashup.Engine1.Library.Resources.HttpResource: Request failed:
OData Version: 3 and 4, Error: The remote server returned an error: (404) Not Found. (Not Found)
OData Version: 4, Error: The remote server returned an error: (404) Not Found. (Not Found)
OData Version: 3, Error: The remote server returned an error: (404) Not Found. (Not Found)
Details:
DataSourceKind=SharePoint
DataSourcePath=https://gusta.sharepoint.com/sites/MaxHelpBI2/Shared%20Documents/DadosParquet
SPRequestGuid=c5a68e9f-1047-0000-723c-982e6e3a9ab1, c5a68e9f-c04a-0000-5ecf-fd9d000ca298, c5a68e9f-804d-0000-7aa3-b1c94e710ee7
@Gustavo
Did you use “https://gusta.sharepoint.com/sites/MaxHelpBI2” explicitly as as the RootPath?
Muchas Gracias por compartir, Imke
Hi Imke,
Thank you! Good work as always!
I am having problems with using the Sharepoint.GetAllFilesInFolder.pq function on Sharepoint subsites.
Using the following example of a main site (https://xyz.sharepoint.com/sites/MainSite), lets assume the files are located on a subsite (https://xyz.sharepoint.com/sites/MainSite/SubSite).
RootPath = https://xyz.sharepoint.com/sites/MainSite/SubSite
StaticRoot = SubSite Contents
ExtractRoot = MAIN SITE path (fourth slash) (generates error) (https://xyz.sharepoint.com/sites/MainSite).
NonRootFolders = “SubSite” & folders
SubfoldersList = {“SubSite”,…}
GetRootContent = SubSite Contents
NavigateIn = error (Cant find “SubSite” folder – not present in subsite……..).
For ExtractRoot would to following work?:
Text.BeforeDelimiter(StaticRoot[Folder Path]{0},”/”,{0, RelativePosition.FromEnd})
Gets Folder from StaticRoot and removes last slash.
Don’t need fnUriUnescapeString
StaticRoot may need Table.Buffer – Not sure…….
instead of:
fnUriUnescapeString(Text.BeforeDelimiter(FullPath, “/”, 4))?
I assume the same would work for single file function.
Thank you!
Sorry that you’ve ran into trouble with the function and thanks for sharing your workaround.
Just keep in mind that we still need this UriKUnescapeString-function for the NonRootFolders – step in case the subfolders have spaces in them (they don’t get autocorrected like the name of the site).
StaticRoot wouldn’t benefit from a buffer here, as it will only be evaluated once as it is.
/Imke
Hello Imke,
I just tried the GetSharePointFile for the single file out and already now it feels much faster for me.
Two questions:
a) I would need to adjust the RootPath Parameter for each file coming from a different source, right?
I. e. if I pull files from 4 different sharepoint sites, I would need to have four different values in that Parameter, so changing it between the queries?
b) forgive me if this is a totally stupid question, but in this second function, why can’t I reuse the FullPath with a formula to get the base instead of having to rely on the RootPath parameter?
I.e. for our sharepoint site I would write
StaticRoot = SharePoint.Contents(Text.Start(FullPath, Text.PositionOf(FullPath, “/”,5){4}+1), [ApiVersion = 15]),
instead of
StaticRoot = SharePoint.Contents(RootPath, [ApiVersion = 15]),
What am I missing here?
And you see I am new, so I didn’t know the Before.Delimiter function, as that can be part extracting the RootPath could be writen as Text.BeforeDelimiter(Fullpath, “/”, 4)
Hello Ralph,
a) yes, that’s correct
b) Yes, you can hardcode your specific path in the function itself, that’s not a problem.
Hi Imke,
thanks for the fast reply.
for b) – If I can do that – i. e. replace the RootPath parameter in line 42 with the FullPath (parameter passed to the function), why would I still need the RootPath?
Wouldn’t it be enough to switch the line 42 and 43 and replace the parameter RootPath with ExtractRoot to then have
ExtractRoot = fnUriUnescapeString(Text.BeforeDelimiter(FullPath, “/”, 4)),
StaticRoot = SharePoint.Contents(ExtractRoot, [ApiVersion = 15]),
Why would this not work? [and sorry again if I am being dense here]
Hi Imke,
I can’t begin to tell you how much this helped! We have what we are now guessing are large SharePoint sites (not sure how to quantify that in relation to others), and this took not only refreshes but simply working in the file WAY DOWN in time. Been searching for a solution many times previously and found this the other day.
One file, that I used the Folder connector above, we would literally walk away from while it did a refresh (25 minutes or so) and it is now under 4.
Another file (also folder connector) that was 2+ minutes is now 40 seconds.
And finally, one that I used the single file connector above, that varied between 4-10 minutes is now literally seconds (this last one is simple easy spreadsheets with just combing 9 tabs, so not complex at all, but if the query had to sort through thousands of SharePoint files first…).
Simply…THANKS.
God bless you! Finally found a reliable working solution here to extract from Sharepoint (or Sh@tpoint) to PowerBI Service without the crazy click-fest of repeated failed refresh attempts. We have 30,000 tiny files of 4KB CSV data that the default Sharepoint Folder just could no longer handle reliably at all. About 6 months ago, it could load 7500 files after a few attempts, but now can’t even handle 3000 tiny files. What a waste of 100+ hours troubleshooting this … MS innovation? … yeah, right!
Hi Imke,
Back again 🙂
Just a small suggestion to the GetAllFiles function – to replace the “Date created” and “Date modified” columns as it burrows down the tree – just like what you have done for “Extension” column.. At the moment, the function will only show the dates of the top level parent node instead of the dates from the child objects. I just modified your script in the “ListGenerate” step to get the desired dates.
Thanks again for this awesome solution.
Woop woop!
How do you do that, specially when i am combining files from multiple folders with different file depths in the folder.
Works very well. Thank you for sharing this with us.
Keep your good work up.
Cheers!
Works nice! One question though: If I want to add more columns ( such as Author etc) where should I add these in the code?
thank you !
Dear Imke, great content! Thank you.
I got the same error you got when you clicked to expand the table: “Formula.Firewall: Query ‘Invoked Function’ (step ‘Expanded Table Column1’) references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.”
How to overcome this issue?
Cheers
Hi Gustavo, I have the same problem. Is there a solution to this?
I got the same error. Delete the last step in the Invoked Function and than in the previous step expand the table. It worked.
Thanks for posting this. Never had this problem, but good to know an alternative!
hi I was using your methods and that worked great! a side case question regarding authentication.
We have setup some folders for customers
where our root is
rootdomain/[NameOfCustomer]/powerbidata
problem is that authentication wise,
PowerBI expects now that the user has access to the root domain with there user account
but I only gave them access to the full path
any chance I can somehow only force the authentication to get access to the full url in stead of the root domain?
Hi,
in PowerBi Desktop it works perfectly fine. But after publishing I get the following error message when trying to refresh the dataset:
“[Unable to combine data] Section1/Aufgerufene Funktion/Geänderter Typ references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.. The exception was raised by the IDbCommand interface. Table: …”
Do you know how to deal with this issue?
Once you invoked the function and followed all the steps go back and delete the last two steps in the query. Practically your last step now should be ‘Removed other columns1’. This is the step BEFORE the table is expanded. Create a new query which references the one from the invoked function. expand the table and do all the calculations on the the table. Basically you are splitting the import in two parts. You first import the data in the query with the invoked function. Once the data is imported, than you do all the transformation within the query.
Thanks a lot, now it works fine!
Thanks for the query ..
I am hitting a roadblock , in my CSV files i have columns called as User Principle Name and Display Name .. these two columns somehow come up not as the actual values but a string of characters which looks like a GUID..
Any idea why this might be happening
Hi,
My RootPath is https://arriumcloud.sharepoint.com/teams/Corporate/GFGGlobalServices
and FullPath is https://arriumcloud.sharepoint.com/teams/Corporate/GFGGlobalServices/AP%20Workflow%20Reports/GBS%20DashBoards/Power%20BI%20Reports/AP%20Supplementary%20Data
I’ve got the below error when I Invoked the Function
An error occurred in the ‘’ query. DataSource.Error: SharePoint: Request failed: The remote server returned an error: (500) Internal Server Error. (The attempted operation is prohibited because it exceeds the list view threshold.)
Details:
DataSourceKind=SharePoint
DataSourcePath=https://arriumcloud.sharepoint.com/teams/Corporate/GFGGlobalServices
SPRequestGuid=23d603a0-8094-0000-dbf8-6193eb9518c4
Url=https://arriumcloud.sharepoint.com/teams/Corporate/GFGGlobalServices/_api/Web/GetFolderByServerRelativePath(decodedurl=’/teams/Corporate/GFGGlobalServices/AP Workflow Reports’)/Folders
Is that mean that it reads the AP Workflow Reports folder and not the sub folder AP Supplementary Data?
Can you help me modify the function to able to read all the way to the last sub folder AP Supplementary Data?
Thanks,
Uyen Dao
Hi Uyen,
sorry to hear – but I don’t have experienced this error yet.
Not sure what to do here unfortunately.
/Imke
Hi,
Can I use this code to get data form SP list?
No, I’m afraid, this will only work for files in SP folders.
hi cubedfluidity,
Did you find a solution to connect to all fileson a subsite?
Thanks,
Hi Doug,
not sure I understand.
Because currently you can use the string of any folder level, so I would say it is already possible.
Have you tried it?
/Imke
Hi Imke,
Thanks for the response and for this awesome solution!
The code works when my files are in a SharePoint Library that is housed on a “main site”:
https://xyz.sharepoint.com/sites/MainSite/Library/FolderName
But it doesn’t seem to work when a subsite is introduced in the hierarchy:
https://xyz.sharepoint.com/sites/MainSite/Subsite/Library/FolderName
Hm, will check out at the weekend.
I changed the code in row 40 to consider the additional level and it worked:
ExtractRoot = fnUriUnescapeString(Text.BeforeDelimiter(FullPath, “/”, 5)),
Imke- You have been a great source of info for me. I appreciate all that you share.
Consistent with this blog, I am a big believer in replacing SharePoint.Files() with SharePoint.Contents() wherever possible. However, for some sites I encounter an issue authenticating when using SharePoint.Contents(). If I revert back to SharePoint.Files() I can authenticate just fine. Have you experienced this?
Hi Brent, sorry – haven’t come across this.
I had this issue.
It turned out to be caused by my permission on the SharePoint site. I only had one folder shared with me, and, for some reason, SharePoint.Contents() could not authenticate with those permissions, but SharePoint.Files() worked fine. Once I made myself owner of the SharePoint site, I could authenticate with SharePoint.Contents() and everything worked okay.
Hey Kieran, Does it work if you have Design permissions? which is just below Owner of sharepoint site. I have edit issues and it doesn’t work.
What is the best way to not include subfolder in the query
Imke,
Please forgive my ignorance, but is your new method for single SharePoint Excel file extraction faster than this?
Excel.Workbook(Web.Contents(FullPathURL), null, true)
-Justin
Hello Imke,
I need your help. I’m getting this error message:
An error occurred in the ‘’ query. Expression.Error: The key didn’t match any rows in the table.
Details:
Key=
Name=Shared Documents
Table=[Table]
What can I do? Any tips?
Tks, Alexandre.
Only to keep inform. I have found the problem some months ago, It was permission. I haven’t permission enough to see the whole content inside that Sharepoint/Teams channel.
Hi Alexandre
This error in PBI is normally seen when something has changed in the data source since the query was originally created. It could be things like the list/library name a column name which is used in steps after the source step of the M query. It might be worth looking at the query in Advanced editor to identify this …most probably in the second or third step.
Nothing was changed on data source.
https://company.sharepoint.com/teams/Namesite/Shared Documents/Subname/Foldername/Foldername/Foldername/Foldername
The only change that I’ve made was to change the query “Get all files from folder” on line 40: ExtractRoot = fnUriUnescapeString(Text.BeforeDelimiter(FullPath, “/”, 4)),
for ExtractRoot = fnUriUnescapeString(Text.BeforeDelimiter(FullPath, “/”, 5)),
but only as a test.
And still get the same error.
Do you have anymore suggestion?
Many Tks in advance.
Hi Alexandre,
one reason for the error message could be lack of authorization.
Are you using the same credentials in Power BI than when picking up the file path?
/Imke
oh yeah… I use office 365 and I’ve authorization to acess the path.
You are right. It was permission that blocked my access to the whole content.
Thank you for the code that pulls a list of all files in all subfolders! It works in my tests, but I have a library of folders named after people, so some have apostrophes in the names, and that is breaking the query. I’m having a difficult time figuring out where in the code I might escape apostrophes in the subfolder name. Any suggestions?
Hi,
since two days all of a sudden I get the following error message when trying to refresh the dataset:
“We cannot convert the value “[Table]” to type Table.. Value = [Table].”
In PowerBi Desktop the report refreshes without any issues, it just doesn’t work anymore in PowerBi Web Service. Until 3 days ago, everything worked perfectly fine for a couple of months.
Hopefully someone knows how to handle this.
Many thanks in advance!
Hi Andreas,
very sorry to hear. Haven’t heard about such an error before (and I’m using this function in many active solutions in the service as well).
One thing I could imagine is that you have to update your gateway (if you’re running your queries through it).
BR,
Imke
Thank you for the qick reply.
There is no gateway used for the queries as all data sources are cloudbased.
BR,
Andreas
Then you can just try to troubleshoot with re-publishing the report for example.
Also, copying the query to a dataflow and trying to refresh there might be a good idea.
.. or a workaround, if you cannot get it running in the dataset at all.
I found the solution – it was just login in with my credentials to every single data source in PowerBi Service.
Sometimes solutions are simple, but not easy to find.
Thank you for your help.
Thank you so much for this. While I did not use your full code, the bit about switching from SharePoint.Files to SharePoint.Contents was enough to dramatically speed up my query. Very grateful.
Hello,
Thank you for posting this, has been very helpful. But Whne I try to load the contents from the folder and selecting the the sheet. I receive an error
“Formula.Firewall: Query ‘Invoked Function’ (step ‘Expanded Table Column1’) references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.”
I am unable to load the data from the file. I see the Transform query working and I see the Transform Sample file working. But not loading it in the invoked function.
Dear All, just FYI, for me it seems that those functions works only when you are an admin of sharepoint site.
Hello,
I have a problem when I try to get a single file. I pasted the code for single file and then when I paste the path of the csv/excel in the Fullpath of the Query and then invoke; there is a message error into Invoked Function:
An error occurred in the ‘’ query. Expression.Error: There weren’t enough elements in the enumeration to complete the operation.
Details:
[List]
what should I do?
Work perfectly. Thx for your work.
Stupid question but where possible Synching the Sharepoint folder to your drive should alleviate all issues and just use csv.documents as an example to import?
when I use the get multiple files from folder than try to load contents i get a notification saying “evaluating query” then it goes away and nothing happens.
if I click on the binary contents of the file here is the error i see.
DataSource.Error: Web.Contents failed to get contents from ‘https://my-work.xxxxxxxxxx.com/sites/xxxxxxx/_api/web/getfilebyserverrelativeurl(‘/sites/xxxxxxxx/xxxxxx.xlsx’)/$value’ (404): Not Found
Details:
DataSourceKind=SharePoint
DataSourcePath=https://my-work.xxxxxxxx.com/sites/xxxxxx
Url=https://my-work.xxxxxxxxx.com/sites/xxxxxxxx/_api/web/getfilebyserverrelativeurl(‘/sites/xxxxxx/xxxxxxxx/xxxxxxxxxx.xlsx’)/$value
Thank you Imke, this is a gamechanger workaround! I am having an issue though, as I believe it’s due to a character that is not expected in the expression (one of the sub-folders has an apostrophe in it). The returned error states:
“DataSource.Error: SharePoint: Request failed: The remote server returned an error: (400) Bad Request. (The expression “Web/GetFolderByServerRelativePath(decodedurl=’/sites/[RemovedForPostingOnline]/_02 Import/BEP’s’)/Folders” is not valid.)”
Any idea how I can workaround this? I suppose I could replace the special character with something else, but I am not sure exactly where and how in the function I should include this step. Anyone can help? Or any other workaround (that doesn’t involve renaming of the files/folders)?
Thank you in advance everyone!
Hey,
first of all thank you for that amazing code!
But I have a little problem with the folder-code.
I followed all the steps and I see the files in Power BI, but i can’t transform the binary to select the right data.
It says that it can’t find the type of the first entry and that I schould try Worksheets. I have 4 subfolders and in every subfolder there is only 1 Excel.
What am I doing wrong?
Hi Imke, same as Laura above, do you know why the function SharePoint.GetAllFilesInFolder doesn’t work with Combine Files for Excel files stored in Sharepoint folder?
It seems the Content column stores binary files, but when clicking on Combine Files I get the error “We didn’t recognize the format of your first file(). Please filter the list of files so it contains only supported types (Text, CSV, Excel, etc) and try again”.
This doesn’t happen when I use Get Data > From Sharepoint Folder.
Clicking on the Binary record on your function shows a file with no Excel icon, while clicking on the Binary record when using From Sharepoint folder, will show an Excel icon.
This is an issue because then instead of Combine Files I have to use the File Path and Sheet Name with a custom function Excel.CurrentWorkbook(Web.Contents([Content]) to fetch each Excel file and combine them. It creates more work.
Works as advertised for me. As I’m running on an extremely large sharepoint site, this can cut load times for my models (which contain individual files are folder loads) down from nearly 2 hours to 1 minute.
Incredible stuff!
The only thing that would make it better would be a slightly more detailed walkthrough of the code, but I’m sure I can work it out!
Thank you so much Michael for this kind comment 🙂
It is very motivating, as it was a lot of work to come up with the solution.
As I am very busy currently, I let ChatGPT do the work of code commenting. I found it fairly good, so hope it helps you as well:
// Inline comments made by ChatGPT 4.0 (September 25 version)
(FullPath as text) =>
let
// Define a helper function to unescape URI-encoded strings
fnUriUnescapeString =
(data as text) as text =>
let
// Convert the input text to a list of characters and buffer it in memory
ToList = List.Buffer(Text.ToList(data)),
// Accumulate a list of bytes, converting hexadecimal sequences
Accumulate = List.Accumulate(
ToList,
[Bytes = {}],
(state, current) =>
let
HexString = state[HexString]?,
NextHexString = HexString & current,
NextState =
if HexString <> null then
if Text.Length(NextHexString) = 2 then
[
Bytes
= state[Bytes]
& Binary.ToList(Binary.FromText(NextHexString, BinaryEncoding.Hex))
]
else
[HexString = NextHexString, Bytes = state[Bytes]]
else if current = "%" then
[HexString = "", Bytes = state[Bytes]]
else
[Bytes = state[Bytes] & {Character.ToNumber(current)}]
in
NextState
),
// Convert the accumulated bytes to text
FromBinary = Text.FromBinary(Binary.FromList(Accumulate[Bytes]))
in
FromBinary,
Hello!
Great work on these functions – they really make it a lot easier to use data from SharePoint. My only question was why there is a need to have RootPath defined as a parameter or fixed value?
I changed the code for the queries from:
StaticRoot = SharePoint.Contents(RootPath, [ApiVersion = 15]),
ExtractRoot = fnUriUnescapeString(Text.BeforeDelimiter(FullPath, “/”, 4)),
to:
ExtractRoot = fnUriUnescapeString(Text.BeforeDelimiter(FullPath, “/”, 4)),
StaticRoot = SharePoint.Contents( ExtractRoot, [ApiVersion = 15]),
The functions seemed to work just fine with this change. Or perhaps I have not run into a problem yet caused by this. 🙂
Cheers, Chris W.
I’ve used a function to get files from a folder, but I’ve noticed that for some of the files, the last column isn’t showing up in Power Query. Do you know why this might be happening?
Hi,
Thanks for sharing this tip, however i tried it and it didn’t work unfortunately. Tried the Get all the files from a folder function. When get at the point of expand the folders it gives me an error:
“Invoked Function
Query ‘Invoked Function’ (step ‘Expanded Table Column1’) references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.”
It seemed really straightforward but it just got stuck in loading the data in Power BI.
It would have been helpful to go also with the loading part further cause it can be misleading.
Thanks