SQL-query folding bug still alive and sucking in PowerBI and PowerQuery in Excel

Edit: There is currently no interest at Microsoft to change the current behaviour: http://community.powerbi.com/t5/Issues/Bug-Query-folding-not-working-with-non-SQL-datasource/idi-p/225100#M11611 .

I found that this workaround works also pretty well (faster in most cases), as long as your filter-tables are not too long: http://community.powerbi.com/t5/Desktop/Parameterized-SQL-Query-with-query-folding/td-p/171503

Are your SQL queries that filter with a non-SQL-table slow in PowerBI and PowerQuery in Excel? Then this might be of interest for you:

Nearly 2 years ago when I published my first blogpost about the bug that prevents query folding on SQL-sources when filtered by non-SQL-sources, PowerBI was still so new that I even didn’t mention it in my blogpost. I had enormous problems with some clients work with performance and spent 2-digit hours on figuring out the reason with some good help from the TechNet-Forum.

Today I was notified that “this bug was deemed as not high-pri enough to warrant a fix”. So it’s well alive and sucking: In Excel as well as in PowerBI. Considering how many people read my article, I found it worth to mention again and provide an improved solution. It is a function that’s easy to implement and takes the name of the tables and its key-columns as parameters:

Code to fix Query-Folding bug in PowerBi and PowerQuery

Code to fix Query Folding Bug in PowerBI and PowerQuery

And the code to copy: FilterSQLTable.txt

Hope this prevents you from wasting precious time.

Comments (14) Write a comment

Imke, I’ve added a link to this in the Excel uservoice idea at https://excel.uservoice.com/forums/304921-excel-for-windows-desktop-application/suggestions/16852819-power-query-cache-shared-nodes. Personally, I’d think that this should be rolled up into a big “perf patch” for Power Query. The refresh times are freaking awful and, despite having a workaround here, should really be MS’s focus right now.

Reply
- - Admin
  - 20.07.17 @ 9:05 PM
  Thank you very much Ken, I totally agree.
  
  Reply
Pingback: #Excel Super Links #103 – shared by David Hager | Excel For You
- Jason M Oates
- 13.06.18 @ 4:10 PM
How do I apply this code to my query? Should this be placed at the top before the LET statement?

Reply
- - Admin
  - 17.06.18 @ 8:40 PM
  Hi Jason,
  this is a function that you should copy into a separate query (open the advanced editor and replace the existing code with it).
  /Imke
  
  Reply
  - Jason M Oates
    
    19.06.18 @ 4:32 PM
    
    Thank you for your response. Once I have this saved as a separate query, how do I apply this function to my filter query? Below are the steps in my query:
    let
    Source = Oracle.Database(“prod”, [HierarchicalNavigation=true]),
    PENTA = Source{[Schema=”PENTA”]}[Data],
    JOB_LAB_RATE_SCHD1 = PENTA{[Name=”JOB_LAB_RATE_SCHD”]}[Data],
    #”Removed Other Columns” = Table.SelectColumns(JOB_LAB_RATE_SCHD1,{“JOB_ID”, “PR_CLASS_NUM”, “EFF_DATE”, “REG_RATE”, “TH_RATE”, “DT_RATE”, “LOW_PAY_RT_NUM”, “HIGH_PAY_RT_NUM”}),
    #”Filter on JOB ID FILTER” = Table.SelectRows(#”Removed Other Columns”, each List.Contains(JOB_ID_FILTER_TBL,[JOB_ID]))
    in
    #”Filter on JOB ID FILTER”
    
    Reply
    - Admin
      
      20.06.18 @ 9:04 PM
      
      Hi Jason,
      you have to transform your filter list (“JOB_ID_FILTER_TBL”) into a table. Name the column “Column1” and your function “TheFunction”, then your code would be this:
      
      let Source = Oracle.Database(“prod”, [HierarchicalNavigation=true]), PENTA = Source{[Schema=”PENTA”]}[Data], JOB_LAB_RATE_SCHD1 = PENTA{[Name=”JOB_LAB_RATE_SCHD”]}[Data], #”Removed Other Columns” = Table.SelectColumns(JOB_LAB_RATE_SCHD1,{“JOB_ID”, “PR_CLASS_NUM”, “EFF_DATE”, “REG_RATE”, “TH_RATE”, “DT_RATE”, “LOW_PAY_RT_NUM”, “HIGH_PAY_RT_NUM”}), #”Filter on JOB ID FILTER” = TheFunction(#”Removed Other Columns”, "JOB_ID", JOB_ID_FILTER_TBL, "Column1") in #”Filter on JOB ID FILTER”
      
      Reply
- Jason M Oates
- 27.06.18 @ 11:10 PM
I copied the function txt as you provided and pasted it into a new query then saved the query/function as Thefunction. I changed the column heading to Column1 in the JOB ID FILTER. I also update my query as you show above. The query runs, however it is showing null for each column of data. Any thoughts? Your assistance is greatly appreciated.

JOB_ID_FILTER_TABLE:
let
Source = Excel.CurrentWorkbook(){[Name=”PARENT_JOB_IDS”]}[Content],
#”Filtered Rows” = Table.SelectRows(Source, each [FILTERED] > 0),
#”Merged Queries” = Table.NestedJoin(#”Filtered Rows”,{“PAR_JOB_ID”},BI45_JOB_MASTER,{“PAR_JOB_ID”},”BI45_JOB_MASTER”,JoinKind.LeftOuter),
#”Expanded BI45_JOB_MASTER” = Table.ExpandTableColumn(#”Merged Queries”, “BI45_JOB_MASTER”, {“JOB_CUS_NAME”, “JOB_JOB_ID”, “JOBID NAME”}, {“JOB_CUS_NAME”, “JOB_JOB_ID”, “JOBID NAME”}),
#”Renamed Columns” = Table.RenameColumns(#”Expanded BI45_JOB_MASTER”,{{“JOB_JOB_ID”, “Column1″}}),
#”Filtered Rows1″ = Table.SelectRows(#”Renamed Columns”, each ([Column1] = “51L18-0002.001″))
in
#”Filtered Rows1”

LABOR RATE QUERY:
let
Source = Oracle.Database(“prod”, [HierarchicalNavigation=true]),
PENTA = Source{[Schema=”PENTA”]}[Data],
JOB_LAB_RATE_SCHD1 = PENTA{[Name=”JOB_LAB_RATE_SCHD”]}[Data],
#”Removed Other Columns” = Table.SelectColumns(JOB_LAB_RATE_SCHD1,{“JOB_ID”, “PR_CLASS_NUM”, “EFF_DATE”, “REG_RATE”, “TH_RATE”, “DT_RATE”, “LOW_PAY_RT_NUM”, “HIGH_PAY_RT_NUM”}),
#”Filter on JOB ID FILTER” = TheFunction(#”Removed Other Columns”, “JOB_ID”, JOB_ID_FILTER_TBL, “Column1″)
in
#”Filter on JOB ID FILTER”

TheFunction:
(SQLTable as table, keysql as text, filtertable as table, keyfilter as text) =>

let
// 1 Splitting filter table into chuncks of 200 lines and add index
Quelle = filtertable ,
AddIndex = Table.AddIndexColumn(Quelle, “Index”, 1, 1),
FilterTable = Table.AddColumn(AddIndex, “Group”, each Number.RoundUp([Index]/200)),

// 2 Performing the filter operation in a function
function = (group)=>
let
Source = SQLTable,
Result = Table.Buffer(Table.NestedJoin(Source,{keysql},Table.SelectRows(FilterTable, each [Group] = group),{keyfilter},”NewColumn”,JoinKind.Inner))
in
Result,

// 3 Main query: Create table with groups to call function from, call function and expand (thereby stitching the results back together)
Source = {1..Number.RoundUp(List.Max(FilterTable[Group]))},
ConvertToTable = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
AddFunctionCol = Table.AddColumn(ConvertToTable, “Custom”, each function([Column1])),
Remove = Table.RemoveColumns(AddFunctionCol ,{“Column1″}),
#”Expanded Custom” = Table.ExpandTableColumn(Remove, “Custom”, Table.ColumnNames(Remove[Custom]{0}))
in
#”Expanded Custom”

Reply
- - Admin
  - 28.06.18 @ 8:46 AM
  Hi Jason,
  are you aware that you’ve filtered your JOB_ID_FILTER_TABLE down to just one (distinct) value (last step: #”Filtered Rows1″)?
  Otherwise, please change the last row of the function from “#”Expanded Custom”” to “AddFunctionCol” to see the returned table before expansion and try to expand manually.
  /Imke
  
  Reply
- Manrique
- 22.11.18 @ 6:20 PM
This is great! Thank you! Have you tried something similar for doing a group by operation? I’m trying to do a distinct count and it’s not folding either. For now, just trying to adjust part 2 of the function for this.

Reply
- - Admin
  - 22.11.18 @ 7:00 PM
  Thank you! Don’t believe that my function helps with distinctcount unfortunately…
  
  Reply
- Manrique
- 22.11.18 @ 6:21 PM
This is great. Thank you!

Reply
- George
- 19.03.20 @ 7:02 PM
I have been trying to use this for my two queries and it is still taking a long time to load.
I am attempting to filter a table from an Oracle database (about 2 million rows) to only show the values that include an ID from my filter table, ID_TABLE ( about 4000 rows of unique ID codes).
The filter table is brought in from a named range in the excel sheet which makes this process difficult.
Doing this in oracle is not possible unless you split up the rows into groups of 1000 which is cumbersome in power query.
I attempted using your code which I think joins them 200 at a time, but it still has super long lag times.
Am I doing anything wrong? I posted your query directly into my power query, it seemed to load fine, then I enter
ORACLETABLE for SQLTable
ORACLEID for keysql
ID_TABLE for filtertable
Column1 for keyfilter

The invoked function takes longer than just joining them normally for some reason.
All the values in my Oracle table are text values and I have not anything with the Oracle table to break query folding.
The filter table is just 1 column of alphanumeric ID numbers formatted as text.

My end goal is to join the two or at least filter the oracle table and bring them into excel without refreshing the entire 2 million row oracle table.

If you know of anything that I am doing wrong with your function or any other tips you might have I would greatly appreciate it.
I am fairly new to using power query and welcome all help of any kind.
Thanks so much!!

Reply
- - Imke Feldmann
  - 19.03.20 @ 9:37 PM
  Hi George,
  have you checked the alternative I’ve linked to at the beginning of this post?: http://community.powerbi.com/t5/Desktop/Parameterized-SQL-Query-with-query-folding/td-p/171503
  
  Reply

New Generation Finance – Accounting – Controlling using Microsoft BI stack

SQL-query folding bug still alive and sucking in PowerBI and PowerQuery in Excel

Code to fix Query-Folding bug in PowerBi and PowerQuery

Like this:

Comments (14) Write a comment

Leave a ReplyCancel reply

Code to fix Query-Folding bug in PowerBi and PowerQuery

Share this:

Like this:

Comments (14) Write a comment

Leave a ReplyCancel reply