From time to time Huang Caiguang sends me some geeky M-code that turns out to be very useful (like this for example). Today it’s utilizing java script code for RegEx (regular expressions): The Web.Page function can execute JS code and you can pass strings from you M-environment to it with simple string concatenation. So escape the js-code and use the ampersand (“&”) to reference the string. As a function it looks like so:
letfx=(x,y)=>Web.Page(“<script>var x=”&x&”;var y=new RegExp(‘”&y&”‘,’g’);var b=x.match(y);document.write(b);</script>”)[Data]{0}[Children]{0}[Children]{1}[Text]{0}in fx(“””hello012中国1235″””,”\\d+”)
This code extracts all decimals (“\\d+”) from the string provided as the first argument and concatenates them with a comma. But just try for yourself by pasting the code into the advanced editor.
This doesn’t work in PowerBI service, so if you need it there, you can switch to an R-script that can be used with a personal gateway. But for Excel users, this is the way to go.
Please vote for a native implementation in M here: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/8892295-regular-expressions-support-in-power-query
Enjoy and stay queryious 😉
Too complicated =)
====
let
fx=(s, reg)=>Web.Page(“document.write(‘”&s&”‘.match(/”®&”/));”),
tbl = fx(“qwer1234asdf”,”\d+”),
result = tbl{0}[Data]{0}[Children]{1}[Children]{0}[Text]
in
result
Cool!
Thx Sergey 🙂
How do you add this as a step in power query to apply this on a column?
Wow now that’s exploitable 🙂
For those that want to get the quotes right have a look at
https://gist.github.com/Hugoberry/4948d96b45d6799c47b4b9fa1b08eadf
Thx Igor!
(posted the article from my mobile…)
Hi Imke, great article. This led me to create a template where you can develop any JavaScript function to consume it in Power Query. Download the file: https://www.mlf.net.br/wp-content/uploads/arquivos-youtube/035.zip
Cool!
Thx a lot Felipe!!
How to concatenate all the decimals with a comma?
The concatenation is done by JS by default.
I was referring to Felipe’s template. All the decimals were concatenated without any comma in the file. Sorry for the misunderstanding caused.
It is because I’m using x.match(pattern).join(“”) instead of x.match(pattern) in the JavaScript function.
In JS, when you ommit the join method, id displays separated by commas.
Thanks for your kind explanation, Felipe. It’s true as you said.
Pingback: Power BI Lingo, RegEx, Data and BI Summit and more... (April 26, 2018) | Guy in a Cube
Nice. I was using R for my Regex. Still baffles me why SQL and other languages don’t add this as standard as they add so much power and would not be difficult to include.
True.
Please vote for it here if you haven’t already: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/8892295-regular-expressions-support-in-power-query
Yep already voted
I am trying to do something similar, but I am too new to Power Query / M to get anything working.
I have a table with a column called Title with some text. I am trying to find any row that has Visio as a word within the title, but exclude anything such as vision, visionary, …
I can’t get the example above to work. Is it possible to provide a sample file with a custom function, along with the appropriate parameters ([Field] as text, RegEx), returning true or false (if found or not found).
Hello Daniel,
if my understanding is correct, this is fairly easy to implement. Please paste the following code into the advanced editor and see what happens:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCssszsxXSCxOSVPSUUosVorVgYrlwQTTk8CCIJ5CGUgGKFaclQYVzIGJ5hWDDEgtqkpUio0FAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Title = _t, #"Other columns" = _t]),
#"Filtered Rows" = Table.SelectRows(Source, each List.Contains(Text.Split([Title], " "), "Visio", Comparer.OrdinalIgnoreCase))
in
#"Filtered Rows"
This is a case-insensitive implementation. If you need a case sensitive, then use Comparer.Ordinal instead.
/Imke
Thanks for the quick reply.
The following row from your code seems to return a table with any row containing visio (case insensitive). I would like to see this in a custom column with a TRUE (contains visio) or FALSE (does not contain visio).
Table.SelectRows(Source, each List.Contains(Text.Split([Title], ” “), “Visio”, Comparer.OrdinalIgnoreCase))
I want to provide this to some employees who can then simply filter on a few columns (I will have a column for Visio, Project, Adobe, Antidote, …
It would be cool, if those could all be in the same column with the software as the result instead of true/false, but that might be a phase 2 – I would be very happy to have the 1st part working. The difficulty with phase 2 is that I may have multiple software in the same title (ie: Install visio and project). In phase 1, I plan on doing this as multiple columns with visio true/false and project true/false). Phase 2, I would like “VISIO,PROJECT” as the result in a custom column.
Hi Daniel,
phase 1 is no problem, just take the filter logic to a new column instead: Table.AddColumn(Source, “FilterColumns”, each List.Contains(Text.Split([Title], ” “), “Visio”, Comparer.OrdinalIgnoreCase))
I don’t have a clear picture what you want to achieve in phase 2. So please provide a link to an Excel file where you’ve outlined the situation as is and the desired outcome.
Thanks, Imke
Hi, I managed to do what I wanted using multiple columns. I have provided a number of rows below to show what I would like.
Sample Data…Column Title is what we check and Custom_Software_Column would return specific words found in the title column separated by a comma or other delimiter.
(At the moment I am specifically looking for Visio, Project, Adobe, Antidote but this could change).
Title Custom_Software_Column
Install Visio Visio
Install Visio and Project Visio, Project
Other issue1
Install Project Project
Other issue2
Remove Antidote Antidote
I managed a solution by using your code (one column per keyword) and then using another column with “if then else” logic. The is probably a better way that would eliminate the interim columns, but the solution works.
Thanks for the heads up, Daniel.
I faced an issue with special non printable characters in the input strings, I removed them from the input string first using Text.Clean function to bypass it:
let
fx=(InputString,RegexPattern)=>
let
CleanText = Text.Clean(InputString),
Match = Web.Page(
”
var x=(‘”&CleanText&”‘);var y=new RegExp(‘”&RegexPattern&”‘,’g’);var b=x.match(y);document.write(b);
“)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
in Match
in fx
Hope this helps.
Great – thanks for sharing!
Pingback: Regexp in Power Query using JavaScript - Tutorial Guruji
Pingback: [SOLVED] Regexp in Power Query using JavaScript – BugsFixing
Has anyone successfully implemented regex in power query (BI) M language using the .replace instead of just the .match
I got the .replace to work using grouping and replacing with groups, the code for the function is below:
et fx=(text,regex,repl)=>
Web.Page(
”
var x='”&text&”‘;
var z='”&repl&”‘;
var y=new RegExp(‘”®ex&”‘,’g’);
var b=x.replace(y, z);
document.write(b);
“)[Data]{0}[Children]{0}[Children]{1}[Text]{0}
in
fx
In PQ a M code example of the usage is below:
#"Added Custom4" = Table.AddColumn(#"Added Custom3", "Clean_First", each Regex_Replace([First], " \\d.\\d+| \\d", " "))
in
#”Added Custom4″
How to overcome this error –> Expression.Error: The name ‘Regex_Replace’ wasn’t recognized. Make sure it’s spelled correctly.
That error is because you first have to create function with the code and name the function Regex_Replace
Then you can call the function as in the example m code
I tried this approach, but the result is a piece of code and not the expected string value. 🙁
on the first line of the code the “L” in let was omitted when copying the code for the Function