In this article I’m going to present a method about transforming multiple columns at once in a fast way. This method also allows you to reference columns that exist in your table already. As I have described in a previous article, this cannot be done using the native Table.TransformColumns function that will be applied if you do column transformations using the UI in Power Query. The function I am sharing here allows you to enter a list of column names to be transformed and a function that defines the transformation itself. So you have to be familiar with defining custom functions to use this approach.
Failed attempts
A very easy way to do such a task is to unpivot all columns that shall be transformed, do the transformation (once) and then pivot back. But this will only work with small datasets or datasets with little transformation in a performant way.
So my next approach to was to apply the Table.ReplaceValue-method from the before mentioned article and use it in a List.Accumulate function to be recursively applied to all affected columns. But that turned out to be very slow as well. At the end, I used a “drop and recreate”-approach that runs much faster:
Function for transforming multiple columns at once
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
let func = | |
// fnTransformMultipleColumns | |
let | |
fnTransformMultipleColumns = ( | |
SourceTable as table, | |
FieldList, | |
TransformationFunction, | |
optional FunctionArgumentsList, | |
optional newType | |
) => | |
let | |
FunctionArguments = if FunctionArgumentsList = null then {} else FunctionArgumentsList, | |
Source = Table.AddColumn( | |
SourceTable, | |
"**tempColumn**", | |
each Record.FromList( | |
List.Transform( | |
FieldList, | |
(l) => | |
Function.Invoke( | |
TransformationFunction, | |
{Record.Field(_, l)} & {_} & FunctionArguments | |
) | |
), | |
FieldList | |
) | |
), | |
RemoveOldColumns = Table.RemoveColumns(Source, FieldList), | |
ExpandNewColumns = Table.ExpandRecordColumn( | |
RemoveOldColumns, | |
"**tempColumn**", | |
FieldList | |
), | |
RestoreType = Value.ReplaceType( | |
Table.ReorderColumns(ExpandNewColumns, Table.ColumnNames(SourceTable)), | |
Value.Type(SourceTable) | |
), | |
ApplyNewTypes = Table.TransformColumnTypes( | |
ExpandNewColumns, | |
List.Transform(FieldList, each {_, newType}) | |
), | |
Result = | |
if Value.Type(newType) = type type then | |
ApplyNewTypes | |
else if newType = null or newType = true then | |
RestoreType | |
else | |
ExpandNewColumns | |
in | |
Result | |
in | |
fnTransformMultipleColumns , | |
documentation = [ | |
Documentation.Name = " Table_TransformMultipleColumns ", | |
Documentation.Description = " Transforms multiple columns of a <code>SourceTable</code> using the specified <code>TransformationFunction</code>. ", | |
Documentation.LongDescription = " Transforms multiple columns (defined in <code>FieldList</code>) of a <code>SourceTable</code> using the specified <code>TransformationFunction</code>. Optional parameters allow to pass addional function arguments and define a new type. ", | |
Documentation.Category = " Table ", | |
Documentation.Source = " www.TheBIcountant.com ", | |
Documentation.Version = " 1.0 ", | |
Documentation.Author = " Imke Feldmann ", | |
Documentation.Examples = {[Description = " ", | |
Code = " let | |
SourceTable = #table( type table [Product = Text.Type, Rate = Number.Type, Sales = Int64.Type, CoS = Int64.Type], | |
// Product| Rate| Sales| CoS| | |
{//——–|—–|——|—-| | |
{ ""A"", 1.2, 100, 40 }, | |
{ ""B"", 0.9, 200, 70 } } ), | |
FieldList = {""Sales"", ""CoS""}, | |
TransformationFunction = (currentItem, _) => currentItem * _[Rate], | |
FunctionCall = Table_TransformMultipleColumns(SourceTable, FieldList, TransformationFunction) | |
in | |
FunctionCall ", | |
Result = " #table( type table [Product = Text.Type, Rate = Number.Type, Sales = Int64.Type, CoS = Int64.Type], | |
// Product| Rate| Sales| CoS| | |
{//——–|—–|——|—-| | |
{ ""A"", 1.2, 120, 48 }, | |
{ ""B"", 0.9, 180, 63 } } ) | |
"]}] | |
in | |
Value.ReplaceType(func, Value.ReplaceMetadata(Value.Type(func), documentation)) |
Function parameters
- SourceTable: Is the table you want to apply this function on
- FieldList: Is a list of column names that shall be transformed
- TransformationFunction: The function that shall be applied to the columns from FieldList. This function can have as many parameters as you need, but the first the first 2 parameters are mandatory and must hold these 2 items: The first parameter must represent the item to be transformed (one of the columns from FieldList) and the second parameter must represent the current record/row. From there on you are free to add further parameters that you want to use in your function. (Example see next paragraph)
- optional FunctionArgumentsList: If your TransformationFunction uses additional parameters, you must pass them in here as a list.
- optional newType: By default, the function will automatically restore the same types for the transformed column that they had before. But if the column transformation shall also change the type, you can define it here. Just specify a type.
Example
Starting from a table that has Sales and CoS, these columns shall be multiplied by the values in column “Rate”:
So for the first function parameter, you reference this table. For the second function parameter, you would pass in this list of column names: {“Sales”, “CoS”}.
The TransformationFunction for the 3rd argument for this will look like so:
(currentItem, _) => currentItem * _[Rate]
It has 2 arguments: “currentItem” stands for the columns that shall be transformed: “Sales” and “CoS” and the underscore “_” stands for the current record. So every column to be transformed will be multiplied by the value from column “Rate” within the current row.
I you paste the function code from the GitHub sample above into the advanced editor window of a blank query, you will see a sample code for this function that you can paste into a new query and explore all function arguments in separate steps.
Enjoy & stay queryious 😉
Will this preserve query folding?
I see many cases Table.AddColumn approach kept the query folding, but others dont.
Hi Tommy,
I doubt it will.
Currently only using it on csv files and haven’t tested on SQL Server.
/Imke
Looks really good, I made some alterations to make it behave more like Table.TransformColumns, except with access to the entire record for each row. Have a look at the below (apologies if formatting gets screwed up). Thanks!
let
func = // fnTransformMultipleColumns
let
fnTransformMultipleColumns = ( SourceTable as table, transformOperations as list ) as table =>
let
Transformations
= // will accept a single transformation or a list of transformations, this standardizes the format
if transformOperations{0} is list then
transformOperations
else
{ transformOperations },
ColumnsToTransform = List.Transform ( Transformations, each _{0} ), // list of column names to replace
TempColumnName = Text.NewGuid(), // unique name for new temporary column
TableWithTempColumn = Table.AddColumn (
// creates a new temporary column that holds a record of all new column values
SourceTable,
TempColumnName,
each Record.Combine (
// record of all new column values
List.Transform (
Transformations,
( Transformation as list ) as record =>
Record.AddField (
[],
Transformation{0},
Transformation{1}( _ )
)
)
),
Type.Union (
// types of new columns
List.Transform (
Transformations,
each Type.ForRecord (
Record.AddField (
[],
_{0},
[ Type = try _{2} otherwise Any.Type, Optional = false ]
),
false
)
)
)
),
TableWithoutOldColumns = Table.RemoveColumns (
// removes old columns
TableWithTempColumn,
ColumnsToTransform,
MissingField.Ignore
),
NewTableUnsorted = Table.ExpandRecordColumn (
// expands out new columns
TableWithoutOldColumns,
TempColumnName,
ColumnsToTransform
),
Result = Table.ReorderColumns (
// sorts table in original sort order
NewTableUnsorted,
Table.ColumnNames ( SourceTable )
)
in
Result
in
fnTransformMultipleColumns,
documentation = [
Documentation.Name = " Table_TransformMultipleColumns ",
Documentation.Description
= "Applies transformation(s) of the form { column, transform, type }.",
Documentation.LongDescription
= "Returns a table from the input <code>table</code> by applying the transform operation to the column specified in the parameter <code>transformOperations</code> (where format is { column name, transformation, optional type }). If the column doesn't exist, a new column with that name will be created.",
Documentation.Category = " Table ",
Documentation.Source = " www.TheBIcountant.com ",
Documentation.Version = " 1.1 ",
Documentation.Author = " Imke Feldmann, modified by Alex Groberman "
]
in
Value.ReplaceType ( func, Value.ReplaceMetadata ( Value.Type ( func ), documentation ) )
Hi Alex,
that’s a nice twist – thanks for sharing!
/Imke
Pingback: Multi-Column Transformations in Power Query – Curated SQL
You can do this in place (without creating new columns and deleting old ones) using Record.TransformFields.
Here’s the basic approach leaving out types and optional arguments:
let
fnTransformMultipleColumns = (
SourceTable as table,
FieldList,
TransformationFunction
) =>
let
Source = Table.FromRecords(
Table.TransformRows(
SourceTable,
(row) =>
Record.TransformFields(
row,
List.Transform(
FieldList,
(field) => {field, each TransformationFunction(Record.Field(row, field), row)}
)
)
)
)
in
Source
in
fnTransformMultipleColumns
Hi Alexis,
that’s also a nice approach, but I very much doubt that it will be faster than my formula.
But if you show me that this is actually faster, I will swap the code (giving credits to you, of course) 🙂
Thanks and cheers,
Imke
Yeah, I don’t think it’s faster. I’m curious about your intuition though. What is your basic reasoning about why one would be faster than the other?
The Table.FromXXX have been slow for me so far, although I mostly use .FromRows or .FromColumns.
Basically splitting up the table into its components and then re-assembling it.
At risk of overdoing it, I’ve cleaned up my above code one more time, I think the below should run optimally:
let
func = ( SourceTable as table, transformOperations as list ) as table =>
let
Transformations = if transformOperations{0} is list then transformOperations else { transformOperations },
TransformNames = List.Transform ( Transformations, each _{0} ),
TransformFunctions = List.Transform ( Transformations, each _{1} ),
TransformTypes = List.Transform ( Transformations, each [ Type = _{2}? ?? type any, Optional = false ] ),
TempColumnType = Type.ForRecord ( Record.FromList ( TransformTypes, TransformNames ), false ),
TempColumnName = "1cb70c9c-8c2e-40ea-b04c-b741ab1bee64",
TableWithTempColumn = Table.AddColumn (
SourceTable,
TempColumnName,
each Record.FromList ( List.Transform ( TransformFunctions, ( Function as function ) as any => Function ( _ ) ), TransformNames ),
TempColumnType
),
TableWithoutOldColumns = Table.RemoveColumns ( TableWithTempColumn, TransformNames, MissingField.Ignore ),
NewTableUnsorted = Table.ExpandRecordColumn ( TableWithoutOldColumns, TempColumnName, TransformNames ),
Result = Table.ReorderColumns ( NewTableUnsorted, Table.ColumnNames ( SourceTable ) )
in
Result,
documentation = [
Documentation.Name = " Table_TransformMultipleColumns ",
Documentation.Description = "Applies transformation(s) of the form { column, transform, type }.",
Documentation.LongDescription
= "Returns a table from the input
table
by applying the transform operation to the column specified in the parametertransformOperations
(where format is { column name, transformation, optional type }). If the column doesn't exist, a new column with that name will be created.",Documentation.Category = " Table ",
Documentation.Source = " www.TheBIcountant.com ",
Documentation.Version = " 1.2 ",
Documentation.Author = " Imke Feldmann, modified by Alex Groberman "
]
in
Value.ReplaceType ( func, Value.ReplaceMetadata ( Value.Type ( func ), documentation ) )
Ok, last one,but THIS ONE FOLDS!
let
func = ( SourceTable as table, transformOperations as list ) as table =>
let
Transformations = if transformOperations{0} is list then transformOperations else { transformOperations },
TempPrefix = “1cb70c9c-8c2e-40ea-b04c-b741ab1bee64″,
ColumnNames = List.Transform ( Transformations, each _{0} ),
PrefixedColumnNames = List.Transform ( ColumnNames, each TempPrefix & _ ),
CreateNewColumns = List.Accumulate (
Transformations,
SourceTable,
( CurrentTable, CurrentColumn ) =>
Table.AddColumn ( CurrentTable, TempPrefix & CurrentColumn{0}, each CurrentColumn{1}( _ ), CurrentColumn{2}? ?? type any )
),
RemoveOldColumns = Table.RemoveColumns ( CreateNewColumns, ColumnNames, MissingField.Ignore ),
RenameNewColumns = Table.RenameColumns ( RemoveOldColumns, List.Zip ( { PrefixedColumnNames, ColumnNames } ) ),
Result = Table.ReorderColumns ( RenameNewColumns, Table.ColumnNames ( SourceTable ) )
in
Result,
documentation = [
Documentation.Name = ” Table_TransformMultipleColumns “,
Documentation.Description = “Applies transformation(s) of the form { column, transform, type }.”,
Documentation.LongDescription
= “Returns a table from the input
table
by applying the transform operation to the column specified in the parametertransformOperations
(where format is { column name, transformation, optional type }). If the column doesn’t exist, a new column with that name will be created.”,Documentation.Category = ” Table “,
Documentation.Source = ” http://www.TheBIcountant.com “,
Documentation.Version = ” 1.3 “,
Documentation.Author = ” Imke Feldmann, modified by Alex Groberman ”
]
in
Value.ReplaceType ( func, Value.ReplaceMetadata ( Value.Type ( func ), documentation ) )
Here’s another surprise, when testing against an Oracle DB, this custom function actually folds even when an identical call using the built-in Table.TransformColumns does not fold. Pretty nifty!