Transform table column using own function in Power Query

Very often we come to situation when we need to somehow transform list elements or, what is more often, table column.

And often standard transform functions do not provide a solution.


Assume we have following table


Typically, what we would like to do in such case – remove numbers.

Usual way to solve this for column “Shapeshift”

  1. add column with function Text.Start and cut first 5 characters.
  2. Then remove original column.
  3. Then Rename new column (if we want to have same name as removed column)

It makes me sick to go this way. My intuition says that if we work with huge table, adding column is not a memory efficient operation.

There should be easier solution, just because it is Power Query ;). If you do something, there always should be a more effective solution.

Would be great to transform column in place to avoid new column adding and renaming.

Let’s try to do this.

First idea, that come in my mind – we can use one of Transform function and replace it with my own function.

Trying


OK, we got formula


Change Text.Trim to Text.Start.

Stop. Text.Start requires two arguments – text and quantity of characters.

1 try


2 try


Wow, it works!

RTFM? No 🙂 Who read boring manuals? Only trials and errors.

I wonder. When I made Trim from UI, PQ generated formula

= Table.TransformColumns(#”Changed Type”,{{“Shapeshift”, Text.Trim}})

There is no “each” word. Nevertheless, PQ understands what to do.

If I add “each”-word I get


PQ puts Text.Trim in each cell as Function object.

However, if I provide parameter to function – it works again


If initially I had received this formula, it would be obvious how to come with own function.

OK. Got that lesson. Understanding of where to use “each” word and where to use underscore sign will save some time for us on future projects.

Now, let’s make slightly more complex task.

Second column contains number of document and issue date.

Our task – get only numbers, we don’t need date.

But length of document number differs. Therefore, previous trick will not help, we need to determine length for each row separately.

In this case, it is quite easy – we need to take start of string from the beginning to the first found “space” character.

Again, we can start from Trim transformation to get initial formula


Then change formula to

= Table.TransformColumns(#”Trimmed Text”,{{“Get me if can”, each Text.Start(_, Text.PositionOf(_, ” “))}})


Text.PositionOf(_, ” “)

Applied for each row it gives us position of space.

There is small difference from same trick in Excel, where we use FIND or SEARCH function to find specific character in string.

FIND and SEARCH count from 1, when Text.PositionOf counts from 0. Therefore, when use Text.PositionOf, we don’t need to subtract 1 to exclude found space.

We are ready for more complex example. Usage of really own function.

Recently Ken Puls published great function PowerTrim. Unfortunately, standard PQ library doesn’t contain function that remove multiple spaces like TRIM in Excel, despite it is very often scenario of data cleaning.

Define

PowerTrim = (text as text, optional char_to_trim as text) =>

let

char = if char_to_trim = null then ” ” else char_to_trim,

split = Text.Split(text, char),

removeblanks = List.Select(split, each _ <> “”),

result=Text.Combine(removeblanks, char)

in

result

Prepare table for tests


Query this table and go to Advanced Editor


Add PowerTrim in code


Now we ready to repeat same trick with Table.TransformColumns, but with PowerTrim is this time


We get


Replace function with PowerTrim


Cheers! It works! Spaces gone!

What is interesting, PowerTrim is defined as function with “1.5” parameters. Second in optional.

PowerTrim = (text as text, optional char_to_trim as text) =>

When I call this function, I don’t provide any parameter.

Therefore we can conclude, that such way of function call works for functions with one parameter. Or, in other words, if function has one parameter, we can omit it.

Both following expressions work, and give same result

= Table.TransformColumns(#”Changed Type”,{}, PowerTrim)

= Table.TransformColumns(#”Changed Type”,{}, each PowerTrim(_))

But if we want to use second parameter, than only option with “each” word works

= Table.TransformColumns(#”Changed Type”,{}, each PowerTrim(_, “t”))


These scenarios are very often, at least in my experience. I’m sure, this trick will save you a lot of time.

For those who use GitHub I included PowerTrim function in my repository

https://github.com/IvanBond/pquery/blob/master/Text.PowerTrim.m

Currently Load.m function offers to load function from local or network drive, but who denies to load function directly from GitHub raw file? More about this way in next post.

Sample workbook can be downloaded here.

UPD: there is a new post on this topic in my blog.

Advertisements

3 thoughts on “Transform table column using own function in Power Query

  1. Dane Chris 2017-06-27 / 18:59

    Hi,

    For this line result=Text.Combine(removeblanks, char)
    I do think this will add an extra space (if char is null) after the last word so applying powertrim to “how are you?” will result to “how are you ?”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s