# Power Query: Split string, manipulate items and combine



## Jon von der Heyden (Feb 24, 2016)

Hi

I'm trying to get to grips with M.

I have an alphanumeric string, e.g AB123C123D12

I wish to split it into {A;B;1;2;3;C;1;2;3;D;1;2}

Such that I can then manipulate elements e.g:

Return Asc code for each A-Z character so it becomes: {65;66;1;2;3;67;1;2;3;68;1;2}

Then I want to combine it and split it into single 1 char lengths again, so that it becomes {6;5;6;6;1;2;3;6;7;1;2;3;6;8;1;2}

And then finally add it up to get 65

I'm not necessarily asking for the whole function, rather just some pointers.  Unsure if I have this correct but here's what I figure will split the string initially;

```
Split=Splitter.SplitTextByRepeatedLengths(1),
Chars = Split(MyString)
```

This is all to be contained in a function so MyString will be the function argument.

Assuming the Split function I have chosen works, the bit I am struggling with is to loop through each item to perform the manipulation.

Thanks
Jon


----------



## ImkeF (Feb 24, 2016)

A quick & dirty one here:


```
(Textstring)=>
let
    Source = Table.FromValue(Textstring),
    SplitText = Table.AddColumn(Source, "Custom", each Text.ToList([Value])),
    ExpandList = Table.ExpandListColumn(SplitText, "Custom"),
    TransferToCharNum = Table.AddColumn(ExpandList, "Custom.1", each try Number.From([Custom]) otherwise Character.ToNumber([Custom])),
    SplitAndTranfToNumber = Table.AddColumn(TransferToCharNum, "Custom.2", each List.Transform(Text.ToList(Text.From([Custom.1])), each Number.From(_))),
    Sum = List.Sum(Table.ExpandListColumn(SplitAndTranfToNumber, "Custom.2")[Custom.2])
in
    Sum
```

Contains a lot of text-number-conversion - for very large dataset it might make sense to optimize the code


----------



## Jon von der Heyden (Feb 24, 2016)

Thanks Imke, very much!

I'm going to try and get to grips with this now.  I'm not sure if I have the scope to add/expand/transform columns from a function, but I guess I will find that out shortly.  A few questions please, given that you have referenced these and I haven't yet found guidance on these on MSDN M formula reference:

What is => referred to as and what exactly does it do?  I see it referenced when defining a function, but I don't actually have the foggiest otherwise.

What does underscore do / mean?

From an optimisation perspective, will adding columns add much overhead?  And if so can it be done in memory?

Sorry for silly questions - I'm only just getting to grips with M.  I think (hope) I'll manage to understand the rest.


----------



## ImkeF (Feb 24, 2016)

This function returns a single number and is supposed to run in a column of the table that contains your text strings in another column. Running it would mean to populate this column with the result in each row. So you add a Custom Column to this table like this:

FunctionCall = Table.AddColumn(Source, "Sum", fnFunction([ColumnNameThatContainsYourString])

Provided that the query with your function is called "fnFunction".

Good blogpost on functions: http://www.mattmasson.com/2014/11/iterating-over-an-unknown-number-of-pages-in-power-query/

"What is => referred to as and what exactly does it do? I see it referenced when defining a function, but I don't actually have the foggiest otherwise. "
Answer: It follows this list of parameters/variables that are passed to the function. In this example it is only one: Your Textstring. So it actually creates the function type.

"What does underscore do / mean?"
Answer: It references the current environment and is a shorthand (sorry, cannot find a link where this is described, but maybe in here: http://pqreference.azurewebsites.net/PowerQueryFormulaLanguageSpecificationAugust2015.pdf). 
I've used it in  List.Transform
Example: List.Transform ({1, 2}, each _ + 1) equals { 2, 3 } Here it means that for every item in that list (1, 2) a transformation will be executed that takes the (current) item and adds 1. 
In our example we use it to transform to number format : List.Transform (..ourList.., each Number.From(_)) - so for every item in the list, the Number.From-operation will take place

" From an optimisation perspective, will adding columns add much overhead? And if so can it be done in memory?"
Answer: Sorry, don't understand these questions.

hope this helps a bit


----------



## Jon von der Heyden (Feb 24, 2016)

Helps tremendously Imke.  I'll be sure to follow up once I've got the function working.


----------



## ImkeF (Feb 24, 2016)

Have a look: https://www.dropbox.com/s/m0il4joal4mpy1t/ME_SplitStringAddChar.xlsx?dl=0


----------



## Jon von der Heyden (Mar 24, 2016)

ImkeF said:


> A quick & dirty one here:
> 
> 
> ```
> ...


Hi again Imke

I finally had a chance to revisit this.

I still have a wee little more to do on it.  SplitAndTranfToNumber is producing a list of each numeric character in the record.  E.g. 16 becomes {1,6}.  And you did this so that you could solve the Sum question.  For my next lesson I wonder if rather than doing that, whether or not we can concatenate all of the digital values into a string again?

AB123
into...
{A,B,1,2,3}
and then convert that into...
{65,66,1,2,3}
and then convert that into...
6566123

It's the concatenation bit that I am stuck with...  I thought this was best described as 'concatenating record values', but web searches don't provide me any good references.


----------



## ImkeF (Mar 24, 2016)

Sure, no prob: Instead of List.Sum(... you take:

Text.Combine(Table.TransformColumnTypes(SplitAndTranfToNumber,{{"Custom.2", type text}})[Custom.2])

So you basically combine your numbers, but as this command doesn't work on numbers, but on text only, you have to convert your numbers to text first. We can nest that step in.


----------



## Jon von der Heyden (Mar 24, 2016)

Absolutely spot on, thank you Imke


----------

