tjdickinson
Board Regular
- Joined
- Jun 26, 2021
- Messages
- 61
- Office Version
- 365
- Platform
- Windows
The fact that Excel is unable to identify and correctly sort hierarchical numbering (ex. 1.1, 1.2, 1.2.1, 1.2.2, 1.3...) is an age-old problem which has been widely discussed in many fora. There are numerous workarounds, but they are often project-specific or formula-based. At the moment, I have a long, ugly, suboptimal, and restrictive formula, but I would like to have a UDF which accomplishes the same thing more efficiently, and which I can call whenever I need it (rather than having to reuse and adapt the formula each time).
The formula I have is based on a workaround posted here by Steve K, and demonstrated in the YouTube link posted by Nehemiah. It takes a number (ex. 1.1.1) and converts it into the format 00.##.##.## (thus, ex. 01.01.01). In the following sheet, the formula is in column G and is independent of all columns except A. Columns B-F contain segments of the formula to visualise how it works: it decomposes the input (column A) into hierarchical levels with the format 00 (columns B-E), and then joins them together with the delimiter "." (column F). In the final formula, then, is also a function to trim the string, removing the final ".0". (I suppose it's not necessary to begin with; it's an artefact of the 'sort by columns B-E' method described in the aforementioned YouTube link. But the formula is so unwieldy that I can't be bothered to take it out.) Note that all cells have the format 'text'.
The following table shows the result of A-Z sorting column A (incorrect sequence) and column G (correct sequence).
So, the formula works, but there are loads of problems with it:
Thank you very much in advance for your help. I really appreciate it!
The formula I have is based on a workaround posted here by Steve K, and demonstrated in the YouTube link posted by Nehemiah. It takes a number (ex. 1.1.1) and converts it into the format 00.##.##.## (thus, ex. 01.01.01). In the following sheet, the formula is in column G and is independent of all columns except A. Columns B-F contain segments of the formula to visualise how it works: it decomposes the input (column A) into hierarchical levels with the format 00 (columns B-E), and then joins them together with the delimiter "." (column F). In the final formula, then, is also a function to trim the string, removing the final ".0". (I suppose it's not necessary to begin with; it's an artefact of the 'sort by columns B-E' method described in the aforementioned YouTube link. But the formula is so unwieldy that I can't be bothered to take it out.) Note that all cells have the format 'text'.
Book3.xlsx | |||||||||
---|---|---|---|---|---|---|---|---|---|
A | B | C | D | E | F | G | |||
1 | Numbering | Level 1 | Level 2 | Level 3 | Level 4 | Joined | Trimmed | ||
2 | 1 | 01 | 0 | 01.0 | 01 | ||||
3 | 1.1 | 01 | 01 | 0 | 01.01.0 | 01.01 | |||
4 | 1.1.0 | 01 | 01 | 0 | 0 | 01.01.0.0 | 01.01.0 | ||
5 | 1.1.1 | 01 | 01 | 01 | 0 | 01.01.01.0 | 01.01.01 | ||
6 | 1.2 | 01 | 02 | 0 | 01.02.0 | 01.02 | |||
7 | 1.10 | 01 | 10 | 0 | 01.10.0 | 01.10 | |||
8 | 1.10.0 | 01 | 10 | 0 | 0 | 01.10.0.0 | 01.10.0 | ||
9 | 1.10.1 | 01 | 10 | 01 | 0 | 01.10.01.0 | 01.10.01 | ||
10 | 1.10.10 | 01 | 10 | 10 | 0 | 01.10.10.0 | 01.10.10 | ||
11 | 2.1.1 | 02 | 01 | 01 | 0 | 02.01.01.0 | 02.01.01 | ||
12 | 10.1.1 | 10 | 01 | 01 | 0 | 10.01.01.0 | 10.01.01 | ||
Sheet1 |
Cell Formulas | ||
---|---|---|
Range | Formula | |
B2:B12 | B2 | =IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,IF(LEN(A2)>1,A2,"0"&A2),IF(LEN(LEFT(A2,SEARCH(".",A2)-1))>1,LEFT(A2,LEN(LEFT(A2,SEARCH(".",A2)-1))),"0"&LEFT(A2,LEN(LEFT(A2,SEARCH(".",A2)-1))))) |
C2:C12 | C2 | =IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"0",IFERROR(IF(LEN(MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1)))=1,"0"&MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1)),MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1))),IF(LEN(RIGHT(A2,LEN(A2)-SEARCH(".",A2)))=1,"0"&RIGHT(A2,1),RIGHT(A2,LEN(A2)-SEARCH(".",A2))))) |
D2:D12 | D2 | =IF(IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"",IF(IFERROR(FIND("~",SUBSTITUTE(A2,".","~",2)),"")="","0",IF(LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))=1,"0"&RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))),RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))))))="00","0",IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"",IF(IFERROR(FIND("~",SUBSTITUTE(A2,".","~",2)),"")="","0",IF(LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))=1,"0"&RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))),RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))))))) |
E2:E12 | E2 | =IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))<2,"","0") |
F2:F12 | F2 | =TEXTJOIN(".",TRUE,IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,IF(LEN(A2)>1,A2,"0"&A2),IF(LEN(LEFT(A2,SEARCH(".",A2)-1))>1,LEFT(A2,LEN(LEFT(A2,SEARCH(".",A2)-1))),"0"&LEFT(A2,LEN(LEFT(A2,SEARCH(".",A2)-1))))),IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"0",IFERROR(IF(LEN(MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1)))=1,"0"&MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1)),MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1))),IF(LEN(RIGHT(A2,LEN(A2)-SEARCH(".",A2)))=1,"0"&RIGHT(A2,1),RIGHT(A2,LEN(A2)-SEARCH(".",A2))))),IF(IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"",IF(IFERROR(FIND("~",SUBSTITUTE(A2,".","~",2)),"")="","0",IF(LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))=1,"0"&RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))),RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))))))="00","0",IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"",IF(IFERROR(FIND("~",SUBSTITUTE(A2,".","~",2)),"")="","0",IF(LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))=1,"0"&RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))),RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))))))),IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))<2,"","0")) |
G2:G12 | G2 | =LEFT(TEXTJOIN(".",TRUE,IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,IF(LEN(A2)>1,A2,"0"&A2),IF(LEN(LEFT(A2,SEARCH(".",A2)-1))>1,LEFT(A2,LEN(LEFT(A2,SEARCH(".",A2)-1))),"0"&LEFT(A2,LEN(LEFT(A2,SEARCH(".",A2)-1))))),IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"0",IFERROR(IF(LEN(MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1)))=1,"0"&MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1)),MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1))),IF(LEN(RIGHT(A2,LEN(A2)-SEARCH(".",A2)))=1,"0"&RIGHT(A2,1),RIGHT(A2,LEN(A2)-SEARCH(".",A2))))),IF(IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"",IF(IFERROR(FIND("~",SUBSTITUTE(A2,".","~",2)),"")="","0",IF(LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))=1,"0"&RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))),RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))))))="00","0",IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"",IF(IFERROR(FIND("~",SUBSTITUTE(A2,".","~",2)),"")="","0",IF(LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))=1,"0"&RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))),RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))))))),IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))<2,"","0")),LEN(TEXTJOIN(".",TRUE,IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,IF(LEN(A2)>1,A2,"0"&A2),IF(LEN(LEFT(A2,SEARCH(".",A2)-1))>1,LEFT(A2,LEN(LEFT(A2,SEARCH(".",A2)-1))),"0"&LEFT(A2,LEN(LEFT(A2,SEARCH(".",A2)-1))))),IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"0",IFERROR(IF(LEN(MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1)))=1,"0"&MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1)),MID(A2,SEARCH(".",A2)+1,FIND("~",SUBSTITUTE(A2,".","~",2))-(SEARCH(".",A2)+1))),IF(LEN(RIGHT(A2,LEN(A2)-SEARCH(".",A2)))=1,"0"&RIGHT(A2,1),RIGHT(A2,LEN(A2)-SEARCH(".",A2))))),IF(IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"",IF(IFERROR(FIND("~",SUBSTITUTE(A2,".","~",2)),"")="","0",IF(LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))=1,"0"&RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))),RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))))))="00","0",IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))=0,"",IF(IFERROR(FIND("~",SUBSTITUTE(A2,".","~",2)),"")="","0",IF(LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))=1,"0"&RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))),RIGHT(A2,LEN(A2)-FIND("~",SUBSTITUTE(A2,".","~",2))))))),IF(LEN(A2)-LEN(SUBSTITUTE(A2,".",""))<2,"","0")))-2) |
The following table shows the result of A-Z sorting column A (incorrect sequence) and column G (correct sequence).
Sorted-A | Sorted-G |
1 | 01 |
1.1 | 01.01 |
1.1.0 | 01.01.0 |
1.1.1 | 01.01.01 |
1.10 | 01.02 |
1.10.0 | 01.10 |
1.10.1 | 01.10.0 |
1.10.10 | 01.10.01 |
1.2 | 01.10.10 |
10.1.1 | 02.01.01 |
2.1.1 | 10.01.01 |
So, the formula works, but there are loads of problems with it:
- it is restricted to three levels of hierarchy, and two digits per level
- it is extremely repetitive and thus suboptimal
- it contains many references to cell A2, which makes it cumbersome to reuse in other sheets (when A2 isn't the desired reference)
- it is difficult to modify and debug
- an efficient function
- that converts an input from a cell or array
- from a format '##.##.##'...
- even better if it can identify any non-numeric (or non-alphanumeric) character as a delimiter; thus it would also process inputs such as: '##/##-##.##|##'
- to a format '00.00.00'...
- all delimiters are converted to "."
- ensuring that each value in a given segment/hierarchy level has the same number of digits
- In other words: if a hierarchy level contains only single digits, then the format is '0'; if it contains any double digits, then the format is '00'; if it contains any triple digits, then the format is '000', etc. Thus, each segment/hierarchy level can have different numbers of digits, but it needs to be consistent down the column (or the sorting won't work).
- and that the returned result is displayed as text (not as a number)
Thank you very much in advance for your help. I really appreciate it!