Replace comma with pipe in .csv file

fddekker

Board Regular
Joined
Jun 30, 2008
Messages
86
Office Version
  1. 365
Platform
  1. Windows
I have searched the internet but have not find the proper solution; however it seems that several users have the same problem. My current client receives a .csv comma delimited files from the source. In these files, some fields contains commas that appear in a text file in double quotes. Editing takes place in Excel, and the files get re-saved and then uploaded to a different software system. In order not to manipulate the files other than making necessary changes, I would like to run a macro on the csv file to replace commas that serves as delimiters with pipes, and then remove the double quotes (leaving the commas that do not serve as delimiters). I know how to delete the double quotes, but am having problems with replacing the correct commas with pipes. The reason I would like to do this as an Excel macro, is because I would like to add it to a ribbon that already contains other macros for them.

I found this on VBA express (please let me know if this was solved here on Mr Excel):
Code_zpsefvlc0oj.png

In a similar posting on Stockoverflow, some users suggest Imports System, Text, RegularExpressions. Can this be imported into VBA code in Excel?

Any help approciated
 
My approach would be to replace the commas in the Excel file before exporting (e.g. with §), then replacing the § in the output file with a comma. That way, you probably won't even have any double quotes to deal with, either.

Hi Paul, The OP commented earlier that the .txt files already exist.

Perhaps a macro that uses a hybrid of those suggestions that would avoid the need to change the Windows separator character could be:
1. Read the csv file into Excel
2. Replace the literal commas with an unused character (e.g. with §).
3. Export as csv
4. Use relatively simple VBA IO to replace commas with pipes then § with commas.
 
Upvote 0

Excel Facts

What does custom number format of ;;; mean?
Three semi-colons will hide the value in the cell. Although most people use white font instead.
Going back a step, though, the OP said:
My current client receives a .csv comma delimited files from the source. In these files, some fields contains commas that appear in a text file in double quotes. Editing takes place in Excel, and the files get re-saved and then uploaded to a different software system.
It's before the re-saving that I'm suggesting the commas should be replaced and then, before uploading to the other system, the pipe-delimited text file gets edited to restore them.
 
Upvote 0
Going back a step, though, the OP said:

It's before the re-saving that I'm suggesting the commas should be replaced and then, before uploading to the other system, the pipe-delimited text file gets edited to restore them.

Hmmm....That is what it says, but I believe there's more to it. If the users have already opened the comma delimited file and their list separators have been changed to a pipe, they could just save the file as .txt and the file should be ready to be uploaded to the other system. (The fields with the literal commas won't be wrapped in quotes because commas are not the user's list separator).

I think there's some constraint against doing that that we are missing.
 
Upvote 0
I use Perl for complex text manipulations and avoid VB. But I did come up with a regex I think works.

You want to replace all commas that have an even number of quotation marks after them. You want to save the commas that have an odd number of quotation marks after them.

This is the comma you want to replace with the pipe character '|':
Code:
,(?=(?:[^"]|"[^"]*")*$)
Here's the explanation:

, — The comma you want to replace, if...
(?= — it is followed by (start of positive look-ahead)
(?: — start of non-capturing group
[^"] — a single character that is not a quotation mark​
| Or
" — a quotation mark followed by
[^"]* — zero or more characters that are not quotation marks
" — followed by one quotation mark
) — end of non-capturing group
* — We want the non-captured group any number of times, including zero times
$ — to the end of the line​
) — end of look-ahead​

Tested in perl with these strings:
Code:
foo,"bar,baz",foo
"bar,baz",foo,foo
foo,foo,"bar,baz"
foo,foo,foo
"bar,baz","bar,baz","bar,baz"
 
Upvote 0
Looking over my previous post, I noticed I didn't test for an empty line and I didn't test for a line of commas only. My regex worked on both of these in Perl.

I also did not perhaps make clear, in my explanation of the regex, that the alternation, the pipe character in `,(?=(?:[^"]|"[^"]*")*$)`, only extends from the single character immediately preceding the pipe to the single character immediately following the pipe.
 
Upvote 0
Macropod, that is my current 'manual' way of getting around it. The problem is that using any special character leaves me exposed when somebody for one or other reason enters that in text line.
 
Upvote 0
Thanks everyone (and old man with not so old brain)
I solved another complicated search with regex, so I had high hopes for this one.

Below is my original attempt (after going through this page: https://msdn.microsoft.com/en-us/library/ms974570.aspx#scripting05_topic5
and the code you suggested. Somewhere I am making a syntax error and I just can't figure it out. Is it something stupid that I am overlooking?

Code2_zpswoevvdkh.png


If I can figure out the syntax error, I can test the scenarios.

Thanks in advance!!
 
Last edited:
Upvote 0
Looks like the double quote needs an escape (also double quote), but still gets error message. Replacing the quotes with Chr(34) seems to work.
 
Upvote 0
thisoldman .... and others..... is there a difference with regexp and PERL in handling the script? This is the results I have so far:
Code4_zpsunwgtnwc.png


It looks like the code that works in PERL does not work in regexp, or am I doing something wrong?

Thanks
 
Upvote 0
An alternative approach for you to consider:
Code:
Sub Demo()
Dim r As Long, c As Long, StrData As String, DataFile As String
With ActiveSheet.UsedRange
  For r = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Row
    For c = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Column
      With .Cells(r, c)
        If InStr(.Value, ",") > 0 Then
          If (Left(.Value, 1) <> Chr(34)) Or (Right(.Value, 1) <> Chr(34)) Then
            StrData = StrData & Chr(34) & .Value & Chr(34) & "|"
          Else
            StrData = StrData & .Value & "|"
          End If
        Else
          StrData = StrData & .Value & "|"
        End If
      End With
    Next
    StrData = StrData & vbCrLf
  Next
  StrData = Replace(StrData, "|" & vbCrLf, vbCrLf)
  StrData = Left(StrData, Len(StrData) - 1)
End With
DataFile = ThisWorkbook.Path & "\Data.txt"
Open DataFile For Output As #1
Print #1, StrData
Close #1
End Sub
 
Upvote 0

Forum statistics

Threads
1,225,392
Messages
6,184,699
Members
453,254
Latest member
topeb

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top