Convert JSON format file to CSV

majdi_marmas

New Member
Joined
Dec 4, 2015
Messages
17
I have a json data file which I would like to convert to CSV or excel.
The data in JSON file looks like this:

[
{
"car": "Audi",
"price": 40000,
"color": "blue"
}, {
"car": "BMW",
"price": 35000,
"color": "black",
"damaged": "false"
"available": "true"
}, {
"car": "Porsche",
"price": 60000,
"color": "green"
}
];


I would like it to be displayed as:

car price colordamagedavailable
Audi40000blue
BMW35000blackFALSETRUE
Porsche60000green

<tbody>
</tbody>








The tricky par things are:
1- I have more than 120,000 lines, most of the online convertors can handle only 1000 record
2- As you can see in some line there are more fields.

Thanks for all of this fantastic forum for your help (you've saved me several times)
I would love to donate if there is such an option to keep this forum running as is.

Many thanks in advance
 

Excel Facts

Which came first: VisiCalc or Lotus 1-2-3?
Dan Bricklin and Bob Frankston debuted VisiCalc in 1979 as a Visible Calculator. Lotus 1-2-3 debuted in the early 1980's, from Mitch Kapor.
Provided that format is strictly upheld then this will parse the data for you:

Code:
Public Sub ParseJsonFile()

Dim myCols As New Collection
Dim fh As Integer
Dim jsonLine As String
Dim nextRow As Long
Dim keyValue As Variant
Dim nextCol As Long
Dim thisCol As Long

On Error Resume Next

fh = FreeFile
Open "c:\sandbox\cars.json" For Input As #fh ' Change to be the path to your file
nextRow = 1
nextCol = 0
Do Until EOF(fh)
    Line Input #fh, jsonLine
    If Right(Trim(jsonLine), 1) = "{" Then nextRow = nextRow + 1
    If Left$(jsonLine, 1) = """" Then
        keyValue = Split(jsonLine, """: ")
        keyValue(0) = Mid(keyValue(0), 2)
        If Left(keyValue(1), 1) = """" Then keyValue(1) = Mid(keyValue(1), 2)
        If Right(keyValue(1), 1) = "," Then keyValue(1) = Left(keyValue(1), Len(keyValue(1)) - 1)
        If Right(keyValue(1), 1) = """" Then keyValue(1) = Left(keyValue(1), Len(keyValue(1)) - 1)
        thisCol = 0
        thisCol = myCols(keyValue(0))
        If thisCol = 0 Then
            nextCol = nextCol + 1
            thisCol = nextCol
            myCols.Add thisCol, keyValue(0)
            Cells(1, thisCol) = keyValue(0)
        End If
        Cells(nextRow, thisCol) = keyValue(1)
    End If
Loop

Close #fh

End Sub

WBD
 
Upvote 0
Sorry for the confusion. I added these new rows as example to make more clear: It does work with the format I provided above.
But my input is exactly like below: Is it still easy to modify the code to change this?
[{"car": "Audi","price": 40000,"color": "blue"}, {"car": "BMW","price": 35000,"color": "black","damaged": "false","available": "true"}, {"car": "Porsche","price": 60000,"color": "green"}]

Thank you very much for you help
 
Upvote 0
Sorry for the confusion. I added these new rows as example to make more clear: It does work with the format I provided above.
But my input is exactly like below: Is it still easy to modify the code to change this?
[{"car": "Audi","price": 40000,"color": "blue"}, {"car": "BMW","price": 35000,"color": "black","damaged": "false","available": "true"}, {"car": "Porsche","price": 60000,"color": "green"}]

Thank you very much for you help


Hi,

Copy and paste this Json format data in cell A1 and run this macro:

Code:
Sub json()
Dim j       As Long
Dim Rng     As Range
Dim ele     As Variant
Dim Sp      As Variant
Dim ar      As Variant

j = 1
Sp = Split(Range("A1"), ", ")

For Each ele In Sp
    j = j + 1
    ar = Split(ele, ",")
    Range("C" & j).Resize(, UBound(ar) + 1) = ar
Next

Set Rng = Range("C1").CurrentRegion

On Error Resume Next

Rng.Replace "car", "": Rng.Replace "price", ""
Rng.Replace "color", "": Rng.Replace "damaged", "": Rng.Replace "available", ""
Rng.Replace """", "": Rng.Replace "{", "": Rng.Replace "}", ""
Rng.Replace "[", "": Rng.Replace "]", "": Rng.Replace ":", ""

Range("C1:G1") = Array("Car", "Price", "Color", "Damaged", "Available")

End Sub
 
Upvote 0
Hi

I don't understand the format. There's information missing.

Can you clarify?

- In the files that you receive are there only those 5 fields ("Car", "Price", "Color", "Damaged", "Available") or are there sometimes more or different ones?

In case there are more do you have the complete list of all the fields that can appear or sometimes do you receive files with new fields?

- Do they always appear in the same order?

For ex. can you have a sequence like:

("Car", "Price", "Color", "Damaged", "Available"), ("Car", "Price", "Available", "Damaged", "Color")


Also please test the solutions posted and tell us if you already have a solution that solves you problem.
 
Upvote 0
Hi

I wrote some code to parse a JSON string like the one posted and write it in a worksheet in a table format.

I considered strings with a variable number of records, each record with a variable number of fields and we don't have a list with all the possible field names.

Remark: I did this quickly just for fun, it's not optimised for efficiency when parsing a big string
I post the assumptions at the end of the post

You can test it with your file.


Code:
Option Explicit

' PGC 2016 Parse a JSON and write it in a table format
' records like { field1, field 2, ... fieldN}, each field with name and value separated by ":"
' Ex: [{"F1": "Yes", "F2": 123}, {"F1": "No", "F3": "True", "F2" : 123}]
Sub TestJSON()
Dim sPathname As String, sText As String
Dim iFile As Integer
Dim vRecords As Variant
Dim dicFields As Object

sPathname = "c:\tmp\excel\JSON.txt"

' read the file into a string
iFile = FreeFile
Open sPathname For Input As #iFile
sText = Input$(LOF(iFile), #iFile)
Close #iFile
sText = Replace(Replace(sText, vbCr, ""), vbLf, "")

' get a dictionary object to hold the fields names and indices
Set dicFields = CreateObject("Scripting.Dictionary")

' load all the information in a table format into an array
JSONArray sText, vRecords, dicFields

' write the information in the active worksheet
Range("A1").Resize(1, dicFields.Count).Value = dicFields.keys
Range("A2").Resize(UBound(vRecords, 1), UBound(vRecords, 2)).Value = vRecords
Columns(1).Resize(, dicFields.Count).AutoFit

End Sub

Sub JSONArray(sText As String, vRecords As Variant, dicFields As Object)
Dim lRecord As Long, lField As Long
Dim regex As Object, regexRMatches As Object, regexFmatches As Object

Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "\{([^}]+)\}" ' to match a record
regex.Global = True
Set regexRMatches = regex.Execute(sText) ' gets all the records
ReDim vRecords(1 To regexRMatches.Count, 1 To 1)  ' initialize the array

regex.Pattern = """([^""]+)""\s*:\s*(""([^""]+)""|\d+(\.\d+)?)" ' to match a field

For lRecord = 1 To regexRMatches.Count
    Set regexFmatches = regex.Execute(regexRMatches(lRecord - 1)) ' gets the fields in the current record
    
    For lField = 1 To regexFmatches.Count
        With regexFmatches(lField - 1)
            If Not dicFields.exists(.submatches(0)) Then ' if first time field appears add it to the dictionary
                dicFields.Add .submatches(0), dicFields.Count + 1
                ReDim Preserve vRecords(1 To UBound(vRecords, 1), 1 To dicFields.Count)
            End If
            vRecords(lRecord, dicFields(.submatches(0))) = .submatches(1)
        End With
    Next lField
Next lRecord

End Sub


This is the data in the file that I used to test:

[
{
"car": "Audi",
"price": 40000,
"color": "blue"
}, {
"car": "BMW",
"price": 35000,
"color": "black",
"damaged": "false"
"available": "true"
},
{"car": "Porsche","price": 60000,"color": "green"},
{"car": "Porsche","price": 60000,"model": "Macan S", "height(m)" : 1.624},
{"car": "Porsche","price": 40000,"color": "green"},
{"car": "Land Rover","price": 80000,"model": "Range Rover", "engine" : "LR-TDV6 3.0 litre Diesel"},
{"car": "Land Rover","price": 60000,"model": "Range Rover Sport"},
{"car": "Porsche","price": 60000,"model": "Macan S", "height(m)" : 1.624},
{"car": "Volkswagen","price": 20000,"model": "Golf", "damaged": "false","color": "black"},
]


These are my assumptions:

- the JSON string has no syntax errors
- like in the example, contains a number of records, like [ record1, record2, ... recordN]
- each record is enclosed in curly braces and contains a number or fields, like
{ fieldName1 : fieldValue1, fieldName2 : field2Value2, ... fieldNameK : fieldValueK}
- a field name is a double quoted string, like "This name"
- a field value can be either a double quoted string like "Yes" or a number value, no thousands separator, eventually with decimals, like 77 or 6543.21 (in the case of decimals with a ".")

... and

- we don't know beforehand how many records there are
- the records can have a variable number of fields in any order
- we don't have beforehand a list with the possible names of the fields

... but
- in a record a field will only appear 1 time
 
Last edited:
Upvote 0
PGC This is an awsome solution. It works like a dream with a 240K line JSON file. It was extremely easy to modify (the bath to the json file) and use.
I don't think there is a need to optimize further, it took less than 45 seconds to convert the 240K records. If you did this for fun then how would it look like if you work seriously :))

Thank you very much,appreciate your valuable time! Next time you're in Europe anywhere around Austria let me know and I'll pay the beers :)
 
Upvote 0
You're welcome. Thanks for the feedback.

Next time you're in Europe anywhere around Austria let me know and I'll pay the beers

Austria has so many great beers!

I think that choosing the beer will be a more serious problem than processing JSON. That that will require a much harder algorithm. :)
 
Upvote 0

Forum statistics

Threads
1,221,580
Messages
6,160,625
Members
451,659
Latest member
honggamthienha

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top