Hi
I wrote some code to parse a JSON string like the one posted and write it in a worksheet in a table format.
I considered strings with a variable number of records, each record with a variable number of fields and we don't have a list with all the possible field names.
Remark: I did this quickly just for fun, it's not optimised for efficiency when parsing a big string
I post the assumptions at the end of the post
You can test it with your file.
Code:
Option Explicit
' PGC 2016 Parse a JSON and write it in a table format
' records like { field1, field 2, ... fieldN}, each field with name and value separated by ":"
' Ex: [{"F1": "Yes", "F2": 123}, {"F1": "No", "F3": "True", "F2" : 123}]
Sub TestJSON()
Dim sPathname As String, sText As String
Dim iFile As Integer
Dim vRecords As Variant
Dim dicFields As Object
sPathname = "c:\tmp\excel\JSON.txt"
' read the file into a string
iFile = FreeFile
Open sPathname For Input As #iFile
sText = Input$(LOF(iFile), #iFile)
Close #iFile
sText = Replace(Replace(sText, vbCr, ""), vbLf, "")
' get a dictionary object to hold the fields names and indices
Set dicFields = CreateObject("Scripting.Dictionary")
' load all the information in a table format into an array
JSONArray sText, vRecords, dicFields
' write the information in the active worksheet
Range("A1").Resize(1, dicFields.Count).Value = dicFields.keys
Range("A2").Resize(UBound(vRecords, 1), UBound(vRecords, 2)).Value = vRecords
Columns(1).Resize(, dicFields.Count).AutoFit
End Sub
Sub JSONArray(sText As String, vRecords As Variant, dicFields As Object)
Dim lRecord As Long, lField As Long
Dim regex As Object, regexRMatches As Object, regexFmatches As Object
Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "\{([^}]+)\}" ' to match a record
regex.Global = True
Set regexRMatches = regex.Execute(sText) ' gets all the records
ReDim vRecords(1 To regexRMatches.Count, 1 To 1) ' initialize the array
regex.Pattern = """([^""]+)""\s*:\s*(""([^""]+)""|\d+(\.\d+)?)" ' to match a field
For lRecord = 1 To regexRMatches.Count
Set regexFmatches = regex.Execute(regexRMatches(lRecord - 1)) ' gets the fields in the current record
For lField = 1 To regexFmatches.Count
With regexFmatches(lField - 1)
If Not dicFields.exists(.submatches(0)) Then ' if first time field appears add it to the dictionary
dicFields.Add .submatches(0), dicFields.Count + 1
ReDim Preserve vRecords(1 To UBound(vRecords, 1), 1 To dicFields.Count)
End If
vRecords(lRecord, dicFields(.submatches(0))) = .submatches(1)
End With
Next lField
Next lRecord
End Sub
This is the data in the file that I used to test:
[
{
"car": "Audi",
"price": 40000,
"color": "blue"
}, {
"car": "BMW",
"price": 35000,
"color": "black",
"damaged": "false"
"available": "true"
},
{"car": "Porsche","price": 60000,"color": "green"},
{"car": "Porsche","price": 60000,"model": "Macan S", "height(m)" : 1.624},
{"car": "Porsche","price": 40000,"color": "green"},
{"car": "Land Rover","price": 80000,"model": "Range Rover", "engine" : "LR-TDV6 3.0 litre Diesel"},
{"car": "Land Rover","price": 60000,"model": "Range Rover Sport"},
{"car": "Porsche","price": 60000,"model": "Macan S", "height(m)" : 1.624},
{"car": "Volkswagen","price": 20000,"model": "Golf", "damaged": "false","color": "black"},
]
These are my assumptions:
- the JSON string has no syntax errors
- like in the example, contains a number of records, like [ record1, record2, ... recordN]
- each record is enclosed in curly braces and contains a number or fields, like
{ fieldName1 : fieldValue1, fieldName2 : field2Value2, ... fieldNameK : fieldValueK}
- a field name is a double quoted string, like "This name"
- a field value can be either a double quoted string like "Yes" or a number value, no thousands separator, eventually with decimals, like 77 or 6543.21 (in the case of decimals with a ".")
... and
- we don't know beforehand how many records there are
- the records can have a variable number of fields in any order
- we don't have beforehand a list with the possible names of the fields
... but
- in a record a field will only appear 1 time