Apologies if this has been posted before. Tried googling and searching here without luck.
So for part of my job, I'm responsible for reviewing Facebook messenger data. Facebook provides the data in a PDF or HTML format, and is pretty messy and time consuming to review. What I'm trying to do is convert the data onto an excel sheet so I can better organize, search, filter, and cull through the thousands of messages. Here is the format it is typically provided in:
Author Joe Smith
Sent 2017-03-30 22:24:03 UTC
Body are u coming to the party?
Author Jane Smith
Sent 2017-03-30 22:27:03 UTC
Body Yepper!
Author Joe Smith
Sent 2017-03-30 22:29:03 UTC
Body Great, I'll bring the soda!!!!!!!!!!!!!
Author Jane Smith
Sent 2017-03-30 22:31:03 UTC
Body You better, I'm bringing the ice!
and otherwise I'll charge you!!
And on, and on, and onnnnnnnnnnnnnnnnnn (typical returns from FB can be over 10,000 pages). I can "save as" the PDF or HTML as a txt file. When I import it to Excel, I can get the repeating "header" (author, sent, body, etc) in repeating columns (a1, a2, a3, and so on) and the corresponding "data" in the next column (b1, b2, b3, etc.) The caveat is that if its a long message (as demonstrated by the last example above) the wraparound dumps half of the message in the correct corresponding cell (say b3 for this example) but the other half will drop down into a4?
What I'd like to be able to do is import the txt file into excel and convert to something like this:
Where A1 would be Author, and B2 would be Sent, C3 Body, and then the corresponding data would be populated below in the columns below the headers.
Thanks in advance if anyone has anything for this!!!!
So for part of my job, I'm responsible for reviewing Facebook messenger data. Facebook provides the data in a PDF or HTML format, and is pretty messy and time consuming to review. What I'm trying to do is convert the data onto an excel sheet so I can better organize, search, filter, and cull through the thousands of messages. Here is the format it is typically provided in:
Author Joe Smith
Sent 2017-03-30 22:24:03 UTC
Body are u coming to the party?
Author Jane Smith
Sent 2017-03-30 22:27:03 UTC
Body Yepper!
Author Joe Smith
Sent 2017-03-30 22:29:03 UTC
Body Great, I'll bring the soda!!!!!!!!!!!!!
Author Jane Smith
Sent 2017-03-30 22:31:03 UTC
Body You better, I'm bringing the ice!
and otherwise I'll charge you!!
And on, and on, and onnnnnnnnnnnnnnnnnn (typical returns from FB can be over 10,000 pages). I can "save as" the PDF or HTML as a txt file. When I import it to Excel, I can get the repeating "header" (author, sent, body, etc) in repeating columns (a1, a2, a3, and so on) and the corresponding "data" in the next column (b1, b2, b3, etc.) The caveat is that if its a long message (as demonstrated by the last example above) the wraparound dumps half of the message in the correct corresponding cell (say b3 for this example) but the other half will drop down into a4?
What I'd like to be able to do is import the txt file into excel and convert to something like this:
Author Sent Body
Joe Smith 2017-03-30 22:24:03 UTC are u coming to the party?
Jane Smith 2017-03-30 22:27:03 UTC Yepper!
Joe Smith 2017-03-30 22:29:03 UTC Great, I'll bring the soda!!!!!!!!!!!!!
Jane Smith 2017-03-30 22:31:03 UTC You better, I'm bringing the ice! and otherwise I'll charge you!!
Joe Smith 2017-03-30 22:24:03 UTC are u coming to the party?
Jane Smith 2017-03-30 22:27:03 UTC Yepper!
Joe Smith 2017-03-30 22:29:03 UTC Great, I'll bring the soda!!!!!!!!!!!!!
Jane Smith 2017-03-30 22:31:03 UTC You better, I'm bringing the ice! and otherwise I'll charge you!!
Where A1 would be Author, and B2 would be Sent, C3 Body, and then the corresponding data would be populated below in the columns below the headers.
Thanks in advance if anyone has anything for this!!!!