Extract specific data from a folder of .eml files?

tpir72

New Member
Joined
Sep 25, 2011
Messages
43
Hi,
I have a folder of .eml files. They are an exported backup of my gmail account. Each .eml file is a separate email. They can be easily viewed with notepad.
I need to extract these four items. The format is always consistent mixed with other header and body info not needed.

Date:
Member logged in:
Email:
IP:

Date: Sat, 25 Feb 2012 19:25:24 -0700
Member logged in: somemember123
Email: name@gmail.com
IP: 83.121.245.221

Is there a way to scan this folder, have each email (one per row), with column data:

A1=Date:
A2=Member logged in:
A3=Email:
A4=IP:

Some emails may only have the Date: information. Is there a way to automate the process to scan the folder, only extract info if all four bits of information are available, add this data to one email message per row and ignore all other emails without all four bits of data present?

This is way over my head. I know some of you can come up with a formula in your sleep to do this.

I have thousands of emails to extract this data from.

Any help is sincerely appreciated.

Regards,

Terry
 

Excel Facts

Spell Check in Excel
Press F7 to start spell check in Excel. Be careful, by default, Excel does not check Capitalized Werds (whoops)
The following macro will loop through each .eml file in the specified folder, and list in the active worksheet the desired data for files that meet the criteria, starting at Column A. Here's a sample of the output...

A1:D3

<TABLE style="WIDTH: 410pt; BORDER-COLLAPSE: collapse" border=0 cellSpacing=0 cellPadding=0 width=546><COLGROUP><COL style="WIDTH: 148pt; mso-width-source: userset; mso-width-alt: 7204" width=197><COL style="WIDTH: 93pt; mso-width-source: userset; mso-width-alt: 4534" width=124><COL style="WIDTH: 96pt; mso-width-source: userset; mso-width-alt: 4681" width=128><COL style="WIDTH: 73pt; mso-width-source: userset; mso-width-alt: 3547" width=97><TBODY><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; WIDTH: 148pt; HEIGHT: 15pt; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 height=20 width=197>Date</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; WIDTH: 93pt; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 width=124>Member Logged In</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; WIDTH: 96pt; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 width=128>Email</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; WIDTH: 73pt; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 width=97>IP</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 height=20>Sat, 25 Feb 2012 19:25:24 -0700</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64>somemember456</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64>name2@gmail.com</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64>83.121.245.222</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 height=20>Sat, 25 Feb 2012 19:25:24 -0700</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64>somemember789</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65>name3@gmail.com</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64>83.121.245.225</TD></TR></TBODY></TABLE>

Here's the code, which needs to be placed in a regular module (VBE > Insert > Module)...

Code:
[FONT=Courier New][COLOR=darkblue]Option[/COLOR] [COLOR=darkblue]Explicit[/COLOR][/FONT]
 
[FONT=Courier New][COLOR=darkblue]Sub[/COLOR] test()[/FONT]
 
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] MyPath [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR][/FONT]
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] MyFile [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR][/FONT]
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] MyArray() [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR][/FONT]
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] strData [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR][/FONT]
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] strDate [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR][/FONT]
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] strMember [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR][/FONT]
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] strEmail [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR][/FONT]
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] strIP [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]String[/COLOR][/FONT]
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] LastRow [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]Long[/COLOR][/FONT]
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] NextRow [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]Long[/COLOR][/FONT]
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] CountOfFields [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]Long[/COLOR][/FONT]
[FONT=Courier New][COLOR=darkblue]Dim[/COLOR] CountOfEmails [COLOR=darkblue]As[/COLOR] [COLOR=darkblue]Long[/COLOR][/FONT]
 
[FONT=Courier New]MyPath = "C:\Users\Domenic\Desktop\" [COLOR=green]'change the path to the folder accordingly[/COLOR][/FONT]
 
[FONT=Courier New]MyFile = Dir(MyPath & "*.eml", vbNormal)[/FONT]
 
[FONT=Courier New]CountOfEmails = 0[/FONT]
 
[FONT=Courier New][COLOR=darkblue]Do[/COLOR] [COLOR=darkblue]While[/COLOR] Len(MyFile) > 0[/FONT]
 
[FONT=Courier New]   [COLOR=darkblue]Open[/COLOR] MyPath & MyFile [COLOR=darkblue]For[/COLOR] [COLOR=darkblue]Input[/COLOR] [COLOR=darkblue]As[/COLOR] #1[/FONT]
[FONT=Courier New]       CountOfFields = 0[/FONT]
[FONT=Courier New]       [COLOR=darkblue]Do[/COLOR] [COLOR=darkblue]Until[/COLOR] EOF(1)[/FONT]
[FONT=Courier New]           Line [COLOR=darkblue]Input[/COLOR] #1, strData[/FONT]
[FONT=Courier New]           [COLOR=darkblue]If[/COLOR] UCase(Left(strData, 5)) = "DATE:" [COLOR=darkblue]Then[/COLOR][/FONT]
[FONT=Courier New]               CountOfFields = CountOfFields + 1[/FONT]
[FONT=Courier New]               strDate = Trim(Mid(strData, 6))[/FONT]
[FONT=Courier New]           [COLOR=darkblue]ElseIf[/COLOR] UCase(Left(strData, 17)) = "MEMBER LOGGED IN:" [COLOR=darkblue]Then[/COLOR][/FONT]
[FONT=Courier New]               CountOfFields = CountOfFields + 1[/FONT]
[FONT=Courier New]               strMember = Trim(Mid(strData, 18))[/FONT]
[FONT=Courier New]           [COLOR=darkblue]ElseIf[/COLOR] UCase(Left(strData, 6)) = "EMAIL:" [COLOR=darkblue]Then[/COLOR][/FONT]
[FONT=Courier New]               CountOfFields = CountOfFields + 1[/FONT]
[FONT=Courier New]               strEmail = Trim(Mid(strData, 7))[/FONT]
[FONT=Courier New]           [COLOR=darkblue]ElseIf[/COLOR] UCase(Left(strData, 3)) = "IP:" [COLOR=darkblue]Then[/COLOR][/FONT]
[FONT=Courier New]               CountOfFields = CountOfFields + 1[/FONT]
[FONT=Courier New]               strIP = Trim(Mid(strData, 4))[/FONT]
[FONT=Courier New]           [COLOR=darkblue]End[/COLOR] [COLOR=darkblue]If[/COLOR][/FONT]
[FONT=Courier New]           [COLOR=darkblue]If[/COLOR] CountOfFields = 4 [COLOR=darkblue]Then[/COLOR][/FONT]
[FONT=Courier New]               CountOfEmails = CountOfEmails + 1[/FONT]
[FONT=Courier New]               [COLOR=darkblue]ReDim[/COLOR] [COLOR=darkblue]Preserve[/COLOR] MyArray(1 [COLOR=darkblue]To[/COLOR] 4, 1 To CountOfEmails)[/FONT]
[FONT=Courier New]               MyArray(1, CountOfEmails) = strDate[/FONT]
[FONT=Courier New]               MyArray(2, CountOfEmails) = strMember[/FONT]
[FONT=Courier New]               MyArray(3, CountOfEmails) = strEmail[/FONT]
[FONT=Courier New]               MyArray(4, CountOfEmails) = strIP[/FONT]
[FONT=Courier New]               [COLOR=darkblue]Exit[/COLOR] [COLOR=darkblue]Do[/COLOR][/FONT]
[FONT=Courier New]           [COLOR=darkblue]End[/COLOR] [COLOR=darkblue]If[/COLOR][/FONT]
[FONT=Courier New]       [COLOR=darkblue]Loop[/COLOR][/FONT]
[FONT=Courier New]   [COLOR=darkblue]Close[/COLOR] #1[/FONT]
 
[FONT=Courier New]   MyFile = Dir[/FONT]
 
[FONT=Courier New][COLOR=darkblue]Loop[/COLOR][/FONT]
 
[FONT=Courier New][COLOR=darkblue]If[/COLOR] CountOfEmails > 0 [COLOR=darkblue]Then[/COLOR][/FONT]
[FONT=Courier New]   LastRow = Cells(Rows.Count, "a").End(xlUp).Row[/FONT]
[FONT=Courier New]   [COLOR=darkblue]If[/COLOR] LastRow = 1 [COLOR=darkblue]Then[/COLOR][/FONT]
[FONT=Courier New]       Range("a1").Resize(, 4).Value = Array("Date", "Member Logged In", "Email", "IP")[/FONT]
[FONT=Courier New]   [COLOR=darkblue]End[/COLOR] [COLOR=darkblue]If[/COLOR][/FONT]
[FONT=Courier New]   Cells(LastRow + 1, "a").Resize(UBound(MyArray, 2), UBound(MyArray, 1)).Value = WorksheetFunction.Transpose(MyArray)[/FONT]
[FONT=Courier New][COLOR=darkblue]Else[/COLOR][/FONT]
[FONT=Courier New]   MsgBox "No data was available...", vbInformation[/FONT]
[FONT=Courier New][COLOR=darkblue]End[/COLOR] [COLOR=darkblue]If[/COLOR][/FONT]
 
[FONT=Courier New][COLOR=darkblue]End[/COLOR] [COLOR=darkblue]Sub[/COLOR][/FONT]
 
Upvote 0
Thank you very much Dominic!

I inserted the module with this path:

MyPath = "E:\EmailBackup" 'change the path to the folder accordingly

then saved the Macro enabled workbook.

When I hit run I instantly got the popup:

Microsoft Excel
No data was available...


Did the workbook need to be in the same folder as the .eml files?

Regards,

Terry
 
Upvote 0
Did the workbook need to be in the same folder as the .eml files?

No, the workbook does not need to be in the same folder as the .eml files. I'm assuming that the four text lines of interest each start with "Date:", "Member logged in:", "Email:", and "IP:", respectively, correct?
 
Upvote 0
Can you post a representative sample of the actual text, as it appears in the .eml file? Of course, replace any personal information with fake data.
 
Upvote 0
OK, here is a random sample of the email that needs to be scanned for those four bits of data. I have removed sensitive info and replaced all with phoney IPs and email addresses.

Thank you very much for the help.

Terry

Delivered-To: name@gmail.com
Received: by 11.210.199.188 with SMTP id k38cs45339145yhg;
Sat, 1 Jan 2011 00:23:49 -0800 (PST)
Received: by 11.210.199.188 with SMTP id n10mr11296524icz.117.1293870228493;
Sat, 01 Jan 2011 00:23:48 -0800 (PST)
Return-Path: <nobody@email.com>
Received: from at2.securitysite.com (at2.securitysite.com [11.210.199.188])
by mx.google.com with ESMTPS id m1si93133789icp.124.2011.09.01.00.12.47
(version=TLSv1/SSLv3 cipher=RC4-MD5);
Sat, 01 Jan 2011 00:23:48 -0800 (PST)
Received-SPF: pass (google.com: best guess record for domain of nobody@somesite.com

designates 11.210.199.188 as permitted sender) client-ip=11.210.199.188;
Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for

domain of nobody@email.com designates 11.210.199.188 as permitted sender)

smtp.mail=nobody@email.com
Received: from name@gmail.com (stuff.whatever.com [127.0.0.1])
by name@gmail.com (8.13.8/8.13.8) with ESMTP id p018Nlna727126
for <things@gmail.com>; Sat, 1 Jan 2011 01:23:47 -0700
Received: (from nobody@localhost)
by name@gmail.com (8.13.8/8.13.8/Submit) id p018NlB9033925
for anothersite.com; Sat, 1 Jan 2011 01:23:47 -0700
Date: Sat, 1 Jan 2011 01:23:47 -0700
Message-Id: <201101022423.p018NlB2222125@server2.gmail.com>
To: me@gmail.com
Subject: Automated Name Here: member logged in
From: Whatever <nobody@asite.com>
Reply-To: nobody@asite.com

Member logged in: xx-twentyfive
Email: personalemail@address.com
IP: 11.210.199.188

PROFILE:
 
Upvote 0
Based on the sample data that you provided, the code I offered should return the desired result. Maybe the files use a LF instead of a CRLF to indicate the end of a line. If so, the FileSystemObject can be used instead to read the files. Does the following code return the desired result?

Code:
[font=Courier New][color=darkblue]Option[/color] [color=darkblue]Explicit[/color]

[color=darkblue]Sub[/color] test()

    [color=darkblue]Dim[/color] FSO [color=darkblue]As[/color] [color=darkblue]Object[/color]
    [color=darkblue]Dim[/color] TS [color=darkblue]As[/color] [color=darkblue]Object[/color]
    [color=darkblue]Dim[/color] MyPath [color=darkblue]As[/color] [color=darkblue]String[/color]
    [color=darkblue]Dim[/color] MyFile [color=darkblue]As[/color] [color=darkblue]String[/color]
    [color=darkblue]Dim[/color] MyArray() [color=darkblue]As[/color] [color=darkblue]String[/color]
    [color=darkblue]Dim[/color] strData [color=darkblue]As[/color] [color=darkblue]String[/color]
    [color=darkblue]Dim[/color] strDate [color=darkblue]As[/color] [color=darkblue]String[/color]
    [color=darkblue]Dim[/color] strMember [color=darkblue]As[/color] [color=darkblue]String[/color]
    [color=darkblue]Dim[/color] strEmail [color=darkblue]As[/color] [color=darkblue]String[/color]
    [color=darkblue]Dim[/color] strIP [color=darkblue]As[/color] [color=darkblue]String[/color]
    [color=darkblue]Dim[/color] LastRow [color=darkblue]As[/color] [color=darkblue]Long[/color]
    [color=darkblue]Dim[/color] NextRow [color=darkblue]As[/color] [color=darkblue]Long[/color]
    [color=darkblue]Dim[/color] CountOfFields [color=darkblue]As[/color] [color=darkblue]Long[/color]
    [color=darkblue]Dim[/color] CountOfEmails [color=darkblue]As[/color] [color=darkblue]Long[/color]
    
    [color=darkblue]Set[/color] FSO = CreateObject("Scripting.FileSystemObject")
    
    MyPath = "C:\Users\Domenic\Desktop\" [color=green]'change the path to the folder accordingly[/color]
    
    MyFile = Dir(MyPath & "*.eml", vbNormal)
    
    CountOfEmails = 0
    
    [color=darkblue]Do[/color] [color=darkblue]While[/color] Len(MyFile) > 0
    
        [color=darkblue]Set[/color] TS = FSO.OpenTextFile(MyPath & MyFile, 1, [color=darkblue]False[/color], -2)
        
        CountOfFields = 0
        [color=darkblue]Do[/color] [color=darkblue]Until[/color] TS.AtEndOfStream
            strData = TS.ReadLine
            [color=darkblue]If[/color] UCase(Left(strData, 5)) = "DATE:" [color=darkblue]Then[/color]
                CountOfFields = CountOfFields + 1
                strDate = Trim(Mid(strData, 6))
            [color=darkblue]ElseIf[/color] UCase(Left(strData, 17)) = "MEMBER LOGGED IN:" [color=darkblue]Then[/color]
                CountOfFields = CountOfFields + 1
                strMember = Trim(Mid(strData, 18))
            [color=darkblue]ElseIf[/color] UCase(Left(strData, 6)) = "EMAIL:" [color=darkblue]Then[/color]
                CountOfFields = CountOfFields + 1
                strEmail = Trim(Mid(strData, 7))
            [color=darkblue]ElseIf[/color] UCase(Left(strData, 3)) = "IP:" [color=darkblue]Then[/color]
                CountOfFields = CountOfFields + 1
                strIP = Trim(Mid(strData, 4))
            [color=darkblue]End[/color] [color=darkblue]If[/color]
            [color=darkblue]If[/color] CountOfFields = 4 [color=darkblue]Then[/color]
                CountOfEmails = CountOfEmails + 1
                [color=darkblue]ReDim[/color] [color=darkblue]Preserve[/color] MyArray(1 [color=darkblue]To[/color] 4, 1 To CountOfEmails)
                MyArray(1, CountOfEmails) = strDate
                MyArray(2, CountOfEmails) = strMember
                MyArray(3, CountOfEmails) = strEmail
                MyArray(4, CountOfEmails) = strIP
                [color=darkblue]Exit[/color] [color=darkblue]Do[/color]
            [color=darkblue]End[/color] [color=darkblue]If[/color]
        [color=darkblue]Loop[/color]
        
        TS.Close
                
        MyFile = Dir
        
    [color=darkblue]Loop[/color]
    
    [color=darkblue]If[/color] CountOfEmails > 0 [color=darkblue]Then[/color]
        LastRow = Cells(Rows.Count, "a").End(xlUp).Row
        [color=darkblue]If[/color] LastRow = 1 [color=darkblue]Then[/color]
            Range("a1").Resize(, 4).Value = Array("Date", "Member Logged In", "Email", "IP")
        [color=darkblue]End[/color] [color=darkblue]If[/color]
        Cells(LastRow + 1, "a").Resize(UBound(MyArray, 2), UBound(MyArray, 1)).Value = WorksheetFunction.Transpose(MyArray)
    [color=darkblue]Else[/color]
        MsgBox "No data was available...", vbInformation
    [color=darkblue]End[/color] [color=darkblue]If[/color]
    
[color=darkblue]End[/color] [color=darkblue]Sub[/color]
[/font]
 
Upvote 0
Hi Dominic,
Well I tried the updated code with correct path. Opened Excel 2010, Alt.F11, inserted it in a module, closed the window, clicked on run and got the same no data found.

I really need this to work. What am I doing wrong?

Thanks for the help.

Regards,

Terry
 
Upvote 0
Hi Dominic,
I tried it and it didn't work. Then I deleted the code, re-pasted, saved, run then it worked!

Thank you very much!

I sincerely appreciate it.

Regards,

Terry
 
Upvote 0

Forum statistics

Threads
1,223,691
Messages
6,173,851
Members
452,535
Latest member
berdex

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top