Extracting @mentions and #hashtags from column A to Columns B and C.

MiliJJ

New Member
Joined
Nov 30, 2012
Messages
5
I have a really large database of tweets. Most of the tweets have multiple #hashtags and @mentions. I want all the #hashtags separated with a space in one column and all the @mentions in another column. I already know how to extract the first occurrence of a #hashtag and a @mention. But I don't know to get them all? Some of the tweets have as much as 8 #hashtags. Manually going through the tweets and copy/pasting the #hashtags and @mentions seem an impossible task for over 5,000 tweets.

Here is an example of what I want. I have Column A and I want a macro that would populate columns B and C. (I'm on Windows &, Excel 2010)

[TABLE="class: grid, width: 1405"]
<tbody>[TR]
[TD] <style> <!--table {mso-displayed-decimal-separator:"\."; mso-displayed-thousand-separator:"\,";} @page {margin:1.0in .75in 1.0in .75in; mso-header-margin:.5in; mso-footer-margin:.5in;} .font5 {color:windowtext; font-size:12.0pt; font-weight:400; font-style:normal; text-decoration:none; font-family:Calibri; mso-generic-font-family:auto; mso-font-charset:0;} .font6 {color:windowtext; font-size:12.0pt; font-weight:400; font-style:normal; text-decoration:none; font-family:Calibri; mso-generic-font-family:auto; mso-font-charset:0;} td {padding-top:1px; padding-right:1px; padding-left:1px; mso-ignore:padding; color:black; font-size:12.0pt; font-weight:400; font-style:normal; text-decoration:none; font-family:Calibri, sans-serif; mso-font-charset:0; mso-number-format:General; text-align:general; vertical-align:bottom; border:none; mso-background-source:auto; mso-pattern:auto; mso-protection:locked visible; white-space:nowrap; mso-rotate:0;} .xl64 {color:windowtext; font-family:Calibri; mso-generic-font-family:auto; mso-font-charset:0;} .xl65 {color:windowtext; font-family:"Helvetica Neue"; mso-generic-font-family:auto; mso-font-charset:0;} .xl66 {color:windowtext; font-family:Calibri; mso-generic-font-family:auto; mso-font-charset:0;} --> </style> [TABLE="width: 1054"]
<tbody>[TR]
[TD="class: grid, width: 924"]Column A[/TD]
[TD="class: grid, width: 65"]Column B[/TD]
[TD="class: grid, width: 65"]Column C[/TD]
[/TR]
[TR="class: grid"]
[TD="width: 924"]Dear <s style="text-decoration:initial">#</s>DavidStern, @spurs put a quality team on the floor and should have beat the @heat. Leave <s style="text-decoration:initial">#</s>Pop alone. <s style="text-decoration:initial">#</s>Spurs a classy organization.[/TD]
[TD="width: 65"]#DavidStern #Pop #Spurs[/TD]
[TD="width: 65"]@spurs @heat[/TD]
[/TR]
[TR="class: grid"]
[TD]Live broadcast from @Nacho_xtreme: "Papelucho Radio"http://mixlr.com/nachoxtreme-radio … <s style="text-decoration:initial">#</s>mixlr <s style="text-decoration: initial">#</s>pop <s style="text-decoration:initial">#</s>dance[/TD]
[TD]#mixlr #pop #dance[/TD]
[TD]@Nacho_xtreme[/TD]
[/TR]
[TR="class: grid"]
[TD]"Since You Left" by @EmilNow now playing on KGUP 106.5FM. Listen now on The Emerge Radio Networks <s style="text-decoration:initial">#</s>Pop <s style="text-decoration:initial">#</s>Rock[/TD]
[TD]#Pop #Rock[/TD]
[TD]@EmilNow[/TD]
[/TR]
[TR="class: grid"]
[TD]Family Night <s style="text-decoration:initial">#</s>battleofthegenerations Dad has the <s style="text-decoration:initial">#</s>Monkeys Mom has <s style="text-decoration:initial">#</s>DonnieOsman @michaelbuble for me <s style="text-decoration:initial">#</s>Dubstep for the boys<s style="text-decoration:initial">#</s>Pop for sissy[/TD]
[TD]#battleofthegenerations #Monkeys #DonnieOsman #Dubstep #Pop[/TD]
[TD]@michaelbuble[/TD]
[/TR]
[TR="class: grid"]
[TD]@McKinzeepowell @m0ore21 I love that the PNW and the Midwest are on the same page!! <s style="text-decoration:initial">#</s>Pop[/TD]
[TD]#pop[/TD]
[TD]@McKinzeepowell @m0ore21[/TD]
[/TR]
</tbody>[/TABLE]
[/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]
 

Excel Facts

Why are there 1,048,576 rows in Excel?
The Excel team increased the size of the grid in 2007. There are 2^20 rows and 2^14 columns for a total of 17 billion cells.
this should do not the nicest code but...

Code:
Sub miliJJ()

lastrow = Range("A" & Rows.Count).End(xlUp).Row

For Each cell In Range("A1:A" & lastrow)
    Range(cell.Address).Offset(0, 2).Value = get_text(Range(cell.Address), "@")
    Range(cell.Address).Offset(0, 1).Value = get_text(Range(cell.Address), "#")
Next cell
End Sub

Function get_text(rng As Range, CHR As String)

Dim Text1 As String
Count1 = Len(rng) - Len(Replace(rng, CHR, ""))
Dim j As Long
j = 1
For i = 1 To Count1
sCh = InStr(j, rng, CHR)
eCh = InStr(InStr(j, rng, CHR), rng, " ")
If eCh = 0 Then eCh = Len(rng) + 1

If Text1 = "" Then
Text1 = Mid(rng, sCh, eCh - sCh) & " "
Else
Text1 = Text1 & " " & Mid(rng, sCh, eCh - sCh)
End If
j = j + eCh - j
Next i
get_text = Text1
End Function
 
Upvote 0
Guys, I have no idea where to put this code! I have to do this for work, and really don't have a clue! :(
 
Upvote 0
Thank you so much for this Macros! It's helped me out immensely. I was wondering if you could tell me what to do for tweets where there is no space between the hashtags or @handles. For example have have a tweet that just has #AltonSterling#PhilandoCastile and when the Macros gets to that line it gives me an error. Thank you so much for your help with this already. If you have any info on how I could fix this I'd be most grateful!
-Paige
 
Upvote 0
this should do not the nicest code but...

Code:
Sub miliJJ()

lastrow = Range("A" & Rows.Count).End(xlUp).Row

For Each cell In Range("A1:A" & lastrow)
    Range(cell.Address).Offset(0, 2).Value = get_text(Range(cell.Address), "@")
    Range(cell.Address).Offset(0, 1).Value = get_text(Range(cell.Address), "#")
Next cell
End Sub

Function get_text(rng As Range, CHR As String)

Dim Text1 As String
Count1 = Len(rng) - Len(Replace(rng, CHR, ""))
Dim j As Long
j = 1
For i = 1 To Count1
sCh = InStr(j, rng, CHR)
eCh = InStr(InStr(j, rng, CHR), rng, " ")
If eCh = 0 Then eCh = Len(rng) + 1

If Text1 = "" Then
Text1 = Mid(rng, sCh, eCh - sCh) & " "
Else
Text1 = Text1 & " " & Mid(rng, sCh, eCh - sCh)
End If
j = j + eCh - j
Next i
get_text = Text1
End Function
Hello i have tried your code it works until a line without a hashtag then it gives a debug :/ can you please fix that issue? much appreciated
 
Upvote 0

Forum statistics

Threads
1,223,904
Messages
6,175,295
Members
452,632
Latest member
jladair

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top