Extracting data from multiple sheets

pthwaites

New Member
Joined
Apr 14, 2020
Messages
13
Office Version
  1. 365
Platform
  1. Windows
Hi all,

I'm looking for some help with a time-consuming issue. I have some psycholinguistic data which is contained across numerous sheets. These are outputs from a program called Psychopy. I need to extract that some parts of that data from those sheets and insert it, transposed and in some cases having mathematical calculations applied to it, into another sheet.

An example is attached. What we have in the source sheets is one cue word (i.e. words which respondents saw) per row, and then responses (I words that respondents wrote) in the same row. For "brave" there were no responses; for "wish" the responses were "god" and "rich", for example. I need these to be transposed so that we have one cue and one response per row in the new table. In the next column in the output sheet we have "iteration". This can be left blank as it's easy to fill in manually. Then we've got RespN. This is the order in which this word was given in response to its cue, and it corresponds to the number in the "typed_word_" column in the source sheets. I.e. if a word received 7 responses, like "trip" in the source sheet, then there should be 7 rows for that cue, numbered 1-7 in the order of their occurrence.

Then we've got RespStartActual and RespEndActual. These correspond simply to the "Start_Time_" and "Submit_Time_" values for that response in the source sheet. Next is RespStartRelative and RespEndRelative. These reflect when the participant started and finished their response, relative to the time that they finished their previous response to that cue only. So the RespStartRelative and RespEndRelative values for the first response to each cue are exactly the same as the RespStartActual and RespEndActual values; but subsequent responses need to have these columns calculated: RespStartRelative = RespStartActual minus the RespEndActual of the previous response (e.g., in the output sheet, G4=E4-F3), and RespEndRelative = RespEndActual minus the RespEndActual of the previous response (e.g. H4=F4-F3). Again, this calculation ONLY needs to happen for the 2nd response and above to each cue. You'll note that the "Start_Time_" and "Submit_Time_" values in the source sheets are not straightforwardly ordered - we don't have "Start_Time_1", "Start_Time_2", "Start_Time_3" etc, and then "Submit_Time_1", "Submit_Time_2", "Submit_Time_3" etc, but instead the columns are bit jumbled. This is because the software only creates a new column when it needs it, so if the first response gets only two responses (as in the example), then we'll have "Start_Time_1", "Start_Time_2", "Submit_Time_1", "Submit_Time_2" before we get "Start_Time_3" etc. So whatever solution anyone might come up with needs to be able to handle that.

Then we have "date". That can straightforwardly be taken from the "date" column of the source sheet.

I'll say again - the data is contained in multiple sheets, so I need a solution that can pull data from different sources, either all at once or in batches, and place it into one target output sheet.

I don't know a huge amount about Excel, but I figure this is going to be a sizeable task because it seems (to me at least) to be pretty complex, so if this is more than anyone would work on on a voluntary basis, maybe someone could point me in the direction of a paid service where I could get this done without too much expense?

Thanks a lot,

Peter

Source sheet:

Source sheet example.xlsx
ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAG
1prompttrials.thisRepNtrials.thisTrialNtrials.thisNtrials.thisIndextyped_word_1typed_word_2typed_word_3typed_word_4typed_word_5typed_word_6typed_word_7start_time_1start_time_2submit_time_1submit_time_2start_time_3start_time_4start_time_5start_time_6submit_time_3submit_time_4submit_time_5submit_time_6start_time_7start_time_8submit_time_7participantsessiondateexpNamepsychopyVersionframeRate
2brave000312019_Sep_03_1124ContinuousWA3.1.560.0766634
3wish0112godrich0.0161748611.69169847.70551033712.3421038312019_Sep_03_1124ContinuousWA3.1.560.0766634
4raise0221animalfoodhousemotherfatherweakness2.983911677.188053525.1857176728.2892119214.862752516.247128218.032290524.271789815.7127750817.2309596418.816185826.4736844512019_Sep_03_1124ContinuousWA3.1.560.0766634
5ahead0334forsee2.899188645.70196840412019_Sep_03_1124ContinuousWA3.1.560.0766634
6trip0440travelhardnesslandscapehotelfriendsfamilycousin1.532526554.518811933.600333166.5702712077.9550056214.64627420.601695526.407384410.5408390418.2987599323.587166928.5589432128.842863934.031901729.776679512019_Sep_03_1124ContinuousWA3.1.560.0766634
ContinuousWA_2019_S


Output sheet:
Output sheet Example.xlsx
ABCDEFGHIJ
1CueResponseIterationRespNRespStartActualRespEndActualRespStartRelativeRespEndRelativeDate
2brave12019_Sep_03_1124
3wishgod110.0161748637.7055103370.0161748637.7055103372019_Sep_03_1124
4wishrich1211.6916983612.342103833.9861880234.6365934932019_Sep_03_1124
5raiseanimal112.9839116725.1857176722.9839116725.1857176722019_Sep_03_1124
6raisefood127.1880535218.289211922.0023358493.1034942482019_Sep_03_1124
7raisehouse1314.8627525315.712775086.573540617.423563162019_Sep_03_1124
8raisemother1416.2471281917.230959640.534353111.518184562019_Sep_03_1124
9raisefather1518.0322905318.81618580.801330891.585226162019_Sep_03_1124
10raiseweakness1624.2717898126.473684455.455604017.657498652019_Sep_03_1124
11aheadforsee112.8991886435.7019684042.8991886435.7019684042019_Sep_03_1124
12triptravel111.5325265533.600333161.5325265533.600333162019_Sep_03_1124
13triphardness124.5188119326.5702712070.9184787722.9699380472019_Sep_03_1124
14triplandscape137.95500562310.540839041.3847344163.9705678332019_Sep_03_1124
15triphotel1414.6462740218.298759934.105434987.757920892019_Sep_03_1124
16tripfriends1520.6016954923.58716692.302935565.288406972019_Sep_03_1124
17tripfamily1626.4073843928.558943212.820217494.971776312019_Sep_03_1124
18tripcousin1728.8428638729.77667950.283920661.217736292019_Sep_03_1124
19traveltrip212.50914124.19236032.50914124.19236032019_Sep_05_1108
20travelfamily225.70991278.47582961.51755244.28346932019_Sep_05_1108
21travelfood239.126095210.4758940.65026562.00006442019_Sep_05_1108
22travelmountain2410.992835614.75813850.51694164.28224452019_Sep_05_1108
23travelriver2515.392671317.30880490.63453282.55066642019_Sep_05_1108
24travelexperience2619.025560122.8751271.71675525.56632212019_Sep_05_1108
25travelchurch2724.492473626.30789071.61734663.43276372019_Sep_05_1108
26fatherfavorite215.03316448.73215275.03316448.73215272019_Sep_05_1108
27fatherlove2211.050029812.89911822.31787714.16696552019_Sep_05_1108
28fatherconsiderable2315.532946919.69900972.63382876.79989152019_Sep_05_1108
29fatherkind2421.116156523.6990291.41714684.00001932019_Sep_05_1108
30fatherlearnable2525.399169832.11442191.70014088.41539292019_Sep_05_1108
31fatherinsurance2634.532186538.11478912.41776466.00036722019_Sep_05_1108
32familymother212.01747893.60233592.01747893.60233592019_Sep_05_1108
Sheet1
Cell Formulas
RangeFormula
G4,G27:G31,G20:G25,G13:G18,G6:G10G4=E4-F3
H4,H27:H31,H20:H25,H13:H18,H6:H10H4=F4-F3
 
Hi Peter:
Extracting data with added sheet.xlsm
AAABACADAEAF
1submit_time_6submit_time_7typed_word_9typed_word_10start_time_9start_time_10
Sheet2

On sheet2, in columns AC you have mixed the columns "typed_word" with the columns "start_time" and "submit_time" and you did not mention that in the original requirement.
I do not know your data, I only follow your instructions, but if you do not mention how your data is, then obviously the macro simply will not work.
Done, I modified the code to search "word", "start" and "submit" without order.

Try this:
VBA Code:
Sub Extracting_data()
  Dim arr As Variant, w As Variant, a() As Variant, RespN As Variant
  Dim sh As Worksheet, f As Range, vacio As Boolean
  Dim i As Long, j As Long, k As Long, lr As Long, lc As Long, m As Long, n As Long
  Dim colW As Long, colD As Long
 
  arr = Array("Sheet1", "Sheet2")
  Sheets("Output").Range("A2:I" & Rows.Count).ClearContents
 
  For Each w In arr
    Set sh = Sheets(w)
    Erase a
    lr = sh.Range("A" & Rows.Count).End(3).Row
    lc = sh.Cells(1, Columns.Count).End(1).Column
    a = sh.Range("A1", sh.Cells(lr, lc)).Value2
   
    colW = WorksheetFunction.CountIf(sh.Rows(1), "typed_word*")
    Set f = sh.Range("A1", sh.Cells(1, lc)).Find("date", , xlValues, xlPart, xlByColumns, xlPrevious, False)
    colD = f.Column
    ReDim b(1 To colW * (lr - 1), 1 To 9)
    k = 1
    For i = 2 To UBound(a, 1)
      RespN = ""
      b(k, 1) = a(i, 1)           'Cue
      b(k, 9) = a(i, colD)        'Date
      vacio = True
      For j = 1 To colW
        Set f = sh.Rows(1).Find("typed_word_" & j, , xlValues, xlWhole)
        If Not f Is Nothing Then
          If sh.Cells(i, f.Column) <> "" Then
            RespN = Val(RespN) + 1
            b(k, 1) = a(i, 1)     'Cue
            b(k, 9) = a(i, colD)  'Date
            b(k, 4) = RespN       'RespN
            '
            For m = 6 To lc
              Select Case LCase(a(1, m))
                Case LCase("typed_word_") & RespN:  b(k, 2) = a(i, m)
                Case LCase("start_time_") & RespN:  b(k, 5) = a(i, m)
                Case LCase("submit_time_") & RespN: b(k, 6) = a(i, m)
              End Select
            Next m
            '
            If RespN = 1 Then
              b(k, 7) = b(k, 5)
              b(k, 8) = b(k, 6)
            Else
              b(k, 7) = b(k, 5) - b(k - 1, 6)
              b(k, 8) = b(k, 6) - b(k - 1, 6)
            End If
            '
            k = k + 1
            vacio = False
          Else
            If vacio Then k = k + 1
            Exit For
          End If
        End If
      Next j
    Next i
    Sheets("Output").Range("A" & Rows.Count).End(3)(2).Resize(k, 9).Value = b
  Next w
  '
  MsgBox "Done"
End Sub

It is important that the column structures retain what you mentioned at the beginning. That is, if you decide to put a sheet with the "date" in the first column, in the second column a "word", then "start", then 20 empty columns, then "submit", then combined cells. You need to comment on it.
 
Upvote 0

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.
Thank you SO much for this Dante Amor. You've saved me dozens of hours of work. The sheet now works perfectly. Apologies for the oversight on the second sheet. The source sheets were all automatically generated by the software I was using, and vary as a result of the participants' actions during their tests. But I should have checked them more closely before I posted my issue. Sorry for the inconvenience, and many thanks once again.

Peter
 
Upvote 0

Forum statistics

Threads
1,225,743
Messages
6,186,777
Members
453,370
Latest member
juliewar

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top