Find positions in a main sequence not covered by internal sequence

ehansen

New Member
Joined
Nov 18, 2009
Messages
5
Dear Mr Excel,

I have a dna sequence, one word of 10 letters (aattggccaa)

Furthermore I have two columns; the start and end position of a number of internal sequences within the main sequence shown above.

Ex
2 5 means the sequence from 2 to 5 (attg)
6 6 means the 6th position (g)

I need a way to extract the positions of the main sequence that is not:warning: covered by any of the shorter internal sequences. In this case positions 1 and 7-10.

Can you please help me. I looks easy but in the reality the main sequence is 250-300 letters and the internal represents a myriade of sequences with different lenght and starting points.

Big thanks,

Erik
 

Excel Facts

Which lookup functions find a value equal or greater than the lookup value?
MATCH uses -1 to find larger value (lookup table must be sorted ZA). XLOOKUP uses 1 to find values greater and does not need to be sorted.
see if this helps,
Excel Workbook
ABCD
1Main seqStartend
2aattggccaa25agccaa
3aaaccaggataaat59aaactaaat
4aaacccaaaccggtac711aaacccggtac
Sheet1
Excel 2003
Cell Formulas
RangeFormula
D2=REPLACE(A2,B2,C2-B2+1,"")
 
Upvote 0
Dear Sankar, thanks for your reply.

What I need is the positions (in numbers) of the letters in the main sequence not covered by any of the internal sequences.
Ex
The main sequence (10 letters) has two internal sequences with the positions 2-5 and 8-9. I would like to get an output telling me that the positions 1, 6-7 and 10 is not covered by any internal stretch of letters.

Thanks,

Erik
 
Upvote 0
could you post few lines of sample data and the expected results
 
Upvote 0
Here are a sample

The main sequence is 10 letters long: aattggccaa
It has two "internal sequences"
position 2-5; here shown in bold aattggccaa
position 7-9; here shown in bold aattggccaa

I would like to have an output file showing the positions of those letters not covered by any of the internal sequences.
The positions should be 1, 6 and 10; in bold aattggccaa;

Thus the inverse of the sum projection of the internal sequences

Thanks,

Erik
 
Upvote 0
Erik

Some questions:

1. How is your data presented/ And how do you want the results presented? Is it like my sample below? If not, please post a small screen shot or clarify in words.

2. Are you looking for a formula approach? Or a macro approach? Or either?

3. For the sample below, are my results correct?

Excel Workbook
ABCDE
1DataResults
2SequenceStartEndStartEnd
3abcdefghijklmnopq1157
4241415
58121717
61313
71616
8
Seq
 
Upvote 0

Forum statistics

Threads
1,223,231
Messages
6,170,884
Members
452,364
Latest member
springate

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top