I wrote a solution to this challenge based on this answer. It successfully handles the example case given, but not the actual case.
Challenge:
Given two strings
sandt,tis a substring ofsiftis contained as a contiguous collection of symbols ins(as a result,tmust be no longer thans).The position of a symbol in a string is the total number of symbols found to its left, including itself (e.g., the positions of all occurrences of 'U' in "AUGCUUCAGAAAGGUCUUACG" are 2, 5, 6, 15, 17, and 18). The symbol at position
iofsis denoted bys[i].A substring of
scan be represented ass[j:k], wherejandkrepresent the starting and ending positions of the substring ins; for example, ifs = "AUGCUUCAGAAAGGUCUUACG", thens[2:5] = "UGCU".The location of a substring
s[j:k]is its beginning positionj; note thattwill have multiple locations insif it occurs more than once as a substring ofs(see the Sample below).
Given:
Two DNA strings
sandt(each of length at most 1 kbp).
Return:
All locations of
tas a substring ofs.
Sample Dataset:
GATATATGCATATACTT
ATAT
Sample Output:
2 4 10
For the sample, it works. Sure, you need to trim the formatting by hand, but that's only a couple of seconds of work.
The actual data and my generated output are not accepted.
Actual Dataset:
CAAATAGTCACACAATAGTCGGCTAAATAGTCAATAGTCAAATAGTCAGAGCTAATAGTCTAAATAGTCGAAAAATAGTCATCAATAGTCTAAATAGTCAATAGTCGGAATAGTCAAAATAGTCAATAGTCAATAGTCAATAGTCGACTAAATAGTCCCAATAGTCTCAGAAATAGTCAATAGTCGTAATAGTCAATAGTCTAATAGTCTAATAGTCCAATAGTCTGTCAAATAGTCAATAGTCCAATAGTCGTTTAATAGTCCCCTTTACCAATAGTCAATAGTCCGAATAGTCAGGAATAGTCAGCACTAATAGTCAATAGTCCTAATAGTCCCAATAGTCAAAATAGTCAATAGTCTAAATAATAGTCCTAGCAGAAGAATAGTCTAATAGTCGGCAATAGTCAATAGTCAAATAGTCAGAATAGTCAAATAGTCGAAATAGTCAATAGTCAATAGTCAAATAGTCAAATAGTCAATAGTCAAATAGTCAAATAGTCAAATAGTCGAATAGTCTGTAATAGTCAATAGTCCTTCAATAGTCTAATAGTCATTCAATAGTCAAGAAATAGTCGGGGGAATAGTCCGAATAGTCAAATAGTCAATAGTCGAATAGTCTAATAGTCAATAGTCTAATAGTCTGATAATAGTCAAATAGTCAATAGTCTAAATAGTCGCCTATGCCAATAGTCTTATCAAATAGTCTCTTAATAGTCTAATAGTCAATAGTCAATAGTCTAATAGTCATAATAGTCAATAGTCAAGGAATAGTCCCATAATAGTCAATAGTCTTAATAGTCCAAACGAAATAGTCTTAATAGTCCCTAATAGTCACTAATAGTCGTAATAGTCATAATAGTCCAATAGTCTAAATAGTCTGCAATAGTCAAATAGTCAAATAGTCCGTACAATAGTCTTAATAGTCTTTGCGGCTCAATAGTCTCATAATAGTC
AATAGTCAA
Actual Output (trimmed):
26 33 93 109 118 125 132
Code:
def find_substring_locations(long, short)
mpos = []
re = Regexp.new(short)
m = i = 0
m = re.match( long, i ) { |k| j = k.begin(0); i = j + 1; mpos << j } while m
return mpos
end
def plus_one(input)
arr = []
for i in input
arr << (i += 1)
end
arr
end
main_string = gets.chomp
sub_string = gets.chomp
plus_one(find_substring_locations(main_string, sub_string))
Where did I go wrong? It appears to be in order.
EDIT: Turns out the problem was a hiccup in the environment. After a reboot the problem could not be reproduced.