Replace adjacent identical tokens that match a regex

Sorry, I was working on the answer before seeing you first comment. If this doesn't answer your question, let me know, and I'll remove it or will try to modify it accordingly.

For the simple input provided in the question (what in the code below is stored in the my_string variable), you could maybe try a different approach: Walk your input list and keep a "bucket" of <matching_word, num_of_occurrences>:

my_string="xyz abc abc zzq ak9 ak9 ak9
foo abc"
my_splitted_string=my_string.split(' ')
occurrences = []
print ("my_splitted_string is a %s now containing:
       % (type(my_splitted_string),

current_bucket = [my_splitted_string[0], 1]
for i in range(1, len(my_splitted_string)):
    current_word = my_splitted_string[i]
    print "Does %s match %s?" % (current_word,
    if current_word == current_bucket[0]:
        current_bucket[1] += 1
        print "It does. Aggregating"
        current_bucket = [current_word, 1]
        print "It doesn't. Creating a new

print "Collected occurrences: %s" % occurrences
# Now re-collect:
for occurrence in occurrences:
    if occurrence[1] > 1:
        re_collected_str += "%s*%d " %
(occurrence[0], occurrence[1])
        re_collected_str += "%s " %
print "Compressed string: '%s'"

This outputs:

my_splitted_string is a <type
'list'> now containing: ['xyz', 'abc', 'abc',
'zzq', 'ak9', 'ak9', 'ak9', 'foo', 'abc']
Does abc match xyz?
It doesn't. Creating a new 'bucket'
Does abc match abc?
It does. Aggregating
Does zzq match abc?
It doesn't. Creating a new 'bucket'
Does ak9 match zzq?
It doesn't. Creating a new 'bucket'
Does ak9 match ak9?
It does. Aggregating
Does ak9 match ak9?
It does. Aggregating
Does foo match ak9?
It doesn't. Creating a new 'bucket'
Does abc match foo?
It doesn't. Creating a new 'bucket'
Collected occurrences: [['xyz', 1], ['abc', 2],
['zzq', 1], ['ak9', 3], ['foo', 1], ['abc', 1]]
Compressed string: 'xyz abc*2 zzq ak9*3 foo abc '

(beware of the final blank space)

