Mean length of largest palindromic substring

Here’s a quick one. The problem is to find out what is the mean length of the largest palindromic substring from a x-character string. (I saw the problem at this Google+ post, where it’s stated with numbers, not exactly strings.)

Solving the problem analytically is definitely complicated. So I wanted to make a program to actually calculate the statistics of the length of the largest palindromic substrings, to check out the theories.

The difficult part of the program is finding out the length of the largest palindromic substring of the given strings. Here is how it works: I start with a list of all the 1-digit and 2-digit palindromic substrings from the input. The 1-digit ones are each character. The 2-digit ones are the simply the characters whose following one is equal to it. The list has maximum size of 2*x-1 entries. Each entry is a tuple, with the length of the substring, and the index of the first digit.

After this list is initialized, we grow each of these substrings in both directions if the previous and following characters are equal. If the substring reaches the limits of the string, or if the next characters are different from each other, the substring stops growing.

At each iteration with build a new list from the previous, so the smaller palindromic substring get dropped. At each iteration we record the length of the current largest substring, and that is the output of the routine.

I pasted the code at http://pastebin.com/ZyfmtYt9 too.

from pylab import *

def largest_palindrome(x,num):
    ## Initialize  palindrome candidates list with the unit palindromes
    seeds = [(1,k) for k in range(x)]
    ## Initial "even" palindrome middles
    seeds += [(2,k) for k in range(x-1) if num[k]==num[k+1]]

    while seeds != []:
        output = seeds[-1][0]
        newseeds = []
        for l,k in seeds:
            if k>0 and k+l<x-1:
                if num[k-1]==num[k+l]:
                    newseeds.append((l+2,k-1))
        seeds = newseeds

    return output


def iteration(x):
    num = randint(0,10,x)
    return largest_palindrome(x,num)


if __name__ == '__main__':
    ion()

    Nsmp = 10000
    Lx = [3,4,5,6,7,8,9,10,32,100,316,1000]

    suptitle('Cumulative probabilities of largest palindromes', size=16, fontweight='bold')
    axes([.1,.2,.8,.7])
    kk=0
    Lll=[]
    meanz = []
    for x in Lx:
        kk+=0.04
        cdf = sort([iteration(x) for k in range(Nsmp)])
        p=mgrid[1:Nsmp+1]/(Nsmp+1.)
        ll=plot(cdf+kk,p,'-')[0]
        Lll.append(ll)
        meanz.append(mean(cdf))

    figlegend(Lll,Lx, 'lower center', ncol=6, title='Values of x (number of digits)')
    grid()
    plot([0,10],[.5,.5],'k--')
    xlim(0,10)
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s