Square-free word - Wikiwand

In combinatorics, a square-free word is a word (a sequence of symbols) that does not contain any squares. A square is a word of the form $XX$ , where $X$ is not empty. Thus, a square-free word can also be defined as a word that avoids the pattern $XX$ .

Finite square-free words

Summarize

Perspective

Binary alphabet

Over a binary alphabet $\{0,1\}$ , the only square-free words are the empty word $\epsilon ,0,1,01,10,010$ , and $101$ .

Ternary alphabet

Over a ternary alphabet $\{0,1,2\}$ , there are infinitely many square-free words. It is possible to count the number $c(n)$ of ternary square-free words of length $n$ .

More information ⁠

...

The number of ternary square-free words of length n^[1]
$n$	0	1	2	3	4	5	6	7	8	9	10	11	12
⁠ $c(n)$ ⁠	1	3	6	12	18	30	42	60	78	108	144	204	264

This number is bounded by $c(n)=\Theta (\alpha ^{n})$ , where ${\textstyle 1.3017597<\alpha <1.3017619}$ .^[2] The upper bound on $\alpha$ can be found via Fekete's Lemma and approximation by automata. The lower bound can be found by finding a substitution that preserves square-freeness.^[2]

Alphabet with more than three letters

Since there are infinitely many square-free words over three-letter alphabets, this implies there are also infinitely many square-free words over an alphabet with more than three letters.

The following table shows the exact growth rate of the $k$ -ary square-free words, rounded off to 7 digits after the decimal point, for $k$ in the range from 4 to 15:^[2]

More information alphabet size (k), growth rate ...

Growth rate of the $k$ -ary square-free words
alphabet size ( $k$ )	4	5	6	7	8	9
growth rate	2.6215080	3.7325386	4.7914069	5.8284661	6.8541173	7.8729902
alphabet size ( $k$ )	10	11	12	13	14	15
growth rate	8.8874856	9.8989813	10.9083279	11.9160804	12.9226167	13.9282035

2-dimensional words

Consider a map ${\textbf {w}}$ from $\mathbb {N} ^{2}$ to $A$ , where $A$ is an alphabet and ${\textbf {w}}$ is called a 2-dimensional word. Let $w_{m,n}$ be the entry ${\textbf {w}}(m,n)$ . A word ${\textbf {x}}$ is a line of ${\textbf {w}}$ if there exists $i_{1},i_{2},j_{1},j_{2}$ such that ${\text{gcd}}(j_{1},j_{2})=1$ , and for $t\geq 0,x_{t}=w_{{i_{1}}+{j_{1}t},{i_{2}}+{j_{2}t}}$ .^[3]

Carpi^[4] proves that there exists a 2-dimensional word ${\textbf {w}}$ over a 16-letter alphabet such that every line of ${\textbf {w}}$ is square-free. A computer search shows that there are no 2-dimensional words ${\textbf {w}}$ over a 7-letter alphabet, such that every line of ${\textbf {w}}$ is square-free.

Generating finite square-free words

Shur^[5] proposes an algorithm called R2F (random-t(w)o-free) that can generate a square-free word of length $n$ over any alphabet with three or more letters. This algorithm is based on a modification of entropy compression: it randomly selects letters from a k-letter alphabet to generate a ⁠ $(k+1)$ ⁠-ary square-free word.

algorithm R2F is
    input: alphabet size  $k\geq 2$ ,
           word length  $n>1$ 
    output: a ⁠ $(k+1)$ ⁠-ary square-free word  $w$  of length  $n$ .

    (Note that  ${\textstyle \color {gray}\Sigma _{k+1}}$  is the alphabet with letters  $\color {gray}\{1,...,k+1\}$ .)
    (For a word  $\color {gray}w\in \Sigma _{k}$ ,  $\color {gray}\chi _{w}$  is the permutation of  $\color {gray}\Sigma _{k}$  such that  $a$  precedes  $b$  in  $\color {gray}\chi _{w}$  if the 
     right most position of  $a$  in  $w$  is to the right of the rightmost position of  $b$  in  $w$ .
     For example,   $\color {gray}w=136263163\in \Sigma _{6}$  has  $\color {gray}\chi _{w}=361245$ .)

    choose  $w[1]$  in  ${\textstyle \Sigma _{k+1}}$  uniformly at random 
    set  $\chi _{w}$  to  $w[1]$  followed by all other letters of  ${\textstyle \Sigma _{k+1}}$  in increasing order
    set the number  $N$  of iterations to 0

    while  $|w|<n$  do
        choose  $j$  in  ${\textstyle \Sigma _{k}}$  uniformly at random
        append  $a=\chi _{w}[j+1]$  to the end of  $w$ 
        update  $\chi _{w}$  shifting the first  $j$  elements to the right and setting  $\chi _{w}[1]=a$ 
        increment  $N$  by  $1$ 
        if  $w$  ends with a square of rank  $r$  then
            delete the last  $r$  letters of  $w$ 

    return  $w$

Every (k+1)-ary square-free word can be the output of Algorithm R2F, because on each iteration it can append any letter except for the last letter of $w$ .

The expected number of random k-ary letters used by Algorithm R2F to construct a ⁠ $(k+1)$ ⁠-ary square-free word of length $n$ is $N=n\left(1+{\frac {2}{k^{2}}}+{\frac {1}{k^{3}}}+{\frac {4}{k^{4}}}+O\left({\frac {1}{k^{5}}}\right)\right)+O(1).$ Note that there exists an algorithm that can verify the square-freeness of a word of length $n$ in $O(n\log n)$ time. Apostolico and Preparata^[6] give an algorithm using suffix trees. Crochemore^[7] uses partitioning in his algorithm. Main and Lorentz^[8] provide an algorithm based on the divide-and-conquer method. A naive implementation may require $O(n^{2})$ time to verify the square-freeness of a word of length $n$ .

Remove ads

Infinite square-free words

Summarize

Perspective

There exist infinitely long square-free words in any alphabet with three or more letters, as proved by Axel Thue.^[9]

Examples

First difference of the Thue–Morse sequence

One example of an infinite square-free word over an alphabet of size 3 is the word over the alphabet $\{-1,0,+1\}$ obtained by taking the first difference of the Thue–Morse sequence.^[9] That is, from the Thue–Morse sequence

0,1,1,0,1,0,0,1,1,0,0,1,0,1,1,0...

one forms a new sequence in which each term is the difference of two consecutive terms of the Thue–Morse sequence. The resulting square-free word is

1,0,-1,1,-1,0,1,0,-1,0,1,-1,1,0,-1,...

(sequence A029883 in the OEIS).

Leech's morphism

Another example found by John Leech^[10] is defined recursively over the alphabet $\{0,1,2\}$ . Let ⁠ $w_{1}$ ⁠ be any square-free word starting with the letter 0. Define the words $\{w_{i}\mid i\in \mathbb {N} \}$ recursively as follows: the word $w_{i+1}$ is obtained from ⁠ $w_{i}$ ⁠ by replacing each 0 in ⁠ $w_{i}$ ⁠ with 0121021201210, each 1 with 1202102012021, and each 2 with 2010210120102. It is possible to prove that the sequence converges to the infinite square-free word

0121021201210120210201202120102101201021202102012021...

Generating infinite square-free words

Infinite square-free words can be generated by square-free morphism. A morphism is called square-free if the image of every square-free word is square-free. A morphism is called k–square-free if the image of every square-free word of length k is square-free.

Crochemore^[11] proves that a uniform morphism $h$ is square-free if and only if it is 3-square-free. In other words, $h$ is square-free if and only if $h(w)$ is square-free for all square-free $w$ of length 3. It is possible to find a square-free morphism by brute-force search.

algorithm square-free_morphism is
    output: a square-free morphism with the lowest possible rank  $k$ .

    set  $k=3$ 
    while True do
        set k_sf_words to the list of all square-free words of length  $k$  over a ternary alphabet
        for each  $h(0)$  in k_sf_words do
            for each  $h(1)$  in k_sf_words do
                for each  $h(2)$  in k_sf_words do
                    if  $h(1)=h(2)$  then
                        break from the current loop (advance to next  $h(1)$ )
                    if  $h(0)\neq h(1)$  and  $h(2)\neq h(0)$  then
                        if  $h(w)$  is square-free for all square-free  $w$  of length 3 then
                            return  $h(0),h(1),h(2)$ 
        increment  $k$  by 1

Over a ternary alphabet, there are exactly 144 uniform square-free morphisms of rank 11 and no uniform square-free morphisms with a lower rank than 11.

To obtain an infinite square-free words, start with any square-free word such as 0, and successively apply a square-free morphism $h$ to it. The resulting words preserve the property of square-freeness. For example, let $h$ be a square-free morphism, then as $w\to \infty$ , $h^{w}(0)$ is an infinite square-free word.

Note that, if a morphism over a ternary alphabet is not uniform, then this morphism is square-free if and only if it is 5-square-free.^[11]

Remove ads

Letter combinations in square-free words

Summarize

Perspective

Thumb — Extending a square-free word to avoid $ab$ .

Avoid two-letter combinations

Over a ternary alphabet, a square-free word of length more than 13 contains all the square-free two-letter combinations.^[12]

This can be proved by constructing a square-free word without the two-letter combination $ab$ . As a result, $bcba$ $cbca$ $cbaca$ is the longest square-free word without the combination $ab$ and its length is equal to 13.

Note that over a more than three-letter alphabet there are square-free words of any length without an arbitrary two-letter combination.

Avoid three-letter combinations

Over a ternary alphabet, a square-free word of length more than 36 contains all the square-free three-letter combinations.^[12]

Note that over a more than three-letter alphabet there are square-free words of any length without an arbitrary three-letter combination.

Density of a letter

The density of a letter $a$ in a finite word $w$ is defined as ${\frac {|w|_{a}}{|w|}}$ where $|w|_{a}$ is the number of occurrences of $a$ in $w$ and $|w|$ is the length of the word. The density of a letter $a$ in an infinite word is $\liminf _{l\to \infty }{\frac {|w_{l}|_{a}}{|w_{l}|}}$ where $w_{l}$ is the prefix of the word $w$ of length $l$ .^[13]

The minimal density of a letter $a$ in an infinite ternary square-free word is equal to ${\frac {883}{3215}}\approx 0.2747$ .^[13]

The maximum density of a letter $a$ in an infinite ternary square-free word is equal to ${\frac {255}{653}}\approx 0.3905$ .^[14]

Remove ads

Notes

Loading content...

References

Loading content...

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads