Arnoux-Rauzy-Poincaré sequences

## Arnoux-Rauzy-Poincaré sequences

26 février 2015 | Catégories: sage | View Comments

In a recent article with Valérie Berthé [BL15], we provided a multidimensional continued fraction algorithm called Arnoux-Rauzy-Poincaré (ARP) to construct, given any vector $v\in\mathbb{R}_+^3$, an infinite word $w\in\{1,2,3\}^\mathbb{N}$ over a three-letter alphabet such that the frequencies of letters in $w$ exists and are equal to $v$ and such that the number of factors (i.e. finite block of consecutive letters) of length $n$ appearing in $w$ is linear and less than $\frac{5}{2}n+1$. We also conjecture that for almost all $v$ the contructed word describes a discrete path in the positive octant staying at a bounded distance from the euclidean line of direction $v$.

In Sage, you can construct this word using the next version of my package slabbe-0.2 (not released yet, email me to press me to finish it). The one with frequencies of letters proportionnal to $(1, e, \pi)$ is:

sage: from slabbe.mcf import algo
sage: D = algo.arp.substitutions()
sage: it = algo.arp.coding_iterator((1,e,pi))
sage: w = words.s_adic(it, repeat(1), D)
word: 1232323123233231232332312323123232312323...


The factor complexity is close to 2n+1 and the balance is often less or equal to three:

sage: w[:10000].number_of_factors(100)
202
sage: w[:100000].number_of_factors(1000)
2002
sage: w[:1000].balance()
3
sage: w[:2000].balance()
3


Note that bounded distance from the euclidean line almost surely was proven in [DHS2013] for Brun algorithm, another MCF algorithm.

Other approaches: Standard model and billiard sequences

Other approaches have been proposed to construct such discrete lines.

One of them is the standard model of Eric Andres [A03]. It is also equivalent to billiard sequences in the cube. It is well known that the factor complexity of billiard sequences is quadratic $p(n)=n^2+n+1$ [AMST94]. Experimentally, we can verify this. We first create a billiard word of some given direction:

sage: from slabbe import BilliardCube
sage: v = vector(RR, (1, e, pi))
sage: b = BilliardCube(v)
sage: b
Cubic billiard of direction (1.00000000000000, 2.71828182845905, 3.14159265358979)
sage: w = b.to_word()
sage: w
word: 3231232323123233213232321323231233232132...


We create some prefixes of $w$ that we represent internally as char*. The creation is slow because the implementation of billiard words in my optional package is in Python and is not that efficient:

sage: p3 = Word(w[:10^3], alphabet=[1,2,3], datatype='char')
sage: p4 = Word(w[:10^4], alphabet=[1,2,3], datatype='char') # takes 3s
sage: p5 = Word(w[:10^5], alphabet=[1,2,3], datatype='char') # takes 32s
sage: p6 = Word(w[:10^6], alphabet=[1,2,3], datatype='char') # takes 5min 20s


We see below that exactly $n^2+n+1$ factors of length $n<20$ appears in the prefix of length 1000000 of $w$:

sage: A = ['n'] + range(30)
sage: c3 = ['p_(w[:10^3])(n)'] + map(p3.number_of_factors, range(30))
sage: c4 = ['p_(w[:10^4])(n)'] + map(p4.number_of_factors, range(30))
sage: c5 = ['p_(w[:10^5])(n)'] + map(p5.number_of_factors, range(30)) # takes 4s
sage: c6 = ['p_(w[:10^6])(n)'] + map(p6.number_of_factors, range(30)) # takes 49s
sage: ref = ['n^2+n+1'] + [n^2+n+1 for n in range(30)]
sage: T = table(columns=[A,c3,c4,c5,c6,ref])
sage: T
n    p_(w[:10^3])(n)   p_(w[:10^4])(n)   p_(w[:10^5])(n)   p_(w[:10^6])(n)   n^2+n+1
+----+-----------------+-----------------+-----------------+-----------------+---------+
0    1                 1                 1                 1                 1
1    3                 3                 3                 3                 3
2    7                 7                 7                 7                 7
3    13                13                13                13                13
4    21                21                21                21                21
5    31                31                31                31                31
6    43                43                43                43                43
7    52                55                56                57                57
8    63                69                71                73                73
9    74                85                88                91                91
10   87                103               107               111               111
11   100               123               128               133               133
12   115               145               151               157               157
13   130               169               176               183               183
14   144               195               203               211               211
15   160               223               232               241               241
16   176               253               263               273               273
17   192               285               296               307               307
18   208               319               331               343               343
19   224               355               368               381               381
20   239               392               407               421               421
21   254               430               448               463               463
22   268               470               491               507               507
23   282               510               536               553               553
24   296               552               583               601               601
25   310               596               632               651               651
26   324               642               683               703               703
27   335               687               734               757               757
28   345               734               787               813               813
29   355               783               842               871               871


Billiard sequences generate paths that are at a bounded distance from an euclidean line. This is equivalent to say that the balance is finite. The balance is defined as the supremum value of difference of the number of apparition of a letter in two factors of the same length. For billiard sequences, the balance is 2:

sage: p3.balance()
2
sage: p4.balance() # takes 2min 37s
2


Other approaches: Melançon and Reutenauer

Melançon and Reutenauer [MR13] also suggested a method that generalizes Christoffel words in higher dimension. The construction is based on the application of two substitutions generalizing the construction of sturmian sequences. Below we compute the factor complexity and the balance of some of their words over a three-letter alphabet.

On a three-letter alphabet, the two morphisms are:

sage: L = WordMorphism('1->1,2->13,3->2')
sage: R = WordMorphism('1->13,2->2,3->3')
sage: L
WordMorphism: 1->1, 2->13, 3->2
sage: R
WordMorphism: 1->13, 2->2, 3->3


Example 1: periodic case $LRLRLRLRLR\dots$. In this example, the factor complexity seems to be around $p(n)=2.76n$ and the balance is at least 28:

sage: from itertools import repeat, cycle
sage: W
word: 1213122121313121312212212131221213131213...
sage: map(W[:10000].number_of_factors, [10,20,40,80])
[27, 54, 110, 221]
sage: [27/10., 54/20., 110/40., 221/80.]
[2.70000000000000, 2.70000000000000, 2.75000000000000, 2.76250000000000]
sage: W[:1000].balance()  # takes 1.6s
21
sage: W[:2000].balance()  # takes 6.4s
28


Example 2: $RLR^2LR^4LR^8LR^{16}LR^{32}LR^{64}LR^{128}\dots$ taken from the conclusion of their article. In this example, the factor complexity seems to be $p(n)=3n$ and balance at least as high (=bad) as $122$:

sage: W = words.s_adic([R,L,R,R,L,R,R,R,R,L]+[R]*8+[L]+[R]*16+[L]+[R]*32+[L]+[R]*64+[L]+[R]*128,'1')
sage: W.length()
330312
sage: map(W.number_of_factors, [10, 20, 100, 200, 300, 1000])
[29, 57, 295, 595, 895, 2981]
sage: [29/10., 57/20., 295/100., 595/200., 895/300., 2981/1000.]
[2.90000000000000,
2.85000000000000,
2.95000000000000,
2.97500000000000,
2.98333333333333,
2.98100000000000]
sage: W[:1000].balance()  # takes 1.6s
122
sage: W[:2000].balance()  # takes 6s
122


Example 3: some random ones. The complexity $p(n)/n$ occillates between 2 and 3 for factors of length $n=1000$ in prefixes of length 100000:

sage: for _ in range(10):
....:     W = words.s_adic([choice((L,R)) for _ in range(50)],'1')
....:     print W[:100000].number_of_factors(1000)/1000.
2.02700000000000
2.23600000000000
2.74000000000000
2.21500000000000
2.78700000000000
2.52700000000000
2.85700000000000
2.33300000000000
2.65500000000000
2.51800000000000


For ten randomly generated words, the balance goes from 6 to 27 which is much more than what is obtained for billiard words or by our approach:

sage: for _ in range(10):
....:     W = words.s_adic([choice((L,R)) for _ in range(50)],'1')
....:     print W[:1000].balance(), W[:2000].balance()
12 15
8 24
14 14
5 11
17 17
14 14
6 6
19 27
9 16
12 12


# References

 [BL15] V. Berthé, S. Labbé, Factor Complexity of S-adic words generated by the Arnoux-Rauzy-Poincaré Algorithm, Advances in Applied Mathematics 63 (2015) 90-130. http://dx.doi.org/10.1016/j.aam.2014.11.001
 [DHS2013] Delecroix, Vincent, Tomás Hejda, and Wolfgang Steiner. “Balancedness of Arnoux-Rauzy and Brun Words.” In Combinatorics on Words, 119–31. Springer, 2013. http://link.springer.com/chapter/10.1007/978-3-642-40579-2_14.
 [A03] E. Andres, Discrete linear objects in dimension n: the standard model, Graphical Models 65 (2003) 92-111.
 [AMST94] P. Arnoux, C. Mauduit, I. Shiokawa, J. I. Tamura, Complexity of sequences defined by billiards in the cube, Bull. Soc. Math. France 122 (1994) 1-12.
 [MR13] G. Melançon, C. Reutenauer, On a class of Lyndon words extending Christoffel words and related to a multidimensional continued fraction algorithm. J. Integer Seq. 16, No. 9, Article 13.9.7, 30 p., electronic only (2013). https://cs.uwaterloo.ca/journals/JIS/VOL16/Reutenauer/reut3.html