Motif sequence generation
MotifSequenceGenerator
— ModuleMotifSequenceGenerator
This module generates random sequences of motifs, under the constrain that the sequence has some total length ℓ so that q - δq ≤ ℓ ≤ q + δq
. All main functionality is given by the function random_sequence
.
MotifSequenceGenerator.random_sequence
— Functionrandom_sequence(motifs::Vector{M}, q, limits, translate, δq = 0; kwargs...)
Create a random sequence of motifs of type M
, under the constraint that the sequence has "length" ℓ
exactly within q - δq ≤ ℓ ≤ q + δq
. Return the sequence itself as well as the sequence of indices of motifs
used to create it. A vector of probabilities weights
can be given as a keyword argument, which then dictates the sampling probability for each entry of motifs
for the initial sequence created.
"length" here means an abstracted length defined by the struct M
, based on the limits
and translate
functions. It does not refer to the amount of elements!
M
can be anything, given the two functions
limits(motif)
: Some function that given themotif
it returns the(start, fine)
of the the motif in the same units asq
. This function establishes a measure of length, which simply isfine - start
.translate(motif, t)
: Some function that given themotif
it returns a new motif which is translated byt
(either negative or positive), with respect to the same units asq
.
Other Keywords
Please see the source code (use @which
) for a full description of the algorithm.
tries = 5
: Up to how many initial random sequences are accepted.taulcut = 2
: Up to how times an element is dropped from the initial guess.summands = 3
: Up to how many motifs may be combined as a sum to complete a sequence.
Simple Example
This example illustrates how the module MotifSequenceGenerator
works using a simple struct
. For a more realistic, and much more complex example, see the example using music notes.
Let's say that we want to create a random sequence of "shouts", which are described by the struct
struct Shout
shout::String
start::Int
end
Let's first create a vector of shouts that will be used as the pool of possible motifs that will create the random sequence:
using Random
shouts = [Shout(uppercase(randstring(rand(3:5))), rand(1:100)) for k in 1:5]
5-element Array{Main.ex-shout.Shout,1}:
Main.ex-shout.Shout("ANFH", 45)
Main.ex-shout.Shout("KU8JA", 82)
Main.ex-shout.Shout("AXY", 82)
Main.ex-shout.Shout("0MPU", 96)
Main.ex-shout.Shout("5ZYNB", 44)
Notice that at the moment the values of the .start
field of Shout
are irrelevant. MotifSequenceGenerator
will translate all motifs to start point 0 while operating.
Now, to create a random sequence, we need to define two concepts:
shoutlimits(s::Shout) = (s.start, s.start + length(s.shout) + 1);
shouttranslate(s::Shout, n) = Shout(s.shout, s.start + n);
shouttranslate (generic function with 1 method)
This means that we accept that the temporal length of a Shout
is length(s.shout) + 1
.
We can now create random sequences of shouts that have total length of exactly q
:
using MotifSequenceGenerator
q = 30
sequence, idxs = random_sequence(shouts, q, shoutlimits, shouttranslate)
sequence
6-element Array{Main.ex-shout.Shout,1}:
Main.ex-shout.Shout("KU8JA", 0)
Main.ex-shout.Shout("0MPU", 6)
Main.ex-shout.Shout("0MPU", 11)
Main.ex-shout.Shout("KU8JA", 16)
Main.ex-shout.Shout("AXY", 22)
Main.ex-shout.Shout("AXY", 26)
sequence, idxs = random_sequence(shouts, q, shoutlimits, shouttranslate)
sequence
6-element Array{Main.ex-shout.Shout,1}:
Main.ex-shout.Shout("5ZYNB", 0)
Main.ex-shout.Shout("AXY", 6)
Main.ex-shout.Shout("ANFH", 10)
Main.ex-shout.Shout("KU8JA", 15)
Main.ex-shout.Shout("ANFH", 21)
Main.ex-shout.Shout("AXY", 26)
Notice that it is impossible to create a sequence of length e.g. 7
with the above pool. Doing random_sequence(shouts, 7, shoutlimits, shouttranslate)
would throw an error.
Floating point lengths
The lengths of the motifs do not have to be integers. When using motifs with floating lengths, it is advised to give a non-0 δq
to random_sequence
. The following example modifies the Shout
struct and shows how it can be done with floating length.
struct FloatShout
shout::String
dur::Float64
start::Float64
end
rs(x) = uppercase(randstring(x))
shouts = [FloatShout(rs(rand(3:5)), rand()+1, rand()) for k in 1:5]
shoutlimits(s::FloatShout) = (s.start, s.start + s.dur);
shouttranslate(s::FloatShout, n) = FloatShout(s.shout, s.dur, s.start + n);
q = 10.0
δq = 1.0
r, s = random_sequence(shouts, q, shoutlimits, shouttranslate, δq)
r
6-element Array{Main.ex-shout.FloatShout,1}:
Main.ex-shout.FloatShout("JPN", 1.3536562534518157, 0.0)
Main.ex-shout.FloatShout("JPN", 1.3536562534518157, 1.3536562534518157)
Main.ex-shout.FloatShout("JPN", 1.3536562534518157, 2.7073125069036315)
Main.ex-shout.FloatShout("JQHG", 1.9769545158514747, 4.060968760355447)
Main.ex-shout.FloatShout("D7X5P", 1.079242757688183, 6.0379232762069215)
Main.ex-shout.FloatShout("JQHG", 1.9769545158514747, 7.117166033895105)
s
6-element Array{Int64,1}:
5
5
5
1
4
1