Dependencies
The software depends on the following Python packages for calculations and displaying results:

Numpy: a library for array manipulation and calculations [14];

Scipy: the Python scientific library, from which we used the gamma function and the nonlinear least squares fitting function [14];

Matplotlib: a Matlablike plotting library to generate information graphs of the results [15].
After installation, its functions are available using following import command within the interpreter or a Python script:
from Rheology import ⋆
Data presentation, data format
Instead of defining individual classes for each data type (MSD, creep compliance and complex shear modulus), an alternative technique in Python is to employ generalized lists, the so called dictionaries or ‘dicts’. A dict is a container class holding data which are indexed with arbitrary keys (marked by quotes in this paper). This is a very general and flexible data structure, which we used for all data in the Rheology package. For example, MSD dicts contain keys “tau” and “MSD” referring to arrays holding the time lag and the mean square displacement data, respectively.
Particle trajectories
Particle tracking microrheology starts with the imaging experiments and the image treatment in a strict sense, however, obtaining particle trajectories from video microscopy is well described in the literature [16–20] including the statistical difficulties of the process [21–23]. There are various implementations of particle tracking algorithms available in IDL (see the same website as for the rheology code) [11, 24], Matlab [12, 25], LabView [20, 26] or C++ languages [27]. An implementation of the Grier method [16] is also translated to Python [28].
Thus, we start our discussion by extracting rheological information from the particle trajectories, leaving the implementation of data input/output to the user. As a good starting template, we recommend the ReadTable() and SaveTable() functions in the ProcessRheology.py example script.
Drift correction
Several experimental systems show a drift: a slow oriented motion of the sample versus the imaging frame, which is caused by various factors of the experimenting apparatus. To remove this drift, which is independent of the sample, there are multiple possibilities one may consider. If a reference bead bound to the sample is available, its trajectory provides the drift itself. If multiple particles are tracked in the same time interval, an average trajectory may be calculated and used as a reference. If these possibilities are not available, one has to consider whether the long time motion is due to drift or it is a characteristic of the investigated sample, because subtracting it changes the resulted long time lag (low frequency) part of the viscoelastic characteristics.
The simplest way of data treatment in such cases is to calculate a smoothened trajectory: e. g. using a running average, and either subtract this smoothened data or, in one further step, fit a low order polynomial to the smoothened data and subtract the fitted values from the trajectory.
A very simple implementation is available for two dimensional data sets as:
GetData(timestamps, poslist, indx=0,
order=3,resolution=0.1434, Nd= 1000)
This algorithm takes a position list (parameter poslist), which is a list of dicts, each describing the positions of a particle. The indx parameter is used to select one of them. A position dict contains “X”, and “Y”, which are arrays of the x and y positions. The dict also contains an index array denoted with key “indx”, identifying the image index (serial number) of the given positions. Using this index allows the tracking algorithm to miss individual frames (e. g. when the particle drifted out of focus). This index is also used to define the time point of a position, either by identifying the corresponding time stamps, provided in seconds in the timestamps array, or if this variable is set to None, multiplying the index by the optional tscale parameter. The Nd parameter gives the number of points used in the running average and the order parameter identifies the order of the polynomial to be fitted for drift correction. If Nd is set to −1, the running averaging is off, and if order is 1, the drift correction is turned off. The resolution is used to scale the coordinates to micrometers (the same value for both coordinates).
Mean square displacement (MSD)
Characterizing a soft sample using particle tracking microrheology strongly relies on the determination of the mean square displacement. Calculating the MSD has two possible sets of assumptions: 1.) ensemble averages are based on recording many tracer particles and assuming homogeneity across the sample. This is the averaging method considered in theories, and has the advantage allowing estimation of the timedependent aging of the system. Technically this can be achieved by using video recordingbased particle tracking, where the number of tracers can be increased to the order of tens. 2.) Assuming ergodicity, one can switch from ensemble averaging to time averaging. This is very important for systems which are not homogeneous and for cases where only few particles (1−5) can be observed at a time. For samples of biological origin, time averaging is more suitable because these samples are seldom homogeneous.
Calculating the time average is done by splitting up the trajectory into nonoverlapping parts and averaging their displacement. Because the number of intervals decreases with an increasing lag time, this method has very high error for large lag time values. Alternatively, it is also possible to split the trajectory into overlapping regions and then do the averaging. The resulting statistical errors follow a nonnormal distribution, but it has been shown that using overlapping segments the resulted MSD may show improved accuracy [29, 30] when using lag time values up to about the quarter of the measurement length (N/4 for N data points). The results of a test calculation using positions randomly calculated from a normal distribution with standard deviation of σ_{
X
}= σ_{
Y
}= 0.15μm in both X and Y directions are presented in Figure 2. The MSD oscillates around the theoretical value of <\mathrm{\Delta}r{\left(\tau \right)}^{2}>=2({\sigma}_{X}^{2}+{\sigma}_{Y}^{2})=0.09\mu {m}^{2} as expected, but the data calculated using overlapping intervals show visibly better accuracy.
msd = MSD(positionlist, tau= 0)
MSD() is the function to calculate the mean square displacement from a single trajectory. The data points are presented as a two dimensional array positionlist, containing coordinates ((x y) or (x y z)) in each row.
The second parameter (tau) is used to generate the lag time values (steps) for which the MSD is calculated. This parameter can take various values: If an array or list of integer values are given, then those are used as index step values. If a single integer is given between 0 and the number of positions, then so many index steps (lag time values) are generated in an equidistant manner. If nothing, or 0 is given, then values 1…N/4 are used.
Each tau value results in M = M(tau) pairs, where the step r[i + tau]−r[i] is calculated. M also depends on whether the set was generated using overlapping intervals (if overlap = True is set).
If an array of time values (in seconds) are provided for the position data using the optional tvector parameter, the algorithm will check the time step between each data pair used. Calculating the mean value of these time steps and using a relative error, every value outside the mean(1 ± tolerance) will be ignored (by default tolerance = 0.1). This process eliminates jumps in the data caused by computer latency during recording.
The function returns a dictionary containing “tau”, “dtau”, “MSD”, “DMSD” keys. If the time values were not provided, then “tau” holds the index steps between the positions and “dtau” is not used.
Creep compliance
Assuming that the generalized StokesEinstein relation holds, the creep compliance is linearly proportional to the MSD [6]. The MSD_to_J() function calculates the creep compliance using equation (1).
J = MSD_to_J(msd, t0= 0.1, tend = 150,
T = 25.0, a = 1.0)
The calculation requires a dict structure (msd) having the time values under the “tau” key, and the MSD values under the “MSD” key (error values are optional). Further parameters are the temperature T of the experiment in Celsius degrees, the radius a of the applied tracer particle in micrometers, and optionally the dimensionality of the motion D (denoted as N_{
D
} in equation (1)), which is set to D = 2 by default.
Calculating the frequency dependent shear modulus from J(t) with the numerical method proposed by Evans et al. requires extrapolated values to the zero time point and to infinite time values. These values are estimated here, allowing the user to override them before being used to calculate the complex shear modulus G^{∗}(ω). The zero time value J_{0} = J(t = 0) is extrapolated from a linear fit in the t < t0 region, and the end extrapolation is obtained from a linear fit to the tend < t part. The slope of the extrapolated end part is 1/η, where η is the steady state viscosity [31].
The function returns a new dict containing: “J” (in 1/Pa), “tau”, “eta”, “J0”, “const”, “dJ”, and the fit parameters as “a0”, “b0” for the first part and “a1”, “b1” for the end part, where the linear equation J = a_{
i
}t + b_{
i
}(i = 0,1) holds.
Calculating the frequency dependent complex shear modulus
While the connection presented by equation (2) is simple, there is a major problem with determining \stackrel{~}{J}\left(\omega \right) numerically. It is well known in numerical analysis that applying a numerical Fourier transform increases the experimental noise enormously [32]. In microrheology, there are four commonly applied methods to solve this problem. The first two address the noise of the Fourier transform directly by averaging or by fitting, the second two were suggested in the last decade to improve the transform itself [6, 9, 31–33].
In a homogeneous system, where multiple particles can be tracked, converting their MSD to creep compliance and then G^{∗}(ω) using a discrete Fourier transform, allows one to average the converted values and decrease the noise this way [32, 33]. For cases when the creep compliance can be modeled using an analytical form, the Fourier transform of the fitted analytical function may be calculated and used to estimate of G^{∗}(ω) [9].
As we have discussed above, samples of biological origin are often not homogeneous and their MSD does not follow a welldescribed analytical function. However, model calculations suggest, in agreement with experiments, that biopolymers and many polymer gels show power law behavior at various time ranges [34–36]. The third and fourth conversion approaches have been suggested for such systems in the microrheology literature. One uses a power law approximation of the MSD or the creepcompliance [6] (we shall call the Mason method), and the other calculates a linear interpolation between the data points and applying a discrete Fourier transform on it [31] (we shall cite as the Evans method).
Because these methods have their strength and weakness, we summarize them and their implementation below. Recently we have shown that the accuracy of the Evans method can be greatly improved by using local interpolation of the data in a splinelike manner without forcing a single function to be fitted to the whole data set. This improvement is also included in the Rheology framework and will be discussed below.
The Mason method
This is a fast conversion method based on the Fourier transformation of a power function, which has been used in various works in the last decade [6–8, 37–39]. Briefly, let us consider a generalized diffusion process, where the MSD is following a power law: < Δr(t)^{2} > = 2N_{
D
}D t^{α} [40], where D is the generalized diffusion coefficient, and 0 < α ≤ 2. In the case of α = 1 we can talk about regular Brownian diffusion, α < 1 describes subdiffusion and α > 1 superdiffusion, indicating active forces participating in the process. Using the generalized StokesEinstein equation (1) we consider Brownian and subdiffusion processes, thus α ≤ 1 [5, 40]. In this case the creep compliance will also be described with a power function as:
J\left(t\right)={J}_{0}{t}^{\alpha}
(3)
The complex shear modulus can be directly calculated using a gamma function (Figure 3), in the form of:
{G}^{\ast}\left(\omega \right)={e}^{\frac{i\Pi \alpha}{2}}\frac{{\omega}^{\alpha}}{{J}_{0}\Gamma (1+\alpha )}=\frac{{e}^{\frac{i\Pi \alpha}{2}}}{J(t=1/\omega )\Gamma (1+\alpha )}.
(4)
Usually this equation is presented containing the MSD [41], but the key feature, the symmetry between the time dependent creepcompliance and the frequency dependent complex shear modulus, is more apparent in this form. The method generalizes this symmetry between ω ↔ t : ω = 1/t, and assumes it holds for the whole measured time range [6], even when the exponent α of the power law varies with time.
However, this is a highly specific case, and the symmetry does not hold for most of the functions [41]. To improve the fit quality, a slightly more complicated version of this formula has been proposed by Dasgupta et al. in [38], based on empirical corrections.
The J_to_G_Mason() function implements both methods based on references [6, 38] and the leastsq() function of scipy, which is a modified LevenbergMarquardt minimization algorithm.
G = J_to_G_Mason(J,N=30, advanced=True,
logstep=True)
The algorithm takes a creepcompliance dict (J), and fits a power function locally in the form of equation (3) to estimate α and calculates the complex shear modulus at ω = 1/t using equation (4). The algorithm uses a Gaussian function to weight the neighbors in the local fit as it is described by Mason et al. [6]. N defines the desired number of resulted data points, which are created by equidistant sampling in the linear or logarithmic space between 1/t_{
max
}…1/t_{
min
}. The logarithmic sampling is activated by the logstep parameter, which is set by default. The resulted dict contains ’omega’ and ’f’ for the circular frequency and frequency respectively, and a ’G’ array storing the corresponding complex shear moduli.
There are several further parameters to control the process, from which setting the advanced parameter forces the use of the method proposed by Dasgupta et al. instead of the original Mason method, and the verbose switch provides graphical feedback on how the local fitting proceeds. A test example is presented in Figure 3 using a creep compliance in the form of equation (3) and converted both numerically and analytically. Because this method is accurate for power law creep compliances, the conversion matches within machine precision.
The Evans method
The fourth method we again discuss in detail. It is based on the work published by Evans et al. [31]. This method considers a linear interpolation between the N data points and provides a conversion in the form of equation (5)[41].
{G}^{\ast}\left(\omega \right)=\frac{i\omega}{i\omega {J}_{0}+\sum _{k=0}^{N}({A}_{k+1}{A}_{k}){e}^{i\omega {t}_{k}}},
(5)
where A_{
i
}s are defined by:
{A}_{k}=\frac{{J}_{k}{J}_{k1}}{{t}_{k}{t}_{k1}},\mathit{\text{where}}0<k\le N,{A}_{0}=0,{A}_{N+1}=\frac{1}{\eta}.
(6)
J_{0} and η are already estimated using the linear fits in the MSD_to_J() function. Using equation (5) is straightforward, and allow for the calculation of G^{∗}(ω) at any ω values. The natural selection of a suitable frequency range would be from 0 to N\frac{\Pi}{T} (the unit is 1/s) in N/2 steps as it is common for the discrete Fourier transform of equally sampled data [32]. The corresponding function is:
G = J_to_G(J)
The algorithm generates a linear array of frequencies, but the number of points is limited to be maximum 1000, usually more than sufficient (MSD and creep compliance data arrays may hold several thousand points). The result is a similar dictionary as it was for the Mason method, and can be tested using a Maxwellfluid, which has a linear creep compliance in the form of J(t) = 1/E + t/η (Figure 4).
There are various details worth mentioning about this method, which may affect the accuracy of the result in general cases. It is clear in equation (5), that the method is sensitive to the A_{k + 1}−A_{
k
}terms, which, in extreme cases, may be either very small for a nearly linear part of J(t) or very high for sudden jumps in the experimental data. In order to reduce roundoff errors, one may eliminate the close to zero values, (when A_{k + 1}−A_{
k
} < ε) by activating the filter parameter.
This numerical conversion method has two basic problems. First, from equation (5) it is apparent that the complex shear modulus is directly related to the numerical Fourier transform of the A_{k + 1}−A_{
k
}set, and thus very sensitive to the noise of these data. Second, the limited bandwidth causes the A_{
k
}values to be a poor representation of the local derivative of the creep compliance, resulting in a strong deviation (usually an unphysical cutoff) of the calculated shear modulus. This latter problem is well represented on a test example of a KelvinVoigt solid characterized by a Young’s modulus of E = 10Pa and viscosity η = 0.2Pas(Figure 5).
The conversion can be improved by increasing the bandwidth of the data, or decreasing the frequency range where G^{∗}(ω) is calculated (using the omax parameter). A third alternative is using model interpolations to increase the bandwidth numerically. Because most experimental data cannot be fitted with a single analytical function for all τ values, we have developed a method to fit the MSD at consecutive intervals in a splinelike manner [41].
Adaptive splinelike fitting
Biopolymer samples commonly show monotonic MSD, frequently described by power laws at various time segments. This makes the choice of the power function a good candidate for the local fitting. To count for deviations from power laws at short time values, the fitting system also allows the use of a KelvinVoigt solid as an alternative model for short time scales, in the case of more elastic samples. Including a Maxwell fluid as an option was not necessary, since the Evans method provides perfect fits for linear MSDs (see Figure 4).
Because this algorithm has not been published previously, we describe it here in detail. The procedure is designed to run automatically being controlled only by selecting the start interval of the data and a scaler, which defines how fast the fitting range should increase with time. The key steps are:

1.
define the data range using indices i _{0} = 0 and i _{1} = i(t _{0}), containing at least 4 data points;

2.
fit function from i _{0}…i _{1}, calculate the squared error of each point and estimate the average error;

3.
find the last point around i _{1} where \left(\right)close="">\n \n \n \n \chi \n \n \n 2\n \n \n (\n \n \n i\n \n \n 2\n \n \n )\n \n \n \n \chi \n \n \n mean\n \n \n 2\n \n \n \n and redefine i _{1} as this point i _{1} = i _{2}(stretching or shrinking the fitting range to a reasonable maximum)

4.
recalculate the fit and errors, store as the current segment

5.
if we reached the end (or close to the end), finish the cycle, return with the results

6.
otherwise, define the new fitting range, using the scaler parameter (by default use i _{0}…scaler × i _{0})

7.
return to 2 (see Additional file 1)
The exact procedure calculating the new range in each step may vary depending on the mode an optional keyword parameter, allowing for some control for functions which show fast changes or slow changes with noise. The default method (mode=“scaler”) described above works well for most MSDs with some noise but monotonic trends. (More details are available in the help of the library and the config.example text file in the Examples folder of the package).
The corresponding function call is:
fits = MSD_power_fit(msd, t0=0.2,
scaler=10.0).
Completing the above algorithm, in the next step the program checks where the fits would cross eachother in the neighboring ranges, and readjusts them to the crossing point, if it lies within the union of the two ranges. This step helps to maintain a smoother approximation of the experimental data. Turning verbosity on using verbose=True, one may see details about how the algorithm operates.
Dynamic and static interpolation
The return value (fits) is a list of dicts, each containing the fitting parameters of one segment. This list can be directly used to calculate the interpolation of the original data using:
msd2 = MSD_interpolate(msd, fits, smoothing
= True, Nsmoothing=10, insert=10, factor=3)
The interpolation and insertion of new points before the first data point will increase the bandwidth of the original data. The factor parameter controls the oversampling of the original data (here it is set to 3× oversampling). Inserting new data points between 0 and the first time τ_{0} is specified by insert = 10.
Estimating how many new points are required during this resampling procedure is a difficult question in general. The above example uses a static approach, simply inserting factor points between every two original data points, which results in a very large equidistant data set. Alternatively we provide a dynamic method, which is controlled by an error parameter.
Investigating the form of equations (2) and (6), one can see that the accuracy of the Evans method is strongly related to how accurately equation (6) approximates the local derivative of J(t). Knowing the analytical form of the interpolating functions, this error can be approximated using the function (f ) and its second derivative (f ”) [32, 41]. Based on this approximation and specifying a local error ε, the minimal step size at any time point h(t) can be estimated as:
h=\epsilon \sqrt{\frac{{f}_{k}\left({t}_{k}\right)}{{{f}_{k}}^{\u2033}\left({t}_{k}\right)}}.
(7)
Using h as a minimal step size for each fitted segment, the program can dynamically interpolate the data and increase the number of points only where the creep compliance changes faster. This results in a nonequally sampled data set, which (after calculating the creep compliance) can be well handled by both the MSD_to_J() function and subsequently the numerical conversion method J_to_G(), resulting in an improved complex shear modulus. To eliminate further bandwidth problems, the maximal desired circular frequency ω_{
max
} can be forced to \frac{N\Pi}{T} using the omax parameter.
The consequence of this fitting and interpolation procedure is an increased bandwidth and decreased noise in the interpolated data set. The resulting accuracy is some orders of magnitude larger than the specified ε but strongly related to it. Therefore the user has to estimate the suitable ε for the given problem. For example, inserting 10 new points between 0−t_{0} and requesting an accuracy of ε = 5×10^{−4}, the algorithm has corrected the errors of Figure 5 to about 1% relative error on average (Figure 6). The Mason method produces about 100% relative error for the loss modulus of the same conversion, originating from the failure of its power law assumption, and can not be improved by interpolation [41].
The advantage of employing nonequally sampled data as the result of the dynamic interpolation is clear if we compare the required smallest time step produced by this method. The results indicate h_{
min
} ≈ 2 × 10^{−6}s, which would increase the number of data points about 5000× in the case of equidistant sampling. In comparison, the increment of the number of data points is only about 20%.
For data sets following various trends in the different time segments (thus the fitting procedure identifies multiple fitted regions), the transition between the regions can be smoothened using a linear interpolation between the two smoothing functions. The range of this smoothing is defined by the NSmooth parameter. The range is identified in the original data, but then applied to the refined data set. The result is an MSD, where sudden jumps are reduced, minimizing the presence of oscillatory artifacts in the resulted complex shear modulus G^{∗}(ω).