VIDA is written in the Julia programming language for a few reasons:
It is fast! Julia can achieve C and FORTRAN speeds while not having to deal with memory.
It's syntax is very simple. Unlike Themis which is written in C++ the development time is orders of magnitude lower in Julia.
So to install VIDA you first have to install the Julia binary on your system. The best bet is to go to the Julia site and install the official binary
To install the VIDA package you can run the following
using Pkg;Pkg.add("VIDA")
VIDA
is based on the idea of interpreting the image as a probability distribution. Namely since any image is integrable, the space of images is in one-to-one correspondence with a probability distribution, especially since the total flux of the image is already known a priori.
Therefore, our idea is very close to variational inference, hence the name Variational Image Domain Analysis. Namely, where we view the image as a distribution and we aim to find a approximation of the distribution given some parameteric family $f_\theta(x,y)$, which for our purproses we will typically call a filter.
The choice of filter, depends on the problem of interest, namely what features we are interested in. Typically for the EHT where our images tend to be rings, we are interested in
Radius r₀
Width or half width σ
Structural asymmetry τ
Flux asymmetry s
Position angle ξ
VIDA
then defines a series of filters parameterize these features.
Currently we have 5 filters defined, although they all belong to the same family. For an example on how to see the process for defining your own filter please see the readme.
The filters implemented are:
GaussianRing
which is a symmetric and circular Gaussian ring.
SlashedGaussianRing
which is a circular Gaussian ring with a flux gradient across its emission.
EllipticalGaussianRing
symmetric Gaussian elliptical ring, where the emission is constant across the ring, unlike with the SlashedGaussianRing.
GeneralGaussianRing
A combination of the two above where the ring is allowed to be elliptical and have a intensity gradient.
TIDAGaussianRing
The GeneralGaussianRing, but where the asymmetry and flux orienation are fixed relative to one another.
AsymGaussian
A asymmetric Gaussian blob. This can be useful if you image has a strong non-ring component in it.
Constant
Adds a constant flux floor to the image. This is very helpful for image reconstructions that tend to add small scale flux everywhere in the image.
CosineRing{N,M}
A ring filter where the azimuthal brightness and thickness is expressed using a cosine expansion.
In order to extract features we first need a cost function that penalized our parameterized distributions $f_\theta(x,y)$. Since we are considering the image as a probability distribution, one cost function would be the distance or divergence between two distributions. A probability divergence is just a functional that takes in two probability distributions p,q and is minimized iff $p\equiv q$.
Currently we have two divergences implemented in VIDA
Bhattacharyya divergence (Bh)
\[ Bh(f_\theta|I) = \int \sqrt{f_\theta(x,y)I(x,y)} dxdy. \]
KL divergence
\[ KL(f_\theta|I) = \int f_\theta(x,y)\log\left(\frac{f_\theta(x,y)}{I(x,y)}\right)dxdy. \]
Both divergences give very similar answers, although we found the BH to be easier to maximize.
VIDA
Using VIDA is based on constructing three items:
Data, i.e. an image that you want to extract features from.
Cost function, i.e. pick if you want to use the KL or BH divergence
Filter, i.e. construct the family of distributions or filters that you will use to approximate the image.
Then all you need to do is minimize the divergence and you will have extracted you image features.
Now lets runs through how that works
To load VIDA we follow the typical Julia flow. Note that to include plotting functionality you need to include Plots as well
using Plots using VIDA using InteractiveUtils
VIDA
currently only works with fits images. THe fits header is based off of what eht-imaging outputs. So as long as you stick to that standard you should be fine.
To read in an image we just use the load_fits
function which should work with any fits image from ehtim and clean
#Load the image img = load_fits("data/elliptical_gaussian_rot-45.00m87Scale_seed_23_simobs_netcal_scanavg-z0.6-s100-t0-v0-l0-p50-e0.000.fits") #To see what this img is lets print the type println(typeof(img)) #To plot the image we can just call plot. This uses recipes and the Plots.jl framework plot(img)
VIDA.EHTImage{Array{Float64,2}}
So from the output we see that img is a EHTImage type. The information in the curly brackets defines the parametric type information. What this means is that the image that is constrained in the EHTImage type is a Matrix whose elements are Float64's.
Julia isn't a traditional OOP language. Namely, methods/functions are first class and aren't members of a class. Instead how a function behaves is dependent on the type of argument inputs. This is known as something called multimethods or multiple dispatch where at run-time the type of functions called is determined by the arguments.
In some sense OOP is just a multiple dispatch type language where the our type of dispatch only depends on the class, i.e. the self argument in Python classes.
Now because of the lack of classes sometimes it can be difficult to figure out which functions will act of our datatypes, e.g. the EHTImage type. Fortunately, Julia has some convenience functions that let you see which functions/methods can act of the EHTImage type
#To see what functions can act on an EHTImage object just call methodswith(EHTImage)8-element Array{Method,1}:
From this list we see there are several methods that can act on EHTImage types. To see what a certain function does you can type ?inertia
in the terminal to see the help for the inertia method.
In order to find the optimal filter you need to first decide on your objective or cost function. In VIDA we use probaility divergences to measure differences between the filter and image. A divergence is defined as an abstract type AbstractDivergence
. In VIDA a divergence is a functor
. A functor is a type that has an anonymous function attached to it. That means it is both a type and a function. For instance we create a divergence by
bh = Bhattacharyya(img); kl = KullbackLeibler(img);
Now to evaluate the divergence we need to pass it a filter. This can be any filter your choose. The great thing about julia is that bh will use multiple dispatch to figure out which filter is being passed to the divergence.
For instance lets create a few different filters
gr = GaussianRing(r0=20.0, σ=5.0, x0=0.0, y0=0.0) ggr = GeneralGaussianRing(r0=20.0, σ = 5.0, τ = 0.2, ξτ = 0.78, s = 0.5, ξs = 0.78, x0=0.0, y0=0.0 ) # We can also plot both filters a = plot(gr, title="GaussianRing") b = plot(ggr, title="GeneralGaussianRing") plot(a, b, layout=(1,2), size=(600,300))
VIDA has a number of filters defined. These are all subtypes of the AbstractFilter type. To see which filters are implemented you can use the subtype method:
subtypes(VIDA.AbstractFilter)
11-element Array{Any,1}: VIDA.AddFilter VIDA.AsymGaussian VIDA.Constant VIDA.CosineRing VIDA.Disk VIDA.EllipticalGaussianRing VIDA.GaussianRing VIDA.GeneralGaussianRing VIDA.MulFilter VIDA.SlashedGaussianRing VIDA.TIDAGaussianRing
Note that the AddFilter and MulFilter are internal filters that allow the user to easily combine two filters, for example:
add = gr + 1.0*ggr
VIDA.AddFilter{VIDA.GaussianRing,VIDA.MulFilter{VIDA.GeneralGaussianRing,Fl oat64}}(VIDA.GaussianRing r0: Float64 20.0 σ: Float64 5.0 x0: Float64 0.0 y0: Float64 0.0 , VIDA.MulFilter{VIDA.GeneralGaussianRing,Float64} θ: VIDA.GeneralGaussianRing r0: Float64 20.0 σ: Float64 5.0 τ: Float64 0.2 ξτ: Float64 0.78 s: Float64 0.5 ξs: Float64 0.78 x0: Float64 0.0 y0: Float64 0.0 Irel: Float64 1.0 )
To evaluate the divergence between our filter and image we then just evaluate the divergence on the filter
@show bh(gr); @show bh(ggr); @show bh(add);
bh(gr) = 0.2850517886474983 bh(ggr) = 0.2922585743257235 bh(add) = 0.28198849078985727
Now neither filter is really a great approximation to the true image. For instance visually they look quite different, which can be checked with the triptic
function
a = triptic(img, gr) b = triptic(img, ggr) c = triptic(img, add) plot(a,b,c, layout=(3,1), size=(800,800))
To extract the optimal filter the first thing you need to do is define your ExtractProblem
. This requires your divergence, initial filter, and bounds.
lower = GaussianRing(r0=0.1, σ=0.01, x0=-60.0, y0=-60.0); upper = GaussianRing(r0=60.0, σ=20.0, x0=60.0, y0=60.0); initial = GaussianRing(r0=20.0, σ=5.0, x0=0.0, y0=0.0); prob = ExtractProblem(bh, initial, lower, upper);
Now to run the optimizers you just need to select which optimizer to use. Currently VIDA has three families of optimizers installed. Each one is a subtype of the VIDA.Optimizer abstract type
subtypes(VIDA.Optimizer)
3-element Array{Any,1}: VIDA.BBO VIDA.CMAES VIDA.Opt
Of the three implemented optimizers my suggestion would be to try the BBO one first. This uses the BlackBoxOptim.jl package. BBO has a number of options that can be changed. To see them please use the julia ?
mode, or see the documentation.
To optimize all you need to do is run the extractor function.
optfilt, divmin = extractor(prob, BBO()) triptic(img, optfilt)
Starting optimization with optimizer BlackBoxOptim.DiffEvoOpt{BlackBoxOptim .FitPopulation{Float64},BlackBoxOptim.RadiusLimitedSelector,BlackBoxOptim.A daptiveDiffEvoRandBin{3},BlackBoxOptim.RandomBound{BlackBoxOptim.Continuous RectSearchSpace}} 0.00 secs, 0 evals, 0 steps 0.50 secs, 2756 evals, 2607 steps, improv/step: 0.264 (last = 0.2635), fitn ess=0.170067861 1.00 secs, 5684 evals, 5537 steps, improv/step: 0.240 (last = 0.2191), fitn ess=0.165721076 1.50 secs, 8752 evals, 8605 steps, improv/step: 0.259 (last = 0.2924), fitn ess=0.165688939 2.00 secs, 11679 evals, 11532 steps, improv/step: 0.266 (last = 0.2880), fi tness=0.165688930 2.50 secs, 14750 evals, 14604 steps, improv/step: 0.270 (last = 0.2852), fi tness=0.165688930 3.00 secs, 17814 evals, 17669 steps, improv/step: 0.244 (last = 0.1181), fi tness=0.165688930 3.50 secs, 20890 evals, 20746 steps, improv/step: 0.211 (last = 0.0227), fi tness=0.165688930 4.00 secs, 23962 evals, 23818 steps, improv/step: 0.185 (last = 0.0104), fi tness=0.165688930 Optimization stopped after 24857 steps and 4.17 seconds Termination reason: Max number of function evaluations (25000) reached Steps per second = 5960.51 Function evals per second = 5995.04 Improvements/step = Inf Total function evaluations = 25001 Best candidate found: [18.7786, 4.23264, -5.85353, -2.86036] Fitness: 0.165688930
Well that seemed to do a terrible job. The reason is that a lot of these images tend to have some low level flux throughout the image. To account for this the filter tends to get very big to absorb some of this flux. To combat this you can add a constant background filter to the problem.
lower = GaussianRing(r0=0.1, σ=0.01, x0=-60.0, y0=-60.0) + 1e-10*Constant(); upper = GaussianRing(r0=60.0, σ=20.0, x0=60.0, y0=60.0) + 1.0*Constant(); initial = GaussianRing(r0=20.0, σ=5.0, x0=0.0, y0=0.0) + 0.1*Constant(); prob = ExtractProblem(bh, initial, lower, upper); optfilt, divmin = extractor(prob, BBO()) triptic(img, optfilt)
Starting optimization with optimizer BlackBoxOptim.DiffEvoOpt{BlackBoxOptim .FitPopulation{Float64},BlackBoxOptim.RadiusLimitedSelector,BlackBoxOptim.A daptiveDiffEvoRandBin{3},BlackBoxOptim.RandomBound{BlackBoxOptim.Continuous RectSearchSpace}} 0.00 secs, 0 evals, 0 steps 0.50 secs, 3177 evals, 3062 steps, improv/step: 0.256 (last = 0.2557), fitn ess=0.061815832 1.00 secs, 6327 evals, 6212 steps, improv/step: 0.261 (last = 0.2670), fitn ess=0.055471319 1.50 secs, 9484 evals, 9370 steps, improv/step: 0.272 (last = 0.2939), fitn ess=0.055460468 2.00 secs, 12579 evals, 12465 steps, improv/step: 0.276 (last = 0.2856), fi tness=0.055460461 2.50 secs, 15712 evals, 15600 steps, improv/step: 0.274 (last = 0.2667), fi tness=0.055460461 3.00 secs, 18836 evals, 18724 steps, improv/step: 0.263 (last = 0.2093), fi tness=0.055460461 3.50 secs, 21992 evals, 21880 steps, improv/step: 0.239 (last = 0.0979), fi tness=0.055460461 Optimization stopped after 24891 steps and 3.98 seconds Termination reason: Max number of function evaluations (25000) reached Steps per second = 6253.66 Function evals per second = 6281.30 Improvements/step = Inf Total function evaluations = 25001 Best candidate found: [18.7085, 3.50879, -5.88309, -2.76303, 0.0212716] Fitness: 0.055460461
Well that's much better! Now if you wanted to capture the asymmetry in the ring you can use other filters, for example the CosineRing filter. Note that this filter tends to be a little harder to fit.
lower = CosineRing{1,4}(r0=0.1, σ=[0.1, -1.0], ξσ = [-π], τ = 0.01, ξτ = -π, s = [0.01, -1.0, -1.0, -1.0], ξs = [-π,-π,-π,-π], x0=-60.0, y0=-60.0 ) + 1e-10*Constant(); upper = CosineRing{1,4}(r0=40.0, σ=[20.0, 1.0], ξσ = [π], τ = 0.999, ξτ = π, s = [0.999, 1.0, 1.0, 1.0], ξs = [π,π,π,π], x0=60.0, y0=60.0 ) + 1.0*Constant(); initial = CosineRing{1,4}(r0=20.0, σ=[5.0, 0.1], ξσ = [0.0], τ = 0.1, ξτ = 0.0, s = [0.1, 0.0, 0.0, 0.0], ξs = [0.0,0.0,0.0,0.0], x0=0.0, y0=0.0 ) + 1e-2*Constant(); prob = ExtractProblem(bh, initial, lower, upper); optfilt, divmin = extractor(prob, BBO(tracemode=:silent)); triptic(img, optfilt)
Well that looks pretty great!