Computer Science FSU Science

Code Walkthrough: Inverse Modeling

In previous articles (here and earlier here) I covered in some cursory way my current research into the California Current and about the technique of Inverse Modeling, so today I wanted to delve into the actual tools and code that I’ve developed.  To review, an inverse model is

An inverse model seeks to solve a series of linear equations (Soetaert and Van Oevelen 2014) which state our understanding of the nutrient flows involved in the ecosystem under study. For example, in our study we will be looking at carbon flows between plankton, fish, and various organic carbon pools (e.g. microzoa, sardines and detritus). This series of linear equations is then codified as a set of matrix equations.

Screen Shot 2015-04-19 at 20.52.16

Without delving too far into the linear algebra, which is beyond the scope of this article, these sets of equations represent–in the simplest case–the mass balance equations (1); the various field measurements (2); and a wide variety of physiological factors including growth rates and efficiencies, and observations (3).

The focus here on out will be on the code and any discussion of the theory, mathematics or oceanography involved in the research will be limited to that which is directly applicable to the code at hand. Links and notes for relevant aspects will be provided wherever possible. Let’s get into it then.

To provide a little bit of an overview, which I think is important whenever details might obscure larger scale structure, here is a diagram of the code hierarchy.

Basic structure of the code base shown without libraries or interconnectedness.
Basic structure of the code base shown without libraries or interconnectedness.

The primary entrance point to the code is through a file called ‘AnalysisHead.r’ and it allows us to run the model, analyze the results and other large scale tasks through it’s subordinate scripts. It also handles the primary UI (command line) and help functionality so it’s code is rather unpleasant to step through. I’ve opted to include the entire source code of the all the files in the Appendix at the end of the article, and throughout I will use code snippets and representations to illustrate the process.

To begin off we’ll start at the first commands to be executed and move on from there.

require(compiler)
enableJIT(3)
compilePKGS(enable=TRUE)

args <- commandArgs(trailingOnly = TRUE)			## Argumens of the form: <id> <options>

head(args)

The first calls all have to do with optimization[1] and are merely interesting to those esoteric few which find that stuff interesting (whom I count myself as one), but after those comes the real start. We take the args from the command line and pass them to the head() function which processes our input and starts the whole machine running. By using command line arguments we save ourselves from hard coding everything and makes trial-and-error time much lower.

Here is sample command-line call that initiates, -i, a new model run for data from cycle 3 and saves the results as 001.

Rscript AnalysisHead.r -i 3 001

This script calls MCMC.r which does all the setup for the model by reading in spreadsheets, organizing data and ultimately doing everything up until the simulation is actually running. The majority of the code here is simply picking values off of the spreadsheet and saving them in an organized vector or matrix, for example take a look at the code below.

  ## Read in the Ae matrix from spreadsheet
  Ae = t(data.matrix(sheetA)[,1:exact_eq])

  ## Pull specific values for h vector
  h[[12]] = sheetB[58]		#Smz-mn2 = Small Gr Min Res (100-surf)
  h[[13]] = sheetB[60]		#Lmz-mn2 = Lg Gr Min Res (100-surf)
  h[[14]] = sheetB[62]		#Gel-mn2 = Gel pred Min Res (100-surf)
  h[[15]] = sheetB[57]		#DeepSmz-mn2 = Sm Gr Min Res (450-100m)
  h[[16]] = sheetB[59]		#DeepLmz-mn2 = Lg Gr Min Res (450-100m)
  h[[17]] = sheetB[61]		#DeepGel-mn2 = GelPred Min Res (450-100m)
  h[[18]] = sheetB[56]		#Sar = Sardine Min Res

Once MCMC.r has read in the model’s structure, the matrices $Aa$, $Ae$ and $G$ are built along with their respective vectors $\vec{ba}$, $\vec{be}$ and $\vec{h}$ which together codify all the constraints of the inverse model. The matrices and vectors are then passed off to Xsample who is the workhorse of the Monte Carlo method employed in my work.

sol = xsample(A=Aa,B=ba, E=Ae, F=be, G=Gg, H=h, sdB=sdb, iter=iter_count, outputlength=iter_out,
              jmp=jmp, burninlength=iter_burn, type="mirror")

After a mater of minutes or hours, the Xsample script returns a data-frame[2] of possible model solutions along with the mean and standard deviation of all the solutions. The quantity of output, typically 1000’s of vectors, is the main asset of using a monte carlo method to explore the solution space of the model.

Let’s make sure we’re all on the same page, here is a script that should give a pretty good idea of what a Monte Carlo method is.

MC = function(n) {
    ## A = pi*r^2
    ## First, let's plot a cicle
    x = seq(0,1,0.01)
    plot(x=x, y = sqrt(1-x^2), xlim=c(0,1), ylim=c(0,1), type='l', ylab="y")
    
    ## Then generate points and plot them
    naccept = 0
    for (i in 1:n) {
        x = runif(1) ## Random x
        y = runif(1) ## Random y
        if (x^2+y^2 < 1) { ## Does point lie inside circle?
            naccept = naccept + 1
            points(x,y, pch=3)
        }
        else
            points(x,y,col="red",pch=3)
    }
    text(x=0,y=0.1,paste(naccept*4.0/n))
}

There are numerous formulas[3] to calculate pi, but don’t think that makes it easy. Here is a personal favorite[4]:30c013b91f90b075196b6b964c2c5010I’m pretty sure you need a masters in theoretical mathematics just to make sense of the formulas, so I’m going to calculate it with some brute force(i.e. a Monte Carlo simulation). To do this all you need is to check if a randomly generated point lies inside or outside of a circle. If my high school geometry class was right[5], a point is inside a unit circle if [math]x^2+y^2 < 1[/math].

The estimate with 100 points is 3.36.
The estimate with 100 points is 3.36.

By running the script for $n=10$, we are testing it for 10 points, and then by taking the ratio of the points inside the circle to outside the circle we can find the value of $\pi$. When only using 10 points, the estimated value of $\pi = 3.6$, but by taking more samples we improve our estimate. With 100,000 samples the script finds $\pi=3.1416$ which is accurate to 0.0002 percent. Not bad for a method that uses random numbers and some simple geometry.

For my work, there is no simple geometric analog, since instead of working in 2-D it is in 105-D; but nevertheless it works in much the same way. Xsample also uses some fancy algorithms to negotiate the region and does it’s best to give us an idea of the solution’s shape and size[6].

The majority of the Xsample script was written by Karel Van den Meersche et al.[7] and can be found in the limSolve package on CRAN[8], so the version I present here is optimized for my usage. Once the Xsample script returns, the MCMC.r function simply saves the data and generates some human-readable spreadsheets before closing.

save(sol, Aa,Ae, ba, be,Gg,h,sdb,cond, file=paste("./data/", cond$hash, ".RData",sep='') )

reateSheet(newwb, name="Data")
createSheet(newwb, name="Sol")
createSheet(newwb, name="G")
createSheet(newwb, name="h")
createSheet(newwb, name="Aa")
createSheet(newwb, name="ba")
createSheet(newwb, name="Ae")
createSheet(newwb, name="be")

Thinking through and figuring out ways to make accessing the output as painless as possible can be daunting, but once done the payout is 10 fold.

Just as important as saving the output of the simulation is analyzing the output of the simulation. This task falls to Analyze.r which generates nearly a dozen graphs and charts of nearly publication-quality, and all this requires is a simple command:

Rscript AnalysisHead.r -a <name>

So where does this leave us? We can now run the model, save the output and the generate output and analyses, but we still need to know what questions to ask. A good question is the basis for any productive scientific endeavor, even Lonergan knew that[9].


Notes

  1. The compiler package in R permits Just In-Time (JIT) compilation of code. This is immensely useful for looping when vectorization is not practical. For a complete discussion on the optimization of my code, please see here.
  2. A data-frame is a type of data structure in R. See this page for description.
  3. Wikipedia has an excellent description of the history of the calculation of $\pi$.
  4. This formula, the Chudnovsky Algorithm, still amazes me every time I see it (wiki).
  5. Looks like geometry class was right after all (wikipedia).
  6. The algorithm is actually a random walk Monte Carlo simulation where the next solution is correlated with the previous one with a known probability. This more advanced form of MC is useful when you need a bias towards more realistic solutions is required.
  7. Van den Meersche, K., Soetaert K., van Oevelen, D. (2009). xsample(): An R Function for Sampling Linear Inverse Problems. Journal of Statistical Software, Code Snippets, vol 30, 1-15.
  8. CRAN is a central repository of R packages, and the limSolve package can be found here.
  9. Lonergan’s philosophy is quite apposite (link).

 


Appendix

Here is the entire source code of the file for clarity.

# AnalysisHead.r - 2015 Thomas Bryce Kelly
#
# This script is designed to be the main entrance point into the inverse modeling system. This script can be used to call:
#   * MCMC2.r:mcmc(cond) : loads the worksheet and calculates the xsample solution based on <cond>.
#   * Analysis.r:analyze(hash, list) : loads data from ./data/<hash>.RData file and generates the plots specified in <list>
#   * continueRun.r:continueRun(hash, cond) : loads data from ./data/<hash>.RData and runs xsample based on <cond> AND from last solution of loaded data.
# 
# <cond> : data.frame(cycle=NULL, iter_count=NULL, iter_out=NULL, iter_burn=NULL, jmp=NULL, hash=NULL)
# <list> : c(0,0,0,0,0,0,0)

require(compiler)
enableJIT(3)
compilePKGS(enable=TRUE)

args <- commandArgs(trailingOnly = TRUE)		## Arguments of the form: <id> <options>


head = function(args) {
setwd("~/Dropbox/FSU/Inverse Modeling")
suppressMessages(library("digest"))
suppressMessages(source("Analysis.r"))
suppressMessages(source("MCMC.r"))
suppressMessages(source("continueRun.r"))

########### All of the functions follow:

help = function() {
	print("To use the program please follow the outlines below")
	print("    -i <cycle> <hash>		- Runs mcmc with the specified conditions")
	print("    -a <hash>				- Analyzes the data in <hash>")
	print("    -c <hashs>				- Continues the runs in <hashs> with new conditions")
	print("    -c2 <hashs>				- Continues the runs in <hashs> with old conditions")
}

init = function(cycle=3, hash=digest(algo='crc32', runif(1))) {
	iter = 10000000
	cond = data.frame(cycle=as.numeric(cycle), iter_count=iter, iter_out=1000, iter_burn=iter/10, jmp=4, hash=hash)
	mcmc(cond)
	print(cond$hash)
}

## Continue Runs for elements of <l> ##
continue = function(l) {
	cond = data.frame(cycle=4, iter_count=2500000, iter_out=1000, iter_burn=1000000, jmp=4, hash="00000000")
	for (i in c(1:length(l))){
		continueRun(l[i],cond)
	}
}

## Continue Runs for elements of <l> based on original cond ##
continue2 = function(l) {
	for (i in c(1:length(l))){
		print(paste("Running continue run 1 for", i))
		continueRun2(l[i])
	}
}

## Run Analysis for elements of <l> ##

for (i in c(1:length(l))) {
#  analyze(l[i], c(1,1,1,1,1,1,1))
#}

## Analyze set for elements in <l>
anset = function(l) {
	analyzeSet(l,plots=c(1,1,1,1,1,1,1))
}

## Analyze run data
ana = function(l) {
	for ( i in 1:length(l)) {
		analyze(hash, c(1,1,1,1,1,1,1))
	}
}

####### The Command structure


print(args[1]=='-h')
if(length(args) < 1) {
	print("Please specify an option")
	help()
	
}

if (args[1] == '-h') { help() }
if (args[1] == '-i') {
	if (length(args) == 3) {
		init(args[2], args[3])
	} else {
		print("Insufficient argumetns")
		help()
	}
}
if (args[1] == '-a') {
	if (length(args) == 2) ana(c(args[2]))
	else {
		print("Insufficient argumetns")
		help()
	}
}
}

head(args)

 

 

#' MCMC2.r - 2015 Thomas Bryce Kelly
#' 
#' This script is the first stage of the inverse modeling workflow. Here the excel model is imported and the matricies and arrays are generated. Finally xsample() is called using <cond>. The important function calls are:
#'   * XLConnect:loadWorkbook(path) : Loads the workbook
#'   * XLConnect:readWorksheet(workbook, sheet, ...) : Reads data from workbook.
#'   * misc:log(string, cond) : This uses the method specified in misc.r to save to the ./misc.csv file the <string> followed by a list of values within <cond>. Useful for later reference.
#'   * Twitter.r:up(title, cond, name) : This function uploads the file <name>, logs the upload, and notifies via twitter that <title> was uploaded.
#'   * Twitter.r:notify(title, cond, name) : This function notifies via twitter that <name> has been uploaded (must be done separately) and logs the event.
#'   * save() : Standard system save(). By default model data is saved in ./data/<cond$hash>.RData.
#'   * xsample.r:xsample( <see fn call below> ) : Used the data loaded from load(). By default uses the last solution found in sol$X to initalize x0. This can be changed.
#' 
#' <cond> : data.frame(cycle=NULL, iter_count=NULL, iter_out=NULL, iter_burn=NULL, jmp=NULL, hash=NULL)

mcmc = function(cond=data.frame(cycle=1, iter_count=100000, iter_out=100, iter_burn=10000, jmp=1, hash=NULL) {
  library(XLConnect)
  library(Matrix)
  library(MASS)
  library(ggplot2)
  library(Hmisc)
  library(limSolve)
  library(digest)
  library(diagram)
  
  source("xsample.r")
  source("Twitter.r")
  source("misc.r")
  
  ## Setup environment
  WB = loadWorkbook("./Spreadsheets/EndToEnd2Layer_TK105.xls")
  log(paste("Loaded Workbook",WB@filename),cond)
  start_cell = c(5,6)
  exact_eq = 23
  approx_eq = 14
  inequal = 79
  flows = 105
  
  if(is.null(cond$hash))
  	hash = digest(algo="crc32", runif(1))
  
  ## Read in the data
  sheetA = readWorksheet(WB, sheet="AllFlows", start_cell[1], start_cell[2] ,start_cell[1] + flows, start_cell[2]+approx_eq+exact_eq+inequal+1, colTypes="numeric" )
  sheetB = readWorksheet(WB, sheet="CycleData", 2, 4,68, 20, colTypes="numeric" )
  lab = as.vector(readWorksheet(WB, sheet="AllFlows", start_cell[1], 1,start_cell[1]+flows, 1, colTypes="character" ))
  sheetB = data.matrix(sheetB)
  
  cycle_CI = as.numeric(sheetB[,2*cond$cycle])/2.0
  sheetB = sheetB[,2*cond$cycle-1]
  cycle_CI[is.na(cycle_CI)] = 0.0
  sheetB[is.na(sheetB)] = 0.0
  
  ## Setup A matricies
  Ae = t(data.matrix(sheetA)[,1:exact_eq])
  Aa = t(data.matrix(sheetA)[,(exact_eq+1):(exact_eq+approx_eq)])
  
  ##G matrix includes diagonal (so h also needs values)
  temp = matrix(0.0, nrow=flows, ncol=flows)
  for (i in 1:flows) { temp[i,i] = 1.0 }
  
  Gg = data.matrix(sheetA)
  Gg = Gg[,(exact_eq+approx_eq+1):(approx_eq+exact_eq+inequal)] #38-116
  Gg = t(cbind(Gg, t(temp)))
  
  ## Setup vectors ba, be, h
  be=list()
  ba=list()
  h=list()
  sdb=list()
  for (i in 1:length(Ae[,1])){
    be[i] = 0.0
  }
  for (i in 1:length(Aa[,1])) {
    ba[i] = 0.0
    sdb[i] = 1.0
  }
  for (i in 1:length(Gg[,1])){
    h[i] = 0.0
  }
  #Mass balance equations
  be[1] = sheetB[31]*sheetB[1]/sheetB[2]		## Delta = d(Chl)/mu * C14
  
  #Equality equations
  ba[[1]] = sheetB[1]		 #CnppPhy = C-14PP
  sdb[1] = ba[[1]]*0.06		## 6% Rel Error
  ba[[2]] = sheetB[5]		## Mzoo gr =micro gr
  #sdb[2] = cycle_CI[5]		## Error from Spreadsheet
  sdb[2] = ba[[2]]*0.43		## 31% Rel Error
  ba[[3]] = sheetB[6]		## Meso gr <1 = MesoGr<1
  #sdb[3] = cycle_CI[6]
  sdb[3] = ba[[3]]*0.43		## 31% Rel Error
  ba[[4]] = sheetB[7]		## Meso gr >1 = MesoGr>1
  #sdb[4] = cycle_CI[7]
  sdb[4] = ba[[4]]*0.43		## 31% Rel Error
  ba[[5]] = sheetB[10]		## St Flux = SedTrap POC Flux
  sdb[5] = cycle_CI[10]
  ba[[6]] = sheetB[11]		## Thorium	 = Thorium POC Flux
  sdb[6] = cycle_CI[11]
  ba[[7]] = sheetB[12]		## BacProd = BacProd (0-100m)
  sdb[7] = cycle_CI[12]
  ba[[8]] = sheetB[55]		## EpiMycRes = VM Epi Res
  sdb[8] = cycle_CI[55]
  ba[[9]] = sheetB[52]		## DeepMycRes = VM Mesopel Res
  sdb[9] = cycle_CI[52]
  ba[[10]] = sheetB[53]		## MycPoop = VM Poop
  sdb[10] = cycle_CI[53]
  ba[[11]] = sheetB[54]		## MycMort = VM Mort
  sdb[11] = cycle_CI[54]
  ba[[12]] = sheetB[49]		## NVMMycRes = NM Res
  sdb[12] = cycle_CI[49]
  ba[[13]] = sheetB[50]		## NVMMycPoop = NM Poop
  sdb[13] = cycle_CI[50]
  ba[[14]] = sheetB[51]		## NVMMycPoop = NM Mort
  sdb[14] = cycle_CI[51]
  
  #Respiration (relation to biomass)
  h[[12]] = sheetB[58]		#Smz-mn2 = Small Gr Min Res (100-surf)
  h[[13]] = sheetB[60]		#Lmz-mn2 = Lg Gr Min Res (100-surf)
  h[[14]] = sheetB[62]		#Gel-mn2 = Gel pred Min Res (100-surf)
  h[[15]] = sheetB[57]		#DeepSmz-mn2 = Sm Gr Min Res (450-100m)
  h[[16]] = sheetB[59]		#DeepLmz-mn2 = Lg Gr Min Res (450-100m)
  h[[17]] = sheetB[61]		#DeepGel-mn2 = GelPred Min Res (450-100m)
  h[[18]] = sheetB[56]		#Sar = Sardine Min Res
  
  #Deep Bac Prod
  h[[75]] = sheetB[13] * 110.6	 #Bac-mn = Bac Prod @100m	 --	 110.6 integration value from spreadsheet
  h[[76]] = sheetB[13] * -400			#Bac-mx
  
  #Fecal Pellet Flux
  h[[77]] = sheetB[63]		#FP-mn = Recog FP Flux
  
  #Vert Migration
  h[[78]] = sheetB[64]		#Mes-mn = VertMig Min*
  h[[79]] = sheetB[65]		#Gel-mn = VertMig Chaeto
  
  #Verify data condition
  ba = as.numeric(ba)
  be = as.numeric(be)
  h = as.numeric(h)
  sdb = as.numeric(sdb)
  
  #Calculate stdev vector for xsample
  for (i in 1:length(sdb)) {
    if ( sdb[i] <= sqrt(ba[i]^2)*0.05+1.0 ) { ## Place a lower bound of 10% + 1.0 to the stdev
      sdb[i] = sqrt(ba[i]^2)*0.05+1.0
    }
  }
  
  #Run xsample
  log("Starting run", cond)
  #try(tweet(paste("@TBryceKelly Starting cycle", cond$cycle, "run for", cond$iter_count, "iterations. Jump size is", cond$jmp, "and burnin period is", cond$iter_burn)))
  
  sol = xsample(A=Aa,B=ba, E=Ae, F=be, G=Gg, H=h, sdB=sdb, iter=cond$iter_count, outputlength=cond$iter_out, jmp=cond$jmp, burninlength=cond$iter_burn, type="mirror")
  sol$labels = lab
  log("Solution found.", cond)
  
  ##Save Data
  log(paste("Saving data to ./data/", cond$hash, ".RData",sep=''), cond)
  save(sol, Aa,Ae, ba, be,Gg,h,sdb,cond, file=paste("./data/", cond$hash, ".RData",sep='') )
  
  library(XLConnect)
  newwb = loadWorkbook(paste("./data/Solutions-",cond$hash,".xlsx", sep=''), create=TRUE)
  createSheet(newwb, name="Data")
  createSheet(newwb, name="Raw")
  createSheet(newwb, name="Approx")

  Ra=NULL
  Ra$X = apply(sol$X, 1, function(x) Aa %*% x)
  Ra$avg = apply(Ra$X, 1, mean)
  Ra$sd = apply(Ra$X, 1, sd)

  writeWorksheet(newwb, cbind(sol$avg,sol$sd), sheet="Data")
  writeWorksheet(newwb, sol$X, sheet="Raw")
  writeWorksheet(newwb, cbind(Ra$X,Ra$avg,Ra$sd) , sheet="Approx")
  saveWorkbook(newwb)
  
  typ = "character"
  for (i in 1:30) {
	typ = c(typ, "numeric")
  }

  WB2 = loadWorkbook("./Spreadsheets/End2End2LayerCompartments.xlsx")
  WB2_data = readWorksheet(WB2, sheet="Sheet1", 2, 1,29, 30, colTypes=typ )
  boxes = WB2_data[,1]
  flow = matrix(0.0, nrow=27, ncol=27)
    
    for(i in 1:(length(WB2_data[1,])-1) ) { ## 1:27
      for (j in 1:length(WB2_data[,1])) {  ## 1:27
        if (!is.na(WB2_data[j,i+1])) {
          flow[j,i] = sol$avg[WB2_data[j,i+1]]
        }
      }
    }
rownames(flow) = boxes
colnames(flow) = boxes
    
newwb = loadWorkbook(paste("./data/Flows-",cond$hash,".xlsx", sep=''), create=TRUE)
createSheet(newwb, name="Data")
createSheet(newwb, name="Sol")
createSheet(newwb, name="G")
createSheet(newwb, name="h")
createSheet(newwb, name="Aa")
createSheet(newwb, name="ba")
createSheet(newwb, name="Ae")
createSheet(newwb, name="be")

writeWorksheet(newwb, cbind(boxes,flow), sheet="Data")
writeWorksheet(newwb, cbind(sol$avg,sol$sd), sheet="Sol")
writeWorksheet(newwb, Gg, sheet="G")
writeWorksheet(newwb, t(h), sheet="h")
writeWorksheet(newwb, Aa, sheet="Aa")
writeWorksheet(newwb, t(cbind(ba,sdb)), sheet="ba")
writeWorksheet(newwb, Ae, sheet="Ae")
writeWorksheet(newwb, t(be), sheet="be")

saveWorkbook(newwb)
}

if (interactive())
	mcmc()

 

#'#==============================================================================
#'# xsample	: Samples linear problem with equality and inequality constraints ##
#'#==============================================================================
xsample <- function(A=NULL, B=NULL, E=NULL, F=NULL, G=NULL, H=NULL,
										sdB=NULL, W=1, iter=3000, outputlength = iter,
										burninlength = NULL, type="mirror", jmp=NULL,
										tol=sqrt(.Machine$double.eps), x0=NULL,
										fulloutput=FALSE, test=TRUE)	 {
	
#####		1. definition of internal functions		 #####

## function ensuring that a jump from q1 to q2
## fulfills all inequality constraints formulated in g and h
## gq=h can be seen as equations for planes that are considered mirrors.
## when a jump crosses one or more of these mirrors,
## the vector describing the jump is deviated according to rules of mirroring.
## the resulting new vector q will always be within the subspace of R^n for
## which all inequalities are met.
## also are the requirements for a MCMC met: the probability in the subspace
## is constant, the probability out of the subspace is 0.
## q1 has to fulfill constraints by default!
## Karel Van den Meersche 20070921

mirror <- function(q1,g,h,k=length(q),jmp)	 {
	
	##if (any((g%*%q1)<h)) stop("starting point of mirroring is not in feasible space")
	q2 <- rnorm(k,q1,jmp)
	if (!is.null(g))		{
		
		residual <- g%*%q2-h
		q10 <- q1
		
		while (any(residual<0))	 {							#mirror
			epsilon <- q2-q10											#vector from q1 to q2: our considered light-ray that will be mirrored at the boundaries of the space
			w <- which(residual<0)								#which mirrors are hit?
			alfa <- {{h-g%*%q10}/g%*%epsilon}[w]	#alfa: at which point does the light-ray hit the mirrors? g*(q1+alfa*epsilon)-h=0
			whichminalfa <- which.min(alfa)
			j <- w[whichminalfa]									#which smallest element of alfa: which mirror is hit first?
			d <- -residual[j]/sum(g[j,]^2)				#add to q2 a vector d*Z[j,] which is oriented perpendicular to the plane Z[j,]%*%x+p; the result is in the plane.
			q2 <- q2+2*d*g[j,]										#mirrored point
			residual <- g%*%q2-h
			q10 <- q10+alfa[whichminalfa]*epsilon #point of reflection
		}
	}
	q2
}


norm <- function(x) sqrt(x%*%x)




#### 2. the xsample function ####

## conversions vectors to matrices and checks
if (is.data.frame(A)) A <- as.matrix(A)
if (is.data.frame(E)) E <- as.matrix(E)
if (is.data.frame(G)) G <- as.matrix(G)
if (is.vector(A)) A <- t(A)
if (is.vector(E)) E <- t(E)
if (is.vector(G)) G <- t(G)

if ( !is.null(A) )	 {
	lb <- length(B)
	lx <- ncol(A)
	## system overdetermined?
	M <- rbind(cbind(A,B),cbind(E,F))
	overdetermined <- !qr(M)$rank<=lx
	
	if (overdetermined & is.null(sdB)) {
		warning("The given linear problem is overdetermined. A standard deviation for the data vector B is incorporated in the MCMC as a model parameter.")
		estimate_sdB=TRUE
		A <- A*W
		B <- B*W
	} else {
		estimate_sdB=FALSE
		if (overdetermined)
			warning("The given linear problem is overdetermined. Giving fixed standard deviations for the data vector B can lead to dubious results. Maybe you want to set sdB=NULL and estimate the data error.")
		if (!length(sdB)%in%c(1,lb))
			stop("sdB does not have the correct length")
		A <- A/sdB
		B <- B/sdB					 # set sd = 1 in Ax = N(B,sd)
		
	}
} else {	#is.null A
	estimate_sdB=FALSE
}

 ## find a particular solution x0
if (is.null(x0))	{
	l <- lsei(A=A,B=B,E=E,F=F,G=G,H=H,type=2)
	if (l$residualNorm>1e-6)
		stop("no particular solution found;incompatible constraints")
	else
		x0 <- l$X
}
lx <- length(x0)

## additional checks for equalities, hidden in inequalities... (Karline S.)
if (test && !is.null(G))	 {
	xv <- varranges(E,F,G,H,EqA=G)
	ii <- which (xv[,1]-xv[,2]==0)
	if (length(ii)>0) { # if they exist: add regular equalities !
		E	 <- rbind(E,G[ii,])
		F	 <- c(F,xv[ii,1])
		
		G	 <- G[-ii,]
		H	 <- H[-ii]
		if (length(H)==0)
			G <- H <- NULL
	}
	xr <- xranges(E,F,G,H)
	ii <- which (xr[,1]-xr[,2]==0)
	if (length(ii)>0) { # if they exist: add regular equalities !
		dia <- diag(nrow=nrow(xr))
		E	 <- rbind(E,dia[ii,])
		F	 <- c(F,xr[ii,1])
	}
}

## Z is an orthogonal matrix for which E%*%Z=0;
## it can serve as basis for the null space of E.
## all solutions for the equalities have the form x = x0 + Zq
## with q a random vector.
## the vector q is varied in a random walk, using a MCMC with
## acceptance rate = 1. The inequality constraints Gx>H
## can be rewritten as Gx0-H + (GZ)q >0

if (!is.null(E))	{
	Z <- Null(t(E)); Z[abs(Z)<tol] <- 0	 #x=x0+Zq ; EZ=0
} else { Z <- diag(lx) }

if (length(Z)==0)	 {
	warning("the problem has a single solution; this solution is returned as function value")
	return(x0)
}
k <- ncol(Z)

if (!is.null(G))	 {
	g <- G%*%Z
	h <- H-G%*%x0																						 #gq-h>=0
	g[abs(g)<tol] <- 0
	h[abs(h)<tol] <- 0
	
} else { g <- G; h <- H }


if (!is.null(A))	 {
	a <- A%*%Z
	b <- B-A%*%x0													 #aq-b~=0
	v <- svd(a,nv=k)$v										 #transformation q <- t(v)q for better convergence
	a <- a%*%v														 #transformation a <- av
	if (!is.null(G)) g <- g%*%v						 #transformation g <- gv
	Z <- Z%*%v														 #transformation Z <- Zv
	
	## if overdetermined, calculate posterior distribution of S in Ax=N(B,S)
	## Marko Laine 2008, thesis on adaptive mcmc
	## S = 1/sd^2 of model
	## prior n0=lb
	## prior SSR0=n0*s0^2=sum((Ax0-B)^2)=sum(b^2)
	## if underdetermined: S=1
	## if overdetermined: S is sampled from a
	## posterior gamma distribution (Laine 2008) and
	## standard deviations of data are S^-.5
	if (estimate_sdB)		 {
		q0 <- lsei(a,b)$X
		SS0 <- sum((a%*%q0-b)^2)
		S <- lb/SS0
	} else {
		S <- 1
	}
	SSR <- function(q) sum({a%*%q-b}^2)						 # sum of squared residuals
	test <- function(q2) exp(-1.0*{SSR(q2)-SSR(q1)}) > runif(1)
	
} else {
	prob <- function(q) 1
	test <- function(q2) TRUE
	S <- 1
	overdetermined <- FALSE
}

outputlength <- min (outputlength,iter)
ou <- ceiling(iter/outputlength)

q1 <- rep(0,k)
x <- matrix(nrow=outputlength,ncol=lx,dimnames=list(NULL,colnames(A)))
x[1,] <- x0
naccepted <- 1
p <- vector(length=outputlength) # probability distribution
p[1] <- prob(q1)

if (fulloutput)		{
	q <- matrix(nrow=outputlength, ncol=k)
	q[1,] <- q1
}

if (is.null(jmp)) jmp <- automatedjump(a,b,g,h) # automatedjump(g,h)
if (type=="mirror") newq <- mirror

## ##################
## the random walk ##
## ##################
print("Starting Burn-in")
print(Sys.time())
for (i in 1:burninlength)	 {
	q2 <- newq(q1,g,h,k,jmp)
	if (test(q2)) q1 <- q2
}

print("Starting Random Walk")
print(Sys.time())

for (i in 2:outputlength)	 {
	for (ii in 1:ou)	{
		#if (estimate_sdB) ## sd is estimated using Laine 2008
		#	S <- rgamma(1,shape=lb,rate=0.5*(SS0+SSR(q1)))
		q2 <- newq(q1,g,h,k,jmp)
		if (test(q2)) {
			q1 <- q2
			naccepted <- naccepted+1
		}
	}
	if (i %% 10 == 0) print(i)
	x[i,] <- x0+Z%*%q1
	p[i] <- 1.0 ## prob(q1)
}
print(Sys.time())

## ##################
## end random walk ##
## ##################

xnames <- colnames(A)
if (is.null(xnames)) xnames <- colnames(E)
if (is.null(xnames)) xnames <- colnames(G)
colnames (x) <- xnames

xsample <- list(X=x,acceptedratio=naccepted/iter,p=p,jmp=jmp)

xsample$avg = apply(xsample$X, 2, mean)
xsample$sd = apply(xsample$X,2,sd)
xsample$med = apply(xsample$X, 2, median)
xsample$iqr = apply(xsample$X, 2, IQR)
print(Sys.time())
print(naccepted*1.0/iter)
if (naccepted < 1000) print(paste("Warning: total number of accepted solutions was only", naccepted))
return(xsample)
return(xsample)
}

 

# Analysis.r - 2015 Thomas Bryce Kelly
# 
# This script is designed to run and genearte the analyses based on saved xsample data. The important function calls here are:
#   * load() : The standard sysstem load function. By default is assumes data is located in ./data/cond$hash.RData. This can be changed.
#   * png(tempname) : All plots generated will be saved as png format in ./img/tempname.
#   * misc.r:save2(name, cond) : This call is designed to upload and log a file. By default it will use the full relative path, <name>, when saving.
#   * Twitter.r:up(title, cond, name) : This function uploads the file <name>, logs the upload, and notifies via twitter that <title> was uploaded.
#   * Twitter.r:notify(title, cond, name) : This function notifies via twitter that <name> has been uploaded (must be done separately) and logs the event.
# 
# analyze() 
# analyzeSet() 
#
# <cond> : data.frame(cycle=NULL, iter_count=NULL, iter_out=NULL, iter_burn=NULL, jmp=NULL, hash=NULL)
# <list> : c(0,0,0,0,0,0,0)



library("digest")
library("lattice")
library("XLConnect")
library("diagram")
source("Twitter.r")
source("misc.r")

analyze = function(hash,plots) {
  load(paste("./data/", hash, ".RData", sep=""))
  flows = length(sol$avg)
  
  if(plots[1] == 1) {
    tempname = paste("./img/Cor",hash, ".png", sep='')
    png(tempname)
    rgb.palette <- colorRampPalette(c("blue","white", "red"), space = "rgb")
    thisplot=levelplot(cor(sol$X[]), main=paste("Correlation Between Flows: cycle", cond$cycle), xlab="Flow", ylab="Flow", col.regions=rgb.palette(250), cuts=500, at=seq(-1,1,0.01))
    print(thisplot)
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    #notify("Cor", cond$cycle, tempname)
  }
  
  if(plots[2] == 1) {
    #res = data.frame(Aa %*% as.numeric(ba), apply(sol$X, 1, function(x) Aa %*% x))
    tempname = paste("./img/SurSol",hash, ".png", sep='')
    png(tempname, height=1280, width=720, res=120)
    
    dotchart(x = as.vector(sol$avg[1:54]), labels=sol$lab$flow, xlab="Carbon Flux (mg C /m^2 /d)", main=paste("Flows in Upper Layer: cycle", cond$cycle) , pch=16, xlim=c(0,max(sol$avg[1:54]+sol$sd[1:54])) )
    segments(sol$avg[1:54]-sol$sd[1:54], 1:flows, sol$avg[1:54]+sol$sd[1:54], 1:flows)
    #points(sol$med, 1:flows, col="red")
    #segments(sol$med-sol$iqr/2, 1:flows, sol$med+sol$iqr/2, 1:flows, col="red")
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    notify("SurSol", cond, tempname)
  }
  
  if(plots[3] == 1) {
    tempname = paste("./img/GPPHist",hash, ".png", sep='')
    png(tempname, res=72)
    plot(c(1:length(sol$X[,1])), sol$X[,1], xlim=c(1,length(sol$X[,1])) ,type='l', main=paste("History of GPP: cycle",cond$cycle) , xlab=paste("Solution number (x",cond$iter_count/cond$iter_out,")"), ylab="GPP to Phy" )
    #dotchart(x = as.vector(sol$X[1:50]), labels=sol$labels$flow[1:50], main=paste("Estimate: cond$cycle ",1), pch=16)
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    #notify("GPPH", cond, tempname)
  }
  
  if(plots[4] == 1) {
    Ra=NULL
    Ra$X = apply(sol$X, 1, function(x) Aa %*% x)
    Ra$avg = apply(Ra$X, 1, mean)
    Ra$sd = apply(Ra$X, 1, sd)
    
    tempname = paste("./img/Approx",hash, ".png", sep='')
    png(tempname)
    par(mar=c(5,5,5,5))
    plot(Ra$avg, pch=16, ylim=c(min(ba, Ra$avg),max(ba+sdb, Ra$avg+Ra$sd)), ylab=expression(Carbon~Flux~(mg~C~m^{-2}~d^{-1})) ,xlab="Approximate Equation" ,main=paste("Comparison of Quantities: cycle", cond$cycle))
    axis(side=1, at=c(seq(1,13,2)))
    segments(1:14-0.02, Ra$avg+Ra$sd, 1:14-0.02, Ra$avg-Ra$sd)
    segments(1:14+0.05, Ra$avg+Ra$sd, 1:14-0.05, Ra$avg+Ra$sd, col="black")
    segments(1:14+0.05, Ra$avg-Ra$sd, 1:14-0.05, Ra$avg-Ra$sd, col="black")
    points(ba, pch=4, col="red")
    segments(1:14+0.02, ba+sdb, 1:14+0.02, ba-sdb, col="red")
    segments(1:14+0.05, ba+sdb, 1:14-0.05, ba+sdb, col="red")
    segments(1:14+0.05, ba-sdb, 1:14-0.05, ba-sdb, col="red")
    legend(11,1000,c("Model", "Measured"), pch=c(16,4), col=c("black","red"))
    #points(Ra$avg, pch=5, col="red")
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    #notify("Approx", cond, tempname)
  }
  if(plots[4] == 1) {
    Ra=NULL
    Ra$X = apply(sol$X, 1, function(x) Aa %*% x)
    Ra$avg = apply(Ra$X, 1, mean)
    Ra$sd = apply(Ra$X, 1, sd)
    
    tempname = paste("./img/Approx_Zoom",hash, ".png", sep='')
    png(tempname)
    par(mar=c(5,5,5,5))
    plot(Ra$avg[5:14], pch=16, ylim=c(0,max(ba[5:14]+sdb[5:14], Ra$avg[5:14]+Ra$sd[5:14])),xaxt="n",, ylab=expression(Carbon~Flux~(mg~C~m^{-2}~d^{-1})) ,xlab="Approximate Equation" ,main=paste("Comparison of Quantities: cycle", cond$cycle))
    axis(side=1, at=c(1:10), label=c(5:14))
    segments(1:10-0.02, Ra$avg[5:14]+Ra$sd[5:14], 1:10-0.02, Ra$avg[5:14]-Ra$sd[5:14])
    segments(1:10+0.05, Ra$avg[5:14]+Ra$sd[5:14], 1:10-0.05, Ra$avg[5:14]+Ra$sd[5:14], col="black")
    segments(1:10+0.05, Ra$avg[5:14]-Ra$sd[5:14], 1:10-0.05, Ra$avg[5:14]-Ra$sd[5:14], col="black")
    points(ba[5:14], pch=4, col="red")
    segments(1:10+0.02, ba[5:14]+sdb[5:14], 1:10+0.02, ba[5:14]-sdb[5:14], col="red")
    segments(1:10+0.05, ba[5:14]+sdb[5:14], 1:10-0.05, ba[5:14]+sdb[5:14], col="red")
    segments(1:10+0.05, ba[5:14]-sdb[5:14], 1:10-0.05, ba[5:14]-sdb[5:14], col="red")
    #legend(11,100,c("Model", "Measured"), pch=c(16,4), col=c("black","red"))
    #points(Ra$avg, pch=5, col="red")
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    #notify("Approx", cond, tempname)
  }
  
  if(plots[5] == 1) {
    #GGElab = list()
    #GGElab[1] = "Phy"
    t1 = as.vector((sol$X[,2]+sol$X[,7]+sol$X[,8] )/sol$X[,1]) ##GGE = (gppTophy-phyToRes) / gppTOphy
    #GGElab[2] = "Hnf"
    t2 = 1-(sol$X[,11]+sol$X[,12]+sol$X[,13] )/(sol$X[,3]+sol$X[,43]+sol$X[,48]+sol$X[,45])
    #GGElab[3] = "Mic"
    t3 = 1-(sol$X[,14]+sol$X[,17]+sol$X[,18])/(sol$X[,4] + sol$X[,9] +sol$X[,49] + sol$X[,44] + sol$X[,46])
    #GGElab[4] = "Smz"
    t4 = 1-(sol$X[,21]+sol$X[,22]+sol$X[,23] +sol$X[,70] + sol$X[,71] + sol$X[,72])/(sol$X[,5] + sol$X[,10] + sol$X[,15]+sol$X[,50] + sol$X[,64] + sol$X[,68] + sol$X[,100])
    #GGElab[5] = "Lmz"
    t5 = 1-(sol$X[,27] + sol$X[,28]+sol$X[,29]+sol$X[,74]+ sol$X[,75] + sol$X[,76])/(sol$X[,6] + sol$X[,19] + sol$X[,16] + sol$X[,51] +  sol$X[,73] + sol$X[,69] + sol$X[,101])
    #GGElab[6] = "Deep Hnf"
    t6 = 1- (sol$X[,60] + sol$X[,61] + sol$X[,62])/(sol$X[,93]  +sol$X[,95] + sol$X[,98])
    #GGElab[7] = "Deep Mic"
    t7 = 1- (sol$X[,65] + sol$X[,66] + sol$X[,67]) / (sol$X[,63] + sol$X[,94] + sol$X[,96] + sol$X[,99])
    #GGElab[8] = "Gel"
    t8 = 1- (sol$X[,30] + sol$X[,31] + sol$X[,32] + sol$X[,80] + sol$X[,81] + sol$X[,82])/(sol$X[,24] + sol$X[,77])
    #GGElab[9] = "Bac"
    t9 = 1- (sol$X[,42])/(sol$X[,53])
    #GGElab[10] = "Deep Bac"
    t10 = 1- (sol$X[,92])/(sol$X[,103])
    
    GGElab = c("Hnf", "Mic", "Smz", "Lmz", "Deep Hnf", "Deep Mic", "Gel", "Bac", "Deep Bac")
    GGE = data.frame(t2,t3,t4,t5,t6,t7,t8,t9,t10)
    names(GGE) = GGElab[1:9]
    GGEavg = apply(GGE, 2, mean)
    GGEsd = apply(GGE, 2, sd)
    
    Gmin = c( .1, .1,.1,.1,.1,.1,.1,.05,.05)
    Gmax = c( .4,.4,.4,.4,.4,.4,.4,.3,.3)
    
    tempname = paste("./img/GGE",hash, ".png", sep='')
    png(tempname,res=64, height=620, width=720)
    par(mar=c(8,6,6,5))
    plot(x=c(1:9), y=GGEavg,cex.lab=1.8, ylim=c(0,0.5), col="red",yaxt='n', xaxt="n", xlab="", ylab="GGE ± SD", main=paste("Gross Growth Efficiencies: cycle", cond$cycle))
    axis(side=1, labels=GGElab, cex.axis=1.8 ,at=c(1:9), las=2)
    axis(side=2, at=c(0.1,0.2, 0.3, 0.4, 0.5), cex.axis=2)
    rect(1:9-.2, Gmin, 1:9+.2, Gmax,density=5, col="white", border="black", lty=4)
    segments(1:9,GGEavg-GGEsd , 1:9, GGEavg+GGEsd, col="red", lty=1)
    legend(7.4,.53,c("Model","Literature"), lty=c(1,4), col=c("red","black"), bty="n")
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    notify("GGE", cond, tempname)
  }

	## Food Web
	  
  if(plots[6] == 1) {
    tempname = paste("./img/Web1",hash, ".png", sep='')
    png(tempname)
    typ = "character"
    for (i in 1:30) {
      typ = c(typ, "numeric")
    }
    
    WB2 = loadWorkbook("./Spreadsheets/End2End2LayerCompartments.xlsx")
    WB2_data = readWorksheet(WB2, sheet="Sheet1", 2, 1,29, 30, colTypes=typ )
    
    boxes = WB2_data[,1]
    
    flow = matrix(0.0, nrow=27, ncol=27)
    
    for(i in 1:(length(WB2_data[1,])-1) ) { ## 1:27
      for (j in 1:length(WB2_data[,1])) {  ## 1:27
        if (!is.na(WB2_data[j,i+1])) {
          flow[j,i] = sol$avg[WB2_data[j,i+1]]
        }
      }
    }
    
    rownames(flow) = boxes
    colnames(flow) = boxes
    
    
    
    #flow[30,] = colSums(flow)
    #flow[,30] = t(rowSums(flow))
    #plotweb(flow[1:16,1:16], legend=FALSE)
    plotweb(flow[1:27,1:27], legend=FALSE)
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    notify("Web1", cond, tempname)
  }
  
  
  if(plots[7] == 1) {
    tempname = paste("./img/Dist", hash, ".png",sep='')
    png(tempname)
    xx = density(sol$X[1:500,1])
    xy = density(sol$X[501:1000,1])
    #plot(density(xx/mean(sol$X[,1])), main=paste("Distribution of GPP values: cycle",cycle), xlab="Carbon Flux")
    plot(xy, main=paste("Distribution of GPP values: cycle", cond$cycle), xlab="Carbon Flux", ylim=c(0,max(xy$y,xx$y)), xlim=c(min(xy$x,xx$x),max(xy$x,xx$x)), cex.lab = 1.6)
    polygon(xy, col="black")
    polygon(xx, col="red")
    lines(xy, col="black")
    legend(max(max(xy$x)*.95, max(xx$x)*.95),max(max(xy$y),max(xx$y)),c("First Set","Second Set"), lty=c(1,1), col=c("red","black"), bty="n")
    #lines(density(sol$X[,flows]/mean(sol$X[,flows])), col="red")
    dev.off()
    save2(tempname, cond)
    #notify("Dist", cond, tempname)
  }
  
  if(plots[2] == 1) {
    #res = data.frame(Aa %*% as.numeric(ba), apply(sol$X, 1, function(x) Aa %*% x))
    tempname = paste("./img/DeepSol",hash, ".png", sep='')
    png(tempname, height=1280, width=720, res=120)
    
    dotchart(x = as.vector(sol$avg[54:59]), labels=sol$lab$flow[54:59], xlab="Carbon Flux (mg C /m^2 /d)", main=paste("Flows in Deep Layer: cycle", cond$cycle) , pch=16, xlim=c(0,max(sol$avg[54:59]+sol$sd[54:59])) )
    
    #dotchart(x = as.vector(sol$avg[61:flows]), labels=sol$lab$flow[16:flows], xlab="Carbon Flux (mg C /m^2 /d)", main=paste("Flows in Deep Layer: cycle", cond$cycle) , pch=16, xlim=c(0,max(sol$avg[61:flows]+sol$sd[61:flows])) )
    segments(sol$avg[54:59]-sol$sd[54:59], 1:6, sol$avg[54:59]+sol$sd[54:59], 1:6)
    #points(sol$med, 1:flows, col="red")
    #segments(sol$med-sol$iqr/2, 1:flows, sol$med+sol$iqr/2, 1:flows, col="red")
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    notify("DeepSol", cond, tempname)
  }
  
  
  if(plots[1] == 1) {
    tempname = paste("./img/CorFlow",hash,".png", sep='')
    png(tempname)
    rgb.palette <- colorRampPalette(c("blue","white", "red"), space = "rgb")
    thisplot=levelplot(cor(flow[]), main=paste("Correlation Between Compartments: cycle", cond$cycle), xlab="Flow", ylab="Flow", col.regions=rgb.palette(250), cuts=500, at=seq(-1,1,0.05))
    print(thisplot)
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    #notify("Cor", cond$cycle, tempname)
  }
  
  if(plots[1] == 1) {
    tempname = paste("./img/SDTFlows-",hash,".png",sep='')
    png(tempname)
    rgb.palette <- colorRampPalette(c("blue","white", "red"), space = "rgb")
    sur = c(47,45,46,54)
    levelplot(cor(sol$X[,sur]), 
              main=list(paste("Correlation of SDT flows: cycle", cond$cycle),cex=1), 
              xlab=list(label="Flow",cex=1.5), 
              ylab=list(label="Flow",cex=1.5), 
              col.regions=rgb.palette(250), 
              cuts=500, 
              at=seq(-1,1,0.01),
              colorkey = list(label=list(cex=1.3)),
              scales=list(x=list(cex=1.3, 
                                 at=c(1:4), 
                                 labels=c('to DOC','to HNF', 'to MIC', 'to Deep')),
                          y=list(cex=1.3), 
                          at=c(1:4), 
                          labels=c('to DOC','to HNF', 'to MIC', 'to Deep'))
    )
    dev.off()
    
    tempname = paste("./img/LDTFlows-",hash,".png",sep='')
    png(tempname)
    sur = c(52,48,49,50,51,55)
    levelplot(cor(sol$X[,sur]), 
              main=list(paste("Correlation of LDT flows: cycle", cond$cycle),cex=1), 
              xlab=list(label="Flow",cex=1.3), 
              ylab=list(label="Flow",cex=1.3),
              col.regions=rgb.palette(250), 
              cuts=500, 
              at=seq(-1,1,0.01),
              colorkey = list(label=list(cex=1.3)),
              scales=list(x=list(cex=1.1, 
                                 at=c(1:6), 
                                 labels=c('to DOC','to HNF','to MIC','to SMZ', 'to LMZ', 'to Deep')),
                          y=list(cex=1.1), 
                          at=c(1:6), 
                          labels=c('to DOC','to HNF','to MIC','to SMZ', 'to LMZ', 'to Deep'))
    )
    
    dev.off()
  }
}




analyzeSet = function(l=NULL, plots) {
  hash = digest(algo="crc32", runif(1))
  
  if (plots[1] == 1) { ## plot of npp in each cycle
    tempname = paste("./img/NPP",hash,".png",sep='')
    png(tempname)
    par(mar=c(5.1, 7.1, 4.1, 2.1))
    plot(x=c(0:10000), y=c(0:10000), type='l',lty=2 , pch=16, ylim=c(0,3000),xlim = c(0,3000),cex.lab=1.4, ylab=expression(Model~Carbon~Flux~(mg~C~m^{-2}~d^{-1})),xlab=expression(Measured~Carbon~Flux~(mg~C~m^{-2}~d^{-1})) ,main="Net Primary Production +/- SD")
    for (i in c(1:length(l))) {
      if (!is.na(l[i])) {
        load(paste("./data/", l[i], ".RData", sep=''))
        RX = Aa %*% sol$avg
        RS = Aa %*% sol$sd
        
        points(ba[1], RX[1])
        text(ba[1]-50, RX[1]*1.07+70, labels=paste(cond$cycle))
        segments(ba[1]+sdb[1], RX[1], ba[1]-sdb[1], RX[1])
        segments(ba[1]+sdb[1], RX[1]*0.98, ba[1]+sdb[1], RX[1]*1.02)
        segments(ba[1]-sdb[1], RX[1]*0.98, ba[1]-sdb[1], RX[1]*1.02)
        segments(ba[1], RX[1]+RS[1], ba[1], RX[1]-RS[1], col="red")
        segments(ba[1]*0.98, RX[1]+RS[1], ba[1]*1.02, RX[1]+RS[1], col="red")
        segments(ba[1]*0.98, RX[1]-RS[1], ba[1]*1.02, RX[1]-RS[1], col="red")
      }
    }
    #legend(2600,800,c("Model", "Measured"),lty=1, col=c("black","red"))
    #points(Ra$avg, pch=5, col="red")
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    notify("Approx", cond, tempname)
  }
  
  if (plots[2] == 1) { ## plot of Grazing in each cycle
    tempname = paste("./img/GrM",hash,".png",sep='')
    png(tempname)
    par(mar=c(5.1, 7.1, 4.1, 2.1))
    plot(x=c(0:10000), y=c(0:10000), type='l',lty=2 , pch=16, ylim=c(0,1800),xlim = c(0,1800),cex.lab=1.4, ylab=expression(Model~Grazing~Rate~(mg~C~m^{-2}~d^{-1})),xlab=expression(Measured~Grazing~Rate~(mg~C~m^{-2}~d^{-1})) ,main="Mic Grazing Rate +/- SD")
    for (i in c(1:length(l))) {
      if (!is.na(l[i])) {
        load(paste("./data/", l[i], ".RData", sep=''))
        RX = Aa %*% sol$avg
        RS = Aa %*% sol$sd
        
        points(ba[2], RX[2])
        text(ba[2]+40, RX[2]-150, labels=paste(cond$cycle))
        segments(ba[2]+sdb[2], RX[2]*0.98, ba[2]+sdb[2], RX[2]*1.02)
        segments(ba[2]+sdb[2], RX[2], ba[2]-sdb[2], RX[2])
        segments(ba[2]-sdb[2], RX[2]*0.98, ba[2]-sdb[2], RX[2]*1.02)
        segments(ba[2], RX[2]+RS[2], ba[2], RX[2]-RS[2], col="red")
        segments(ba[2]*0.98, RX[2]+RS[2], ba[2]*1.02, RX[2]+RS[2], col="red")
        segments(ba[2]*0.98, RX[2]-RS[2], ba[2]*1.02, RX[2]-RS[2], col="red")
      }
    }
    #legend(1200,400,c("Model", "Measured"),lty=1, col=c("black","red"))
    #points(Ra$avg, pch=5, col="red")
    dev.off()
      ##Save and log the image.
    save2(tempname, cond)
    notify("Approx", cond, tempname)
  }
  
  if (plots[2] == 1) { ## plot of SMZ grazing in each cycle
    tempname = paste("./img/GrSMZ",hash,".png",sep='')
    png(tempname)
    par(mar=c(5.1, 7.1, 4.1, 2.1))
    plot(x=c(0:10000), y=c(0:10000), type='l',lty=2 , pch=16, ylim=c(0,3200),xlim = c(0,3200),cex.lab=1.4, ylab=expression(Model~Grazing~Rate~(mg~C~m^{-2}~d^{-1})),xlab=expression(Measured~Grazing~Rate~(mg~C~m^{-2}~d^{-1})) ,main="SMZ Grazing Rate +/- SD")
    for (i in c(1:length(l))) {
      if (!is.na(l[i])) {
        load(paste("./data/", l[i], ".RData", sep=''))
        RX = Aa %*% sol$avg
        RS = Aa %*% sol$sd
        
        points(ba[3], RX[3])
        text(ba[3]+50, RX[3]*0.9-70, labels=paste(cond$cycle))
        segments(ba[3]+sdb[3], RX[3]*0.98, ba[3]+sdb[3], RX[3]*1.02)
        segments(ba[3]+sdb[3], RX[3], ba[3]-sdb[3], RX[3])
        segments(ba[3]-sdb[3], RX[3]*0.98, ba[3]-sdb[3], RX[3]*1.02)
        segments(ba[3], RX[3]+RS[3], ba[3], RX[3]-RS[3], col="red")
        segments(ba[3]*0.98, RX[3]+RS[3], ba[3]*1.02, RX[3]+RS[3], col="red")
        segments(ba[3]*0.98, RX[3]-RS[3], ba[3]*1.02, RX[3]-RS[3], col="red")
      }
    }
    #legend(250,70,c("Model", "Measured"),lty=1, col=c("black","red"))
    #points(Ra$avg, pch=5, col="red")
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    notify("Approx", cond, tempname)
  }
  
  if (plots[2] == 1) { ## plot of LMZ grazing in each cycle    
    tempname = paste("./img/GrLMZ",hash,".png",sep='')
    png(tempname)
    par(mar=c(5.1, 7.1, 4.1, 2.1))
    plot(x=c(0:10000), y=c(0:10000), type='l',lty=2 , pch=16, ylim=c(0,600),xlim = c(0,600),cex.lab=1.4, ylab=expression(Model~Grazing~Rate~(mg~C~m^{-2}~d^{-1})),xlab=expression(Measured~Grazing~Rate~(mg~C~m^{-2}~d^{-1})) ,main="LMZ Grazing Rate +/- SD")
    for (i in c(1:length(l))) {
      if (!is.na(l[i])) {
        load(paste("./data/", l[i], ".RData", sep=''))
        RX = Aa %*% sol$avg
        RS = Aa %*% sol$sd
        
        points(ba[4], RX[4])
        text(ba[4]-25, RX[4]*1.0+30, labels=paste(cond$cycle))
        segments(ba[4]+sdb[4], RX[4]*0.98, ba[4]+sdb[4], RX[4]*1.02)
        segments(ba[4]+sdb[4], RX[4], ba[4]-sdb[4], RX[4])
        segments(ba[4]-sdb[4], RX[4]*0.98, ba[4]-sdb[4], RX[4]*1.02)
        segments(ba[4], RX[4]+RS[4], ba[4], RX[4]-RS[4], col="red")
        segments(ba[4]*0.98, RX[4]+RS[4], ba[4]*1.02, RX[4]+RS[4], col="red")
        segments(ba[4]*0.98, RX[4]-RS[4], ba[4]*1.02, RX[4]-RS[4], col="red")
      }
    }
    #legend(300,120,c("Model", "Measured"),lty=1, col=c("black","red"))
    #points(Ra$avg, pch=5, col="red")
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    notify("Approx", cond, tempname)
  }
  
  if (plots[3] == 1) { ## plot of surface solutions 
    tempname = paste("./img/SurSols-",hash,".png",sep='')
    
    pts = c(5, 3, 18, 11, 2, 9, 14, 12, 4, 7, 13, 16, 17)
    png(tempname, height=880, width=720, res=120)
    load(paste("./data/", l[1], ".RData", sep=''))
    par(mar=c(4,4,5,8))
    dotchart(x = as.vector(sol$avg[2:8]), labels=sol$lab$flow[2:8], xlab="Carbon Flux (mg C /m^2 /d)", main="Impact of constraints on Phy Flows" , pch=16, xlim=c(0,max(sol$avg[2:7]+sol$sd[2:7])*2) )
    #segments(sol$avg[1:54]-sol$sd[1:54], 1:flows, sol$avg[1:54]+sol$sd[1:54], 1:flows)
    for(i in c(1:6)) {
      segments(sol$avg[i+1],i,sol$avg[i+2],i+1, col="black")
    }
    ptcol = colorRampPalette(c("red", "green","blue"))(n=10)
    for (i in c(2:length(l))) {
      if (!is.na(l[i])) {
        load(paste("./data/", l[i], ".RData", sep=''))
        points(as.vector(sol$avg[2:7]), c(1:6), pch=16, col=ptcol[i])
        for(k in c(1:6)) {
          segments(sol$avg[k+1],k,sol$avg[k+2],k+1, col=ptcol[i])
        }
        #segments(sol$avg[1:54]-sol$sd[1:54], 1:flows, sol$avg[1:54]+sol$sd[1:54], 1:flows, col=ptcol[i])
      }
    }
    legend(2800,7.7,c("1","2","3","4","5","6","7","8","9"),lty=1, col=ptcol)
    #points(Ra$avg, pch=5, col="red")
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    notify("SetSurface", cond, tempname)
  }
  
  if (plots[4] == 1) { ## barplot
    tempname = paste("./img/PhyFlows-",hash,".png",sep='')
    png(tempname)
    poop_sur = NULL
    poop_deep = NULL
    ext = NULL
    
    for (i in c(1:length(l))) {
      if (!is.na(l[i])) {
        load(paste("./data/", l[i], ".RData", sep=''))
        poop_sur[i] = sol$avg[32]+sol$avg[36]+sol$avg[40]
        poop_deep[i] = sol$avg[82]+sol$avg[86]+sol$avg[90]
        #ext[i] = sol$avg[53]+sol$avg[99]+sol$avg[103]
        ext[i] = sol$avg[41]+sol$avg[33]+sol$avg[37]+sol$avg[83]+sol$avg[87]+sol$avg[91]
      }
    }
    par(mar=c(5,5,5,4))
    barplot(rbind(poop_sur,poop_deep,ext), col=c("blue1","blue4", "lightblue3"), main="Source of Carbon Export", ylim=c(0,sol$avg[1]-sol$avg[2]) ,density=NULL, names.arg=c(1:length(ext)), xlab="Equation", ylab="Carbon Flux (mg C m^-2 d^-1)")
    par(mar=c(3,3,3,4))
    legend(5,max(poop_sur+poop_deep+ext)*1.2,c("Fecal Matter (surface)","Fecal Matter (Deep)", "to HTL"),lty=1, col=c("blue1","blue4","lightblue3"), lwd=15)
    #points(Ra$avg, pch=5, col="red")
    dev.off()
    ##Save and log the image.
    save2(tempname, cond)
    notify("SetPhyFlow", cond, tempname)
  }
  
  
}

 

 

 

 

 

 

Back To Top